![JohnSnowLabs](https://nlp.johnsnowlabs.com/assets/images/logo.png)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp-workshop/blob/master/legal-nlp/06.0.Relation_Extraction.ipynb)

#🎬 Installation

In [None]:
! pip install -q johnsnowlabs

##🔗 Automatic Installation
Using my.johnsnowlabs.com SSO

In [None]:
from johnsnowlabs import nlp, legal, viz

# nlp.install(force_browser=True)

##🔗 Manual downloading
If you are not registered in my.johnsnowlabs.com, you received a license via e-email or you are using Safari, you may need to do a manual update of the license.

- Go to my.johnsnowlabs.com
- Download your license
- Upload it using the following command

In [None]:
from google.colab import files
print('Please Upload your John Snow Labs License using the button below')
license_keys = files.upload()

- Install it

In [None]:
nlp.install()

#📌 Starting

In [None]:
spark = nlp.start()

#🔎 Legal Relation Extraction(RE) and Zero-shot Relation Extraction

📚Legal relation extraction is a task in natural language processing (NLP) that involves extracting relationships between entities in legal documents. These relationships can be between people, organizations, or legal concepts.

📚Legal relation extraction is useful for a variety of purposes, including legal research, contract analysis, and legal case management. For example, legal relation extraction can be used to identify relationships between parties in a contract, such as the buyer and seller, or to extract clauses in a contract that outline certain obligations or rights.

##✔️ Pretrained Relation Extraction Models and Pipelines for Legal

Here are the list of pretrained Relation Extraction models and pipelines:

📚**Relation Extraction Models**

|index|model|
|-----:|:-----|
| 1| [Legal Relation Extraction (Parties, Alias, Dates, Document Type) (Small, Bidirectional)](https://nlp.johnsnowlabs.com/2022/08/12/legre_contract_doc_parties_en_3_2.html)  | 
| 2| [Legal Relation Extraction (Parties, Alias, Dates, Document Type) (Medium, Undirectional)](https://nlp.johnsnowlabs.com/2022/11/02/legre_contract_doc_parties_md_en.html)  | 
| 3| [Legal Relation Extraction (Alias)](https://nlp.johnsnowlabs.com/2022/08/17/legre_org_prod_alias_en_3_2.html)  |
| 4| [Legal Relation Extraction (Whereas) (Small, Bidirectional)](https://nlp.johnsnowlabs.com/2022/08/24/legre_whereas_en.html)  | 
| 5| [Legal Relation Extraction (Whereas) (Medium, Unidirectional)](https://nlp.johnsnowlabs.com/2022/11/09/legre_whereas_md_en.html)  | 
| 6| [Legal Relation Extraction (Indemnification) (Small, Bidirectional)](https://nlp.johnsnowlabs.com/2022/09/28/legre_indemnifications_en.html)  |
| 7| [Legal Relation Extraction (Indemnification) (Medium, Unidirectional)](https://nlp.johnsnowlabs.com/2022/11/09/legre_indemnifications_md_en.html)  | 
| 8| [Legal Relation Extraction (Confidentiality) (Small, Bidirectional)](https://nlp.johnsnowlabs.com/2022/10/18/legre_confidentiality_en.html)  |
| 9| [Legal Relation Extraction (Confidentiality) (Medium, Unidirectional)](https://nlp.johnsnowlabs.com/2022/11/09/legre_confidentiality_md_en.html)  |
| 10| [Legal Relation Extraction (Warranty)](https://nlp.johnsnowlabs.com/2022/10/19/legre_warranty_en.html)  |
| 11| [Legal Relation Extraction (Grants) (Medium, Unidirectional)](https://nlp.johnsnowlabs.com/2022/11/09/legre_grants_md_en.html)  |
| 12| [(Obligations) (Medium, Unidirectional)](https://nlp.johnsnowlabs.com/2022/11/03/legre_obligations_md_en.html)  |
| 13| [Legal Relation Extraction (Notice Clause)](https://nlp.johnsnowlabs.com/2022/12/17/legre_notice_clause_xs_en.html)  |
| 14| [Legal Zero-shot Relation Extraction](https://nlp.johnsnowlabs.com/2022/08/22/legre_zero_shot_en_3_2.html)  |
| 15| [Pretrained Pipeline(Whereas)](https://nlp.johnsnowlabs.com/2022/08/24/legpipe_whereas_en.html)  |


##🔎 NER and Relation Extraction
NER only extracts isolated entities by itself. But you can combine some NER with specific Relation Extraction Annotators trained for them, to retrieve if the entities are related to each other.

Let's suppose we want to extract information about **PARTIES**, **ALIAS**, **DATES** and **DOCUMENT_TYPES**. If we don't know where that information is in the document, we can use Text Classifiers to find it.

Firstly, we will download sample dataset and do all progress on it.

In [None]:
! wget -q https://raw.githubusercontent.com/JohnSnowLabs/spark-nlp-workshop/master/legal-nlp/data/intellectual_property_agreement.txt

In [None]:
with open('intellectual_property_agreement.txt', 'r') as f:
  agreement = f.read()
print(agreement[:1500])

Exhibit 10.2

Execution Version

INTELLECTUAL PROPERTY AGREEMENT

This INTELLECTUAL PROPERTY AGREEMENT (this "Agreement"), dated as of December 31, 2018 (the "Effective Date") is entered into by and between Armstrong Flooring, Inc., a Delaware corporation ("Seller") and AFI Licensing LLC, a Delaware limited liability company ("Licensing" and together with Seller, "Arizona") and AHF Holding, Inc. (formerly known as Tarzan HoldCo, Inc.), a Delaware corporation ("Buyer") and Armstrong Hardwood Flooring Company, a Tennessee corporation (the "Company" and together with Buyer the "Buyer Entities") (each of Arizona on the one hand and the Buyer Entities on the other hand, a "Party" and collectively, the "Parties").

WHEREAS, Seller and Buyer have entered into that certain Stock Purchase Agreement, dated November 14, 2018 (the "Stock Purchase Agreement"); WHEREAS, pursuant to the Stock Purchase Agreement, Seller has agreed to sell and transfer, and Buyer has agreed to purchase and acquire, all

📜We have lots of classification models to get relevant pages or clauses. You can find any of them in our [Models Hub](https://nlp.johnsnowlabs.com/models]). 

Why do we use this?Because, we don't need to run all document with any pretraiend models.

Firstly,  we will split all document to page or paragraphs.

Here, we get the paraghraps from the entire agreement. As you see above, paraghraps are splitted with `\n\n` in the agreement. So we use `setCustomBounds` parameter in `TextSplitter`.

In [None]:
document_assembler = nlp.DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("document")

text_splitter = legal.TextSplitter() \
    .setInputCols(["document"]) \
    .setOutputCol("paragraphs")\
    .setCustomBounds(["\n\n"])\
    .setUseCustomBoundsOnly(True)\
    .setExplodeSentences(True)

nlp_pipeline = nlp.Pipeline(stages=[
    document_assembler,
    text_splitter])

empty_df = spark.createDataFrame([[""]]).toDF("text")

model = nlp_pipeline.fit(empty_df)

light_model = nlp.LightPipeline(model)


In [None]:
result = light_model.annotate(agreement)

paragraphs = result['paragraphs']

paragraphs[:10]

['Exhibit 10.2',
 'Execution Version',
 'INTELLECTUAL PROPERTY AGREEMENT',
 'This INTELLECTUAL PROPERTY AGREEMENT (this "Agreement"), dated as of December 31, 2018 (the "Effective Date") is entered into by and between Armstrong Flooring, Inc., a Delaware corporation ("Seller") and AFI Licensing LLC, a Delaware limited liability company ("Licensing" and together with Seller, "Arizona") and AHF Holding, Inc. (formerly known as Tarzan HoldCo, Inc.), a Delaware corporation ("Buyer") and Armstrong Hardwood Flooring Company, a Tennessee corporation (the "Company" and together with Buyer the "Buyer Entities") (each of Arizona on the one hand and the Buyer Entities on the other hand, a "Party" and collectively, the "Parties").',
 'WHEREAS, Seller and Buyer have entered into that certain Stock Purchase Agreement, dated November 14, 2018 (the "Stock Purchase Agreement"); WHEREAS, pursuant to the Stock Purchase Agreement, Seller has agreed to sell and transfer, and Buyer has agreed to purchase an

In [None]:
len(paragraphs)

171

In [None]:
candidates = [paragraphs[3]]

candidates

['This INTELLECTUAL PROPERTY AGREEMENT (this "Agreement"), dated as of December 31, 2018 (the "Effective Date") is entered into by and between Armstrong Flooring, Inc., a Delaware corporation ("Seller") and AFI Licensing LLC, a Delaware limited liability company ("Licensing" and together with Seller, "Arizona") and AHF Holding, Inc. (formerly known as Tarzan HoldCo, Inc.), a Delaware corporation ("Buyer") and Armstrong Hardwood Flooring Company, a Tennessee corporation (the "Company" and together with Buyer the "Buyer Entities") (each of Arizona on the one hand and the Buyer Entities on the other hand, a "Party" and collectively, the "Parties").']

##✔️ Using Text Classification to Find Relevant Parts of the Document

In this case, we know paragraphs 4 is the paragraph with introduction of the agreement. However, let's suppose we don't know it. So, we can use Clasue Classification.

To check introduction of the agreement , we have a specific model called `legclf_introduction_clause_cuad`

In [None]:
# Text Classifier

def generic_clf_pipeline(model_name):

  """This pipeline allows you to use different classification models to understand if an input text is of a specific class or is something else.
  It will be used to check where the introduction of agreement is, where the WHEREAS clause are, and etc."""

  document_assembler = nlp.DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("document")

  embeddings = nlp.BertSentenceEmbeddings.pretrained("sent_bert_base_cased", "en")\
    .setInputCols("document") \
    .setOutputCol("sentence_embeddings")

  doc_classifier = legal.ClassifierDLModel.pretrained(model_name, "en", "legal/models")\
    .setInputCols(["sentence_embeddings"])\
    .setOutputCol("category")

  pipeline = nlp.Pipeline(stages=[
    document_assembler, 
    embeddings,
    doc_classifier
  ])

  empty_df = spark.createDataFrame([[""]]).toDF("text")

  model = pipeline.fit(empty_df)

  return model


In [None]:
model_name = "legclf_introduction_clause_cuad"

model = generic_clf_pipeline(model_name)

df = spark.createDataFrame([candidates]).toDF("text")

result = model.transform(df)

sent_bert_base_cased download started this may take some time.
Approximate size to download 389.1 MB
[OK!]
legclf_introduction_clause_cuad download started this may take some time.
[OK!]


In [None]:
result.select('category.result').show()

+--------------+
|        result|
+--------------+
|[introduction]|
+--------------+



Confirmed, paragraphs 4 is introduction of the agreement!

##📌 Extract Relations Between Parties in an Agreement

Main component to carry out information extraction and extract entities from texts. 

This time we will use the `legner_contract_doc_parties` model, which is trained to extract many entities from contracts.

After that, we will extract the relations between these entities using `legre_contract_doc_parties` model.



In [None]:
# Relation Extraction Pipeline Function

def generic_re_pipeline(ner_model, re_model):

  """This pipeline allows you to get relations between the entities."""

  document_assembler = nlp.DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("document")

  text_splitter = legal.TextSplitter()\
      .setInputCols(["document"])\
      .setOutputCol("sentence")

  tokenizer = nlp.Tokenizer()\
      .setInputCols(["sentence"])\
      .setOutputCol("token")

  embeddings = nlp.RoBertaEmbeddings.pretrained("roberta_embeddings_legal_roberta_base", "en") \
      .setInputCols("sentence", "token") \
      .setOutputCol("embeddings")\
      .setMaxSentenceLength(512)

  ner_model = legal.NerModel.pretrained(ner_model, "en", "legal/models")\
      .setInputCols(["sentence", "token", "embeddings"])\
      .setOutputCol("ner")

  ner_converter = nlp.NerConverter()\
      .setInputCols(["sentence","token","ner"])\
      .setOutputCol("ner_chunk")

  """
  ONLY NEEDED IF YOU WANT TO FILTER RELATION PAIRS OR SYNTACTIC DISTANCE
  pos_tagger = nlp.PerceptronModel().pretrained() \
      .setInputCols(["document", "token"])\
      .setOutputCol("pos_tags")

  dependency_parser = nlp.DependencyParserModel() \
      .pretrained("dependency_conllu", "en") \
      .setInputCols(["document", "pos_tags", "token"]) \
      .setOutputCol("dependencies")

  Set a filter on pairs of named entities which will be treated as relation candidates
  re_filter = legal.RENerChunksFilter()\
      .setInputCols(["ner_chunk", "dependencies"])\
      .setOutputCol("re_ner_chunks")\
      .setMaxSyntacticDistance(7)\
      .setRelationPairs(['PARTY-ALIAS', 'DOC-PARTY', 'DOC-EFFDATE'])
  """
  re_model = legal.RelationExtractionDLModel.pretrained(re_model, "en", "legal/models")\
      .setPredictionThreshold(0.1)\
      .setInputCols(["ner_chunk", "sentence"])\
      .setOutputCol("relations")

  pipeline = nlp.Pipeline(stages=[
          document_assembler,
          text_splitter,
          tokenizer,
          embeddings,
          ner_model,
          ner_converter,
          re_model
          ])
  empty_df = spark.createDataFrame([[""]]).toDF("text")

  model = pipeline.fit(empty_df)

  return model

In [None]:
# Create Generic Function to Show Relations in Dataframe

import pandas as pd
def get_relations_df (results, col='relations'):
    rel_pairs=[]
    for i in range(len(results)):
        for rel in results[i][col]:
            rel_pairs.append((
              rel.result, 
              rel.metadata['entity1'], 
              rel.metadata['entity1_begin'],
              rel.metadata['entity1_end'],
              rel.metadata['chunk1'], 
              rel.metadata['entity2'],
              rel.metadata['entity2_begin'],
              rel.metadata['entity2_end'],
              rel.metadata['chunk2'], 
              rel.metadata['confidence']
          ))
    rel_df = pd.DataFrame(rel_pairs, columns=['relation','entity1','entity1_begin','entity1_end','chunk1','entity2','entity2_begin','entity2_end','chunk2', 'confidence'])
    return rel_df

📜As an output, you will get the relations linking the different concepts together, if such relation exists. The list of relations is:

- **dated_as**: A document has an effective date
- **has_alias**: The alias of a party all along the document
- **has_collective_alias**: An alias hold by several parties at the same time
- **signed_by**: Between a party and the document they signed

In [None]:
ner_model = "legner_contract_doc_parties"

re_model = "legre_contract_doc_parties"

model = generic_re_pipeline(ner_model, re_model)

light_model = nlp.LightPipeline(model)

result = light_model.fullAnnotate(candidates)


roberta_embeddings_legal_roberta_base download started this may take some time.
Approximate size to download 447.2 MB
[OK!]
legner_contract_doc_parties download started this may take some time.
[OK!]
legre_contract_doc_parties download started this may take some time.
[OK!]


In [None]:
# Recognized entities
pd.DataFrame([(x.result, x.metadata["entity"]) for x in result[0]["ner_chunk"]], columns=["text", "ner"])

Unnamed: 0,text,ner
0,INTELLECTUAL PROPERTY AGREEMENT,DOC
1,"December 31, 2018",EFFDATE
2,"Armstrong Flooring, Inc",PARTY
3,Seller,ALIAS
4,AFI Licensing LLC,PARTY
5,Licensing,ALIAS
6,Seller,ALIAS
7,Arizona,ALIAS
8,"AHF Holding, Inc",PARTY
9,Buyer,ALIAS


In [None]:
rel_df = get_relations_df(result)

rel_df[rel_df["relation"] != "no_rel"]

Unnamed: 0,relation,entity1,entity1_begin,entity1_end,chunk1,entity2,entity2_begin,entity2_end,chunk2,confidence
0,dated_as,DOC,5,35,INTELLECTUAL PROPERTY AGREEMENT,EFFDATE,69,85,"December 31, 2018",0.9856822
1,signed_by,DOC,5,35,INTELLECTUAL PROPERTY AGREEMENT,PARTY,141,163,"Armstrong Flooring, Inc",0.78165114
3,signed_by,DOC,5,35,INTELLECTUAL PROPERTY AGREEMENT,PARTY,205,221,AFI Licensing LLC,0.5352147
15,has_alias,PARTY,141,163,"Armstrong Flooring, Inc",ALIAS,192,197,Seller,0.89620024
26,has_alias,PARTY,205,221,AFI Licensing LLC,ALIAS,263,271,Licensing,0.9518907
33,has_collective_alias,ALIAS,292,297,Seller,ALIAS,301,307,Arizona,0.8934925
42,has_alias,PARTY,411,445,Armstrong Hardwood Flooring Company,ALIAS,478,484,Company,0.98353046
51,has_collective_alias,ALIAS,505,509,Buyer,ALIAS,516,529,Buyer Entities,0.72171456
56,has_collective_alias,ALIAS,611,615,Party,ALIAS,641,647,Parties,0.5040901


###✔️ Visualization of Extracted Relations

In [None]:
# from sparknlp_display import RelationExtractionVisualizer

re_vis = viz.RelationExtractionVisualizer()

re_vis.display(result = result[0],
           relation_col = "relations",
           document_col = "document",
           exclude_relations = ["no_rel"],
           show_relations=True
           )

###✔️ Get Relations with Unidirectional Model

Now, let's try to get same relations with unidirectional REDL model, meaning that the model retrieves in chunk1 the left side of the relation (source), and in chunk2 the right side (target). For this, we will use the `legre_contract_doc_parties_md` model.

In [None]:
ner_model = "legner_contract_doc_parties"

re_model = "legre_contract_doc_parties_md"

model = generic_re_pipeline(ner_model, re_model)

light_model = nlp.LightPipeline(model)

result = light_model.fullAnnotate(candidates)

roberta_embeddings_legal_roberta_base download started this may take some time.
Approximate size to download 447.2 MB
[OK!]
legner_contract_doc_parties download started this may take some time.
[OK!]
legre_contract_doc_parties_md download started this may take some time.
[OK!]


In [None]:
rel_df = get_relations_df(result)

rel_df[rel_df["relation"] != "other"]

Unnamed: 0,relation,entity1,entity1_begin,entity1_end,chunk1,entity2,entity2_begin,entity2_end,chunk2,confidence
0,dated_as,DOC,5,35,INTELLECTUAL PROPERTY AGREEMENT,EFFDATE,69,85,"December 31, 2018",0.9999635
1,signed_by,DOC,5,35,INTELLECTUAL PROPERTY AGREEMENT,PARTY,141,163,"Armstrong Flooring, Inc",0.9994797
2,signed_by,DOC,5,35,INTELLECTUAL PROPERTY AGREEMENT,ALIAS,192,197,Seller,0.98703974
3,signed_by,DOC,5,35,INTELLECTUAL PROPERTY AGREEMENT,PARTY,205,221,AFI Licensing LLC,0.99916875
4,signed_by,DOC,5,35,INTELLECTUAL PROPERTY AGREEMENT,ALIAS,263,271,Licensing,0.94065416
5,signed_by,DOC,5,35,INTELLECTUAL PROPERTY AGREEMENT,ALIAS,292,297,Seller,0.9914723
6,signed_by,DOC,5,35,INTELLECTUAL PROPERTY AGREEMENT,ALIAS,301,307,Arizona,0.9941164
7,signed_by,DOC,5,35,INTELLECTUAL PROPERTY AGREEMENT,PARTY,315,330,"AHF Holding, Inc",0.99897385
8,dated_as,EFFDATE,69,85,"December 31, 2018",PARTY,141,163,"Armstrong Flooring, Inc",0.7296508
9,dated_as,ALIAS,192,197,Seller,EFFDATE,69,85,"December 31, 2018",0.7790136


In [None]:
# from sparknlp_display import RelationExtractionVisualizer

re_vis = viz.RelationExtractionVisualizer()

re_vis.display(result = result[0],
           relation_col = "relations",
           document_col = "document",
           exclude_relations = ["other"],
           show_relations=True
           )

##✔️ Relation Extraction Model to Infer Relations Between Elements in WHEREAS Clauses

The "whereas" clause is often used in legal documents to provide background information or to set the stage for the document. It is typically used at the beginning of the document and is followed by one or more "therefore" clauses, which outline the actions or decisions that are being taken based on the information provided in the "whereas" clause.

The "whereas" clause is used to provide context and to establish a foundation for the subsequent provisions of the document. It is often used to describe the circumstances that have led to the creation of the document, or to provide other relevant information that is necessary to understand the purpose of the document.

In general, the "whereas" clause is used to provide a clear and concise explanation of the reasons behind the actions or decisions that are being taken in the document. It is an important part of many legal documents and is often used to establish a clear and logical chain of reasoning.

###📌 Firstly, we will get the `whereas` clauses
 Let's choose one WHEREAS clasue from agreement


In [None]:
candidates = [paragraphs[4], paragraphs[5], paragraphs[6], paragraphs[7], paragraphs[8], paragraphs[9]]

candidates

['WHEREAS, Seller and Buyer have entered into that certain Stock Purchase Agreement, dated November 14, 2018 (the "Stock Purchase Agreement"); WHEREAS, pursuant to the Stock Purchase Agreement, Seller has agreed to sell and transfer, and Buyer has agreed to purchase and acquire, all of Seller\'s right, title and interest in and to Armstrong Wood Products, Inc., a Delaware corporation ("AWP") and its Subsidiaries, the Company and HomerWood Hardwood Flooring Company, a Delaware corporation ("HHFC," and together with the Company, the "Company Subsidiaries" and together with AWP, the "Company Entities" and each a "Company Entity") by way of a purchase by Buyer and sale by Seller of the Shares, all upon the terms and condition set forth therein;',
 "WHEREAS, Arizona owns certain Copyrights, Know-How, Patents and Trademarks which may be used in the Company Field, and in connection with the transactions contemplated by the Stock Purchase Agreement the Company desires to acquire all of Arizona

In [None]:
candidates = [paragraphs[9]]

candidates

['WHEREAS, the Company Entities own certain Copyrights and Know-How which may be used in the Arizona Field, and in connection with the transactions contemplated by the Stock Purchase Agreement, Arizona desires to obtain a license from the Company Entities to use such Intellectual Property on the terms and subject to the conditions set forth herein.']

Previously, we used `legclf_introduction_clause_cuad` model to learn introduction of the agreement. Here we will use the `legclf_cuad_whereas_clause` model to learn WHEREAS clauses.

In [None]:
model_name = "legclf_cuad_whereas_clause"

model = generic_clf_pipeline(model_name)

sent_bert_base_cased download started this may take some time.
Approximate size to download 389.1 MB
[OK!]
legclf_cuad_whereas_clause download started this may take some time.
[OK!]


In [None]:
df = spark.createDataFrame([candidates]).toDF("text")

result = model.transform(df)

In [None]:
result.select('category.result').show()

+---------+
|   result|
+---------+
|[whereas]|
+---------+



📚Now, we will get relations between elements in **WHEREAS** clauses, more specifically the **SUBJECT**, the **ACTION** and the **OBJECT**. Firstly, we will extract these entities with `legner_whereas` model, after that we will extract relations between these entities with `legre_whereas` model. There are two relations possible: **has_subject** and **has_object**.

In [None]:
ner_model = "legner_whereas_md"

re_model = "legre_whereas"

model = generic_re_pipeline(ner_model, re_model)

light_model = nlp.LightPipeline(model)

result = light_model.fullAnnotate(candidates)

roberta_embeddings_legal_roberta_base download started this may take some time.
Approximate size to download 447.2 MB
[OK!]
legner_whereas_md download started this may take some time.
[OK!]
legre_whereas download started this may take some time.
[OK!]


In [None]:
## Recognized entities
pd.DataFrame([(x.result, x.metadata["entity"]) for x in result[0]["ner_chunk"]], columns=["text", "ner"])

Unnamed: 0,text,ner
0,Arizona,WHEREAS_SUBJECT
1,desires to obtain,WHEREAS_ACTION
2,a license from the Company Entities,WHEREAS_OBJECT


In [None]:
rel_df = get_relations_df(result)

rel_df[rel_df["relation"] != "no_rel"]

Unnamed: 0,relation,entity1,entity1_begin,entity1_end,chunk1,entity2,entity2_begin,entity2_end,chunk2,confidence
0,has_subject,WHEREAS_SUBJECT,192,198,Arizona,WHEREAS_ACTION,200,216,desires to obtain,0.6815935
1,has_subject,WHEREAS_SUBJECT,192,198,Arizona,WHEREAS_OBJECT,218,252,a license from the Company Entities,0.6681112
2,has_object,WHEREAS_ACTION,200,216,desires to obtain,WHEREAS_OBJECT,218,252,a license from the Company Entities,0.93285346


In [None]:
# from sparknlp_display import RelationExtractionVisualizer

re_vis = viz.RelationExtractionVisualizer()

re_vis.display(result = result[0],
           relation_col = "relations",
           document_col = "document",
           exclude_relations = ["no_rel"],
           show_relations=True
           )

###✔️ Get Relations with Unidirectional Model

In [None]:
ner_model = "legner_whereas_md"

re_model = "legre_whereas_md"

model = generic_re_pipeline(ner_model, re_model)

light_model = nlp.LightPipeline(model)

result = light_model.fullAnnotate(candidates)

roberta_embeddings_legal_roberta_base download started this may take some time.
Approximate size to download 447.2 MB
[OK!]
legner_whereas_md download started this may take some time.
[OK!]
legre_whereas_md download started this may take some time.
[OK!]


In [None]:
rel_df = get_relations_df(result)

rel_df[rel_df["relation"] != "other"]

Unnamed: 0,relation,entity1,entity1_begin,entity1_end,chunk1,entity2,entity2_begin,entity2_end,chunk2,confidence
0,has_subject,WHEREAS_ACTION,200,216,desires to obtain,WHEREAS_SUBJECT,192,198,Arizona,0.9947596
1,has_subject,WHEREAS_OBJECT,218,252,a license from the Company Entities,WHEREAS_SUBJECT,192,198,Arizona,0.9687109
2,has_object,WHEREAS_ACTION,200,216,desires to obtain,WHEREAS_OBJECT,218,252,a license from the Company Entities,0.67824775


###📚 Visualization of Extracted Relations

In [None]:
# from sparknlp_display import RelationExtractionVisualizer

re_vis = nlp.viz.RelationExtractionVisualizer()

re_vis.display(result = result[0],
           relation_col = "relations",
           document_col = "document",
           exclude_relations = ["other"],
           show_relations=True
           )