Legal NLP 1.14.0 (#287)

* 2023-05-17-legclf_nda_assigments_bert_en (#238) * Add model 2023-05-17-legclf_nda_assigments_bert_en * Update 2023-05-17-legclf_nda_assigments_bert_en.md * Add model 2023-05-17-legclf_nda_perpetuity_bert_en * Update 2023-05-17-legclf_nda_perpetuity_bert_en.md * Update 2023-05-17-legclf_nda_perpetuity_bert_en.md * Add model 2023-05-17-legclf_nda_non_compete_items_bert_en * Update 2023-05-17-legclf_nda_non_compete_items_bert_en.md --------- Co-authored-by: bunyamin-polat <muhendisbp@gmail.com> Co-authored-by: Bünyamin Polat <78386903+bunyamin-polat@users.noreply.github.com> * 2023-05-29-legqa_flant5_finetuned_en (#278) * Add model 2023-05-29-legqa_flant5_finetuned_en * Update 2023-05-29-legqa_flant5_finetuned_en.md * Update 2023-05-29-legqa_flant5_finetuned_en.md --------- Co-authored-by: gadde5300 <gadde5300@gmail.com> Co-authored-by: GADDE SAI SHAILESH <69344247+gadde5300@users.noreply.github.com> --------- Co-authored-by: jsl-models <74001263+jsl-models@users.noreply.github.com> Co-authored-by: bunyamin-polat <muhendisbp@gmail.com> Co-authored-by: Bünyamin Polat <78386903+bunyamin-polat@users.noreply.github.com> Co-authored-by: gadde5300 <gadde5300@gmail.com> Co-authored-by: GADDE SAI SHAILESH <69344247+gadde5300@users.noreply.github.com>
JohnSnowLabs · May 30, 2023 · 30e8e24 · 30e8e24
1 parent 9a65615
commit 30e8e24
Show file tree

Hide file tree

Showing 4 changed files with 449 additions and 0 deletions.
diff --git a/docs/_posts/bunyamin-polat/2023-05-17-legclf_nda_assigments_bert_en.md b/docs/_posts/bunyamin-polat/2023-05-17-legclf_nda_assigments_bert_en.md
@@ -0,0 +1,121 @@
+---
+layout: model
+title: Understanding Restriction Level of Assignment Clauses(Bert)
+author: John Snow Labs
+name: legclf_nda_assigments_bert
+date: 2023-05-17
+tags: [en, legal, licensed, bert, nda, classification, assigments, tensorflow]
+task: Text Classification
+language: en
+edition: Legal NLP 1.0.0
+spark_version: 3.0
+supported: true
+engine: tensorflow
+annotator: LegalBertForSequenceClassification
+article_header:
+  type: cover
+use_language_switcher: "Python-Scala-Java"
+---
+
+## Description
+
+Given a clause classified as `ASSIGNMENT ` using the `legmulticlf_mnda_sections_paragraph_other` classifier, you can subclassify the sentences as `PERMISSIVE_ASSIGNMENT`, `RESTRICTIVE_ASSIGNMENT` or `OTHER` from it using the `legclf_nda_assigments_bert` model. It has been trained with the SOTA approach.
+
+## Predicted Entities
+
+`PERMISSIVE_ASSIGNMENT`, `RESTRICTIVE_ASSIGNMENT`, `OTHER`
+
+{:.btn-box}
+<button class="button button-orange" disabled>Live Demo</button>
+<button class="button button-orange" disabled>Open in Colab</button>
+[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/legal/models/legclf_nda_assigments_bert_en_1.0.0_3.0_1684350248553.zip){:.button.button-orange}
+[Copy S3 URI](s3://auxdata.johnsnowlabs.com/legal/models/legclf_nda_assigments_bert_en_1.0.0_3.0_1684350248553.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
+
+## How to use
+
+
+
+<div class="tabs-box" markdown="1">
+{% include programmingLanguageSelectScalaPythonNLU.html %}
+
+```python
+document_assembler = nlp.DocumentAssembler()\
+    .setInputCol("text")\
+    .setOutputCol("document")
+
+tokenizer = nlp.Tokenizer()\
+    .setInputCols(["document"])\
+    .setOutputCol("token")
+
+sequence_classifier = legal.BertForSequenceClassification.pretrained("legclf_nda_assigments_bert", "en", "legal/models")\
+    .setInputCols(["document","token"])\
+    .setOutputCol("class")\
+    .setCaseSensitive(True)\
+    .setMaxSentenceLength(512)
+
+clf_pipeline = nlp.Pipeline(stages=[
+    document_assembler, 
+    tokenizer,
+    sequence_classifier    
+])
+
+empty_df = spark.createDataFrame([['']]).toDF("text")
+
+model = clf_pipeline.fit(empty_df)
+
+text_list = [
+"""This Agreement will be binding upon and inure to the benefit of each Party and its respective heirs, successors and assigns""",
+"""All notices and other communications provided for in this Agreement and the other Loan Documents shall be in writing and may (subject to paragraph (b) below) be telecopied (faxed), mailed by certified mail return receipt requested, or delivered by hand or overnight courier service to the intended recipient at the addresses specified below or at such other address as shall be designated by any party listed below in a notice to the other parties listed below given in accordance with this Section.""",
+"""This Agreement is a personal contract for XCorp, and the rights and interests of XCorp hereunder may not be sold, transferred, assigned, pledged or hypothecated except as otherwise expressly permitted by the Company"""
+]
+
+df = spark.createDataFrame(pd.DataFrame({"text" : text_list}))
+
+result = model.transform(df)
+```
+
+</div>
+
+## Results
+
+```bash
++--------------------------------------------------------------------------------+----------------------+
+|                                                                            text|                 class|
++--------------------------------------------------------------------------------+----------------------+
+|This Agreement will be binding upon and inure to the benefit of each Party an...| PERMISSIVE_ASSIGNMENT|
+|All notices and other communications provided for in this Agreement and the o...|                 OTHER|
+|This Agreement is a personal contract for XCorp, and the rights and interests...|RESTRICTIVE_ASSIGNMENT|
++--------------------------------------------------------------------------------+----------------------+
+```
+
+{:.model-param}
+## Model Information
+
+{:.table-model}
+|---|---|
+|Model Name:|legclf_nda_assigments_bert|
+|Compatibility:|Legal NLP 1.0.0+|
+|License:|Licensed|
+|Edition:|Official|
+|Input Labels:|[document, token]|
+|Output Labels:|[class]|
+|Language:|en|
+|Size:|406.4 MB|
+|Case sensitive:|true|
+|Max sentence length:|512|
+
+## Sample text from the training dataset
+
+In-house annotations on the Non-disclosure Agreements
+
+## Benchmarking
+
+```bash
+label                   precision  recall  f1-score  support 
+OTHER                   0.98       1.00    0.99      124     
+PERMISSIVE_ASSIGNMENT   1.00       0.93    0.97      15      
+RESTRICTIVE_ASSIGNMENT  1.00       0.96    0.98      25      
+accuracy                -          -       0.99      164     
+macro avg               0.99       0.96    0.98      164     
+weighted avg            0.99       0.99    0.99      164     
+```
diff --git a/docs/_posts/bunyamin-polat/2023-05-17-legclf_nda_non_compete_items_bert_en.md b/docs/_posts/bunyamin-polat/2023-05-17-legclf_nda_non_compete_items_bert_en.md
@@ -0,0 +1,118 @@
+---
+layout: model
+title: Understanding Non-compete Items in Non-Compete Clauses (Bert)
+author: John Snow Labs
+name: legclf_nda_non_compete_items_bert
+date: 2023-05-17
+tags: [en, legal, licensed, bert, classification, nda, non_compete, tensorflow]
+task: Text Classification
+language: en
+edition: Legal NLP 1.0.0
+spark_version: 3.0
+supported: true
+engine: tensorflow
+annotator: LegalBertForSequenceClassification
+article_header:
+  type: cover
+use_language_switcher: "Python-Scala-Java"
+---
+
+## Description
+
+Given a clause classified as `NON_COMP` using the `legmulticlf_mnda_sections_paragraph_other` classifier, you can subclassify the sentences as `NON_COMPETE_ITEMS`, or `OTHER` from it using the `legclf_nda_non_compete_items_bert` model. It has been trained with the SOTA approach.
+
+## Predicted Entities
+
+`NON_COMPETE_ITEMS`, `OTHER`
+
+{:.btn-box}
+<button class="button button-orange" disabled>Live Demo</button>
+<button class="button button-orange" disabled>Open in Colab</button>
+[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/legal/models/legclf_nda_non_compete_items_bert_en_1.0.0_3.0_1684358961459.zip){:.button.button-orange}
+[Copy S3 URI](s3://auxdata.johnsnowlabs.com/legal/models/legclf_nda_non_compete_items_bert_en_1.0.0_3.0_1684358961459.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
+
+## How to use
+
+
+
+<div class="tabs-box" markdown="1">
+{% include programmingLanguageSelectScalaPythonNLU.html %}
+
+```python
+document_assembler = nlp.DocumentAssembler()\
+    .setInputCol("text")\
+    .setOutputCol("document")
+
+tokenizer = nlp.Tokenizer()\
+    .setInputCols(["document"])\
+    .setOutputCol("token")
+
+sequence_classifier = legal.BertForSequenceClassification.pretrained("legclf_nda_non_compete_items_bert", "en", "legal/models")\
+    .setInputCols(["document", "token"])\
+    .setOutputCol("class")\
+    .setCaseSensitive(True)\
+    .setMaxSentenceLength(512)
+
+clf_pipeline = nlp.Pipeline(stages=[
+    document_assembler, 
+    tokenizer,
+    sequence_classifier    
+])
+
+empty_df = spark.createDataFrame([['']]).toDF("text")
+
+model = clf_pipeline.fit(empty_df)
+
+text_list = [
+"""This Agreement will be binding upon and inure to the benefit of each Party and its respective heirs, successors and assigns""",
+"""Activity that is in direct competition with the Company's business, including but not limited to developing, marketing, or selling products or services that are similar to those of the Company."""
+]
+
+df = spark.createDataFrame(pd.DataFrame({"text" : text_list}))
+
+result = model.transform(df)
+```
+
+</div>
+
+## Results
+
+```bash
++--------------------------------------------------------------------------------+-----------------+
+|                                                                            text|            class|
++--------------------------------------------------------------------------------+-----------------+
+|This Agreement will be binding upon and inure to the benefit of each Party an...|            OTHER|
+|Activity that is in direct competition with the Company's business, including...|NON_COMPETE_ITEMS|
++--------------------------------------------------------------------------------+-----------------+
+```
+
+{:.model-param}
+## Model Information
+
+{:.table-model}
+|---|---|
+|Model Name:|legclf_nda_non_compete_items_bert|
+|Compatibility:|Legal NLP 1.0.0+|
+|License:|Licensed|
+|Edition:|Official|
+|Input Labels:|[document, token]|
+|Output Labels:|[class]|
+|Language:|en|
+|Size:|406.4 MB|
+|Case sensitive:|true|
+|Max sentence length:|512|
+
+## References
+
+In-house annotations on the Non-disclosure Agreements
+
+## Benchmarking
+
+```bash
+label              precision  recall  f1-score  support 
+NON_COMPETE_ITEMS  1.00       1.00    1.00      10      
+OTHER              1.00       1.00    1.00      64      
+accuracy           -          -       1.00      74      
+macro avg          1.00       1.00    1.00      74      
+weighted avg       1.00       1.00    1.00      74  
+```
diff --git a/docs/_posts/bunyamin-polat/2023-05-17-legclf_nda_perpetuity_bert_en.md b/docs/_posts/bunyamin-polat/2023-05-17-legclf_nda_perpetuity_bert_en.md
@@ -0,0 +1,118 @@
+---
+layout: model
+title: Understanding Perpetuity in "Return of Confidential Information" Clauses (Bert)
+author: John Snow Labs
+name: legclf_nda_perpetuity_bert
+date: 2023-05-17
+tags: [en, legal, licensed, bert, nda, classification, perpetuity, tensorflow]
+task: Text Classification
+language: en
+edition: Legal NLP 1.0.0
+spark_version: 3.0
+supported: true
+engine: tensorflow
+annotator: LegalBertForSequenceClassification
+article_header:
+  type: cover
+use_language_switcher: "Python-Scala-Java"
+---
+
+## Description
+
+Given a clause classified as `RETURN_OF_CONF_INFO` using the `legmulticlf_mnda_sections_paragraph_other` classifier, you can subclassify the sentences as `PERPETUITY` or `OTHER` from it using the `legclf_nda_perpetuity_bert` model. It has been trained with the SOTA approach
+
+## Predicted Entities
+
+`PERPETUITY`, `OTHER`
+
+{:.btn-box}
+<button class="button button-orange" disabled>Live Demo</button>
+<button class="button button-orange" disabled>Open in Colab</button>
+[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/legal/models/legclf_nda_perpetuity_bert_en_1.0.0_3.0_1684353033843.zip){:.button.button-orange}
+[Copy S3 URI](s3://auxdata.johnsnowlabs.com/legal/models/legclf_nda_perpetuity_bert_en_1.0.0_3.0_1684353033843.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
+
+## How to use
+
+
+
+<div class="tabs-box" markdown="1">
+{% include programmingLanguageSelectScalaPythonNLU.html %}
+
+```python
+document_assembler = nlp.DocumentAssembler()\
+    .setInputCol("text")\
+    .setOutputCol("document")
+
+tokenizer = nlp.Tokenizer()\
+    .setInputCols(["document"])\
+    .setOutputCol("token")
+
+sequence_classifier = legal.BertForSequenceClassification.pretrained("legclf_nda_perpetuity_bert", "en", "legal/models")\
+    .setInputCols(["document", "token"])\
+    .setOutputCol("class")\
+    .setCaseSensitive(True)\
+    .setMaxSentenceLength(512)
+
+clf_pipeline = nlp.Pipeline(stages=[
+    document_assembler, 
+    tokenizer,
+    sequence_classifier    
+])
+
+empty_df = spark.createDataFrame([['']]).toDF("text")
+
+model = clf_pipeline.fit(empty_df)
+
+text_list = [
+"""Notwithstanding the return or destruction of all Evaluation Material, you or your Representatives shall continue to be bound by your obligations of confidentiality and other obligations hereunder.""",
+"""There are no intended third party beneficiaries to this Agreement."""
+]
+
+df = spark.createDataFrame(pd.DataFrame({"text" : text_list}))
+
+result = model.transform(df)
+```
+
+</div>
+
+## Results
+
+```bash
++--------------------------------------------------------------------------------+----------+
+|                                                                            text|     class|
++--------------------------------------------------------------------------------+----------+
+|Notwithstanding the return or destruction of all Evaluation Material, you or ...|PERPETUITY|
+|              There are no intended third-party beneficiaries to this Agreement.|     OTHER|
++--------------------------------------------------------------------------------+----------+
+```
+
+{:.model-param}
+## Model Information
+
+{:.table-model}
+|---|---|
+|Model Name:|legclf_nda_perpetuity_bert|
+|Compatibility:|Legal NLP 1.0.0+|
+|License:|Licensed|
+|Edition:|Official|
+|Input Labels:|[document, token]|
+|Output Labels:|[class]|
+|Language:|en|
+|Size:|406.4 MB|
+|Case sensitive:|true|
+|Max sentence length:|512|
+
+## References
+
+In-house annotations on the Non-disclosure Agreements
+
+## Benchmarking
+
+```bash
+label         precision  recall  f1-score  support 
+OTHER         0.98       1.00    0.99      60      
+PERPETUITY    1.00       0.89    0.94      9       
+accuracy      -          -       0.99      69      
+macro avg     0.99       0.94    0.97      69      
+weighted avg  0.99       0.99    0.99      69 
+```