Skip to content

Commit

Permalink
Legal NLP 1.14.0 (#287)
Browse files Browse the repository at this point in the history
* 2023-05-17-legclf_nda_assigments_bert_en (#238)

* Add model 2023-05-17-legclf_nda_assigments_bert_en

* Update 2023-05-17-legclf_nda_assigments_bert_en.md

* Add model 2023-05-17-legclf_nda_perpetuity_bert_en

* Update 2023-05-17-legclf_nda_perpetuity_bert_en.md

* Update 2023-05-17-legclf_nda_perpetuity_bert_en.md

* Add model 2023-05-17-legclf_nda_non_compete_items_bert_en

* Update 2023-05-17-legclf_nda_non_compete_items_bert_en.md

---------

Co-authored-by: bunyamin-polat <muhendisbp@gmail.com>
Co-authored-by: Bünyamin Polat <78386903+bunyamin-polat@users.noreply.github.com>

* 2023-05-29-legqa_flant5_finetuned_en (#278)

* Add model 2023-05-29-legqa_flant5_finetuned_en

* Update 2023-05-29-legqa_flant5_finetuned_en.md

* Update 2023-05-29-legqa_flant5_finetuned_en.md

---------

Co-authored-by: gadde5300 <gadde5300@gmail.com>
Co-authored-by: GADDE SAI SHAILESH <69344247+gadde5300@users.noreply.github.com>

---------

Co-authored-by: jsl-models <74001263+jsl-models@users.noreply.github.com>
Co-authored-by: bunyamin-polat <muhendisbp@gmail.com>
Co-authored-by: Bünyamin Polat <78386903+bunyamin-polat@users.noreply.github.com>
Co-authored-by: gadde5300 <gadde5300@gmail.com>
Co-authored-by: GADDE SAI SHAILESH <69344247+gadde5300@users.noreply.github.com>
  • Loading branch information
6 people committed May 30, 2023
1 parent 9a65615 commit 30e8e24
Show file tree
Hide file tree
Showing 4 changed files with 449 additions and 0 deletions.
121 changes: 121 additions & 0 deletions docs/_posts/bunyamin-polat/2023-05-17-legclf_nda_assigments_bert_en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
---
layout: model
title: Understanding Restriction Level of Assignment Clauses(Bert)
author: John Snow Labs
name: legclf_nda_assigments_bert
date: 2023-05-17
tags: [en, legal, licensed, bert, nda, classification, assigments, tensorflow]
task: Text Classification
language: en
edition: Legal NLP 1.0.0
spark_version: 3.0
supported: true
engine: tensorflow
annotator: LegalBertForSequenceClassification
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Given a clause classified as `ASSIGNMENT ` using the `legmulticlf_mnda_sections_paragraph_other` classifier, you can subclassify the sentences as `PERMISSIVE_ASSIGNMENT`, `RESTRICTIVE_ASSIGNMENT` or `OTHER` from it using the `legclf_nda_assigments_bert` model. It has been trained with the SOTA approach.

## Predicted Entities

`PERMISSIVE_ASSIGNMENT`, `RESTRICTIVE_ASSIGNMENT`, `OTHER`

{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/legal/models/legclf_nda_assigments_bert_en_1.0.0_3.0_1684350248553.zip){:.button.button-orange}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/legal/models/legclf_nda_assigments_bert_en_1.0.0_3.0_1684350248553.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}

```python
document_assembler = nlp.DocumentAssembler()\
.setInputCol("text")\
.setOutputCol("document")

tokenizer = nlp.Tokenizer()\
.setInputCols(["document"])\
.setOutputCol("token")

sequence_classifier = legal.BertForSequenceClassification.pretrained("legclf_nda_assigments_bert", "en", "legal/models")\
.setInputCols(["document","token"])\
.setOutputCol("class")\
.setCaseSensitive(True)\
.setMaxSentenceLength(512)

clf_pipeline = nlp.Pipeline(stages=[
document_assembler,
tokenizer,
sequence_classifier
])

empty_df = spark.createDataFrame([['']]).toDF("text")

model = clf_pipeline.fit(empty_df)

text_list = [
"""This Agreement will be binding upon and inure to the benefit of each Party and its respective heirs, successors and assigns""",
"""All notices and other communications provided for in this Agreement and the other Loan Documents shall be in writing and may (subject to paragraph (b) below) be telecopied (faxed), mailed by certified mail return receipt requested, or delivered by hand or overnight courier service to the intended recipient at the addresses specified below or at such other address as shall be designated by any party listed below in a notice to the other parties listed below given in accordance with this Section.""",
"""This Agreement is a personal contract for XCorp, and the rights and interests of XCorp hereunder may not be sold, transferred, assigned, pledged or hypothecated except as otherwise expressly permitted by the Company"""
]

df = spark.createDataFrame(pd.DataFrame({"text" : text_list}))

result = model.transform(df)
```

</div>

## Results

```bash
+--------------------------------------------------------------------------------+----------------------+
| text| class|
+--------------------------------------------------------------------------------+----------------------+
|This Agreement will be binding upon and inure to the benefit of each Party an...| PERMISSIVE_ASSIGNMENT|
|All notices and other communications provided for in this Agreement and the o...| OTHER|
|This Agreement is a personal contract for XCorp, and the rights and interests...|RESTRICTIVE_ASSIGNMENT|
+--------------------------------------------------------------------------------+----------------------+
```

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|legclf_nda_assigments_bert|
|Compatibility:|Legal NLP 1.0.0+|
|License:|Licensed|
|Edition:|Official|
|Input Labels:|[document, token]|
|Output Labels:|[class]|
|Language:|en|
|Size:|406.4 MB|
|Case sensitive:|true|
|Max sentence length:|512|

## Sample text from the training dataset

In-house annotations on the Non-disclosure Agreements

## Benchmarking

```bash
label precision recall f1-score support
OTHER 0.98 1.00 0.99 124
PERMISSIVE_ASSIGNMENT 1.00 0.93 0.97 15
RESTRICTIVE_ASSIGNMENT 1.00 0.96 0.98 25
accuracy - - 0.99 164
macro avg 0.99 0.96 0.98 164
weighted avg 0.99 0.99 0.99 164
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
---
layout: model
title: Understanding Non-compete Items in Non-Compete Clauses (Bert)
author: John Snow Labs
name: legclf_nda_non_compete_items_bert
date: 2023-05-17
tags: [en, legal, licensed, bert, classification, nda, non_compete, tensorflow]
task: Text Classification
language: en
edition: Legal NLP 1.0.0
spark_version: 3.0
supported: true
engine: tensorflow
annotator: LegalBertForSequenceClassification
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Given a clause classified as `NON_COMP` using the `legmulticlf_mnda_sections_paragraph_other` classifier, you can subclassify the sentences as `NON_COMPETE_ITEMS`, or `OTHER` from it using the `legclf_nda_non_compete_items_bert` model. It has been trained with the SOTA approach.

## Predicted Entities

`NON_COMPETE_ITEMS`, `OTHER`

{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/legal/models/legclf_nda_non_compete_items_bert_en_1.0.0_3.0_1684358961459.zip){:.button.button-orange}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/legal/models/legclf_nda_non_compete_items_bert_en_1.0.0_3.0_1684358961459.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}

```python
document_assembler = nlp.DocumentAssembler()\
.setInputCol("text")\
.setOutputCol("document")

tokenizer = nlp.Tokenizer()\
.setInputCols(["document"])\
.setOutputCol("token")

sequence_classifier = legal.BertForSequenceClassification.pretrained("legclf_nda_non_compete_items_bert", "en", "legal/models")\
.setInputCols(["document", "token"])\
.setOutputCol("class")\
.setCaseSensitive(True)\
.setMaxSentenceLength(512)

clf_pipeline = nlp.Pipeline(stages=[
document_assembler,
tokenizer,
sequence_classifier
])

empty_df = spark.createDataFrame([['']]).toDF("text")

model = clf_pipeline.fit(empty_df)

text_list = [
"""This Agreement will be binding upon and inure to the benefit of each Party and its respective heirs, successors and assigns""",
"""Activity that is in direct competition with the Company's business, including but not limited to developing, marketing, or selling products or services that are similar to those of the Company."""
]

df = spark.createDataFrame(pd.DataFrame({"text" : text_list}))

result = model.transform(df)
```

</div>

## Results

```bash
+--------------------------------------------------------------------------------+-----------------+
| text| class|
+--------------------------------------------------------------------------------+-----------------+
|This Agreement will be binding upon and inure to the benefit of each Party an...| OTHER|
|Activity that is in direct competition with the Company's business, including...|NON_COMPETE_ITEMS|
+--------------------------------------------------------------------------------+-----------------+
```
{:.model-param}
## Model Information
{:.table-model}
|---|---|
|Model Name:|legclf_nda_non_compete_items_bert|
|Compatibility:|Legal NLP 1.0.0+|
|License:|Licensed|
|Edition:|Official|
|Input Labels:|[document, token]|
|Output Labels:|[class]|
|Language:|en|
|Size:|406.4 MB|
|Case sensitive:|true|
|Max sentence length:|512|
## References
In-house annotations on the Non-disclosure Agreements
## Benchmarking
```bash
label precision recall f1-score support
NON_COMPETE_ITEMS 1.00 1.00 1.00 10
OTHER 1.00 1.00 1.00 64
accuracy - - 1.00 74
macro avg 1.00 1.00 1.00 74
weighted avg 1.00 1.00 1.00 74
```
118 changes: 118 additions & 0 deletions docs/_posts/bunyamin-polat/2023-05-17-legclf_nda_perpetuity_bert_en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
---
layout: model
title: Understanding Perpetuity in "Return of Confidential Information" Clauses (Bert)
author: John Snow Labs
name: legclf_nda_perpetuity_bert
date: 2023-05-17
tags: [en, legal, licensed, bert, nda, classification, perpetuity, tensorflow]
task: Text Classification
language: en
edition: Legal NLP 1.0.0
spark_version: 3.0
supported: true
engine: tensorflow
annotator: LegalBertForSequenceClassification
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Given a clause classified as `RETURN_OF_CONF_INFO` using the `legmulticlf_mnda_sections_paragraph_other` classifier, you can subclassify the sentences as `PERPETUITY` or `OTHER` from it using the `legclf_nda_perpetuity_bert` model. It has been trained with the SOTA approach

## Predicted Entities

`PERPETUITY`, `OTHER`

{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/legal/models/legclf_nda_perpetuity_bert_en_1.0.0_3.0_1684353033843.zip){:.button.button-orange}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/legal/models/legclf_nda_perpetuity_bert_en_1.0.0_3.0_1684353033843.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}

```python
document_assembler = nlp.DocumentAssembler()\
.setInputCol("text")\
.setOutputCol("document")

tokenizer = nlp.Tokenizer()\
.setInputCols(["document"])\
.setOutputCol("token")

sequence_classifier = legal.BertForSequenceClassification.pretrained("legclf_nda_perpetuity_bert", "en", "legal/models")\
.setInputCols(["document", "token"])\
.setOutputCol("class")\
.setCaseSensitive(True)\
.setMaxSentenceLength(512)

clf_pipeline = nlp.Pipeline(stages=[
document_assembler,
tokenizer,
sequence_classifier
])

empty_df = spark.createDataFrame([['']]).toDF("text")

model = clf_pipeline.fit(empty_df)

text_list = [
"""Notwithstanding the return or destruction of all Evaluation Material, you or your Representatives shall continue to be bound by your obligations of confidentiality and other obligations hereunder.""",
"""There are no intended third party beneficiaries to this Agreement."""
]

df = spark.createDataFrame(pd.DataFrame({"text" : text_list}))

result = model.transform(df)
```

</div>

## Results

```bash
+--------------------------------------------------------------------------------+----------+
| text| class|
+--------------------------------------------------------------------------------+----------+
|Notwithstanding the return or destruction of all Evaluation Material, you or ...|PERPETUITY|
| There are no intended third-party beneficiaries to this Agreement.| OTHER|
+--------------------------------------------------------------------------------+----------+
```

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|legclf_nda_perpetuity_bert|
|Compatibility:|Legal NLP 1.0.0+|
|License:|Licensed|
|Edition:|Official|
|Input Labels:|[document, token]|
|Output Labels:|[class]|
|Language:|en|
|Size:|406.4 MB|
|Case sensitive:|true|
|Max sentence length:|512|

## References

In-house annotations on the Non-disclosure Agreements

## Benchmarking

```bash
label precision recall f1-score support
OTHER 0.98 1.00 0.99 60
PERPETUITY 1.00 0.89 0.94 9
accuracy - - 0.99 69
macro avg 0.99 0.94 0.97 69
weighted avg 0.99 0.99 0.99 69
```
Loading

0 comments on commit 30e8e24

Please sign in to comment.