-
Notifications
You must be signed in to change notification settings - Fork 20
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* 2023-05-17-legclf_nda_assigments_bert_en (#238) * Add model 2023-05-17-legclf_nda_assigments_bert_en * Update 2023-05-17-legclf_nda_assigments_bert_en.md * Add model 2023-05-17-legclf_nda_perpetuity_bert_en * Update 2023-05-17-legclf_nda_perpetuity_bert_en.md * Update 2023-05-17-legclf_nda_perpetuity_bert_en.md * Add model 2023-05-17-legclf_nda_non_compete_items_bert_en * Update 2023-05-17-legclf_nda_non_compete_items_bert_en.md --------- Co-authored-by: bunyamin-polat <muhendisbp@gmail.com> Co-authored-by: Bünyamin Polat <78386903+bunyamin-polat@users.noreply.github.com> * 2023-05-29-legqa_flant5_finetuned_en (#278) * Add model 2023-05-29-legqa_flant5_finetuned_en * Update 2023-05-29-legqa_flant5_finetuned_en.md * Update 2023-05-29-legqa_flant5_finetuned_en.md --------- Co-authored-by: gadde5300 <gadde5300@gmail.com> Co-authored-by: GADDE SAI SHAILESH <69344247+gadde5300@users.noreply.github.com> --------- Co-authored-by: jsl-models <74001263+jsl-models@users.noreply.github.com> Co-authored-by: bunyamin-polat <muhendisbp@gmail.com> Co-authored-by: Bünyamin Polat <78386903+bunyamin-polat@users.noreply.github.com> Co-authored-by: gadde5300 <gadde5300@gmail.com> Co-authored-by: GADDE SAI SHAILESH <69344247+gadde5300@users.noreply.github.com>
- Loading branch information
1 parent
9a65615
commit 30e8e24
Showing
4 changed files
with
449 additions
and
0 deletions.
There are no files selected for viewing
121 changes: 121 additions & 0 deletions
121
docs/_posts/bunyamin-polat/2023-05-17-legclf_nda_assigments_bert_en.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,121 @@ | ||
--- | ||
layout: model | ||
title: Understanding Restriction Level of Assignment Clauses(Bert) | ||
author: John Snow Labs | ||
name: legclf_nda_assigments_bert | ||
date: 2023-05-17 | ||
tags: [en, legal, licensed, bert, nda, classification, assigments, tensorflow] | ||
task: Text Classification | ||
language: en | ||
edition: Legal NLP 1.0.0 | ||
spark_version: 3.0 | ||
supported: true | ||
engine: tensorflow | ||
annotator: LegalBertForSequenceClassification | ||
article_header: | ||
type: cover | ||
use_language_switcher: "Python-Scala-Java" | ||
--- | ||
|
||
## Description | ||
|
||
Given a clause classified as `ASSIGNMENT ` using the `legmulticlf_mnda_sections_paragraph_other` classifier, you can subclassify the sentences as `PERMISSIVE_ASSIGNMENT`, `RESTRICTIVE_ASSIGNMENT` or `OTHER` from it using the `legclf_nda_assigments_bert` model. It has been trained with the SOTA approach. | ||
|
||
## Predicted Entities | ||
|
||
`PERMISSIVE_ASSIGNMENT`, `RESTRICTIVE_ASSIGNMENT`, `OTHER` | ||
|
||
{:.btn-box} | ||
<button class="button button-orange" disabled>Live Demo</button> | ||
<button class="button button-orange" disabled>Open in Colab</button> | ||
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/legal/models/legclf_nda_assigments_bert_en_1.0.0_3.0_1684350248553.zip){:.button.button-orange} | ||
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/legal/models/legclf_nda_assigments_bert_en_1.0.0_3.0_1684350248553.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} | ||
|
||
## How to use | ||
|
||
|
||
|
||
<div class="tabs-box" markdown="1"> | ||
{% include programmingLanguageSelectScalaPythonNLU.html %} | ||
|
||
```python | ||
document_assembler = nlp.DocumentAssembler()\ | ||
.setInputCol("text")\ | ||
.setOutputCol("document") | ||
|
||
tokenizer = nlp.Tokenizer()\ | ||
.setInputCols(["document"])\ | ||
.setOutputCol("token") | ||
|
||
sequence_classifier = legal.BertForSequenceClassification.pretrained("legclf_nda_assigments_bert", "en", "legal/models")\ | ||
.setInputCols(["document","token"])\ | ||
.setOutputCol("class")\ | ||
.setCaseSensitive(True)\ | ||
.setMaxSentenceLength(512) | ||
|
||
clf_pipeline = nlp.Pipeline(stages=[ | ||
document_assembler, | ||
tokenizer, | ||
sequence_classifier | ||
]) | ||
|
||
empty_df = spark.createDataFrame([['']]).toDF("text") | ||
|
||
model = clf_pipeline.fit(empty_df) | ||
|
||
text_list = [ | ||
"""This Agreement will be binding upon and inure to the benefit of each Party and its respective heirs, successors and assigns""", | ||
"""All notices and other communications provided for in this Agreement and the other Loan Documents shall be in writing and may (subject to paragraph (b) below) be telecopied (faxed), mailed by certified mail return receipt requested, or delivered by hand or overnight courier service to the intended recipient at the addresses specified below or at such other address as shall be designated by any party listed below in a notice to the other parties listed below given in accordance with this Section.""", | ||
"""This Agreement is a personal contract for XCorp, and the rights and interests of XCorp hereunder may not be sold, transferred, assigned, pledged or hypothecated except as otherwise expressly permitted by the Company""" | ||
] | ||
|
||
df = spark.createDataFrame(pd.DataFrame({"text" : text_list})) | ||
|
||
result = model.transform(df) | ||
``` | ||
|
||
</div> | ||
|
||
## Results | ||
|
||
```bash | ||
+--------------------------------------------------------------------------------+----------------------+ | ||
| text| class| | ||
+--------------------------------------------------------------------------------+----------------------+ | ||
|This Agreement will be binding upon and inure to the benefit of each Party an...| PERMISSIVE_ASSIGNMENT| | ||
|All notices and other communications provided for in this Agreement and the o...| OTHER| | ||
|This Agreement is a personal contract for XCorp, and the rights and interests...|RESTRICTIVE_ASSIGNMENT| | ||
+--------------------------------------------------------------------------------+----------------------+ | ||
``` | ||
|
||
{:.model-param} | ||
## Model Information | ||
|
||
{:.table-model} | ||
|---|---| | ||
|Model Name:|legclf_nda_assigments_bert| | ||
|Compatibility:|Legal NLP 1.0.0+| | ||
|License:|Licensed| | ||
|Edition:|Official| | ||
|Input Labels:|[document, token]| | ||
|Output Labels:|[class]| | ||
|Language:|en| | ||
|Size:|406.4 MB| | ||
|Case sensitive:|true| | ||
|Max sentence length:|512| | ||
|
||
## Sample text from the training dataset | ||
|
||
In-house annotations on the Non-disclosure Agreements | ||
|
||
## Benchmarking | ||
|
||
```bash | ||
label precision recall f1-score support | ||
OTHER 0.98 1.00 0.99 124 | ||
PERMISSIVE_ASSIGNMENT 1.00 0.93 0.97 15 | ||
RESTRICTIVE_ASSIGNMENT 1.00 0.96 0.98 25 | ||
accuracy - - 0.99 164 | ||
macro avg 0.99 0.96 0.98 164 | ||
weighted avg 0.99 0.99 0.99 164 | ||
``` |
118 changes: 118 additions & 0 deletions
118
docs/_posts/bunyamin-polat/2023-05-17-legclf_nda_non_compete_items_bert_en.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,118 @@ | ||
--- | ||
layout: model | ||
title: Understanding Non-compete Items in Non-Compete Clauses (Bert) | ||
author: John Snow Labs | ||
name: legclf_nda_non_compete_items_bert | ||
date: 2023-05-17 | ||
tags: [en, legal, licensed, bert, classification, nda, non_compete, tensorflow] | ||
task: Text Classification | ||
language: en | ||
edition: Legal NLP 1.0.0 | ||
spark_version: 3.0 | ||
supported: true | ||
engine: tensorflow | ||
annotator: LegalBertForSequenceClassification | ||
article_header: | ||
type: cover | ||
use_language_switcher: "Python-Scala-Java" | ||
--- | ||
|
||
## Description | ||
|
||
Given a clause classified as `NON_COMP` using the `legmulticlf_mnda_sections_paragraph_other` classifier, you can subclassify the sentences as `NON_COMPETE_ITEMS`, or `OTHER` from it using the `legclf_nda_non_compete_items_bert` model. It has been trained with the SOTA approach. | ||
|
||
## Predicted Entities | ||
|
||
`NON_COMPETE_ITEMS`, `OTHER` | ||
|
||
{:.btn-box} | ||
<button class="button button-orange" disabled>Live Demo</button> | ||
<button class="button button-orange" disabled>Open in Colab</button> | ||
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/legal/models/legclf_nda_non_compete_items_bert_en_1.0.0_3.0_1684358961459.zip){:.button.button-orange} | ||
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/legal/models/legclf_nda_non_compete_items_bert_en_1.0.0_3.0_1684358961459.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} | ||
|
||
## How to use | ||
|
||
|
||
|
||
<div class="tabs-box" markdown="1"> | ||
{% include programmingLanguageSelectScalaPythonNLU.html %} | ||
|
||
```python | ||
document_assembler = nlp.DocumentAssembler()\ | ||
.setInputCol("text")\ | ||
.setOutputCol("document") | ||
|
||
tokenizer = nlp.Tokenizer()\ | ||
.setInputCols(["document"])\ | ||
.setOutputCol("token") | ||
|
||
sequence_classifier = legal.BertForSequenceClassification.pretrained("legclf_nda_non_compete_items_bert", "en", "legal/models")\ | ||
.setInputCols(["document", "token"])\ | ||
.setOutputCol("class")\ | ||
.setCaseSensitive(True)\ | ||
.setMaxSentenceLength(512) | ||
|
||
clf_pipeline = nlp.Pipeline(stages=[ | ||
document_assembler, | ||
tokenizer, | ||
sequence_classifier | ||
]) | ||
|
||
empty_df = spark.createDataFrame([['']]).toDF("text") | ||
|
||
model = clf_pipeline.fit(empty_df) | ||
|
||
text_list = [ | ||
"""This Agreement will be binding upon and inure to the benefit of each Party and its respective heirs, successors and assigns""", | ||
"""Activity that is in direct competition with the Company's business, including but not limited to developing, marketing, or selling products or services that are similar to those of the Company.""" | ||
] | ||
|
||
df = spark.createDataFrame(pd.DataFrame({"text" : text_list})) | ||
|
||
result = model.transform(df) | ||
``` | ||
|
||
</div> | ||
|
||
## Results | ||
|
||
```bash | ||
+--------------------------------------------------------------------------------+-----------------+ | ||
| text| class| | ||
+--------------------------------------------------------------------------------+-----------------+ | ||
|This Agreement will be binding upon and inure to the benefit of each Party an...| OTHER| | ||
|Activity that is in direct competition with the Company's business, including...|NON_COMPETE_ITEMS| | ||
+--------------------------------------------------------------------------------+-----------------+ | ||
``` | ||
{:.model-param} | ||
## Model Information | ||
{:.table-model} | ||
|---|---| | ||
|Model Name:|legclf_nda_non_compete_items_bert| | ||
|Compatibility:|Legal NLP 1.0.0+| | ||
|License:|Licensed| | ||
|Edition:|Official| | ||
|Input Labels:|[document, token]| | ||
|Output Labels:|[class]| | ||
|Language:|en| | ||
|Size:|406.4 MB| | ||
|Case sensitive:|true| | ||
|Max sentence length:|512| | ||
## References | ||
In-house annotations on the Non-disclosure Agreements | ||
## Benchmarking | ||
```bash | ||
label precision recall f1-score support | ||
NON_COMPETE_ITEMS 1.00 1.00 1.00 10 | ||
OTHER 1.00 1.00 1.00 64 | ||
accuracy - - 1.00 74 | ||
macro avg 1.00 1.00 1.00 74 | ||
weighted avg 1.00 1.00 1.00 74 | ||
``` |
118 changes: 118 additions & 0 deletions
118
docs/_posts/bunyamin-polat/2023-05-17-legclf_nda_perpetuity_bert_en.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,118 @@ | ||
--- | ||
layout: model | ||
title: Understanding Perpetuity in "Return of Confidential Information" Clauses (Bert) | ||
author: John Snow Labs | ||
name: legclf_nda_perpetuity_bert | ||
date: 2023-05-17 | ||
tags: [en, legal, licensed, bert, nda, classification, perpetuity, tensorflow] | ||
task: Text Classification | ||
language: en | ||
edition: Legal NLP 1.0.0 | ||
spark_version: 3.0 | ||
supported: true | ||
engine: tensorflow | ||
annotator: LegalBertForSequenceClassification | ||
article_header: | ||
type: cover | ||
use_language_switcher: "Python-Scala-Java" | ||
--- | ||
|
||
## Description | ||
|
||
Given a clause classified as `RETURN_OF_CONF_INFO` using the `legmulticlf_mnda_sections_paragraph_other` classifier, you can subclassify the sentences as `PERPETUITY` or `OTHER` from it using the `legclf_nda_perpetuity_bert` model. It has been trained with the SOTA approach | ||
|
||
## Predicted Entities | ||
|
||
`PERPETUITY`, `OTHER` | ||
|
||
{:.btn-box} | ||
<button class="button button-orange" disabled>Live Demo</button> | ||
<button class="button button-orange" disabled>Open in Colab</button> | ||
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/legal/models/legclf_nda_perpetuity_bert_en_1.0.0_3.0_1684353033843.zip){:.button.button-orange} | ||
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/legal/models/legclf_nda_perpetuity_bert_en_1.0.0_3.0_1684353033843.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} | ||
|
||
## How to use | ||
|
||
|
||
|
||
<div class="tabs-box" markdown="1"> | ||
{% include programmingLanguageSelectScalaPythonNLU.html %} | ||
|
||
```python | ||
document_assembler = nlp.DocumentAssembler()\ | ||
.setInputCol("text")\ | ||
.setOutputCol("document") | ||
|
||
tokenizer = nlp.Tokenizer()\ | ||
.setInputCols(["document"])\ | ||
.setOutputCol("token") | ||
|
||
sequence_classifier = legal.BertForSequenceClassification.pretrained("legclf_nda_perpetuity_bert", "en", "legal/models")\ | ||
.setInputCols(["document", "token"])\ | ||
.setOutputCol("class")\ | ||
.setCaseSensitive(True)\ | ||
.setMaxSentenceLength(512) | ||
|
||
clf_pipeline = nlp.Pipeline(stages=[ | ||
document_assembler, | ||
tokenizer, | ||
sequence_classifier | ||
]) | ||
|
||
empty_df = spark.createDataFrame([['']]).toDF("text") | ||
|
||
model = clf_pipeline.fit(empty_df) | ||
|
||
text_list = [ | ||
"""Notwithstanding the return or destruction of all Evaluation Material, you or your Representatives shall continue to be bound by your obligations of confidentiality and other obligations hereunder.""", | ||
"""There are no intended third party beneficiaries to this Agreement.""" | ||
] | ||
|
||
df = spark.createDataFrame(pd.DataFrame({"text" : text_list})) | ||
|
||
result = model.transform(df) | ||
``` | ||
|
||
</div> | ||
|
||
## Results | ||
|
||
```bash | ||
+--------------------------------------------------------------------------------+----------+ | ||
| text| class| | ||
+--------------------------------------------------------------------------------+----------+ | ||
|Notwithstanding the return or destruction of all Evaluation Material, you or ...|PERPETUITY| | ||
| There are no intended third-party beneficiaries to this Agreement.| OTHER| | ||
+--------------------------------------------------------------------------------+----------+ | ||
``` | ||
|
||
{:.model-param} | ||
## Model Information | ||
|
||
{:.table-model} | ||
|---|---| | ||
|Model Name:|legclf_nda_perpetuity_bert| | ||
|Compatibility:|Legal NLP 1.0.0+| | ||
|License:|Licensed| | ||
|Edition:|Official| | ||
|Input Labels:|[document, token]| | ||
|Output Labels:|[class]| | ||
|Language:|en| | ||
|Size:|406.4 MB| | ||
|Case sensitive:|true| | ||
|Max sentence length:|512| | ||
|
||
## References | ||
|
||
In-house annotations on the Non-disclosure Agreements | ||
|
||
## Benchmarking | ||
|
||
```bash | ||
label precision recall f1-score support | ||
OTHER 0.98 1.00 0.99 60 | ||
PERPETUITY 1.00 0.89 0.94 9 | ||
accuracy - - 0.99 69 | ||
macro avg 0.99 0.94 0.97 69 | ||
weighted avg 0.99 0.99 0.99 69 | ||
``` |
Oops, something went wrong.