diff --git a/docs/_posts/bunyamin-polat/2023-05-17-legclf_nda_assigments_bert_en.md b/docs/_posts/bunyamin-polat/2023-05-17-legclf_nda_assigments_bert_en.md new file mode 100644 index 0000000000..862e717739 --- /dev/null +++ b/docs/_posts/bunyamin-polat/2023-05-17-legclf_nda_assigments_bert_en.md @@ -0,0 +1,121 @@ +--- +layout: model +title: Understanding Restriction Level of Assignment Clauses(Bert) +author: John Snow Labs +name: legclf_nda_assigments_bert +date: 2023-05-17 +tags: [en, legal, licensed, bert, nda, classification, assigments, tensorflow] +task: Text Classification +language: en +edition: Legal NLP 1.0.0 +spark_version: 3.0 +supported: true +engine: tensorflow +annotator: LegalBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Given a clause classified as `ASSIGNMENT ` using the `legmulticlf_mnda_sections_paragraph_other` classifier, you can subclassify the sentences as `PERMISSIVE_ASSIGNMENT`, `RESTRICTIVE_ASSIGNMENT` or `OTHER` from it using the `legclf_nda_assigments_bert` model. It has been trained with the SOTA approach. + +## Predicted Entities + +`PERMISSIVE_ASSIGNMENT`, `RESTRICTIVE_ASSIGNMENT`, `OTHER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/legal/models/legclf_nda_assigments_bert_en_1.0.0_3.0_1684350248553.zip){:.button.button-orange} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/legal/models/legclf_nda_assigments_bert_en_1.0.0_3.0_1684350248553.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} + +```python +document_assembler = nlp.DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = nlp.Tokenizer()\ + .setInputCols(["document"])\ + .setOutputCol("token") + +sequence_classifier = legal.BertForSequenceClassification.pretrained("legclf_nda_assigments_bert", "en", "legal/models")\ + .setInputCols(["document","token"])\ + .setOutputCol("class")\ + .setCaseSensitive(True)\ + .setMaxSentenceLength(512) + +clf_pipeline = nlp.Pipeline(stages=[ + document_assembler, + tokenizer, + sequence_classifier +]) + +empty_df = spark.createDataFrame([['']]).toDF("text") + +model = clf_pipeline.fit(empty_df) + +text_list = [ +"""This Agreement will be binding upon and inure to the benefit of each Party and its respective heirs, successors and assigns""", +"""All notices and other communications provided for in this Agreement and the other Loan Documents shall be in writing and may (subject to paragraph (b) below) be telecopied (faxed), mailed by certified mail return receipt requested, or delivered by hand or overnight courier service to the intended recipient at the addresses specified below or at such other address as shall be designated by any party listed below in a notice to the other parties listed below given in accordance with this Section.""", +"""This Agreement is a personal contract for XCorp, and the rights and interests of XCorp hereunder may not be sold, transferred, assigned, pledged or hypothecated except as otherwise expressly permitted by the Company""" +] + +df = spark.createDataFrame(pd.DataFrame({"text" : text_list})) + +result = model.transform(df) +``` + +
+ +## Results + +```bash ++--------------------------------------------------------------------------------+----------------------+ +| text| class| ++--------------------------------------------------------------------------------+----------------------+ +|This Agreement will be binding upon and inure to the benefit of each Party an...| PERMISSIVE_ASSIGNMENT| +|All notices and other communications provided for in this Agreement and the o...| OTHER| +|This Agreement is a personal contract for XCorp, and the rights and interests...|RESTRICTIVE_ASSIGNMENT| ++--------------------------------------------------------------------------------+----------------------+ +``` + +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|legclf_nda_assigments_bert| +|Compatibility:|Legal NLP 1.0.0+| +|License:|Licensed| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|406.4 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## Sample text from the training dataset + +In-house annotations on the Non-disclosure Agreements + +## Benchmarking + +```bash +label precision recall f1-score support +OTHER 0.98 1.00 0.99 124 +PERMISSIVE_ASSIGNMENT 1.00 0.93 0.97 15 +RESTRICTIVE_ASSIGNMENT 1.00 0.96 0.98 25 +accuracy - - 0.99 164 +macro avg 0.99 0.96 0.98 164 +weighted avg 0.99 0.99 0.99 164 +``` diff --git a/docs/_posts/bunyamin-polat/2023-05-17-legclf_nda_non_compete_items_bert_en.md b/docs/_posts/bunyamin-polat/2023-05-17-legclf_nda_non_compete_items_bert_en.md new file mode 100644 index 0000000000..297b8666f4 --- /dev/null +++ b/docs/_posts/bunyamin-polat/2023-05-17-legclf_nda_non_compete_items_bert_en.md @@ -0,0 +1,118 @@ +--- +layout: model +title: Understanding Non-compete Items in Non-Compete Clauses (Bert) +author: John Snow Labs +name: legclf_nda_non_compete_items_bert +date: 2023-05-17 +tags: [en, legal, licensed, bert, classification, nda, non_compete, tensorflow] +task: Text Classification +language: en +edition: Legal NLP 1.0.0 +spark_version: 3.0 +supported: true +engine: tensorflow +annotator: LegalBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Given a clause classified as `NON_COMP` using the `legmulticlf_mnda_sections_paragraph_other` classifier, you can subclassify the sentences as `NON_COMPETE_ITEMS`, or `OTHER` from it using the `legclf_nda_non_compete_items_bert` model. It has been trained with the SOTA approach. + +## Predicted Entities + +`NON_COMPETE_ITEMS`, `OTHER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/legal/models/legclf_nda_non_compete_items_bert_en_1.0.0_3.0_1684358961459.zip){:.button.button-orange} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/legal/models/legclf_nda_non_compete_items_bert_en_1.0.0_3.0_1684358961459.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} + +```python +document_assembler = nlp.DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = nlp.Tokenizer()\ + .setInputCols(["document"])\ + .setOutputCol("token") + +sequence_classifier = legal.BertForSequenceClassification.pretrained("legclf_nda_non_compete_items_bert", "en", "legal/models")\ + .setInputCols(["document", "token"])\ + .setOutputCol("class")\ + .setCaseSensitive(True)\ + .setMaxSentenceLength(512) + +clf_pipeline = nlp.Pipeline(stages=[ + document_assembler, + tokenizer, + sequence_classifier +]) + +empty_df = spark.createDataFrame([['']]).toDF("text") + +model = clf_pipeline.fit(empty_df) + +text_list = [ +"""This Agreement will be binding upon and inure to the benefit of each Party and its respective heirs, successors and assigns""", +"""Activity that is in direct competition with the Company's business, including but not limited to developing, marketing, or selling products or services that are similar to those of the Company.""" +] + +df = spark.createDataFrame(pd.DataFrame({"text" : text_list})) + +result = model.transform(df) +``` + +
+ +## Results + +```bash ++--------------------------------------------------------------------------------+-----------------+ +| text| class| ++--------------------------------------------------------------------------------+-----------------+ +|This Agreement will be binding upon and inure to the benefit of each Party an...| OTHER| +|Activity that is in direct competition with the Company's business, including...|NON_COMPETE_ITEMS| ++--------------------------------------------------------------------------------+-----------------+ +``` + +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|legclf_nda_non_compete_items_bert| +|Compatibility:|Legal NLP 1.0.0+| +|License:|Licensed| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|406.4 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +In-house annotations on the Non-disclosure Agreements + +## Benchmarking + +```bash +label precision recall f1-score support +NON_COMPETE_ITEMS 1.00 1.00 1.00 10 +OTHER 1.00 1.00 1.00 64 +accuracy - - 1.00 74 +macro avg 1.00 1.00 1.00 74 +weighted avg 1.00 1.00 1.00 74 +``` diff --git a/docs/_posts/bunyamin-polat/2023-05-17-legclf_nda_perpetuity_bert_en.md b/docs/_posts/bunyamin-polat/2023-05-17-legclf_nda_perpetuity_bert_en.md new file mode 100644 index 0000000000..3506ecb5fa --- /dev/null +++ b/docs/_posts/bunyamin-polat/2023-05-17-legclf_nda_perpetuity_bert_en.md @@ -0,0 +1,118 @@ +--- +layout: model +title: Understanding Perpetuity in "Return of Confidential Information" Clauses (Bert) +author: John Snow Labs +name: legclf_nda_perpetuity_bert +date: 2023-05-17 +tags: [en, legal, licensed, bert, nda, classification, perpetuity, tensorflow] +task: Text Classification +language: en +edition: Legal NLP 1.0.0 +spark_version: 3.0 +supported: true +engine: tensorflow +annotator: LegalBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Given a clause classified as `RETURN_OF_CONF_INFO` using the `legmulticlf_mnda_sections_paragraph_other` classifier, you can subclassify the sentences as `PERPETUITY` or `OTHER` from it using the `legclf_nda_perpetuity_bert` model. It has been trained with the SOTA approach + +## Predicted Entities + +`PERPETUITY`, `OTHER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/legal/models/legclf_nda_perpetuity_bert_en_1.0.0_3.0_1684353033843.zip){:.button.button-orange} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/legal/models/legclf_nda_perpetuity_bert_en_1.0.0_3.0_1684353033843.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} + +```python +document_assembler = nlp.DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = nlp.Tokenizer()\ + .setInputCols(["document"])\ + .setOutputCol("token") + +sequence_classifier = legal.BertForSequenceClassification.pretrained("legclf_nda_perpetuity_bert", "en", "legal/models")\ + .setInputCols(["document", "token"])\ + .setOutputCol("class")\ + .setCaseSensitive(True)\ + .setMaxSentenceLength(512) + +clf_pipeline = nlp.Pipeline(stages=[ + document_assembler, + tokenizer, + sequence_classifier +]) + +empty_df = spark.createDataFrame([['']]).toDF("text") + +model = clf_pipeline.fit(empty_df) + +text_list = [ +"""Notwithstanding the return or destruction of all Evaluation Material, you or your Representatives shall continue to be bound by your obligations of confidentiality and other obligations hereunder.""", +"""There are no intended third party beneficiaries to this Agreement.""" +] + +df = spark.createDataFrame(pd.DataFrame({"text" : text_list})) + +result = model.transform(df) +``` + +
+ +## Results + +```bash ++--------------------------------------------------------------------------------+----------+ +| text| class| ++--------------------------------------------------------------------------------+----------+ +|Notwithstanding the return or destruction of all Evaluation Material, you or ...|PERPETUITY| +| There are no intended third-party beneficiaries to this Agreement.| OTHER| ++--------------------------------------------------------------------------------+----------+ +``` + +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|legclf_nda_perpetuity_bert| +|Compatibility:|Legal NLP 1.0.0+| +|License:|Licensed| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|406.4 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +In-house annotations on the Non-disclosure Agreements + +## Benchmarking + +```bash +label precision recall f1-score support +OTHER 0.98 1.00 0.99 60 +PERPETUITY 1.00 0.89 0.94 9 +accuracy - - 0.99 69 +macro avg 0.99 0.94 0.97 69 +weighted avg 0.99 0.99 0.99 69 +``` diff --git a/docs/_posts/gadde5300/2023-05-29-legqa_flant5_finetuned_en.md b/docs/_posts/gadde5300/2023-05-29-legqa_flant5_finetuned_en.md new file mode 100644 index 0000000000..c1927211dc --- /dev/null +++ b/docs/_posts/gadde5300/2023-05-29-legqa_flant5_finetuned_en.md @@ -0,0 +1,92 @@ +--- +layout: model +title: Legal FLAN-T5 Question Answering +author: John Snow Labs +name: legqa_flant5_finetuned +date: 2023-05-29 +tags: [en, legal, qa, question_answering, licensed, tensorflow] +task: Question Answering +language: en +edition: Legal NLP 1.0.0 +spark_version: 3.0 +supported: true +engine: tensorflow +annotator: LegalQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +This Question Answering model has been fine-tuned on FLANT5 using legal data. FLAN-T5 is a state-of-the-art language model developed by Google AI that utilizes the T5 architecture for text generation tasks. This model provides a powerful and efficient solution for accurately answering legal questions and delivering insightful information in the legal domain. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/legal/models/legqa_flant5_finetuned_en_1.0.0_3.0_1685371188640.zip){:.button.button-orange.button-orange-trans.arr.button-icon.hidden} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/legal/models/legqa_flant5_finetuned_en_1.0.0_3.0_1685371188640.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = nlp.MultiDocumentAssembler()\ + .setInputCols("question", "context")\ + .setOutputCols("document_question", "document_context") + +leg_qa = legal.QuestionAnswering.pretrained("legqa_flant5_finetuned","en","legal/models")\ + .setInputCols(["document_question", "document_context"])\ + .setCustomPrompt("question: {QUESTION} context: {CONTEXT}")\ + .setMaxNewTokens(50)\ + .setOutputCol("answer") + +pipeline = nlp.Pipeline(stages=[document_assembler, leg_qa]) + +question = 'How often will the incentive rate be reviewed?' +context = ''' + +The incentive rate shall remain in effect for a period of one year from the effective date. After the one year period, the incentive rate may be adjusted, or new incentive rates may be put in place, as determined by the governing body of Lincoln Parish, Louisiana. +The incentive rate shall be reviewed annually by the governing body and any changes or adjustments shall be made in accordance with the terms and conditions of this agreement. Furthermore, the incentive rate shall be adjusted to reflect any changes in the cost of production of the oil or the market price of the oil, as determined by the governing body. +If an adjustment is necessary, the governing body shall notify the parties of such adjustment in writing.''' + +data = spark.createDataFrame([[question, context]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` + +
+ +## Results + +```bash ++------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +|result | ++------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +|[The incentive rate shall be reviewed annually by the governing body. ]| ++------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +``` + +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|legqa_flant5_finetuned| +|Compatibility:|Legal NLP 1.0.0+| +|License:|Licensed| +|Edition:|Official| +|Language:|en| +|Size:|920.9 MB| +|Case sensitive:|true| + +## References + +In house annotated dataset