Legal NLP 1.12.0 (#180)

* 2023-04-16-legner_nda_remedies_en (#123) * Add model 2023-04-16-legner_nda_remedies_en * Update 2023-04-16-legner_nda_remedies_en.md --------- Co-authored-by: bunyamin-polat <muhendisbp@gmail.com> Co-authored-by: Bünyamin Polat <78386903+bunyamin-polat@users.noreply.github.com> * 2023-04-19-legner_nda_return_of_conf_info_en (#132) * Add model 2023-04-19-legner_nda_return_of_conf_info_en * Update 2023-04-19-legner_nda_return_of_conf_info_en.md --------- Co-authored-by: bunyamin-polat <muhendisbp@gmail.com> Co-authored-by: Bünyamin Polat <78386903+bunyamin-polat@users.noreply.github.com> * Add model 2023-04-20-legmulticlf_covid19_exceptions_italian_it (#135) Co-authored-by: Mary-Sci <meryemyildiz366@gmail.com> * 2023-04-21-leggen_flant5_base_en (#143) * Add model 2023-04-21-leggen_flant5_base_en * Update 2023-04-21-leggen_flant5_base_en.md --------- Co-authored-by: gadde5300 <gadde5300@gmail.com> Co-authored-by: GADDE SAI SHAILESH <69344247+gadde5300@users.noreply.github.com> * 2023-04-24-legner_nda_req_discl_en (#146) * Add model 2023-04-24-legner_nda_req_discl_en * Update 2023-04-24-legner_nda_req_discl_en.md --------- Co-authored-by: bunyamin-polat <muhendisbp@gmail.com> Co-authored-by: Bünyamin Polat <78386903+bunyamin-polat@users.noreply.github.com> * 2023-04-25-legner_greek_legislation_el (#148) * Add model 2023-04-25-legner_greek_legislation_el * Update 2023-04-25-legner_greek_legislation_el.md * Update 2023-04-25-legner_greek_legislation_el.md --------- Co-authored-by: bunyamin-polat <muhendisbp@gmail.com> Co-authored-by: Bünyamin Polat <78386903+bunyamin-polat@users.noreply.github.com> * Add model 2023-04-26-legmulticlf_online_terms_of_service_english_en (#153) Co-authored-by: Mary-Sci <meryemyildiz366@gmail.com> * 2023-04-26-legner_mapa_bg (#155) * Add model 2023-04-26-legner_mapa_bg * Update 2023-04-26-legner_mapa_bg.md --------- Co-authored-by: bunyamin-polat <muhendisbp@gmail.com> Co-authored-by: Bünyamin Polat <78386903+bunyamin-polat@users.noreply.github.com> * 2023-04-26-legner_mapa_da (#156) * Add model 2023-04-26-legner_mapa_da * Update 2023-04-26-legner_mapa_da.md --------- Co-authored-by: bunyamin-polat <muhendisbp@gmail.com> Co-authored-by: Bünyamin Polat <78386903+bunyamin-polat@users.noreply.github.com> * 2023-04-27-legner_mapa_de (#159) * Add model 2023-04-27-legner_mapa_de * Update 2023-04-27-legner_mapa_de.md * Add model 2023-04-27-legner_mapa_el * Update 2023-04-27-legner_mapa_el.md --------- Co-authored-by: bunyamin-polat <muhendisbp@gmail.com> Co-authored-by: Bünyamin Polat <78386903+bunyamin-polat@users.noreply.github.com> * 2023-04-27-legner_mapa_en (#160) * Add model 2023-04-27-legner_mapa_en * Update 2023-04-27-legner_mapa_en.md --------- Co-authored-by: bunyamin-polat <muhendisbp@gmail.com> Co-authored-by: Bünyamin Polat <78386903+bunyamin-polat@users.noreply.github.com> * 2023-04-27-legner_mapa_es (#162) * Add model 2023-04-27-legner_mapa_es * Update 2023-04-27-legner_mapa_es.md --------- Co-authored-by: bunyamin-polat <muhendisbp@gmail.com> Co-authored-by: Bünyamin Polat <78386903+bunyamin-polat@users.noreply.github.com> * 2023-04-27-legner_mapa_fr (#163) * Add model 2023-04-27-legner_mapa_fr * Update 2023-04-27-legner_mapa_fr.md * Add model 2023-04-27-legner_mapa_it * Update 2023-04-27-legner_mapa_it.md --------- Co-authored-by: bunyamin-polat <muhendisbp@gmail.com> Co-authored-by: Bünyamin Polat <78386903+bunyamin-polat@users.noreply.github.com> * 2023-04-27-legner_mapa_lt (#166) * Add model 2023-04-27-legner_mapa_lt * Update 2023-04-27-legner_mapa_lt.md --------- Co-authored-by: bunyamin-polat <muhendisbp@gmail.com> Co-authored-by: Bünyamin Polat <78386903+bunyamin-polat@users.noreply.github.com> * 2023-04-27-legner_mapa_nl (#167) * Add model 2023-04-27-legner_mapa_nl * Update 2023-04-27-legner_mapa_nl.md --------- Co-authored-by: bunyamin-polat <muhendisbp@gmail.com> Co-authored-by: Bünyamin Polat <78386903+bunyamin-polat@users.noreply.github.com> * 2023-04-27-legner_mapa_pt (#169) * Add model 2023-04-27-legner_mapa_pt * Update 2023-04-27-legner_mapa_pt.md * Add model 2023-04-27-legner_mapa_ro * Update 2023-04-27-legner_mapa_ro.md --------- Co-authored-by: bunyamin-polat <muhendisbp@gmail.com> Co-authored-by: Bünyamin Polat <78386903+bunyamin-polat@users.noreply.github.com> * 2023-04-28-legner_mapa_cs (#172) * Add model 2023-04-28-legner_mapa_cs * Update 2023-04-28-legner_mapa_cs.md * Add model 2023-04-28-legner_mapa_ga * Update 2023-04-28-legner_mapa_ga.md * Update 2023-04-28-legner_mapa_ga.md * Add model 2023-04-28-legner_mapa_fi * Update 2023-04-28-legner_mapa_fi.md * Add model 2023-04-28-legner_mapa_sk * Update 2023-04-28-legner_mapa_sk.md --------- Co-authored-by: bunyamin-polat <muhendisbp@gmail.com> Co-authored-by: Bünyamin Polat <78386903+bunyamin-polat@users.noreply.github.com> * 2023-04-29-legpipe_alias_en (#176) * Add model 2023-04-29-legpipe_alias_en * Update 2023-04-29-legpipe_alias_en.md --------- Co-authored-by: bunyamin-polat <muhendisbp@gmail.com> Co-authored-by: Bünyamin Polat <78386903+bunyamin-polat@users.noreply.github.com> * 2023-04-29-leggen_flant5_finetuned_en (#177) * Add model 2023-04-29-leggen_flant5_finetuned_en * Update 2023-04-29-leggen_flant5_finetuned_en.md --------- Co-authored-by: gadde5300 <gadde5300@gmail.com> Co-authored-by: GADDE SAI SHAILESH <69344247+gadde5300@users.noreply.github.com> * Delete 2023-04-29-legpipe_alias_en.md * 2023-04-30-legpipe_alias_en (#178) * Add model 2023-04-30-legpipe_alias_en * Update 2023-04-30-legpipe_alias_en.md * Update 2023-04-30-legpipe_alias_en.md * Update 2023-04-30-legpipe_alias_en.md * Update 2023-04-30-legpipe_alias_en.md --------- Co-authored-by: bunyamin-polat <muhendisbp@gmail.com> Co-authored-by: Bünyamin Polat <78386903+bunyamin-polat@users.noreply.github.com> Co-authored-by: Juan Martinez <36634572+josejuanmartinez@users.noreply.github.com> --------- Co-authored-by: jsl-models <74001263+jsl-models@users.noreply.github.com> Co-authored-by: bunyamin-polat <muhendisbp@gmail.com> Co-authored-by: Bünyamin Polat <78386903+bunyamin-polat@users.noreply.github.com> Co-authored-by: Mary-Sci <meryemyildiz366@gmail.com> Co-authored-by: gadde5300 <gadde5300@gmail.com> Co-authored-by: GADDE SAI SHAILESH <69344247+gadde5300@users.noreply.github.com>
JohnSnowLabs · May 1, 2023 · 0a84240 · 0a84240
1 parent c7c06e6
commit 0a84240
Show file tree

Hide file tree

Showing 25 changed files with 3,130 additions and 0 deletions.
diff --git a/docs/_posts/Mary-Sci/2023-04-20-legmulticlf_covid19_exceptions_italian_it.md b/docs/_posts/Mary-Sci/2023-04-20-legmulticlf_covid19_exceptions_italian_it.md
@@ -0,0 +1,126 @@
+---
+layout: model
+title: Legal Multilabel Classifier on Covid-19 Exceptions (Italian)
+author: John Snow Labs
+name: legmulticlf_covid19_exceptions_italian
+date: 2023-04-20
+tags: [it, licensed, legal, multilabel, classification, tensorflow]
+task: Text Classification
+language: it
+edition: Legal NLP 1.0.0
+spark_version: 3.0
+supported: true
+engine: tensorflow
+annotator: MultiClassifierDLModel
+article_header:
+  type: cover
+use_language_switcher: "Python-Scala-Java"
+---
+
+## Description
+
+This is the Multi-Label Text Classification model that can be used to identify up to 5 classes to facilitate analysis, discovery, and comparison of legal texts in Italian related to COVID-19 exception measures. The classes are as follows:
+
+ -  Closures/lockdown     
+ -  Government_oversight    
+ -  Restrictions_of_daily_liberties      
+ -  Restrictions_of_fundamental_rights_and_civil_liberties      
+ -  State_of_Emergency
+
+## Predicted Entities
+
+`Closures/lockdown`, `Government_oversight`, `Restrictions_of_daily_liberties`, `Restrictions_of_fundamental_rights_and_civil_liberties`, `State_of_Emergency`
+
+{:.btn-box}
+<button class="button button-orange" disabled>Live Demo</button>
+<button class="button button-orange" disabled>Open in Colab</button>
+[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/legal/models/legmulticlf_covid19_exceptions_italian_it_1.0.0_3.0_1681985472330.zip){:.button.button-orange}
+[Copy S3 URI](s3://auxdata.johnsnowlabs.com/legal/models/legmulticlf_covid19_exceptions_italian_it_1.0.0_3.0_1681985472330.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
+
+## How to use
+
+
+
+<div class="tabs-box" markdown="1">
+{% include programmingLanguageSelectScalaPythonNLU.html %}
+```python
+document_assembler = nlp.DocumentAssembler() \
+    .setInputCol("text")\
+    .setOutputCol("document")
+
+tokenizer = nlp.Tokenizer()\
+    .setInputCols(["document"]) \
+    .setOutputCol("token")
+
+embeddings = nlp.BertEmbeddings.pretrained("bert_embeddings_bert_base_italian_xxl_cased", "it") \
+    .setInputCols(["document", "token"])\
+    .setOutputCol("embeddings")
+
+embeddingsSentence = nlp.SentenceEmbeddings() \
+    .setInputCols(["document", "embeddings"])\
+    .setOutputCol("sentence_embeddings")\
+    .setPoolingStrategy("AVERAGE")
+
+multilabelClfModel = nlp.MultiClassifierDLModel.pretrained('legmulticlf_covid19_exceptions_italian', 'it', "legal/models") \
+    .setInputCols(["sentence_embeddings"])\
+    .setOutputCol("class")
+
+clf_pipeline = nlp.Pipeline(
+    stages=[document_assembler, 
+            tokenizer,
+            embeddings, 
+            embeddingsSentence,
+            multilabelClfModel])
+
+df = spark.createDataFrame([["Al di fuori di tale ultima ipotesi, secondo le raccomandazioni impartite dal Ministero della salute, occorre provvedere ad assicurare la corretta applicazione di misure preventive quali lavare frequentemente le mani con acqua e detergenti comuni."]]).toDF("text")
+
+model = clf_pipeline.fit(df)
+result = model.transform(df)
+
+result.select("text", "class.result").show(truncate=False)
+```
+
+</div>
+
+## Results
+
+```bash
++------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------+
+|text                                                                                                                                                                                                                                                  |result                           |
++------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------+
+|Al di fuori di tale ultima ipotesi, secondo le raccomandazioni impartite dal Ministero della salute, occorre provvedere ad assicurare la corretta applicazione di misure preventive quali lavare frequentemente le mani con acqua e detergenti comuni.|[Restrictions_of_daily_liberties]|
++------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------+
+```
+
+{:.model-param}
+## Model Information
+
+{:.table-model}
+|---|---|
+|Model Name:|legmulticlf_covid19_exceptions_italian|
+|Compatibility:|Legal NLP 1.0.0+|
+|License:|Licensed|
+|Edition:|Official|
+|Input Labels:|[sentence_embeddings]|
+|Output Labels:|[class]|
+|Language:|it|
+|Size:|13.9 MB|
+
+## References
+
+Train dataset available [here](https://huggingface.co/datasets/joelito/covid19_emergency_event)
+
+## Benchmarking
+
+```bash
+label                                                   precision  recall  f1-score  support 
+Closures/lockdown                                       0.88       0.94    0.91      47      
+Government_oversight                                    1.00       0.50    0.67      4       
+Restrictions_of_daily_liberties                         0.88       0.79    0.83      28      
+Restrictions_of_fundamental_rights_and_civil_liberties  0.62       0.62    0.62      16      
+State_of_Emergency                                      0.67       1.00    0.80      6       
+micro-avg                                               0.82       0.83    0.83      101     
+macro-avg                                               0.81       0.77    0.77      101     
+weighted-avg                                            0.83       0.83    0.83      101     
+samples-avg                                             0.81       0.84    0.81      101     
+```
diff --git a/docs/_posts/Mary-Sci/2023-04-26-legmulticlf_online_terms_of_service_english_en.md b/docs/_posts/Mary-Sci/2023-04-26-legmulticlf_online_terms_of_service_english_en.md
@@ -0,0 +1,131 @@
+---
+layout: model
+title: Legal Multilabel Classifier on Online Terms of Service
+author: John Snow Labs
+name: legmulticlf_online_terms_of_service_english
+date: 2023-04-26
+tags: [en, licensed, multilabel, classification, legal, tensorflow]
+task: Text Classification
+language: en
+edition: Legal NLP 1.0.0
+spark_version: 3.0
+supported: true
+engine: tensorflow
+annotator: MultiClassifierDLModel
+article_header:
+  type: cover
+use_language_switcher: "Python-Scala-Java"
+---
+
+## Description
+
+This is the Multi-Label Text Classification model that can be used to identify potentially unfair clauses in online Terms of Service. The classes are as follows:
+
+     - Arbitration 
+     - Choice_of_law
+     - Content_removal
+     - Jurisdiction
+     - Limitation_of_liability
+     - Other
+     - Unilateral_change
+     - Unilateral_termination
+
+## Predicted Entities
+
+`Arbitration`, `Choice_of_law`, `Content_removal`, `Jurisdiction`, `Limitation_of_liability`, `Other`, `Unilateral_change`, `Unilateral_termination`
+
+{:.btn-box}
+<button class="button button-orange" disabled>Live Demo</button>
+<button class="button button-orange" disabled>Open in Colab</button>
+[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/legal/models/legmulticlf_online_terms_of_service_english_en_1.0.0_3.0_1682519205970.zip){:.button.button-orange}
+[Copy S3 URI](s3://auxdata.johnsnowlabs.com/legal/models/legmulticlf_online_terms_of_service_english_en_1.0.0_3.0_1682519205970.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
+
+## How to use
+
+
+
+<div class="tabs-box" markdown="1">
+{% include programmingLanguageSelectScalaPythonNLU.html %}
+```python
+document_assembler = nlp.DocumentAssembler() \
+        .setInputCol('text')\
+        .setOutputCol('document')
+
+tokenizer = nlp.Tokenizer() \
+        .setInputCols(['document'])\
+        .setOutputCol('token')
+
+embeddings = nlp.BertEmbeddings.pretrained("bert_embeddings_sec_bert_base", "en") \
+        .setInputCols(['document', 'token'])\
+        .setOutputCol("embeddings")
+
+embeddingsSentence = nlp.SentenceEmbeddings() \
+        .setInputCols(['document', 'embeddings'])\
+        .setOutputCol('sentence_embeddings')\
+        .setPoolingStrategy('AVERAGE')
+
+classifierdl = nlp.MultiClassifierDLModel.pretrained('legmulticlf_online_terms_of_service_english', 'en', 'legal/models')
+         .setInputCols(["sentence_embeddings"])\
+         .setOutputCol("class")
+
+clf_pipeline = nlp.Pipeline(stages=[document_assembler, 
+                                    tokenizer, 
+                                    embeddings, 
+                                    embeddingsSentence, 
+                                    classifierdl])
+
+df = spark.createDataFrame([["We are not responsible or liable for (and have no obligation to verify) any wrong or misspelled email address or inaccurate or wrong (mobile) phone number or credit card number."]]).toDF("text")
+
+model = clf_pipeline.fit(df)
+result = model.transform(df)
+
+result.select("text", "class.result").show(truncate=False)
+```
+
+</div>
+
+## Results
+
+```bash
++---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------+
+|sentence                                                                                                                                                                         |result                   |
++---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------+
+|We are not responsible or liable for (and have no obligation to verify) any wrong or misspelled email address or inaccurate or wrong (mobile) phone number or credit card number.|[Limitation_of_liability]|
++---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------+
+```
+
+{:.model-param}
+## Model Information
+
+{:.table-model}
+|---|---|
+|Model Name:|legmulticlf_online_terms_of_service_english|
+|Compatibility:|Legal NLP 1.0.0+|
+|License:|Licensed|
+|Edition:|Official|
+|Input Labels:|[sentence_embeddings]|
+|Output Labels:|[class]|
+|Language:|en|
+|Size:|13.9 MB|
+
+## References
+
+Train dataset available [here](https://huggingface.co/datasets/joelito/online_terms_of_service)
+
+## Benchmarking
+
+```bash
+label                    precision  recall  f1-score  support 
+Arbitration              1.00       0.50    0.67      4       
+Choice_of_law            0.67       0.67    0.67      3       
+Content_removal          1.00       0.67    0.80      3       
+Jurisdiction             0.80       1.00    0.89      4       
+Limitation_of_liability  0.73       0.73    0.73      15      
+Other                    0.86       0.89    0.88      28      
+Unilateral_change        0.86       1.00    0.92      6       
+Unilateral_termination   1.00       0.80    0.89      5       
+micro-avg                0.84       0.82    0.83      68      
+macro-avg                0.86       0.78    0.81      68      
+weighted-avg             0.85       0.82    0.83      68      
+samples-avg              0.80       0.82    0.81      68      
+```
diff --git a/docs/_posts/bunyamin-polat/2023-04-16-legner_nda_remedies_en.md b/docs/_posts/bunyamin-polat/2023-04-16-legner_nda_remedies_en.md
@@ -0,0 +1,126 @@
+---
+layout: model
+title: Legal NER for NDA (Remedies Clauses)
+author: John Snow Labs
+name: legner_nda_remedies
+date: 2023-04-16
+tags: [en, licensed, ner, legal, nda, remedies]
+task: Named Entity Recognition
+language: en
+edition: Legal NLP 1.0.0
+spark_version: 3.0
+supported: true
+annotator: LegalNerModel
+article_header:
+  type: cover
+use_language_switcher: "Python-Scala-Java"
+---
+
+## Description
+
+This is a NER model, aimed to be run **only** after detecting the `REMEDIES` clause with a proper classifier (use `legmulticlf_mnda_sections_paragraph_other` for that purpose). It will extract the following entities: `CURRENCY`, `NUMERIC_REMEDY`, and `REMEDY_TYPE`.
+
+## Predicted Entities
+
+`CURRENCY`, `NUMERIC_REMEDY`, `REMEDY_TYPE`
+
+{:.btn-box}
+<button class="button button-orange" disabled>Live Demo</button>
+<button class="button button-orange" disabled>Open in Colab</button>
+[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/legal/models/legner_nda_remedies_en_1.0.0_3.0_1681687124993.zip){:.button.button-orange}
+[Copy S3 URI](s3://auxdata.johnsnowlabs.com/legal/models/legner_nda_remedies_en_1.0.0_3.0_1681687124993.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
+
+## How to use
+
+
+
+<div class="tabs-box" markdown="1">
+{% include programmingLanguageSelectScalaPythonNLU.html %}
+
+```python
+document_assembler = nlp.DocumentAssembler()\
+        .setInputCol("text")\
+        .setOutputCol("document")
+
+sentence_detector = nlp.SentenceDetector()\
+        .setInputCols(["document"])\
+        .setOutputCol("sentence")
+
+tokenizer = nlp.Tokenizer()\
+        .setInputCols(["sentence"])\
+        .setOutputCol("token")
+
+embeddings = nlp.RoBertaEmbeddings.pretrained("roberta_embeddings_legal_roberta_base","en") \
+        .setInputCols(["sentence", "token"]) \
+        .setOutputCol("embeddings")\
+        .setMaxSentenceLength(512)\
+        .setCaseSensitive(True)
+
+ner_model = legal.NerModel.pretrained("legner_nda_remedies", "en", "legal/models")\
+        .setInputCols(["sentence", "token", "embeddings"])\
+        .setOutputCol("ner")
+
+ner_converter = nlp.NerConverter()\
+        .setInputCols(["sentence", "token", "ner"])\
+        .setOutputCol("ner_chunk")
+
+nlpPipeline = nlp.Pipeline(stages=[
+        document_assembler,
+        sentence_detector,
+        tokenizer,
+        embeddings,
+        ner_model,
+        ner_converter])
+
+empty_data = spark.createDataFrame([[""]]).toDF("text")
+
+model = nlpPipeline.fit(empty_data)
+
+text = ["""The breaching party shall pay the non-breaching party liquidated damages of $ 1,000 per day for each day that the breach of this NDA continues."""]
+
+result = model.transform(spark.createDataFrame([text]).toDF("text"))
+```
+
+</div>
+
+## Results
+
+```bash
++------------------+--------------+
+|chunk             |ner_label     |
++------------------+--------------+
+|liquidated damages|REMEDY_TYPE   |
+|$                 |CURRENCY      |
+|1,000             |NUMERIC_REMEDY|
++------------------+--------------+
+```
+
+{:.model-param}
+## Model Information
+
+{:.table-model}
+|---|---|
+|Model Name:|legner_nda_remedies|
+|Compatibility:|Legal NLP 1.0.0+|
+|License:|Licensed|
+|Edition:|Official|
+|Input Labels:|[sentence, token, embeddings]|
+|Output Labels:|[ner]|
+|Language:|en|
+|Size:|16.3 MB|
+
+## References
+
+In-house annotations on the Non-disclosure Agreements
+
+## Benchmarking
+
+```bash
+label           precision  recall  f1-score  support 
+CURRENCY        1.00       1.00    1.00      11      
+NUMERIC_REMEDY  1.00       1.00    1.00      11      
+REMEDY_TYPE     0.86       0.94    0.90      32      
+micro-avg       0.91       0.96    0.94      54      
+macro-avg       0.95       0.98    0.97      54      
+weighted-avg    0.92       0.96    0.94      54 
+```