![JohnSnowLabs](https://nlp.johnsnowlabs.com/assets/images/logo.png)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp-workshop/blob/master/legal-nlp/13.Legal_Summarization.ipynb)

#🎬 Installation

In [None]:
! pip install -q johnsnowlabs

##🔗 Automatic Installation
Using my.johnsnowlabs.com SSO

In [None]:
from johnsnowlabs import nlp, finance, legal

nlp.install(refresh_install=True, visual=True, force_browser = True)

##🔗 Manual downloading
If you are not registered in my.johnsnowlabs.com, you received a license via e-email or you are using Safari, you may need to do a manual update of the license.

- Go to my.johnsnowlabs.com
- Download your license
- Upload it using the following command

In [None]:
from google.colab import files
print('Please Upload your John Snow Labs License using the button below')
license_keys = files.upload()

- Install it

#📌 Starting

In [3]:
spark = nlp.start()

👌 Launched [92mcpu optimized[39m session with with: 🚀Spark-NLP==4.4.0, 💊Spark-Healthcare==4.4.0, running on ⚡ PySpark==3.1.2


#🔎 Legal Text Summarization

📜Explanation:

Native Legal Text Summarization is a valuable tool for legal professionals who need to quickly understand the key points of a legal document. It can save time and improve accuracy, allowing lawyers to focus on more complex tasks.

The Native Legal Text Summarization feature uses a deep learning model that has been trained on a large corpus of legal documents to automatically generate summaries of legal text. The model uses a combination of natural language processing techniques, such as entity recognition, part-of-speech tagging, and dependency parsing, to extract important information from the text.

The summarization model then uses this extracted information to generate a summary that accurately captures the key points of the document in a concise and understandable manner. The resulting summary can be used to quickly identify relevant information, such as the outcome of a legal case or the main provisions of a regulation.

By using our new Legal Summarizer() module, you can get state-of-the-art, short versions of your legal documents, without losing any information.

We included 2 models for Legal Summarization:

  - **Legal FLAN-T5 Summarization (Base):** The base model, with generic capacities for summarizing legal documents.
  - **Legal Finetuned FLAN-T5 Summarization:** A specifically finetuned model trained to summarize Legal Agreements . For this task, we finetuned our base model with more than 8K sections from different legal commercial agreements.



### Let's see how to get summaries in different Legal documents using the `Summarizer()` module.


## Subpoenas

In [4]:
document_assembler = nlp.DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("documents")

flant5 = legal.Summarizer().pretrained('legsum_flant5_legal_augmented','en','legal/models')\
    .setInputCols(["documents"])\
    .setOutputCol("summary")\
    .setMaxNewTokens(1000)

pipeline = nlp.Pipeline(stages=[document_assembler, flant5])

data = spark.createDataFrame([
  [1, """ 
NOTICE OF DEFAULT AND INTENT TO FORECLOSE

PLEASE TAKE NOTICE that you are in default of your mortgage agreement with XYZ Bank, which is secured by the property located at 1234 Elm Street, Anytown, USA 12345. As of the date of this notice, the outstanding balance on your mortgage is $200,000, which includes principal, interest, late fees, and other charges.

Under the terms of your mortgage agreement, you were required to make monthly payments of $1,200 on the first day of each month. However, you have failed to make your payments for the months of January, February, and March 2023. As a result, you are in default of your mortgage agreement, and the entire amount of the outstanding balance has become due and payable.

UNLESS YOU TAKE ACTION TO CURE THIS DEFAULT, XYZ Bank INTENDS TO FORECLOSE ON YOUR PROPERTY. XYZ Bank will file a notice of default with the county recorder's office, which will initiate the foreclosure process. If the foreclosure proceeds, XYZ Bank will sell your property at a public auction to satisfy the outstanding balance on your mortgage.

YOU HAVE THE RIGHT TO CURE THIS DEFAULT BY PAYING THE ENTIRE OUTSTANDING BALANCE OF $200,000, INCLUDING ALL FEES AND CHARGES, ON OR BEFORE APRIL 30th, 2023.

IF YOU ARE UNABLE TO CURE THIS DEFAULT, YOU MAY BE ELIGIBLE FOR ALTERNATIVE FORECLOSURE PREVENTION OPTIONS, SUCH AS LOAN MODIFICATION, SHORT SALE, OR DEED IN LIEU OF FORECLOSURE. YOU MAY CONTACT XYZ BANK TO DISCUSS THESE OPTIONS OR TO SEEK ASSISTANCE FROM A HOUSING COUNSELOR.

IF YOU HAVE ANY QUESTIONS ABOUT THIS NOTICE, PLEASE CONTACT XYZ BANK AS SOON AS POSSIBLE.
"""]]).toDF('id', 'text')

results = pipeline.fit(data).transform(data)

results.select("summary.result").show(truncate=False)

legsum_flant5_legal_augmented download started this may take some time.
[OK!]
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|result                                                                                                                                      

## Mutual Non-Disclosure Agreement (MNDA)

In [5]:
data = spark.createDataFrame([
  [2, """NOW, THEREFORE, in consideration of the Company’s disclosure of information to the Recipient
and the promises set forth below, the parties agree as follows:

     1. Confidential Information. “Confidential Information” as used in this
Agreement means all information relating to the Company disclosed to the Recipient by the Company,
including without limitation any business, technical, marketing, financial or other information,
whether in written, electronic or oral form. Any and all reproductions, copies, notes, summaries,
reports, analyses or other material derived by the Recipient or its Representatives (as defined
below) in whole or in part from the Confidential Information in whatever form maintained shall be
considered part of the Confidential Information itself and shall be treated as such. Confidential
Information does not include information that (a) is or becomes part of the public domain other
than as a result of disclosure by the Recipient or its Representatives; (b) becomes available to
the Recipient on a nonconfidential basis from a source other than the Company, provided that source
is not bound with respect to that information by a confidentiality agreement with the Company or is
otherwise prohibited from transmitting that information by a contractual, legal or other
obligation; (c) can be proven by the Recipient to have been in the Recipient’s possession prior to
disclosure of the same by the Company; or (d) is independently developed by the Recipient without
reference to or reliance on any of the Company’s Confidential Information."""]
]).toDF('id', 'text')

results = pipeline.fit(data).transform(data)

results.select("summary.result").show(truncate=False)

+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|result                                                                                        

## Commercial Agreements

In [6]:
data = spark.createDataFrame([
  [3, """EXHIBIT 99.2 Page 1 of 3 DISTRIBUTOR AGREEMENT Agreement made this 19t h day of March, 2020 Between: Co-Diagnostics, Inc. (herein referred to as "Principal") And PreCheck Health Services, Inc. (herein referred to as "Distributor"). In consideration of the mutual terms, conditions and covenants hereinafter set forth, Principal and Distributor acknowledge and agree to the following descriptions and conditions: DESCRIPTION OF PRINCIPAL The Principal is a company located in Utah, United States and is in the business of research and development of reagents. The Principal markets and sells it products globally through direct sales and distributors. DESCRIPTION OF DISTRIBUTOR The Distributor is a company operating or planning to operate in the United States of America, Latin America, Europe and Russia. The Distributor represents that the Distributor or a subsidiary of the Distributor is or will be fully licensed and registered in the Territory and will provide professional distribution services for the products of the Principal. CONDITIONS: 1. The Principal appoints the Distributor as a non-exclusive distributor, to sell Principal's qPCR infectious disease kits, Logix Smart COVID-19 PCR diagnostic test and Co-Dx Box™ instrument (the "Products"). The Products are described on Exhibit A to this Agreement. 2. The Principal grants Distributor non- exclusive rights to sell these products within the countries of Romania (the "Territory"), which may be amended by mutual written agreement.
  
Source: PRECHECK HEALTH SERVICES, INC., 8-K, 3/20/2020

3. The Distributor accepts the appointment and shall use its commercially reasonable efforts to promote, market and sell the Products within the Territory, devote such time and attention as may be reasonably necessary and abide by the Principal's policies. 4. The Principal shall maintain the right to contact and market its products to potential customers in the Territory; but agrees to pass on all sales leads and orders to the Distributor. 5. The parties agree that the list of Products and/or prices may be amended from time to time. The Principal may unilaterally remove Products from the catalog or change prices. Additions to the Products shall be by mutual agreement. However, in the event the Distributor rejects a new product addition to the product list, the Principal shall then retain the right to market and distribute the new product that is rejected by the Distributor. 6. Unless accepted by the Principal, the Distributor agrees that during the term of this Agreement, the Distributor, either directly or indirectly, shall handle no products that are competitive with the Products within the Territory. 7. The Distributor shall obtain at its own expense, all necessary licenses and permits to allow the Distributor to conduct business as contemplated herein. The Distributor represents and warrants that the Distributor shall conduct business in strict conformity with all local, state and federal laws, rules and regulations. 8. The Principal agrees that the Distributor may employ or engage representatives or sub-distributors in furtherance of this Agreement and the Distributor agrees that the Distributor shall be solely responsible for the payment of wages or commissions to those representatives and sub-distributors, and that under no circumstances shall Distributor's representatives be deemed employees of Principal for any purpose whatsoever. 9. Principal will grant Distributor a discount based on the Products and Prices. The proposed discount is expected to be ¨%. Discount may vary depending on product volume ordered or promotions. 10. This Agreement shall be in effect until March 18. 2021, unless sooner terminated by either party upon (30) days written notice, without cause. 11. In the event of termination, the Distributor shall be entitled to receive all orders accepted by the Principal prior to the date of termination and may sell the ordered Products in the Territory. Payment to be made upon shipment"""]
]).toDF('id', 'text')

results = pipeline.fit(data).transform(data)

results.select("summary.result").show(truncate=False)

+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|result                                                                                                                                                                      