# Legal Jargon Simplifier:
Legal documents are hard for non-experts to understand. This project uses BART, an encoderâ€“decoder transformer trained for abstractive summarization, to convert complex legal jargon into simple English. Encoder-only models like BERT cannot generate rewritten text, which is why BART was chosen. The system takes Terms & Conditions as input and outputs a concise, user-friendly explanation.

In [1]:
!pip install transformers torch sentencepiece --quiet

In [2]:
from transformers import pipeline
import textwrap

In [3]:
summarizer = pipeline(
    "summarization",
    model="facebook/bart-large-cnn",
    device=0
)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json: 0.00B [00:00, ?B/s]

model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Device set to use cuda:0


In [4]:
legal_text = """
The Company shall not be liable for any indirect, incidental, special, consequential,
or punitive damages, including without limitation, loss of profits, data, use, goodwill,
or other intangible losses, resulting from (i) your access to or use of or inability to
access or use the Service; (ii) any conduct or content of any third party on the Service;
(iii) any content obtained from the Service; and (iv) unauthorized access, use or alteration
of your transmissions or content.
"""

In [5]:
summary = summarizer(
    legal_text,
    max_length=80,
    min_length=30,
    do_sample=False
)

simplified_text = summary[0]["summary_text"]

In [6]:
print("ðŸ“œ ORIGINAL LEGAL TEXT:\n")
print(textwrap.fill(legal_text, width=80))

print("\n\nâœ… SIMPLIFIED VERSION:\n")
print(textwrap.fill(simplified_text, width=80))

ðŸ“œ ORIGINAL LEGAL TEXT:

 The Company shall not be liable for any indirect, incidental, special,
consequential,  or punitive damages, including without limitation, loss of
profits, data, use, goodwill,  or other intangible losses, resulting from (i)
your access to or use of or inability to  access or use the Service; (ii) any
conduct or content of any third party on the Service;  (iii) any content
obtained from the Service; and (iv) unauthorized access, use or alteration  of
your transmissions or content.


âœ… SIMPLIFIED VERSION:

The Company shall not be liable for any indirect, incidental, special,
consequential,  punitive damages, including without limitation, loss of profits,
data, use, goodwill, or other intangible losses.


In [7]:
def simplify_legal_text(text):
    summary = summarizer(
        text,
        max_length=80,
        min_length=30,
        do_sample=False
    )
    return summary[0]["summary_text"]

In [8]:
simplify_legal_text(legal_text)

'The Company shall not be liable for any indirect, incidental, special, consequential,  punitive damages, including without limitation, loss of profits, data, use, goodwill, or other intangible losses.'