<a href="https://colab.research.google.com/github/Starlight0901/LawKey---Law-Constitution-Chatbot/blob/NLP---Practice/Law_Explaination_and_summarization.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## **Using Bart Pre-trained Model to generate explainations**

In [None]:
from transformers import pipeline, BartTokenizer, BartForConditionalGeneration

# Load pre-trained BART model and tokenizer
model_name = "facebook/bart-large-cnn"
tokenizer = BartTokenizer.from_pretrained(model_name)
model = BartForConditionalGeneration.from_pretrained(model_name)

# Function to generate explanation
def generate_explanation(text):
    inputs = tokenizer(text, return_tensors="pt", max_length=1024, truncation=True)
    summary_ids = model.generate(**inputs)
    summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
    return summary

# Example legal text
legal_text = """
Where a request for information is refused by an information officer, such officer shall specify the following information in the communication to be sent under section 25(1) to the citizen who made the request:
(a) the grounds on which such request is refused and
(b) the period within which and the person to whom an appeal against such refusal may be preferred under section 32 of this Act.
"""

# Generate explanation for the legal text
explanation = generate_explanation(legal_text)

# Print the generated explanation
print("Generated Explanation:")
print(explanation)


Generated Explanation:
Where a request for information is refused by an information officer, such officer shall specify the following information in the communication to be sent under section 25(1) to the citizen who made the request:(a) the grounds on which such request is refused and(b) the period within which and the person to whom an appeal against such refusal may be preferred.


## **Using T5-Base Pre-trained Model to generate explainations**

In [None]:
from transformers import pipeline, T5Tokenizer, T5ForConditionalGeneration

# Load pre-trained T5 model and tokenizer
model_name = "t5-base"
tokenizer = T5Tokenizer.from_pretrained(model_name)
model = T5ForConditionalGeneration.from_pretrained(model_name)

# Function to generate explanation
def generate_explanation(text):
    input_text = "explain: " + text
    inputs = tokenizer(input_text, return_tensors="pt", max_length=1024, truncation=True)
    summary_ids = model.generate(**inputs)
    summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
    return summary

# Example legal text
legal_text = """
Where a request for information is refused by an information officer, such officer shall specify the following information in the communication to be sent under section 25(1) to the citizen who made the request:
(a) the grounds on which such request is refused and
(b) the period within which and the person to whom an appeal against such refusal may be preferred under section 32 of this Act.
"""

# Generate explanation for the legal text
explanation = generate_explanation(legal_text)

# Print the generated explanation
print("Generated Explanation:")
print(explanation)


Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


Generated Explanation:
a) the grounds on which such request is refused and (b) the period within which


## **Using BERT Pre-trained model to generate explainations**

In [None]:
from transformers import BertTokenizer, BertForSequenceClassification
import torch

def generate_legal_explanation(text, model_name="bert-base-uncased"):
    # Load pre-trained BERT model and tokenizer
    tokenizer = BertTokenizer.from_pretrained(model_name)
    model = BertForSequenceClassification.from_pretrained(model_name)

    # Tokenize input text
    inputs = tokenizer(text, return_tensors="pt", max_length=512, truncation=True)

    # Perform classification (adjust as needed based on your task)
    outputs = model(**inputs)
    logits = outputs.logits

    # You can interpret the logits or use a softmax layer for probabilities

    return logits

# Example legal text
legal_text = """
Where a request for information is refused by an information officer, such officer shall specify the following information in the communication to be sent under section 25(1) to the citizen who made the request:
(a) the grounds on which such request is refused and
(b) the period within which and the person to whom an appeal against such refusal may be preferred under section 32 of this Act.
"""

# Generate explanation for the legal text
explanation_logits = generate_legal_explanation(legal_text)

# Print the logits (or apply softmax for probabilities)
print("Logits:", explanation_logits)


tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

config.json:   0%|          | 0.00/570 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/440M [00:00<?, ?B/s]

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Logits: tensor([[0.0370, 0.2493]], grad_fn=<AddmmBackward0>)


## **Using AdaptLLM/Law-LLM pre-trained model to generate explainations**

This model is a large language model pre-trained on legal data. Colab crashed several times when trying to run this model.

In [None]:
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("AdaptLLM/law-LLM")
tokenizer = AutoTokenizer.from_pretrained("AdaptLLM/law-LLM", use_fast=False)

# Put your input here:
user_input = '''Question: Which of the following is false about ex post facto laws?
Options:
- They make criminal an act that was innocent when committed.
- They prescribe greater punishment for an act than was prescribed when it was done.
- They increase the evidence required to convict a person than when the act was done.
- They alter criminal offenses or punishment in a substantially prejudicial manner for the purpose of punishing a person for some past activity.

Please provide your choice first and then provide explanations if possible.'''

# Simply use your input as the prompt for base models
prompt = user_input

inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False).input_ids.to(model.device)
outputs = model.generate(input_ids=inputs, max_length=2048)[0]

answer_start = int(inputs.shape[-1])
pred = tokenizer.decode(outputs[answer_start:], skip_special_tokens=True)

print(f'### User Input:\n{user_input}\n\n### Assistant Output:\n{pred}')


config.json:   0%|          | 0.00/515 [00:00<?, ?B/s]

pytorch_model.bin.index.json:   0%|          | 0.00/25.5k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/33 [00:00<?, ?it/s]

pytorch_model-00001-of-00033.bin:   0%|          | 0.00/810M [00:00<?, ?B/s]

pytorch_model-00002-of-00033.bin:   0%|          | 0.00/810M [00:00<?, ?B/s]

pytorch_model-00003-of-00033.bin:   0%|          | 0.00/810M [00:00<?, ?B/s]

pytorch_model-00004-of-00033.bin:   0%|          | 0.00/810M [00:00<?, ?B/s]

pytorch_model-00005-of-00033.bin:   0%|          | 0.00/810M [00:00<?, ?B/s]

pytorch_model-00006-of-00033.bin:   0%|          | 0.00/810M [00:00<?, ?B/s]

pytorch_model-00007-of-00033.bin:   0%|          | 0.00/810M [00:00<?, ?B/s]

pytorch_model-00008-of-00033.bin:   0%|          | 0.00/810M [00:00<?, ?B/s]

pytorch_model-00009-of-00033.bin:   0%|          | 0.00/810M [00:00<?, ?B/s]

pytorch_model-00010-of-00033.bin:   0%|          | 0.00/810M [00:00<?, ?B/s]

pytorch_model-00011-of-00033.bin:   0%|          | 0.00/810M [00:00<?, ?B/s]

pytorch_model-00012-of-00033.bin:   0%|          | 0.00/810M [00:00<?, ?B/s]

pytorch_model-00013-of-00033.bin:   0%|          | 0.00/810M [00:00<?, ?B/s]

pytorch_model-00014-of-00033.bin:   0%|          | 0.00/810M [00:00<?, ?B/s]

pytorch_model-00015-of-00033.bin:   0%|          | 0.00/810M [00:00<?, ?B/s]

pytorch_model-00016-of-00033.bin:   0%|          | 0.00/810M [00:00<?, ?B/s]

pytorch_model-00017-of-00033.bin:   0%|          | 0.00/810M [00:00<?, ?B/s]

pytorch_model-00018-of-00033.bin:   0%|          | 0.00/810M [00:00<?, ?B/s]

pytorch_model-00019-of-00033.bin:   0%|          | 0.00/810M [00:00<?, ?B/s]

pytorch_model-00020-of-00033.bin:   0%|          | 0.00/810M [00:00<?, ?B/s]

pytorch_model-00021-of-00033.bin:   0%|          | 0.00/810M [00:00<?, ?B/s]

pytorch_model-00022-of-00033.bin:   0%|          | 0.00/810M [00:00<?, ?B/s]

pytorch_model-00023-of-00033.bin:   0%|          | 0.00/810M [00:00<?, ?B/s]

pytorch_model-00024-of-00033.bin:   0%|          | 0.00/810M [00:00<?, ?B/s]

pytorch_model-00025-of-00033.bin:   0%|          | 0.00/810M [00:00<?, ?B/s]

pytorch_model-00026-of-00033.bin:   0%|          | 0.00/810M [00:00<?, ?B/s]

pytorch_model-00027-of-00033.bin:   0%|          | 0.00/810M [00:00<?, ?B/s]

pytorch_model-00028-of-00033.bin:   0%|          | 0.00/810M [00:00<?, ?B/s]

pytorch_model-00029-of-00033.bin:   0%|          | 0.00/810M [00:00<?, ?B/s]

pytorch_model-00030-of-00033.bin:   0%|          | 0.00/810M [00:00<?, ?B/s]

pytorch_model-00031-of-00033.bin:   0%|          | 0.00/810M [00:00<?, ?B/s]

pytorch_model-00032-of-00033.bin:   0%|          | 0.00/810M [00:00<?, ?B/s]

pytorch_model-00033-of-00033.bin:   0%|          | 0.00/524M [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/33 [00:00<?, ?it/s]

  return self.fget.__get__(instance, owner)()


## **Law explaination using the pre-trained Bart model by fine tuning**

### **Self Supervised Fine Tuning - Bart**

**Preprare the dataframe**

In [None]:
import pandas as pd
import seaborn as sns
%matplotlib inline

In [None]:
law_df=pd.read_csv("Data.csv")
law_df.head(10)

Unnamed: 0,Key,Law,Unnamed: 2
0,1,"(1)Where any person having sufficient means, n...",Maintainance
1,2,No Order for an allowance for the maintenance ...,Maintainance
2,3,An application for maintenance may be made :-\...,Maintainance
3,4,An application for maintenance may be made to ...,Maintainance
4,5,(1) Where any person against whom neglects to ...,Maintainance
5,6,(1) If on the application of a person entitled...,Maintainance
6,7,(1) Where an order for maintenance is made und...,Maintainance
7,8,On the application of any person receiving or ...,Maintainance
8,9,A copy of the order of maintenance certified u...,Maintainance
9,10,10.Every application for an order of maintenan...,Maintainance


In [None]:
# Data cleaning
firstCol=law_df[law_df.columns[0]]
law_df=law_df.drop(columns=law_df.columns[0])
law_df[firstCol.name] = firstCol

law_df.drop_duplicates(inplace=True)
law_df.dropna(inplace=True)

law_df.head(10)

Unnamed: 0,Law,Unnamed: 2,Key
0,"(1)Where any person having sufficient means, n...",Maintainance,1
1,No Order for an allowance for the maintenance ...,Maintainance,2
2,An application for maintenance may be made :-\...,Maintainance,3
3,An application for maintenance may be made to ...,Maintainance,4
4,(1) Where any person against whom neglects to ...,Maintainance,5
5,(1) If on the application of a person entitled...,Maintainance,6
6,(1) Where an order for maintenance is made und...,Maintainance,7
7,On the application of any person receiving or ...,Maintainance,8
8,A copy of the order of maintenance certified u...,Maintainance,9
9,10.Every application for an order of maintenan...,Maintainance,10


In [None]:
law_df.tail(10)

Unnamed: 0,Law,Unnamed: 2,Key
371,(1) The Competent Authority may restrict or pr...,Motor Traffic Law,372
372,Direction indicator signals are signals-\n(a) ...,Motor Traffic Law,373
373,(1) Police Officers on traffic duty may use Tr...,Motor Traffic Law,374
374,(1) Any person transporting dangerous goods or...,Motor Traffic Law,375
375,(1) Information which may have to be given des...,Motor Traffic Law,376
376,"(1) Where any instruction, information or symb...",Motor Traffic Law,377
377,On the reverse side of every traffic sign or n...,Motor Traffic Law,378
378,(1) Any sign or component of a sign specified ...,Motor Traffic Law,379
379,"No person shall-\n(a) fix to a sign, to its su...",Motor Traffic Law,380
380,"No person shall erect, exhibit or maintain or ...",Motor Traffic Law,381


#### Fine Tuning the Bart Model

In [None]:
from google.colab import drive

# Mount Google Drive
drive.mount('/content/gdrive')


Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True).


In [None]:
path = "/content/gdrive/MyDrive/Yr_2/DSGP/Data_Science_Group_Project_-_Group_07"
path

'/content/gdrive/MyDrive/Yr_2/DSGP/Data_Science_Group_Project_-_Group_07'

In [None]:
from transformers import BartTokenizer, BartForConditionalGeneration
from transformers import pipeline
from torch.utils.data import DataLoader, Dataset
from torch.nn import functional as F
from tqdm import tqdm
import torch
import pandas as pd


# Define your dataset class
class CustomDataset(Dataset):
    def __init__(self, data, tokenizer, max_length):
        self.data = data
        self.tokenizer = tokenizer
        self.max_length = max_length

    def __len__(self):
        return len(self.data)

    def __getitem__(self, idx):
        text = self.data[idx]
        encoding = self.tokenizer(text, max_length=self.max_length, padding="max_length", return_tensors="pt", truncation=True)
        return {'input_ids': encoding['input_ids'].squeeze(), 'attention_mask': encoding['attention_mask'].squeeze()}

# Set the maximum length for padding
max_length = 128

# Create an instance of the custom dataset
tokenizer = BartTokenizer.from_pretrained("facebook/bart-large-cnn")
custom_dataset = CustomDataset(law_df["Law"], tokenizer, max_length)

# Prepare DataLoader
batch_size = 4
dataloader = DataLoader(custom_dataset, batch_size=batch_size, shuffle=True)

# Initialize BART model
model = BartForConditionalGeneration.from_pretrained("facebook/bart-large-cnn")

# Set up optimizer and training loop
optimizer = torch.optim.AdamW(model.parameters(), lr=5e-5)
num_epochs = 5

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

for epoch in range(num_epochs):
    model.train()
    total_loss = 0

    for batch in tqdm(dataloader, desc=f"Epoch {epoch + 1}/{num_epochs}"):
        input_ids = batch["input_ids"].to(device)
        attention_mask = batch["attention_mask"].to(device)

        outputs = model(input_ids, attention_mask=attention_mask, labels=input_ids)
        loss = outputs.loss

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        total_loss += loss.item()

    average_loss = total_loss / len(dataloader)
    print(f"Epoch {epoch + 1}/{num_epochs}, Average Loss: {average_loss}")

# Save the fine-tuned model to Google Drive
model.save_pretrained("/content/gdrive/MyDrive/Yr_2/DSGP/Data_Science_Group_Project_-_Group_07/self_supervised_fine_tuned_bart")
tokenizer.save_pretrained("/content/gdrive/MyDrive/Yr_2/DSGP/Data_Science_Group_Project_-_Group_07/self_supervised_fine_tuned_bart")


model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

Epoch 1/5: 100%|██████████| 96/96 [00:51<00:00,  1.86it/s]


Epoch 1/5, Average Loss: 0.17519180707373985


Epoch 2/5: 100%|██████████| 96/96 [00:53<00:00,  1.81it/s]


Epoch 2/5, Average Loss: 0.02266784636109757


Epoch 3/5: 100%|██████████| 96/96 [00:52<00:00,  1.83it/s]


Epoch 3/5, Average Loss: 0.02318835431651678


Epoch 4/5: 100%|██████████| 96/96 [00:52<00:00,  1.82it/s]


Epoch 4/5, Average Loss: 0.01350303601854345


Epoch 5/5: 100%|██████████| 96/96 [00:52<00:00,  1.82it/s]
Non-default generation parameters: {'max_length': 142, 'min_length': 56, 'early_stopping': True, 'num_beams': 4, 'length_penalty': 2.0, 'no_repeat_ngram_size': 3, 'forced_bos_token_id': 0, 'forced_eos_token_id': 2}


Epoch 5/5, Average Loss: 0.01724161124487485


('/content/gdrive/MyDrive/Yr_2/DSGP/Data_Science_Group_Project_-_Group_07/self_supervised_fine_tuned_bart/tokenizer_config.json',
 '/content/gdrive/MyDrive/Yr_2/DSGP/Data_Science_Group_Project_-_Group_07/self_supervised_fine_tuned_bart/special_tokens_map.json',
 '/content/gdrive/MyDrive/Yr_2/DSGP/Data_Science_Group_Project_-_Group_07/self_supervised_fine_tuned_bart/vocab.json',
 '/content/gdrive/MyDrive/Yr_2/DSGP/Data_Science_Group_Project_-_Group_07/self_supervised_fine_tuned_bart/merges.txt',
 '/content/gdrive/MyDrive/Yr_2/DSGP/Data_Science_Group_Project_-_Group_07/self_supervised_fine_tuned_bart/added_tokens.json')

***Using the Fine-tuned Bart model to generate summary.***

In [None]:
from transformers import BartTokenizer, BartForConditionalGeneration

# Load the fine-tuned model and tokenizer
model = BartForConditionalGeneration.from_pretrained("/content/gdrive/MyDrive/Yr_2/DSGP/Data_Science_Group_Project_-_Group_07/self_supervised_fine_tuned_bart")
tokenizer = BartTokenizer.from_pretrained("/content/gdrive/MyDrive/Yr_2/DSGP/Data_Science_Group_Project_-_Group_07/self_supervised_fine_tuned_bart")

# Generate text
input_text = """
Where a request for information is refused by an information officer, such officer shall specify the following information in the communication to be sent under section 25(1) to the citizen who made the request:
(a) the grounds on which such request is refused and
(b) the period within which and the person to whom an appeal against such refusal may be preferred under section 32 of this Act.
"""
input_ids = tokenizer.encode(input_text, return_tensors="pt")
output = model.generate(input_ids)
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)

print("Generated Text:", generated_text)


Generated Text: Where a request for information is refused by an information officer, such officer shall specify the following information in the communication to be sent under section 25(1) to the citizen who made the request:
(a) the grounds on which such request is refused and
(b) the period within which and the person to whom an appeal against such refusal may be preferred under section 32 of this Act.



In [None]:
from transformers import pipeline, BartTokenizer, BartForCausalLM

# Load the fine-tuned BART model and tokenizer
model = BartForCausalLM.from_pretrained("/content/gdrive/MyDrive/Yr_2/DSGP/Data_Science_Group_Project_-_Group_07/self_supervised_fine_tuned_bart")
tokenizer = BartTokenizer.from_pretrained("/content/gdrive/MyDrive/Yr_2/DSGP/Data_Science_Group_Project_-_Group_07/self_supervised_fine_tuned_bart")

# Set up the text generation pipeline
explanation_pipeline = pipeline("text-generation", model=model, tokenizer=tokenizer)

# Generate text
input_text = """
(1)Where any person having sufficient means, neglects or unreasonably refuses to maintain such person's spouse who is unable to maintain himself or herself, the Magistrate may, upon an application being made for maintenance and upon proof of such neglect or unreasonable refusal, order such person to make a monthly allowance for the maintenance of such spouse at such monthly rate as the Magistrate thinks fit, having regard to the income of such 'person and the means and circumstances of such spouse : Provided however, that no such order shall be made if the applicant spouse is living in adultery or both the spouses are living separately by mutual consent.
(2)Where a parent having sufficient means neglects or refuses to maintain his or her child who is unable to maintain himself or herself, the Magistrate may upon an application being made for maintenance and upon proof of such neglect or refusal, order such parent to make a monthly allowance for the maintenance of such child at such monthly rate as the Magistrate thinks fit, having regard to the income of the parents and the means and circumstances of the child. Order for Maintenance of a spouse or child or adult offspring or disabled offspring.
Provided however, that no such order shall be made in the case of a non-marital child unless parentage is established by cogent evidence to the satisfaction of the Magistrate.
(3) Where a parent having sufficient means neglects or refuses to maintain his or her adult offspring who is unable to maintain himself or herself, the Magistrate may upon an application being made for maintenance and upon proof of such neglect or refusal, order such parent to make a monthly allowance for the Maintenance of such adult offspring at such monthly rate as the Magistrate thinks fit, having regard to the income of the parents and the means and circumstances of the adult offspring. Provided however, that no such order shall be made in the case of an non-marital adult offspring unless parentage is established by cogent evidence to the satisfaction of the Magistrate.
(4) where a parent having sufficient means neglects or refuses to maintain his or her disabled offspring who is unable to maintain himself or herself, the Magistrate may upon an application being made for maintenance and upon proof of such neglect or refusal, order such parent to make a monthly allowance for the maintenance of such disabled offspring at such monthly rate as the Magistrate thinks fit, having regard to the income of the parents and the means and circumstances of the disabled offspring. Provided however, that no such order shall be made in the case of a disabled non-marital offspring unless parentage is established by cogent evidence to the satisfaction of the Magistrate.
(5) Where an order is made by a Magistrate for the payment of an allowance pursuant to an application made under subsection (1) or (2) or (3) or (4), such allowance shall be payable from the date on which the application for maintenance was made to such court, unless the Magistrate, for good reasons to be recorded, orders payment from any other date.
(6) Where an application is made for the maintenance of a child, adult offspring or disabled offspring, as the case may be under subsection (2), (3) or (4), as the case may be, the court may, either on the application of the parties or of its own motion, add the other parent as a party to such application and make such order as is appropriate against one or both such parents. (Maintenance)
"""

# Generate an explanation
explanation = explanation_pipeline(input_text, max_length=150, min_length=50, length_penalty=2.0, num_beams=4, early_stopping=True)

# Print the generated explanation
print("Explanation:", explanation[0]['generated_text'])


Some weights of BartForCausalLM were not initialized from the model checkpoint at /content/gdrive/MyDrive/Yr_2/DSGP/Data_Science_Group_Project_-_Group_07/self_supervised_fine_tuned_bart and are newly initialized: ['lm_head.weight', 'model.decoder.embed_tokens.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.


Explanation: 
(1)Where any person having sufficient means, neglects or unreasonably refuses to maintain such person's spouse who is unable to maintain himself or herself, the Magistrate may, upon an application being made for maintenance and upon proof of such neglect or unreasonable refusal, order such person to make a monthly allowance for the maintenance of such spouse at such monthly rate as the Magistrate thinks fit, having regard to the income of such 'person and the means and circumstances of such spouse : Provided however, that no such order shall be made if the applicant spouse is living in adultery or both the spouses are living separately by mutual consent.
(2)Where a parent having sufficient means neglects or refuses to maintain his or her child who is unable to maintain himself or herself, the Magistrate may upon an application being made for maintenance and upon proof of such neglect or refusal, order such parent to make a monthly allowance for the maintenance of such chi

In [None]:
from transformers import pipeline, BartTokenizer, BartForConditionalGeneration

# Load the fine-tuned BART model and tokenizer
model = BartForConditionalGeneration.from_pretrained("/content/gdrive/MyDrive/Yr_2/DSGP/Data_Science_Group_Project_-_Group_07/self_supervised_fine_tuned_bart")
tokenizer = BartTokenizer.from_pretrained("/content/gdrive/MyDrive/Yr_2/DSGP/Data_Science_Group_Project_-_Group_07/self_supervised_fine_tuned_bart")

# Set up the summarization pipeline
summarization_pipeline = pipeline("summarization", model=model, tokenizer=tokenizer)

# Generate text
input_text = """
(1) If on the application of a person entitled to receive any payment under an order of maintenance, it appears to the Magistrate that the respondent has defaulted in the payment of maintenance due for a period exceeding two months, the Magistrate may, after inquiry, by an order,(hereinafter referred "to as an attachment of salary order) require the person to whom the order is directed, being a person appearing to the Magistrate to be the respondent's employer, to deduct, for such period as may be specified in the order, such amount from the respondent's salary as may be specified in the order and forthwith to remit that amount to the applicant in the manner directed by Court.
(2) (a) Before an order is made under subsection (1) of this section, the Magistrate shall notice the person on whom he proposes to serve such order, to show cause, if any, why an order should not be made under that subsection, and to require him to furnish to the court, within such period as may be specified in such order, the salary particulars of the respondent. Any order made under subsection (1) of this section may be the subject of an appeal to a High Court established by Article 154P of the Constitution by any person aggrieved by such order, but notwithstanding such appeal, the Magistrate may decide to continue proceedings under this Act. The provisions of section 14 of this Act shall apply to, and in relation to, every such appeal.
(b) The Magistrate may also by an order served on the respondent, require him to furnish to the Court within such period as may be specified in such order,a statement specifying
(i) the name and address of his employer or employers as the case may be, if he has more than one employer
(ii)such particulars as to his salary, inclusive of deductions, as may be within his knowledge and (iii) any other particulars as are required or necessary to enable his employer or employers to identify him.
(3) A statement furnished in compliance with an order made under paragraph (b) of subsection (2) shall, in any proceedings of any court, be received as evidence and be deemed to be prima facie proof of the particulars referred to in the said paragraph, unless the contrary is shown.
(4) The Magistrate shall not make an attachment of salary order, if it appears to him, that the failure of the respondent to make any payment in accordance with the order of maintenance in question, was not due to his wilful refusal or culpable neglect.
(5) In determining the amount to be deducted from the respondent's salary in terms of subsection (1) of this section,the Magistrate shall have regard to the resources and needs of the respondent, and the needs of the person, the payment of whose maintenance is in default.
(6) An attachment of salary order shall not come into force until the expiration of fourteen days from the date on which a copy of the order is served on the person to whom the order is directed.
(7) An attachment of salary order may on the application of the respondent or the person entitled to receive payment under the order of maintenance, be discharged or varied.
(8) A person to whom an attachment of salary order is directed shall be subject to the provisions of this Act, comply with the order or, if the order is subsequently varied under subsection (7), with the order as varied.
(9) Where, on any occasion on which any deductions have to be made from the salary of a respondent in pursuance of an attachment of salary order, there are in force, two or more orders for attachment of salary, relating to such salary, made under this Act or other written law, then , for the purposes of complying with this section, the employer shall, notwithstanding anything to the contrary in any other written law, first give effect to an order of attachment made under this Act and deal with any other order in respect of the residue of the respondent's salary according to the respective dates on which they came into force.
(10)An employer who in pursuance of an attachment of salary order makes any payment shall forthwith give to the respondent a statement in writing specifying the amount deducted from his salary in pursuance of such order.
(11)Any employer who fails or neglects to comply with an attachment of salary order shall be liable on conviction by a Magistrate's Court to a fine not exceeding five hundred rupees and in the case of a second or subsequent conviction in respect of the same attachment of salary order, to a fine not exceeding one thousand rupees : Provided however, it shall be a defense for an employer charged with failing or neglecting to comply with an attachment of salary order, to prove that he took all reasonable steps to comply with such order.
(12)The provisions of this section shall, notwithstanding anything to the contrary in any other written law, have effect in relation to an attachment of salary that may be made by a Magistrate under this Act.
(13)For the purposes of this section -
(a) where the respondent is a public officer or an officer of a provincial public service, the head of the department to which he is for the time being attached shall be deemed to be his employer
(b) where the respondent is a member of the Local Government Service and employed in any local authority, the Commissioner if it be a Municipal Council or the Chairman if it be an Urban Council or a Pradeshiya Sabha, as the case may be, shall be deemed to be his employer
(c) where the respondent is a person employed in any Corporation, Statutory Board or Company, the principal officer of such Corporation, Statutory Board or Company, as the case may be, shall be deemed to be his employer
(d) where the respondent is a person employed in any partnership, the Managing partner or the Manager of such partnership shall be deemed to be his employer and (e) where the respondent is a member of the armed forces, the commander of the unit to which he is attached shall be deemed to be his employer. (Maintenance)Act.
"""

# Generate a summary/explanation
summarization = summarization_pipeline(input_text, max_length=42, min_length=20, length_penalty=2.0, num_beams=4, early_stopping=True)

# Print the generated explanation
print("Explanation:", explanation[0]['generated_text'])


Explanation: 
(1)Where any person having sufficient means, neglects or unreasonably refuses to maintain such person's spouse who is unable to maintain himself or herself, the Magistrate may, upon an application being made for maintenance and upon proof of such neglect or unreasonable refusal, order such person to make a monthly allowance for the maintenance of such spouse at such monthly rate as the Magistrate thinks fit, having regard to the income of such 'person and the means and circumstances of such spouse : Provided however, that no such order shall be made if the applicant spouse is living in adultery or both the spouses are living separately by mutual consent.
(2)Where a parent having sufficient means neglects or refuses to maintain his or her child who is unable to maintain himself or herself, the Magistrate may upon an application being made for maintenance and upon proof of such neglect or refusal, order such parent to make a monthly allowance for the maintenance of such chi

In [None]:
from transformers import BartTokenizer, BartForConditionalGeneration

# Load fine-tuned BART model and tokenizer
fine_tuned_model_path = "/content/gdrive/MyDrive/Yr_2/DSGP/Data_Science_Group_Project_-_Group_07/self_supervised_fine_tuned_bart"
tokenizer = BartTokenizer.from_pretrained(fine_tuned_model_path)
model = BartForConditionalGeneration.from_pretrained(fine_tuned_model_path)

# Function to generate summary
def generate_summary(document):
    inputs = tokenizer(document, max_length=1024, return_tensors="pt", truncation=True)
    summary_ids = model.generate(inputs["input_ids"], max_length=150, length_penalty=2.0, num_beams=4, early_stopping=True)
    summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
    return summary

# Example usage
law_text = """
(1) Victim impact statement
A victim of crime shall have the right to make a statement in writing (in this Act referred to as the “victim impact statement”) to the court or Commission to describe the manner in which the offense alleged to have been committed has impacted him physically, emotionally, psychologically, financially, professionally or in any other manner.
(2) Where any victim of crime is unable to make, or incapable of making, such a victim impact statement due to any reason acceptable to the court or Commission, any other person on behalf of the victim of crime as may be permitted by the court or Commission may make such statement.
(3) The victim impact statement shall consist of -
(a) a victim personal statement and
(b) a victim impact report.
A victim personal statement referred to in paragraph (a) -
(a) shall set out the physical, emotional, psychological, financial, professional or other impact of the offense on the victim of crime
(b) may contain a statement, where applicable, whether the offense has been motivated by the age, gender,ethnicity, faith, religion, sexuality or disability of the victim of crime
(c) may state whether the victim wishes to claim compensation or requires any assistance as provided for in this Act.
(5) A victim impact report referred to in paragraph (b) shall be a report issued by a medical expert or psychologist and shall-
(a) contain an opinion on the traumatic impact of the offense on the victim of crime and
(b) contain a report on needs assessment of the victim of crime, consequent to the impact of the offense on the victim of crime.
"""

summary = generate_summary(law_text)

# Print the generated summary
print("Original Document:\n", law_text)
print("\n")
print("\nGenerated Summary:\n", summary)


Original Document:
 
(1) Victim impact statement
A victim of crime shall have the right to make a statement in writing (in this Act referred to as the “victim impact statement”) to the court or Commission to describe the manner in which the offense alleged to have been committed has impacted him physically, emotionally, psychologically, financially, professionally or in any other manner.
(2) Where any victim of crime is unable to make, or incapable of making, such a victim impact statement due to any reason acceptable to the court or Commission, any other person on behalf of the victim of crime as may be permitted by the court or Commission may make such statement.
(3) The victim impact statement shall consist of -
(a) a victim personal statement and
(b) a victim impact report.
A victim personal statement referred to in paragraph (a) -
(a) shall set out the physical, emotional, psychological, financial, professional or other impact of the offense on the victim of crime
(b) may contain 

***Using the pre-trained Bart model without fine-tuning***

In [None]:
from transformers import BartTokenizer, BartForConditionalGeneration

# Load pre-trained BART model and tokenizer
tokenizer = BartTokenizer.from_pretrained("facebook/bart-large-cnn")
model = BartForConditionalGeneration.from_pretrained("facebook/bart-large-cnn")

# Function to generate summary
def generate_summary(document):
    inputs = tokenizer(document, max_length=1024, return_tensors="pt", truncation=True)
    summary_ids = model.generate(inputs["input_ids"], max_length=150, length_penalty=2.0, num_beams=4, early_stopping=True)
    summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
    return summary

# Example usage
law_text = """
(1) Victim impact statement
A victim of crime shall have the right to make a statement in writing (in this Act referred to as the “victim impact statement”) to the court or Commission to describe the manner in which the offense alleged to have been committed has impacted him physically, emotionally, psychologically, financially, professionally or in any other manner.
(2) Where any victim of crime is unable to make, or incapable of making, such a victim impact statement due to any reason acceptable to the court or Commission, any other person on behalf of the victim of crime as may be permitted by the court or Commission may make such statement.
(3) The victim impact statement shall consist of -
(a) a victim personal statement and
(b) a victim impact report.
A victim personal statement referred to in paragraph (a) -
(a) shall set out the physical, emotional, psychological, financial, professional or other impact of the offense on the victim of crime
(b) may contain a statement, where applicable, whether the offense has been motivated by the age, gender,ethnicity, faith, religion, sexuality or disability of the victim of crime
(c) may state whether the victim wishes to claim compensation or requires any assistance as provided for in this Act.
(5) A victim impact report referred to in paragraph (b) shall be a report issued by a medical expert or psychologist and shall-
(a) contain an opinion on the traumatic impact of the offense on the victim of crime and
(b) contain a report on needs assessment of the victim of crime, consequent to the impact of the offense on the victim of
"""
summary = generate_summary(law_text)

# Print the generated summary
print("Original Document:\n", law_text)
print("\nGenerated Summary:\n", summary)


Original Document:
 
(1) Victim impact statement
A victim of crime shall have the right to make a statement in writing (in this Act referred to as the “victim impact statement”) to the court or Commission to describe the manner in which the offense alleged to have been committed has impacted him physically, emotionally, psychologically, financially, professionally or in any other manner.
(2) Where any victim of crime is unable to make, or incapable of making, such a victim impact statement due to any reason acceptable to the court or Commission, any other person on behalf of the victim of crime as may be permitted by the court or Commission may make such statement.
(3) The victim impact statement shall consist of -
(a) a victim personal statement and
(b) a victim impact report.
A victim personal statement referred to in paragraph (a) -
(a) shall set out the physical, emotional, psychological, financial, professional or other impact of the offense on the victim of crime
(b) may contain 

***Using the pre-trained T5-Base model without fine-tuning***

-------------------------------------------

In [None]:
from transformers import T5Tokenizer, T5ForConditionalGeneration

# Load pre-trained T5 model and tokenizer
tokenizer = T5Tokenizer.from_pretrained("t5-base")
model = T5ForConditionalGeneration.from_pretrained("t5-base")

# Function to generate summary
def generate_summary(document):
    inputs = tokenizer("summarize: " + document, return_tensors="pt", max_length=512, truncation=True)
    summary_ids = model.generate(inputs["input_ids"], max_length=150, length_penalty=2.0, num_beams=4, early_stopping=True)
    summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
    return summary

# Example usage
law_text = """
(1) Victim impact statement
A victim of crime shall have the right to make a statement in writing (in this Act referred to as the “victim impact statement”) to the court or Commission to describe the manner in which the offense alleged to have been committed has impacted him physically, emotionally, psychologically, financially, professionally or in any other manner.
(2) Where any victim of crime is unable to make, or incapable of making, such a victim impact statement due to any reason acceptable to the court or Commission, any other person on behalf of the victim of crime as may be permitted by the court or Commission may make such statement.
(3) The victim impact statement shall consist of -
(a) a victim personal statement and
(b) a victim impact report.
A victim personal statement referred to in paragraph (a) -
(a) shall set out the physical, emotional, psychological, financial, professional or other impact of the offense on the victim of crime
(b) may contain a statement, where applicable, whether the offense has been motivated by the age, gender,ethnicity, faith, religion, sexuality or disability of the victim of crime
(c) may state whether the victim wishes to claim compensation or requires any assistance as provided for in this Act.
(5) A victim impact report referred to in paragraph (b) shall be a report issued by a medical expert or psychologist and shall-
(a) contain an opinion on the traumatic impact of the offense on the victim of crime and
(b) contain a report on needs assessment of the victim of crime, consequent to the impact of the offense on the victim of
"""
summary = generate_summary(law_text)

# Print the generated summary
print("Original Document:\n", law_text)
print("\nGenerated Summary:\n", summary)


Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


Original Document:
 
(1) Victim impact statement
A victim of crime shall have the right to make a statement in writing (in this Act referred to as the “victim impact statement”) to the court or Commission to describe the manner in which the offense alleged to have been committed has impacted him physically, emotionally, psychologically, financially, professionally or in any other manner.
(2) Where any victim of crime is unable to make, or incapable of making, such a victim impact statement due to any reason acceptable to the court or Commission, any other person on behalf of the victim of crime as may be permitted by the court or Commission may make such statement.
(3) The victim impact statement shall consist of -
(a) a victim personal statement and
(b) a victim impact report.
A victim personal statement referred to in paragraph (a) -
(a) shall set out the physical, emotional, psychological, financial, professional or other impact of the offense on the victim of crime
(b) may contain 

***Using the pre-trained BERT model without fine-tuning***

In [None]:
from transformers import BertTokenizer, BertModel
import torch
from sklearn.feature_extraction.text import CountVectorizer
import numpy as np

# Load pre-trained BERT model and tokenizer
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
model = BertModel.from_pretrained("bert-base-uncased")

# Function to generate summary using extractive summarization
def generate_summary(document, max_sentences=3):
    # Tokenize the document
    tokenized_text = tokenizer.tokenize(tokenizer.decode(tokenizer.encode(document)))

    # Get sentence embeddings using BERT
    indexed_tokens = tokenizer.convert_tokens_to_ids(tokenized_text)
    tokens_tensor = torch.tensor([indexed_tokens])
    with torch.no_grad():
        outputs = model(tokens_tensor)
        sentence_embeddings = torch.mean(outputs.last_hidden_state, dim=1).squeeze().numpy()

    # Use CountVectorizer to extract important sentences
    vectorizer = CountVectorizer()
    X = vectorizer.fit_transform([document])
    sentence_importance = np.array(X.toarray())[0]

    # Get the indices of the most important sentences
    top_sentences_indices = sentence_importance.argsort()[-max_sentences:][::-1]

    # Extract and concatenate the top sentences to form the summary
    summary = " ".join([sent for i, sent in enumerate(document.split(".")) if i in top_sentences_indices])

    return summary

# Example usage
law_text = """
(1) Victim impact statement
A victim of crime shall have the right to make a statement in writing (in this Act referred to as the “victim impact statement”) to the court or Commission to describe the manner in which the offense alleged to have been committed has impacted him physically, emotionally, psychologically, financially, professionally or in any other manner.
(2) Where any victim of crime is unable to make, or incapable of making, such a victim impact statement due to any reason acceptable to the court or Commission, any other person on behalf of the victim of crime as may be permitted by the court or Commission may make such statement.
(3) The victim impact statement shall consist of -
(a) a victim personal statement and
(b) a victim impact report.
A victim personal statement referred to in paragraph (a) -
(a) shall set out the physical, emotional, psychological, financial, professional or other impact of the offense on the victim of crime
(b) may contain a statement, where applicable, whether the offense has been motivated by the age, gender,ethnicity, faith, religion, sexuality or disability of the victim of crime
(c) may state whether the victim wishes to claim compensation or requires any assistance as provided for in this Act.
(5) A victim impact report referred to in paragraph (b) shall be a report issued by a medical expert or psychologist and shall-
(a) contain an opinion on the traumatic impact of the offense on the victim of crime and
(b) contain a report on needs assessment of the victim of crime, consequent to the impact of the offense on the victim
"""
summary = generate_summary(law_text)

# Print the generated summary
print("Original Document:\n", law_text)
print("\nGenerated Summary:\n", summary)


Original Document:
 
(1) Victim impact statement
A victim of crime shall have the right to make a statement in writing (in this Act referred to as the “victim impact statement”) to the court or Commission to describe the manner in which the offense alleged to have been committed has impacted him physically, emotionally, psychologically, financially, professionally or in any other manner.
(2) Where any victim of crime is unable to make, or incapable of making, such a victim impact statement due to any reason acceptable to the court or Commission, any other person on behalf of the victim of crime as may be permitted by the court or Commission may make such statement.
(3) The victim impact statement shall consist of -
(a) a victim personal statement and
(b) a victim impact report.
A victim personal statement referred to in paragraph (a) -
(a) shall set out the physical, emotional, psychological, financial, professional or other impact of the offense on the victim of crime
(b) may contain 

### **Fine-tuned Model (Bart)**
- Most of the time gives the same as the input.
- Sometimes it gives half of the response.
- Sometimes it generates meaningless words.

### **Bart (Without Fine tuning)**
- Summary is okay.

### **T5 (Without Fine tuning)**
- Summary is okay.


