## Legal RAG Solution

## 1. Setup

Importing Libraries

In [3]:
! pip install langchain_openai langchain whisper langchain_community scikit-learn langchain_pinecone langchain[docarray] docarray pydantic==1.10.8 pytube  python-dotenv tiktoken ruff --quiet

[0m

Loading the environment variables we need to use.

In [2]:
import os
os.environ['OPENAI_API_KEY'] = ""
os.environ['PINECONE_API_KEY'] = ""
os.environ['PINECONE_ENV'] = ""


Let's define the LLM model that we'll use as part of the workflow.

In [4]:
from langchain_openai.chat_models import ChatOpenAI

model = ChatOpenAI(openai_api_key= 'OPENAI_API_KEY', model="gpt-3.5-turbo")

## 2. Loading the document

In [7]:
! pip install datasets



In [6]:
from datasets import load_dataset
dataset = load_dataset("ninadn/indian-legal",)
dataset

  from .autonotebook import tqdm as notebook_tqdm


DatasetDict({
    train: Dataset({
        features: ['Text', 'Summary'],
        num_rows: 7030
    })
    test: Dataset({
        features: ['Text', 'Summary'],
        num_rows: 100
    })
})

In [8]:
import pandas as pd
df = pd.DataFrame(dataset['test'])
df


Unnamed: 0,Text,Summary
0,Appeal No. 101 of 1959.\nAppeal by special lea...,The appellants who are displaced persons from ...
1,Appeal No. 52 of 1957.\nAppeal from the judgme...,The appellants and the respondents were owners...
2,Appeals Nos. 45 and 46 of 1959.\nAppeal by spe...,The respondents firm claimed exemption from Sa...
3,ION: Criminal Appeal 89 of 1961.\nAppeal by sp...,The appellant was tried for murder.\nThe facts...
4,Civil Appeal No. 50 of 1961.\nAppeal by specia...,"S, employed by the appellant as a cross cutter..."
...,...,...
95,Appeal No. 1367 of 1980.\nFrom the Judgment an...,Proceedings were commenced under Chapter III B...
96,Appeal No. 1695 of 1993.\nFrom the Judgment an...,"The plaintiff, predecessor in interest of the ..."
97,iminal Appeal No. 46 of 1957.\nAppeal by speci...,Conciliation proceedings were started in Janua...
98,N: Criminal Appeal No. 8 of 1951.\nAppeal from...,Sub section (1) of sec.\n19 of the Bombay Rent...


In [9]:
df = df.drop(columns=['Summary'])


In [10]:
df

Unnamed: 0,Text
0,Appeal No. 101 of 1959.\nAppeal by special lea...
1,Appeal No. 52 of 1957.\nAppeal from the judgme...
2,Appeals Nos. 45 and 46 of 1959.\nAppeal by spe...
3,ION: Criminal Appeal 89 of 1961.\nAppeal by sp...
4,Civil Appeal No. 50 of 1961.\nAppeal by specia...
...,...
95,Appeal No. 1367 of 1980.\nFrom the Judgment an...
96,Appeal No. 1695 of 1993.\nFrom the Judgment an...
97,iminal Appeal No. 46 of 1957.\nAppeal by speci...
98,N: Criminal Appeal No. 8 of 1951.\nAppeal from...


In [11]:
df.head(18)

Unnamed: 0,Text
0,Appeal No. 101 of 1959.\nAppeal by special lea...
1,Appeal No. 52 of 1957.\nAppeal from the judgme...
2,Appeals Nos. 45 and 46 of 1959.\nAppeal by spe...
3,ION: Criminal Appeal 89 of 1961.\nAppeal by sp...
4,Civil Appeal No. 50 of 1961.\nAppeal by specia...
5,iminal Appeals Nos.\n160 to 162 of 1960.\nAppe...
6,l Appeals Nos.\n15 to 19 of 1962.\nAppeal from...
7,"Appeals, Nos. 275 276 of 1963.\nAppeals by spe..."
8,Appeals Nos. 884 887 of 1962.\nAppeals from th...
9,Appeal No. 251 of 1963.\nAppeal by special lea...


In [12]:
df.to_csv('data_space_separated.txt', sep=' ', index=False)


In [13]:
df.to_csv('data.csv', index=False, sep=';')


In [14]:
from google.colab import files
files.download('data.csv')

ModuleNotFoundError: No module named 'google'

In [15]:
from langchain_community.document_loaders.csv_loader import CSVLoader
import sys
import csv
csv.field_size_limit(sys.maxsize)
loader = CSVLoader(file_path='data.csv')
data = loader.load()

In [16]:
with open("data.csv") as file:
    transcription = file.read()

transcription[:100]

'Text\n"Appeal No. 101 of 1959.\nAppeal by special leave from the judgment and order dated November 8, '

## 3. Chunking & Indexing the CSV

In [18]:
from langchain_community.document_loaders import CSVLoader

loader = CSVLoader("data.csv")
text_documents = loader.load()
text_documents

[Document(metadata={'source': 'data.csv', 'row': 0}, page_content='Text: Appeal No. 101 of 1959.\nAppeal by special leave from the judgment and order dated November 8, 1957, of the Deputy Custodian General, Evacuee Property, Now Delhi Revision Petition No. 17 R/55 of 1955.\nAchhru Ram and K. L. Mehta for the appellants.\nB.K., Khanna and, T. M. Sen, for the respondent No. 1.\nN.S. Bindra and A. G. Ratnaparkhi, for the respondents Nos.\nMarch 15.\nThe Judgment of the Court was delivered by MUDHOLKAR J.\nThe appellants who are admittedly displaced persons from West Pakistan were granted quasi permanent allotment of 24 standard acres and 15 3/4 units in the village of Raikot in Ludhiana District in 1949.\nTheir father Sardar Nand Singh who was 42 330 found entitled to quasi permanent allotment of 40 standard acres and 5 1/4 units of land was given quasipermanent allotment in another village named Humbran in the same district.\nThe two villages are, however, 25 miles or so distant from eac

In [20]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
documents=text_splitter.split_documents(text_documents)

## 4. Creating embeddings

Let's generate embeddings for an arbitrary query:

In [21]:
from langchain_openai.embeddings import OpenAIEmbeddings

embeddings = OpenAIEmbeddings()


## 5. Using Pinecone as Vector Database

In [23]:
from langchain_pinecone import PineconeVectorStore

index_name = "tmf17"

vectorstore = PineconeVectorStore.from_documents(
    documents, embeddings, index_name=index_name
)

--------------------------------------------------    INGESTION COMPLETES HERE     -----------------------------------------------------------

### Retrival & Generation

In [24]:
query = ["How did the court interpret the terms annual charge & capital charge under section 9 (1) (iv) of the Indian Income Tax Act in the case of Appeal No. LXVI of 1949? "]

results = []
for q in query:
    result = vectorstore.similarity_search(q)
    results.append(result)


In [25]:
from langchain.prompts import ChatPromptTemplate

results = []
all_contexts = []

# Loop through each query, retrieve context, and store it
for q in query:
    result = vectorstore.similarity_search(q)
    context_text = ([doc.page_content for doc in result])
    all_contexts.append(context_text)

In [27]:
from langchain.prompts import ChatPromptTemplate


# Combine all contexts into a single string
#final_context = "\n\n".join(all_contexts)
#print(final_context)

# Create prompt template
PROMPT_TEMPLATE = """
Answer the question based only on the following context:
{context}
Answer the question based on the above context: {question}.
Provide a detailed answer.
Don’t justify your answers.
Don’t give information not mentioned in the CONTEXT INFORMATION.
Do not say "according to the context" or "mentioned in the context" or similar.
"""

prompt_template = ChatPromptTemplate.from_template(PROMPT_TEMPLATE)

In [28]:
from langchain.prompts import ChatPromptTemplate

responses = []
for i, q in enumerate(query):
    prompt = prompt_template.format(context=all_contexts[i], question=q)
    model = ChatOpenAI()
    response_text = model.predict(prompt)
    responses.append(response_text)


for response in responses:
    print(response)

  warn_deprecated(


The court interpreted the terms annual charge and capital charge under section 9(1)(iv) of the Indian Income Tax Act in the case of Appeal No. LXVI of 1949 by considering the specific facts and circumstances presented in the case. The court examined the applicability of these terms to the particular situation of the assessee, which was a Hindu undivided family headed by Sri Trivikram Narain Singh, a descendant of Sri Babu Ausan Singh, the original founder of Ausanganj State in the district of Benaras. The court also took into account the historical background related to the Treaty between the East India Company and Nawab Asfuddaula in 1775, which ceded the province of Benaras to the British Government. Based on these facts, the court likely analyzed whether the sum of Rs. 36,396 received by the assessee as an allowance during the previous year of the assessment year 1949-50 constituted revenue income liable to tax under the Indian Income Tax Act of 1922.


In [33]:
df

Unnamed: 0,Text
0,Appeal No. 101 of 1959.\nAppeal by special lea...
1,Appeal No. 52 of 1957.\nAppeal from the judgme...
2,Appeals Nos. 45 and 46 of 1959.\nAppeal by spe...
3,ION: Criminal Appeal 89 of 1961.\nAppeal by sp...
4,Civil Appeal No. 50 of 1961.\nAppeal by specia...
...,...
95,Appeal No. 1367 of 1980.\nFrom the Judgment an...
96,Appeal No. 1695 of 1993.\nFrom the Judgment an...
97,iminal Appeal No. 46 of 1957.\nAppeal by speci...
98,N: Criminal Appeal No. 8 of 1951.\nAppeal from...


In [37]:
import pandas as pd
df = pd.read_csv('qa.csv')

# Initialize lists for queries and responses
queries = df['question']
responses = []

In [39]:
# Loop through each query, invoke response and store in responses list
for query in queries:
    response = model.invoke(query)  # Replace with your actual chain.invoke(Query) function
    responses.append(response)

In [40]:
#df = df.drop(columns=['id'])
df['answer'] = responses
#df["contexts"]
print("Existing columns:", df.columns)



Existing columns: Index(['question', 'ground_truth', 'answer'], dtype='object')


In [42]:
pd.set_option('display.max_colwidth', None)

In [44]:
df

Unnamed: 0,question,ground_truth,answer
0,What was the main legal issue in the appeal case of Nand Singh and his sons concerning the allotment of land in the village Raikot?,The main legal issue in the appeal case was whether the Deputy Custodian General had the jurisdiction to revise the order canceling the allotment of land in Raikot made in favor of the appellants after the enactment of the 1954 Act and the notification issued thereunder.,"content=""The main legal issue in the appeal case of Nand Singh and his sons concerning the allotment of land in the village Raikot was whether the lower court had erred in finding that the plaintiffs were entitled to the land in question based on their alleged possession and inheritance rights, despite the defendant's claim that they had acquired the land through a valid purchase agreement."" additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 73, 'prompt_tokens': 37, 'total_tokens': 110}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None} id='run-723bd1ad-4f0e-4854-a03c-e5aa043f9ce0-0' usage_metadata={'input_tokens': 37, 'output_tokens': 73, 'total_tokens': 110}"
1,"On what grounds did the Deputy Custodian General dismiss the appellants' application in the judgment dated November 8, 1957?","The Deputy Custodian General dismissed the appellants' application on the grounds that his jurisdiction to revise the order had been taken away by the provisions of the 1954 Act and the notification issued on March 24, 1955. The decision relied on the precedent set in the case of Bal Mukund vs. The State of Punjab.","content=""The Deputy Custodian General dismissed the appellants' application in the judgment dated November 8, 1957 on the grounds that they had failed to provide sufficient evidence to support their claim for relief. The appellants were unable to demonstrate that they were entitled to the relief sought under the relevant laws and regulations governing their case. Additionally, the Deputy Custodian General found that the appellants had not met the burden of proof required to show that they had a valid legal basis for their application."" additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 98, 'prompt_tokens': 34, 'total_tokens': 132}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None} id='run-2515a3d1-f519-4f58-a192-6e2104e9010f-0' usage_metadata={'input_tokens': 34, 'output_tokens': 98, 'total_tokens': 132}"
2,What was the main legal issue concerning the timing of when property in the goods passed in the context of the sales on FOB contracts in this case?,"The main legal issue was whether the property in the goods passed to the buyers at the time of shipment (i.e., after crossing the customs frontier) or at some point before shipment. The sellers argued that the property passed on shipment, thus exempting them from sales tax under article 286(1)(b) of the Constitution. The court had to determine if the sales occurred ""in the course of export"" and if the goods were indeed exempt from sales tax based on the timing of the property transfer.","content='The main legal issue concerning the timing of when property in the goods passed in the context of the sales on FOB contracts in this case was whether the property in the goods passed to the buyer at the time the goods were loaded onto the vessel or at the time the vessel sailed. This determination is important because it affects which party bears the risk of loss or damage to the goods during transit.' additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 78, 'prompt_tokens': 37, 'total_tokens': 115}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None} id='run-95c84ac8-65d1-42ef-b5be-ef200d46781c-0' usage_metadata={'input_tokens': 37, 'output_tokens': 78, 'total_tokens': 115}"
3,How did the court interpret the provision of section 10(b) of the Bombay Sales Tax Act regarding the levy of purchase tax?,"he court interpreted section 10(b) of the Bombay Sales Tax Act to mean that the term ""a person"" refers specifically to a registered dealer. Therefore, the purchasing dealer was liable for purchase tax if the goods were not dispatched outside the State of Bombay by a registered dealer, despite having furnished a certificate under section 8(b) of the Act. The court concluded that the legislative intent was to ensure that the actual despatch outside the state was carried out by a registered dealer to comply with the provisions and that failing to do so justified the levy of purchase tax.","content=""The court interpreted the provision of section 10(b) of the Bombay Sales Tax Act regarding the levy of purchase tax by examining the language and intent of the statute. In this case, the court may have considered the specific wording of the provision and any relevant case law or precedents to determine the scope and application of the purchase tax. Additionally, the court may have considered the legislative history and purpose of the provision to understand the underlying rationale for imposing the tax on certain transactions. Ultimately, the court's interpretation would be based on a thorough analysis of the statutory language, legislative intent, and relevant legal principles."" additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 121, 'prompt_tokens': 33, 'total_tokens': 154}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None} id='run-09f4a8b8-0938-4457-a340-2a84699dae1c-0' usage_metadata={'input_tokens': 33, 'output_tokens': 121, 'total_tokens': 154}"
4,What was the basis for the conviction of Ram Singh in the case discussed?,"Ram Singh was convicted of the murder of Sheo Sahai based on several pieces of evidence. The prosecution presented evidence of motive, including a prior dispute between the appellant and the victim, and established that Ram Singh purchased a sword, which was later found stained with human blood. Additionally, Ram Singh made an extra-judicial confession to Ujagar Singh, admitting to the crime. Despite the High Court initially rejecting this confession, the Supreme Court found that the evidence, including the sword's condition and Ram Singh's actions, sufficiently supported his conviction for murder under Section 302, I.P.C.","content=""Ram Singh was convicted based on the evidence presented by the prosecution, which included witness testimonies, forensic evidence, and his own confession. The prosecution argued that Singh was the main perpetrator in the gang rape and murder of Jyoti Singh, also known as Nirbhaya, on a bus in Delhi in 2012. Singh's confession, along with DNA evidence linking him to the crime scene, played a crucial role in his conviction. Additionally, witnesses testified against him, identifying him as one of the perpetrators. These pieces of evidence collectively led to his conviction in the case."" additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 116, 'prompt_tokens': 22, 'total_tokens': 138}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None} id='run-bb4b5065-8b02-4a20-9d66-4198020066bc-0' usage_metadata={'input_tokens': 22, 'output_tokens': 116, 'total_tokens': 138}"
5,Why did the Supreme Court find it unnecessary to decide on the admissibility of the portions of the report dictated by Ram Singh at the Police Station?,"The Supreme Court found it unnecessary to decide on the admissibility of the portions of the report dictated by Ram Singh at the Police Station because there was substantial independent evidence to support the conviction. This included testimony about the motive, the purchase of the sword, and the sword being stained with human blood. The Court concluded that the independent evidence was sufficient to establish Ram Singh's guilt, making it unnecessary to determine whether the dictated report was admissible as evidence.","content='The Supreme Court found it unnecessary to decide on the admissibility of the portions of the report dictated by Ram Singh at the Police Station because they ultimately determined that the prosecution had failed to prove the guilt of the accused beyond a reasonable doubt. In other words, even if those portions of the report were considered admissible, it would not have changed the outcome of the case. Therefore, the court did not see the need to address that specific issue in their decision.' additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 93, 'prompt_tokens': 37, 'total_tokens': 130}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None} id='run-37d840c5-9de0-43f2-b08d-af612f420f5c-0' usage_metadata={'input_tokens': 37, 'output_tokens': 93, 'total_tokens': 130}"
6,"What was the primary contention of the appellant regarding the termination of Sankaran's employment, and how did the Court address this contention?","The primary contention of the appellant was that under Rule 18(a) of the Standing Orders, they were entitled to terminate the services of any employee by providing 14 days' notice or paying 12 days' wages without needing to justify the termination. The appellant argued that this rule allowed them to dispense with Sankaran's services without the need for a departmental inquiry. The Court, however, rejected this contention, stating that the right to terminate employment under Rule 18(a) was not absolute and could not be used in a manner that negates the security of service provided to industrial employees through industrial adjudication. The Court held that even if termination was executed under Rule 18(a), it was subject to scrutiny by the industrial tribunal to ensure that the action was bona fide and not a colourable exercise of powe","content=""The primary contention of the appellant regarding the termination of Sankaran's employment was that it was unjust and arbitrary. The appellant argued that Sankaran's termination was based on false allegations and that he was not given a fair opportunity to defend himself.\n\nThe Court addressed this contention by carefully examining the evidence presented and evaluating the fairness of the termination process. The Court considered the reasons provided by the employer for Sankaran's termination and whether they were supported by evidence. The Court also assessed whether Sankaran was given a chance to refute the allegations and defend himself before being terminated.\n\nUltimately, the Court determined that the termination was not unjust or arbitrary. The employer had provided sufficient evidence to support their decision, and Sankaran had been given an opportunity to present his side of the story. The Court found that the termination was justified based on the evidence and upheld the decision."" additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 175, 'prompt_tokens': 34, 'total_tokens': 209}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None} id='run-3fe8abab-d116-4224-a138-60cefdef0a08-0' usage_metadata={'input_tokens': 34, 'output_tokens': 175, 'total_tokens': 209}"
7,"What were the reasons provided by the tribunal for setting aside the appellant's termination order, and how did the Supreme Court view these reasons?","The tribunal set aside the termination order on the grounds that the appellant's action was a colourable exercise of power under Rule 18(a). The tribunal found that while the appellant initially intended to take disciplinary action against Sankaran for alleged misconduct, it later abandoned this course and used Rule 18(a) to terminate Sankaran's services without conducting a departmental inquiry. The tribunal also noted that important witnesses who could have corroborated the alleged misconduct were not presented. The Supreme Court upheld the tribunal’s decision, agreeing that the exercise of power under Rule 18(a) was not bona fide and was intended to avoid the procedural requirements of a disciplinary inquiry. The Court emphasized that the industrial tribunal had the jurisdiction to scrutinize the action taken under Rule 18(a) and ensure it was not a result of unfair labour practices or victimization.","content=""The tribunal provided several reasons for setting aside the appellant's termination order. These reasons included procedural irregularities in the termination process, lack of evidence to support the termination, and failure to follow proper disciplinary procedures.\n\nThe Supreme Court viewed these reasons as valid and upheld the tribunal's decision to set aside the termination order. The court agreed that the appellant was not given a fair hearing and that the termination was not supported by sufficient evidence. The court also emphasized the importance of following proper disciplinary procedures in such cases to ensure fairness and due process for the employee."" additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 109, 'prompt_tokens': 34, 'total_tokens': 143}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None} id='run-37456cf4-f280-4fce-99ad-0cb77431aeef-0' usage_metadata={'input_tokens': 34, 'output_tokens': 109, 'total_tokens': 143}"
8,What were the primary allegations against the appellant in the appeals before the Supreme Court?,"The primary allegations against the appellant were that he either stole or secreted five registered letters and fabricated three receipts to show that these letters were received by the addressees. The appellant was charged under section 52 of the Indian Post Office Act, 1898, and in two cases, also under sections 467 and 471 of the Indian Penal Code.","content='The primary allegations against the appellant in the appeals before the Supreme Court were:\n\n1. Violation of constitutional rights: The appellant was accused of violating the constitutional rights of individuals, such as the right to freedom of speech or the right to privacy.\n\n2. Criminal misconduct: The appellant was accused of engaging in criminal activities, such as fraud, corruption, or other illegal behavior.\n\n3. Breach of contract: The appellant was accused of failing to fulfill their obligations under a contract or agreement.\n\n4. Professional misconduct: The appellant was accused of engaging in unethical or unprofessional behavior in their capacity as a professional, such as a lawyer, doctor, or accountant.\n\n5. Violation of laws or regulations: The appellant was accused of violating specific laws or regulations, such as environmental regulations or labor laws.' additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 160, 'prompt_tokens': 23, 'total_tokens': 183}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None} id='run-8d319e87-5cf5-4a09-b3cf-66b7624bac0b-0' usage_metadata={'input_tokens': 23, 'output_tokens': 160, 'total_tokens': 183}"
9,On what grounds did the Supreme Court decide to set aside the conviction and sentences against the appellant?,"The Supreme Court set aside the conviction and sentences on the grounds that the prosecution failed to prove that the five registered letters were in the exclusive possession of the appellant. The Court found that the almirah where the letters were found was not exclusively in the appellant's possession, as the key was produced by his father, and there was no evidence to establish that the appellant had exclusive control over the almirah or the letters. Therefore, the Court concluded that the prosecution had not sufficiently proven the appellant's possession or that he had secreted the letters.","content=""The Supreme Court decided to set aside the conviction and sentences against the appellant on the grounds that there were errors in the lower court's trial proceedings that had a material impact on the outcome of the case. This could include errors in the admission of evidence, jury instructions, or other procedural issues that affected the fairness of the trial. The Supreme Court may have also found that there was insufficient evidence to support the conviction or that the lower court applied the law incorrectly."" additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 91, 'prompt_tokens': 26, 'total_tokens': 117}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None} id='run-f1ecb68f-4f59-4f46-884b-75acfdd2cafa-0' usage_metadata={'input_tokens': 26, 'output_tokens': 91, 'total_tokens': 117}"


### Evaluation
