<a href="https://colab.research.google.com/github/dishambha/NyayaGPT/blob/main/Agentic_RAG_Application_for_LAW.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
!pip install langchain langchain-openai langchain-community pypdf faiss-cpu sentence-transformers

Collecting langchain-openai
  Downloading langchain_openai-0.3.28-py3-none-any.whl.metadata (2.3 kB)
Downloading langchain_openai-0.3.28-py3-none-any.whl (70 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m70.6/70.6 kB[0m [31m4.3 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: langchain-openai
Successfully installed langchain-openai-0.3.28


In [3]:
!pip install langchain-groq




In [4]:
!pip install sentence-transformers



# Part 1: Importing Necessary Libraries
# ----------------------------------------------------------------


In [5]:
!pip install langchain langchain-community langchain-groq pypdf faiss-cpu sentence-transformers



In [6]:
# ==============================================================================
# Part 1: Complete and Correct Imports
# ==============================================================================

import os
from google.colab import userdata

# For loading and preparing your documents
from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

# For the vector database and the free embedding model
from langchain_community.vectorstores import FAISS
from langchain_community.embeddings import HuggingFaceEmbeddings

# For the Groq LLM which will be the agent's "brain"
from langchain_groq import ChatGroq

# For building the agent and its tools
from langchain.tools.retriever import create_retriever_tool
from langchain import hub
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder # For the custom prompt
from langchain.agents import create_react_agent, AgentExecutor

# For managing the conversation's memory
from langchain_core.messages import HumanMessage, AIMessage

# Part 2 (Updated for Groq): API Key and Configuration

In [7]:

# ----------------------------------------------------

# This line now retrieves the API key named 'GROQ_API_KEY' which you must create in Colab's Secrets Manager (🔑).
# It's a secure way to handle keys without hard-coding them.
# Make sure to enable 'Notebook access' for the secret.
from google.colab import userdata
import os

os.environ["GROQ_API_KEY"] = userdata.get('GROQ_API_KEY')

# Part 3: The Reusable "Factory" Function for Tool Creation

In [8]:
# Add this new import statement at the top of your script
from langchain_community.embeddings import HuggingFaceEmbeddings

# This is the corrected function
def create_pdf_retriever_tool(pdf_path, tool_name, tool_description):
    """This function now uses a free Hugging Face model for embeddings."""

    # 1. Loading the document
    loader = PyPDFLoader(pdf_path)
    documents = loader.load()

    # 2. Chunking the document
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=150)
    docs = text_splitter.split_documents(documents)

    # 3. Creating Embeddings (THE CORRECTED PART)
    # We replace the OpenAI line with this one to use a free model.
    # The first time this runs, it will download the model (approx. 220MB).
    embeddings_model = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

    # The rest of the function works perfectly with the new model
    vector_store = FAISS.from_documents(docs, embeddings_model)

    # 4. Creating the Retriever
    retriever = vector_store.as_retriever(search_kwargs={"k": 3})

    # 5. Wrapping the retriever in a Tool
    retriever_tool = create_retriever_tool(retriever, name=tool_name, description=tool_description)

    print(f"Tool '{tool_name}' for '{pdf_path}' created successfully with free embeddings.")

    return retriever_tool

# Part 4: Building Your Specific Tools

In [9]:

# ------------------------------------

# Create the first tool for the BNS document.
# We provide the file path, a simple name for the tool, and a detailed description of its purpose.
bns_tool = create_pdf_retriever_tool(
    pdf_path="BNS.pdf", # Make sure this path matches your uploaded file name in Colab
    tool_name="bns_law_search",
    tool_description="Use this tool to search for the definitions of crimes, punishments, and substantive criminal law as defined in the Bharatiya Nyaya Sanhita (BNS)."
)

# Create the second tool for the BNSS document.
# Notice the description is different, telling the agent this tool is for procedural law.
bnss_tool = create_pdf_retriever_tool(
    pdf_path="BNSS.pdf", # Make sure this path matches your uploaded file name in Colab
    tool_name="bnss_procedure_search",
    tool_description="Use this tool to search for legal procedures, rules for investigation, arrest, bail, and trial processes as defined in the Bharatiya Nagarik Suraksha Sanhita (BNSS)."
)

# This line creates the 'tools' variable that your next cell needs
tools = [bns_tool, bnss_tool]

  embeddings_model = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
  return forward_call(*args, **kwargs)


Tool 'bns_law_search' for 'BNS.pdf' created successfully with free embeddings.
Tool 'bnss_procedure_search' for 'BNSS.pdf' created successfully with free embeddings.


# Part 5 (Updated for Groq): Creating the Agent and Executor

In [18]:
# ==============================================================================
# Part 5 (Updated with Formatting Rules)
# ==============================================================================

# Your 'tools' list from Part 4 remains the same.
# The necessary imports are assumed to be in place.

# --- 1. Pull the standard ReAct prompt from the hub ---
prompt = hub.pull("hwchase17/react-chat")

# --- 2. Add our new, detailed rules for formatting and citation ---
new_rules = """

**VERY IMPORTANT RULES FOR YOUR FINAL ANSWER**:
1.  Structure your response clearly with markdown. Use bold headings for each part of the user's question (e.g., **Punishment for Assault (BNS)** and **Reporting Procedure (BNSS)**).
2.  Under each heading, first provide a concise, one-sentence summary of the answer.
3.  After the summary, add a bullet point labeled "**Source Quote:**" and provide the single, most relevant quote from the source text that supports your summary. Keep the quote as short as possible.
"""
# We access the .template attribute and append our new, stricter rules.
prompt.template += new_rules

# --- 3. Create the Agent ---
# Using a powerful model is key for following complex formatting rules.
llm = ChatGroq(model_name="llama3-70b-8192", temperature=0)

agent = create_react_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

# (The code for Part 6, running the agent, remains the same)



# Part 6: Running the Agent to Get a Final Output

In [16]:
# After your first response...
# 1. Update the history with the last interaction
chat_history.append(HumanMessage(content=query))
chat_history.append(AIMessage(content=response["output"]))

# 2. Ask a follow-up question
follow_up_query = "Tell me more about the 'commitment of cases to the Court of Session' that you mentioned."

# 3. Invoke the agent again with the UPDATED history
response_2 = agent_executor.invoke({
    "input": follow_up_query,
    "chat_history": chat_history
})

print(response_2["output"])



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: Do I need to use a tool? Yes
Action: bnss_procedure_search
Action Input: commitment of cases to the Court of Session[0m[33;1m[1;3mis of opinion that it will tend to the general convenience of the parties and witnesses to hold
its sittings at any other place in the sessions division, it may, with the consent of the
prosecution and the accused, sit at that place for the disposal of the case or the examination
of any witness or witnesses therein.
(7) The Sessions Judge may, from time to time, make orders consistent with this
Sanhita, as to the distribution of business among such Additional Sessions Judges.
(8) The Sessions Judge may also make provision for the disposal of any urgent
application, in the event of his absence or inability to act, by an Additional Sessions Judge
or if there be no Additional Sessions Judge, by the Chief Judicial Magistrate, and such
Judge or Magistrate shall be deemed to have jurisdiction

  return forward_call(*args, **kwargs)


[32;1m[1;3mThought: Do I need to use a tool? No
Final Answer: According to the BNSS, the commitment of cases to the Court of Session involves the notification of the Public Prosecutor, as stated in section 233: "(d) notify the Public Prosecutor of the commitment of the case to the Court of Session:" (BNSS, Section 233). Additionally, the proceedings under this section shall be completed within a period of ninety days from the date of taking cognizance, and such period may be extended by the Magistrate for a period not exceeding one hundred and eighty days for the reasons to be recorded in writing (BNSS, Section 233).[0m

[1m> Finished chain.[0m
According to the BNSS, the commitment of cases to the Court of Session involves the notification of the Public Prosecutor, as stated in section 233: "(d) notify the Public Prosecutor of the commitment of the case to the Court of Session:" (BNSS, Section 233). Additionally, the proceedings under this section shall be completed within a perio