# Legal Assistant Bot

This chatbot retrieves context from a proprietary datasource and the web to answer questions about federal laws in the United States of America (USA).  The proprietary datasource is a CSV file of all federal laws and their revision history in the USA.  The web data required to respond to the user's questions is retrieved using the You.com API.  The chatbot is implemented as an agent in Langchain.

## Install all required packages

In [1]:
%%capture
! pip install langgraph==0.0.59
! pip install pandas==2.2.2
! pip install openai==1.30.3
! pip install langchain==0.2.1
! pip install langchain_community==0.2.1
! pip install langchain_openai==0.1.7
! pip install langchain_text_splitters==0.2.0
! pip install langchain_core==0.2.1
! pip install numpy==1.26.4

## Load in the US Laws dataset and create a Vector Database representation of this dataset, which will then be converted into a Langchain retriever and tool

In [3]:
# Let's take a look at our CSV dataset first
import pandas as pd

# The CSV file can be downloaded from: https://www.nature.com/articles/s41597-023-02758-z#Sec3
df = pd.read_csv("us_laws_dataset.csv")

df.head()

Unnamed: 0,row_number,action,Title,sal_volume,sal_page_start,BillCitation,congress_number,chapter,session_number,pl_no,date_of_passage,secondary_date,dates_conflict,Source,URL,alternate_sal_volume,alternate_sal_page_start,has_alternate_sal_citation
0,1,An Act,To regulate the time and manner of administeri...,1,23.0,,1,1.0,1.0,,1789-06-01,,,HeinOnline,,,,False
1,2,An Act,"For laying a duty on goods, wares, and merchan...",1,24.0,,1,2.0,1.0,,1789-07-04,,,HeinOnline,,,,False
2,3,An Act,Imposing duties on tonnage.,1,27.0,,1,3.0,1.0,,1789-07-20,,,HeinOnline,,,,False
3,4,An Act,For establishing an executive department to be...,1,28.0,,1,4.0,1.0,,1789-07-27,,,HeinOnline,,,,False
4,5,An Act,To regulate the collection of the duties impos...,1,29.0,,1,5.0,1.0,,1789-07-31,,,HeinOnline,,,,False


In [4]:
import openai
import langchain
import os

In [5]:
os.environ["YDC_API_KEY"] = "<Insert your YDC API key here>"
os.environ["OPENAI_API_KEY"] = "<Insert your Open AI API key here>"

In [6]:
from langchain_community.document_loaders.csv_loader import CSVLoader

# The CSV file can be downloaded from: https://www.nature.com/articles/s41597-023-02758-z#Sec3
loader = CSVLoader(file_path = "us_laws_dataset.csv")

data = loader.load()

In [7]:
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter

# split the document into chunks, and vectorize these chunks in a FAISS database
text_splitter = RecursiveCharacterTextSplitter(chunk_size = 1000, chunk_overlap = 100)
docs = text_splitter.split_documents(data)
embeddings = OpenAIEmbeddings()
db = FAISS.from_documents(documents=docs, embedding=embeddings)

In [8]:
# test out the similarity search
query = "What laws and amendments relate to perjury?"
response = db.similarity_search(query, k=10)
# let's look at the first 3 retrieved docs
response[:3]

[Document(page_content='row_number: 1885\naction: An Act\nTitle: In addition to the act, entitled "An act for the prompt settlement of public accounts," and for the punishment of the crime of perjury\nsal_volume: 3\nsal_page_start: 770\nBillCitation: NA\ncongress_number: 17\nchapter: 37\nsession_number: 2\npl_no: NA\ndate_of_passage: 1823-03-01\nsecondary_date: NA\ndates_conflict: NA\nSource: HeinOnline\nURL: NA\nalternate_sal_volume: NA\nalternate_sal_page_start: NA\nhas_alternate_sal_citation: FALSE', metadata={'source': 'us_laws_dataset.csv', 'row': 1884}),
 Document(page_content='row_number: 38724\naction: An Act\nTitle: An act to permit the use of unsworn declarations under penalty of perjury as evidence in Federal proceedings\nsal_volume: 90\nsal_page_start: 2534\nBillCitation: H.R. 15531\ncongress_number: 94\nchapter: NA\nsession_number: 2\npl_no: 94-550\ndate_of_passage: 1976-10-18\nsecondary_date: NA\ndates_conflict: FALSE\nSource: NA\nURL: https://www.govinfo.gov/content/pkg/

In [9]:
from langchain.tools.retriever import create_retriever_tool

# convert this retriver into a tool
db_retriever = db.as_retriever()
db_retriever_tool = create_retriever_tool(
    db_retriever,
    name = "law_dataset_retriever",
    description = "Retrieve relevant context from the US laws dataset."
)

## Instantiating the You.com Tool in Langchain

In [10]:
from langchain_community.tools.you import YouSearchTool
from langchain_community.utilities.you import YouSearchAPIWrapper

api_wrapper = YouSearchAPIWrapper(num_web_results = 10)
ydc_tool = YouSearchTool(api_wrapper=api_wrapper)

In [11]:
# test out the You.com search tool
response = ydc_tool.invoke("Tell me about a recent high-profile case related to antitrust in the USA?")
# let's look at the first 3 results
response[:3]

[Document(page_content='Meijer, Inc. v. Ferring B.V.; Ferring Pharmaceuticals, Inc.; and Aventis Pharmaceuticals [In Re: DDAVP Direct Purchaser Antitrust Litigation] U.S. v. Memphis Board of Realtors', metadata={'url': 'https://www.justice.gov/atr/antitrust-case-filings-alpha', 'thumbnail_url': None, 'title': 'Antitrust Division | Antitrust Case Filings | United States Department of Justice', 'description': 'An official website of the United States government · Official websites use .gov A .gov website belongs to an official government organization in the United States'}),
 Document(page_content="Dentsply International, Inc. v. Antitrust Division of the United States Department of Justice · U.S. v. Freddy Deoliveira · U.S. v. Wilhelm DerMinassian · U.S. v. Eric Descouraux · Leinani Deslandes, Stephanie Turner, et al. v. McDonald's USA, LLC, et al.", metadata={'url': 'https://www.justice.gov/atr/antitrust-case-filings-alpha', 'thumbnail_url': None, 'title': 'Antitrust Division | Antitru

## Instantiate our LLM

In [12]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o", temperature=0.5)

## Tying it all together

In [13]:
from langgraph.prebuilt import chat_agent_executor
from langgraph.checkpoint import MemorySaver

# Create a checkpointer to use memory
memory = MemorySaver()
# the vector store representation of the CSV dataset and the You.com Search tool will both be passed as tools to the agent
tools = [db_retriever_tool, ydc_tool]
agent_executor = chat_agent_executor.create_tool_calling_executor(llm, tools, checkpointer=memory)

## Let's try it out!

In [14]:
agent_executor.invoke(input={"messages": "What laws in the US pertain to perjury and is there a recent case in the US that relates to a violation of these laws?"}, config={"configurable": {"thread_id": "xyz_789"}})["messages"][-1].content

'### US Laws Pertaining to Perjury\n\n1. **Federal Law on Perjury**:\n   - **18 U.S.C. § 1621**: This statute makes it a crime to knowingly make false statements under oath in any federal proceeding. The elements of the crime include:\n     - Making a false statement under oath.\n     - The statement must be material to the proceeding.\n     - The statement must be made with willful intent to deceive.\n\n2. **Historical Acts**:\n   - **An Act to permit the use of unsworn declarations under penalty of perjury as evidence in Federal proceedings** (1976): This act allows for unsworn declarations to be used as evidence under the penalty of perjury in federal proceedings. [Link to the Act](https://www.govinfo.gov/content/pkg/STATUTE-90/pdf/STATUTE-90-Pg2534.pdf)\n   - **An Act for the punishment of the crime of perjury** (1823): This early legislation laid the foundation for the punishment of perjury in public accounts and other matters.\n\n3. **Amendments and Related Acts**:\n   - **An act

In [15]:
agent_executor.invoke(input={"messages": "What is the most famous US Supreme Court perjury case?"}, config={"configurable": {"thread_id": "xyz_789"}})["messages"][-1].content

"The most famous U.S. Supreme Court case related to perjury is **Bronston v. United States (1973)**.\n\n### Bronston v. United States (409 U.S. 352)\n\n**Summary**:\n- **Facts**: Samuel Bronston, a film producer, was questioned under oath during a bankruptcy proceeding about his company's foreign bank accounts. He provided answers that were technically truthful but misleading. For example, when asked if he had any Swiss bank accounts, he responded with information about his company's accounts, not his own.\n- **Issue**: Whether a literally truthful but misleading answer given under oath can constitute perjury.\n- **Decision**: The Supreme Court, in a unanimous decision written by Chief Justice Warren Burger, held that a witness cannot be convicted of perjury for giving a literally truthful answer that is misleading by negative implication. The Court emphasized that it is the responsibility of the questioner to ask precise and follow-up questions to clarify any ambiguities.\n- **Impact*

In [16]:
agent_executor.invoke(input={"messages": "Based on your knowledge of all laws in the US pertaining to perjury, were there any other laws that could have been applied in this case?"}, config={"configurable": {"thread_id": "xyz_789"}})["messages"][-1].content

'In the context of the **Bronston v. United States** case, the primary legal framework applied was the federal perjury statute under **18 U.S.C. § 1621**, which requires that a statement be false and made with willful intent to deceive. Given the specifics of the case, where Bronston\'s statements were misleading but technically truthful, it is challenging to apply other perjury-related laws directly. However, other related statutes and legal concepts might have been considered depending on the circumstances and the nature of the questioning.\n\n### Potentially Applicable Laws and Concepts\n\n1. **18 U.S.C. § 1623 - False Declarations Before Grand Jury or Court**:\n   - This statute is similar to § 1621 but is specifically focused on false declarations made in court or before a grand jury. It also requires the false statement to be material. However, like § 1621, it requires the statement to be false, not just misleading.\n\n2. **18 U.S.C. § 1001 - False Statements**:\n   - This law ma

## Let's create a Python class that encapsulates the code above.  This will enable users to easily create chatbots for new use cases with custom CSV datasets

In [19]:
import secrets

class CSV_QA_Bot:
    def __init__(self, llm: ChatOpenAI, csv_files: list[str], num_web_results_to_fetch: int = 10):
        self._llm = llm
        
        docs = self._load_csv_files(csv_files)
        
        # split the docs into chunks, vectorize the chunks and load them into a vector store
        db = self._create_vector_store(docs)
        
        # create a retriever from the vector store
        self._faiss_retriever = db.as_retriever()
        
        # convert this retriever into a Langchain tool
        self._faiss_retriever_tool = create_retriever_tool(
            self._faiss_retriever,
            name = "law_dataset_retriever",
            description = "Retrieve relevant context from the US laws dataset."
        )
        
        # instantiate the YDC search tool in Langchain
        self._ydc_api_wrapper = YouSearchAPIWrapper(num_web_results=num_web_results_to_fetch)
        self._ydc_search_tool = YouSearchTool(api_wrapper=self._ydc_api_wrapper)
        
        
        # create a list of tools that will be supplied to the Langchain agent
        self._tools = [self._faiss_retriever_tool, self._ydc_search_tool]
        
        # Create a checkpointer to use memory
        self._memory = MemorySaver()
        
        # create the agent executor
        self._agent_executor = chat_agent_executor.create_tool_calling_executor(self._llm, tools, checkpointer=memory)
        
        # generate a thread ID for to keep track of conversation history
        self._thread_id = self._generate_thread_id()

    def _load_csv_files(self, csv_files: list[str]) -> list:
        docs = []
        for file in csv_files:
            data_loader = CSVLoader(file)
            docs.extend(data_loader.load())
        return docs
    
    def _create_vector_store(self, docs: list) -> FAISS:
        text_splitter = RecursiveCharacterTextSplitter(chunk_size = 1000, chunk_overlap = 100)
        chunked_docs = text_splitter.split_documents(docs)
        embeddings = OpenAIEmbeddings()
        return FAISS.from_documents(documents=chunked_docs, embedding=embeddings)

    def _generate_thread_id(self) -> str:
        thread_id = secrets.token_urlsafe(16)
        return thread_id
    
    def invoke_bot(self, input_str: str) -> str:
        input = {"messages": input_str}
        config = {"configurable": {"thread_id": self._thread_id}}
        output = self._agent_executor.invoke(input=input, config=config)["messages"][-1].content
        return output

## Let's try it out!

In [20]:
conversational_agent = CSV_QA_Bot(llm, csv_files=["us_laws_dataset.csv"])

In [21]:
conversational_agent.invoke_bot("What laws in the US pertain to anti-trust and is there a recent case in the USA that pertains to violations of these laws?")

"### US Anti-Trust Laws\n\n1. **Sherman Act (1890)**: This is the foundational antitrust law in the United States, prohibiting monopolistic practices and promoting fair competition. It addresses and outlaws monopolistic behavior and conspiracies that restrain trade or commerce.\n   - [Full Text](https://www.govinfo.gov/content/pkg/STATUTE-26/pdf/STATUTE-26-Pg209.pdf)\n\n2. **Clayton Act (1914)**: This act builds on the Sherman Act by addressing specific practices that could lead to anticompetitive behavior. It covers topics such as price discrimination, exclusive dealing agreements, and mergers and acquisitions that may substantially lessen competition.\n   - [Full Text](https://www.govinfo.gov/content/pkg/STATUTE-38/pdf/STATUTE-38-Pg730.pdf)\n\n3. **Federal Trade Commission Act (1914)**: This act established the Federal Trade Commission (FTC) and outlaws unfair methods of competition and unfair or deceptive acts or practices in or affecting commerce.\n   - [Full Text](https://www.govi

In [22]:
conversational_agent.invoke_bot("What is the most famous US Supreme Court anti-trust case?")

'One of the most famous U.S. Supreme Court antitrust cases is **Standard Oil Co. of New Jersey v. United States (1911)**. This landmark case led to the breakup of the Standard Oil Company, which was deemed to have violated the Sherman Antitrust Act by engaging in monopolistic practices and restraining trade.\n\n### Key Details of the Case:\n\n- **Background**: Standard Oil, founded by John D. Rockefeller, had grown to dominate the oil industry in the United States through a series of aggressive and often underhanded business practices, including predatory pricing and acquiring competitors.\n  \n- **Legal Issue**: The main issue was whether Standard Oil\'s business practices constituted an illegal monopoly under the Sherman Antitrust Act of 1890.\n\n- **Supreme Court Decision**: In 1911, the Supreme Court ruled that Standard Oil was an illegal monopoly and ordered its dissolution. The Court applied the "rule of reason," which means that only those combinations and contracts unreasonably

In [23]:
conversational_agent.invoke_bot("Based on your knowledge of all laws in the USA relating to anti-trust, were there any other laws that could have been applied in this case?")

"The **Standard Oil Co. of New Jersey v. United States (1911)** case primarily involved the application of the Sherman Antitrust Act of 1890. However, several other antitrust laws could potentially have been applied or considered in similar contexts, had they existed at the time or been relevant to the specifics of the case. Here are some key antitrust laws and their potential applicability:\n\n### 1. **Clayton Act (1914)**\n- **Provisions**: Addresses specific practices that the Sherman Act does not explicitly cover, such as price discrimination, exclusive dealing agreements, and mergers and acquisitions that may substantially lessen competition.\n- **Applicability**: If the Clayton Act had been in effect at the time, it could have been used to address Standard Oil's acquisition of competitors and practices like exclusive dealing and price discrimination.\n\n### 2. **Federal Trade Commission Act (1914)**\n- **Provisions**: Establishes the Federal Trade Commission (FTC) and prohibits u