# Legal Assistant Bot

This chatbot retrieves context from a proprietary data source and the web to answer questions about federal laws in the United States of America (USA).  The proprietary datasource is a CSV file of all federal laws and their revision history in the USA.  The web data required to respond to the user's questions is retrieved using the You.com API.  The chatbot is implemented as an agent in using LangChain and LangGraph.

## Install all required packages

In [1]:
%%capture
! pip install langgraph==0.0.59
! pip install pandas==2.2.2
! pip install openai==1.30.3
! pip install langchain==0.2.1
! pip install langchain_community==0.2.1
! pip install langchain_openai==0.1.7
! pip install langchain_text_splitters==0.2.0
! pip install langchain_core==0.2.1
! pip install numpy==1.26.4
! pip install openai==1.30.3
! pip install python-dotenv==1.0.1
! pip install faiss-cpu==1.8.0

In [27]:
%%capture
import dotenv
dotenv.load_dotenv(".env", override=True)

## Load in the US Federal Laws dataset and create a vector database representation of this dataset, which will then be converted into a LangChain Retriever and Tool

In [3]:
# Let's take a look at our CSV dataset first
import pandas as pd

# The CSV file can be downloaded from: https://www.nature.com/articles/s41597-023-02758-z#Sec3
df = pd.read_csv("us_laws_dataset.csv")

df.head()

Unnamed: 0,row_number,action,Title,sal_volume,sal_page_start,BillCitation,congress_number,chapter,session_number,pl_no,date_of_passage,secondary_date,dates_conflict,Source,URL,alternate_sal_volume,alternate_sal_page_start,has_alternate_sal_citation
0,1,An Act,To regulate the time and manner of administeri...,1,23.0,,1,1.0,1.0,,1789-06-01,,,HeinOnline,,,,False
1,2,An Act,"For laying a duty on goods, wares, and merchan...",1,24.0,,1,2.0,1.0,,1789-07-04,,,HeinOnline,,,,False
2,3,An Act,Imposing duties on tonnage.,1,27.0,,1,3.0,1.0,,1789-07-20,,,HeinOnline,,,,False
3,4,An Act,For establishing an executive department to be...,1,28.0,,1,4.0,1.0,,1789-07-27,,,HeinOnline,,,,False
4,5,An Act,To regulate the collection of the duties impos...,1,29.0,,1,5.0,1.0,,1789-07-31,,,HeinOnline,,,,False


In [4]:
import openai
import langchain
import os

In [8]:
os.environ["YDC_API_KEY"] = "<Insert your YDC API key here>"
os.environ["OPENAI_API_KEY"] = "<Insert your Open AI API key here>"

In [9]:
from langchain_community.document_loaders.csv_loader import CSVLoader

# The CSV file can be downloaded from: https://www.nature.com/articles/s41597-023-02758-z#Sec3
loader = CSVLoader(file_path = "us_laws_dataset.csv")

data = loader.load()

In [10]:
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter

# split the document into chunks, and vectorize these chunks in a FAISS database
text_splitter = RecursiveCharacterTextSplitter(chunk_size = 1000, chunk_overlap = 100)
docs = text_splitter.split_documents(data)
embeddings = OpenAIEmbeddings()
db = FAISS.from_documents(documents=docs, embedding=embeddings)

In [11]:
# test out the similarity search
query = "What laws and amendments relate to perjury?"
response = db.similarity_search(query, k=10)
# let's look at the first 3 retrieved docs
response[:3]

[Document(page_content='row_number: 1885\naction: An Act\nTitle: In addition to the act, entitled "An act for the prompt settlement of public accounts," and for the punishment of the crime of perjury\nsal_volume: 3\nsal_page_start: 770\nBillCitation: NA\ncongress_number: 17\nchapter: 37\nsession_number: 2\npl_no: NA\ndate_of_passage: 1823-03-01\nsecondary_date: NA\ndates_conflict: NA\nSource: HeinOnline\nURL: NA\nalternate_sal_volume: NA\nalternate_sal_page_start: NA\nhas_alternate_sal_citation: FALSE', metadata={'source': 'us_laws_dataset.csv', 'row': 1884}),
 Document(page_content='row_number: 38724\naction: An Act\nTitle: An act to permit the use of unsworn declarations under penalty of perjury as evidence in Federal proceedings\nsal_volume: 90\nsal_page_start: 2534\nBillCitation: H.R. 15531\ncongress_number: 94\nchapter: NA\nsession_number: 2\npl_no: 94-550\ndate_of_passage: 1976-10-18\nsecondary_date: NA\ndates_conflict: FALSE\nSource: NA\nURL: https://www.govinfo.gov/content/pkg/

In [12]:
from langchain.tools.retriever import create_retriever_tool

# convert this retriver into a tool
db_retriever = db.as_retriever()
db_retriever_tool = create_retriever_tool(
    db_retriever,
    name = "law_dataset_retriever",
    description = "Retrieve relevant context from the US laws dataset."
)

## Instantiating the You.com Tool in Langchain

LangChain provides a wrapper around the You.com API and a You.com Tool.  For more information, please visit: https://python.langchain.com/v0.1/docs/integrations/tools/you/

In [13]:
from langchain_community.tools.you import YouSearchTool
from langchain_community.utilities.you import YouSearchAPIWrapper

api_wrapper = YouSearchAPIWrapper(num_web_results = 10)
ydc_tool = YouSearchTool(api_wrapper=api_wrapper)

In [14]:
# test out the You.com search tool
response = ydc_tool.invoke("Tell me about a recent high-profile case related to antitrust in the USA?")
# let's look at the first 3 results
response[:3]

[Document(page_content='Meijer, Inc. v. Ferring B.V.; Ferring Pharmaceuticals, Inc.; and Aventis Pharmaceuticals [In Re: DDAVP Direct Purchaser Antitrust Litigation] U.S. v. Memphis Board of Realtors', metadata={'url': 'https://www.justice.gov/atr/antitrust-case-filings-alpha', 'thumbnail_url': None, 'title': 'Antitrust Division | Antitrust Case Filings | United States Department of Justice', 'description': 'An official website of the United States government · Official websites use .gov A .gov website belongs to an official government organization in the United States'}),
 Document(page_content="Dentsply International, Inc. v. Antitrust Division of the United States Department of Justice · U.S. v. Freddy Deoliveira · U.S. v. Wilhelm DerMinassian · U.S. v. Eric Descouraux · Leinani Deslandes, Stephanie Turner, et al. v. McDonald's USA, LLC, et al.", metadata={'url': 'https://www.justice.gov/atr/antitrust-case-filings-alpha', 'thumbnail_url': None, 'title': 'Antitrust Division | Antitru

## Instantiate our LLM

In [15]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o", temperature=0.5)

## Tying it all together

In [16]:
from langgraph.prebuilt import chat_agent_executor
from langgraph.checkpoint import MemorySaver

# Create a checkpointer to use memory
memory = MemorySaver()
# the vector store representation of the CSV dataset and the You.com Search tool will both be passed as tools to the agent
tools = [db_retriever_tool, ydc_tool]
agent_executor = chat_agent_executor.create_tool_calling_executor(llm, tools, checkpointer=memory)

## Let's try it out!

In [25]:
agent_executor.invoke(input={"messages": "What laws in the US pertain to perjury and is there a recent case in the US that relates to a violation of these laws?"}, config={"configurable": {"thread_id": "xyz_789"}})["messages"][-1].content

'### US Laws Pertaining to Perjury\n\nIn the United States, perjury is governed by several federal statutes and legal precedents. Here are the primary laws that pertain to perjury:\n\n1. **18 U.S. Code § 1621 - Perjury Generally**\n   - **Definition**: This statute makes it a crime to willfully make false statements under oath in any case where an oath is authorized by U.S. law.\n   - **Penalties**: Perjury under this statute is classified as a felony and can result in a fine or imprisonment for up to five years, or both.\n   - **Elements**: The elements of perjury include taking an oath, willfully making a false statement, believing the statement to be untrue, and the statement relating to a material fact.\n   - [18 U.S. Code § 1621 - Perjury generally](https://www.law.cornell.edu/uscode/text/18/1621)\n\n2. **18 U.S. Code § 1622 - Subornation of Perjury**\n   - **Definition**: This statute makes it a crime to persuade or induce another person to commit perjury.\n   - **Penalties**: Th

In [18]:
agent_executor.invoke(input={"messages": "What is the most famous US Supreme Court perjury case?"}, config={"configurable": {"thread_id": "xyz_789"}})["messages"][-1].content

'One of the most famous U.S. Supreme Court cases involving perjury is **Bronston v. United States, 409 U.S. 352 (1973)**. This case is seminal because it set a significant precedent regarding the interpretation of the federal perjury statute.\n\n### Bronston v. United States (1973)\n\n#### Case Summary:\n- **Facts**: Samuel Bronston, the president of a film production company, was questioned under oath during a bankruptcy proceeding. He was asked about his personal bank accounts in Switzerland. Bronston\'s answers were technically true but misleading. He stated that his company had an account in Switzerland, but did not mention his personal account there.\n- **Legal Issue**: The issue was whether Bronston\'s misleading but literally true statements could be prosecuted under the federal perjury statute (18 U.S.C. § 1621).\n- **Supreme Court Decision**: The Supreme Court unanimously held that literally true statements, even if misleading, do not constitute perjury under the federal statu

In [19]:
agent_executor.invoke(input={"messages": "Based on your knowledge of all laws in the US pertaining to perjury, were there any other laws that could have been applied in this case?"}, config={"configurable": {"thread_id": "xyz_789"}})["messages"][-1].content

'In the case of **Bronston v. United States**, the primary legal issue revolved around the interpretation of the federal perjury statute, specifically 18 U.S.C. § 1621. However, considering the nature of Bronston\'s testimony and the broader context of perjury laws in the U.S., there are a few other statutes and legal principles that might have been relevant or could have been considered:\n\n### 1. **False Declarations Before Grand Jury or Court (18 U.S.C. § 1623)**\n- **Overview**: This statute addresses knowingly making false material declarations under oath before a grand jury or court.\n- **Relevance**: While Bronston\'s statements were literally true, if it could have been shown that he knowingly made a false material declaration in a more straightforward manner, § 1623 might have been applicable. However, this statute also requires the false statement to be unambiguous.\n\n### 2. **Obstruction of Justice (18 U.S.C. § 1503)**\n- **Overview**: This statute makes it a crime to corru

## Let's create a Python class that encapsulates the code above.  This will enable users to easily create chatbots for new use cases with custom CSV datasets

In [30]:
import secrets

class CSV_QA_Bot:
    def __init__(self, llm: ChatOpenAI, csv_files: list[str], num_web_results_to_fetch: int = 10):
        self._llm = llm
        
        docs = self._load_csv_files(csv_files)
        
        # split the docs into chunks, vectorize the chunks and load them into a vector store
        db = self._create_vector_store(docs)
        
        # create a retriever from the vector store
        self._faiss_retriever = db.as_retriever()
        
        # convert this retriever into a Langchain tool
        self._faiss_retriever_tool = create_retriever_tool(
            self._faiss_retriever,
            name = "custom_dataset_retriever",
            description = "Retrieve relevant context from custom dataset."
        )
        
        # instantiate the YDC search tool in Langchain
        self._ydc_api_wrapper = YouSearchAPIWrapper(num_web_results=num_web_results_to_fetch)
        self._ydc_search_tool = YouSearchTool(api_wrapper=self._ydc_api_wrapper)
        
        
        # create a list of tools that will be supplied to the Langchain agent
        self._tools = [self._faiss_retriever_tool, self._ydc_search_tool]
        
        # Create a checkpointer to use memory
        self._memory = MemorySaver()
        
        # create the agent executor
        self._agent_executor = chat_agent_executor.create_tool_calling_executor(self._llm, self._tools, checkpointer=self._memory)
        
        # generate a thread ID for to keep track of conversation history
        self._thread_id = self._generate_thread_id()

    def _load_csv_files(self, csv_files: list[str]) -> list:
        docs = []
        for file in csv_files:
            data_loader = CSVLoader(file)
            docs.extend(data_loader.load())
        return docs
    
    def _create_vector_store(self, docs: list) -> FAISS:
        text_splitter = RecursiveCharacterTextSplitter(chunk_size = 1000, chunk_overlap = 100)
        chunked_docs = text_splitter.split_documents(docs)
        embeddings = OpenAIEmbeddings()
        return FAISS.from_documents(documents=chunked_docs, embedding=embeddings)

    def _generate_thread_id(self) -> str:
        thread_id = secrets.token_urlsafe(16)
        return thread_id
    
    def invoke_bot(self, input_str: str) -> str:
        input = {"messages": input_str}
        config = {"configurable": {"thread_id": self._thread_id}}
        output = self._agent_executor.invoke(input=input, config=config)["messages"][-1].content
        return output

## Let's try it out!

In [31]:
conversational_agent = CSV_QA_Bot(llm, csv_files=["us_laws_dataset.csv"])

In [32]:
conversational_agent.invoke_bot("What laws in the US pertain to anti-trust and is there a recent case in the USA that pertains to violations of these laws?")

"### Key Antitrust Laws in the United States\n\n1. **Sherman Antitrust Act (1890)**:\n   - **Section 1**: Prohibits contracts, combinations, or conspiracies that restrain trade.\n   - **Section 2**: Prohibits monopolization, attempts to monopolize, or conspiracies to monopolize.\n\n2. **Clayton Antitrust Act (1914)**:\n   - Addresses specific practices that the Sherman Act does not clearly prohibit, such as mergers and interlocking directorates.\n   - **Section 7**: Prohibits mergers and acquisitions where the effect may be substantially to lessen competition or to tend to create a monopoly.\n   - **Section 3**: Prohibits exclusive dealing and tying arrangements that may harm competition.\n\n3. **Federal Trade Commission Act (1914)**:\n   - Established the Federal Trade Commission (FTC).\n   - **Section 5**: Prohibits unfair methods of competition and unfair or deceptive acts or practices.\n\n4. **Robinson-Patman Act (1936)**:\n   - Prohibits anticompetitive practices by producers, spe

In [33]:
conversational_agent.invoke_bot("What is the most famous US Supreme Court anti-trust case?")

'One of the most famous U.S. Supreme Court antitrust cases is **Standard Oil Co. of New Jersey v. United States (1911)**.\n\n### Standard Oil Co. of New Jersey v. United States (1911)\n\n#### Background:\n- **Company**: Standard Oil Company, founded by John D. Rockefeller, was the largest oil refiner in the world at the time.\n- **Accusation**: The federal government accused Standard Oil of engaging in monopolistic practices and violating the Sherman Antitrust Act by maintaining a monopoly through a series of abusive and anticompetitive actions.\n\n#### Key Points:\n- **Decision Date**: May 15, 1911\n- **Court\'s Ruling**: The Supreme Court ruled that Standard Oil had indeed violated the Sherman Antitrust Act.\n- **Outcome**: The Court ordered the dissolution of Standard Oil into 34 independent companies. This decision was based on the "rule of reason" standard, which evaluates whether a business practice unreasonably restrains trade.\n\n#### Significance:\n- **Impact**: The case set a

In [34]:
conversational_agent.invoke_bot("Based on your knowledge of all laws in the USA relating to anti-trust, were there any other laws that could have been applied in this case?")

"The **Standard Oil Co. of New Jersey v. United States (1911)** case primarily relied on the Sherman Antitrust Act of 1890. However, other antitrust laws that could potentially have been applied or considered in similar contexts include:\n\n### 1. **Clayton Antitrust Act (1914)**\n   - **Section 2**: Addresses price discrimination, which could be relevant if Standard Oil was found to be selling the same product to different buyers at different prices in a way that lessened competition.\n   - **Section 3**: Prohibits exclusive dealing and tying arrangements that may harm competition.\n   - **Section 7**: Prohibits mergers and acquisitions where the effect may be substantially to lessen competition or to tend to create a monopoly.\n   - **Section 8**: Prohibits interlocking directorates (i.e., the same person making business decisions for competing companies).\n\n### 2. **Federal Trade Commission Act (1914)**\n   - **Section 5**: Prohibits unfair methods of competition and unfair or dece