# Legal Assistant Bot

This chatbot retrieves context from a proprietary data source and the web to answer questions about federal laws in the United States of America (USA).  The proprietary datasource is a CSV file of all federal laws and their revision history in the USA.  The web data required to respond to the user's questions is retrieved using the You.com API.  The chatbot is implemented as an agent in using LangChain and LangGraph.

## Install all required packages

In [102]:
%%capture
! pip install langgraph==0.0.59
! pip install pandas==2.2.2
! pip install openai==1.30.3
! pip install langchain==0.2.1
! pip install langchain_community==0.2.1
! pip install langchain_openai==0.1.7
! pip install langchain_text_splitters==0.2.0
! pip install langchain_core==0.2.1
! pip install numpy==1.26.4
! pip install openai==1.30.3
! pip install python-dotenv==1.0.1
! pip install faiss-cpu==1.8.0

## Load in the US Federal Laws dataset and create a vector database representation of this dataset, which will then be converted into a LangChain Retriever and Tool

In [103]:
# Let's take a look at our CSV dataset first
import pandas as pd

# The CSV file can be downloaded from: https://www.nature.com/articles/s41597-023-02758-z#Sec3
df = pd.read_csv("us_laws_dataset.csv")

df.head()

Unnamed: 0,row_number,action,Title,sal_volume,sal_page_start,BillCitation,congress_number,chapter,session_number,pl_no,date_of_passage,secondary_date,dates_conflict,Source,URL,alternate_sal_volume,alternate_sal_page_start,has_alternate_sal_citation
0,1,An Act,To regulate the time and manner of administeri...,1,23.0,,1,1.0,1.0,,1789-06-01,,,HeinOnline,,,,False
1,2,An Act,"For laying a duty on goods, wares, and merchan...",1,24.0,,1,2.0,1.0,,1789-07-04,,,HeinOnline,,,,False
2,3,An Act,Imposing duties on tonnage.,1,27.0,,1,3.0,1.0,,1789-07-20,,,HeinOnline,,,,False
3,4,An Act,For establishing an executive department to be...,1,28.0,,1,4.0,1.0,,1789-07-27,,,HeinOnline,,,,False
4,5,An Act,To regulate the collection of the duties impos...,1,29.0,,1,5.0,1.0,,1789-07-31,,,HeinOnline,,,,False


In [104]:
import openai
import langchain
import os

In [120]:
# Insert your APIs here or alternatively, add them in a .env file and load the API keys from the .env file in the cell below
os.environ["YDC_API_KEY"] = "<Insert your YDC API key here>"
os.environ["OPENAI_API_KEY"] = "<Insert your Open AI API key here>"

In [106]:
import dotenv
dotenv.load_dotenv(".env", override=True)

True

In [107]:
from langchain_community.document_loaders.csv_loader import CSVLoader

# The CSV file can be downloaded from: https://www.nature.com/articles/s41597-023-02758-z#Sec3
loader = CSVLoader(file_path = "us_laws_dataset.csv")

data = loader.load()

In [108]:
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter

# split the document into chunks, and vectorize these chunks in a FAISS database
text_splitter = RecursiveCharacterTextSplitter(chunk_size = 1000, chunk_overlap = 100)
docs = text_splitter.split_documents(data)
embeddings = OpenAIEmbeddings()
db = FAISS.from_documents(documents=docs, embedding=embeddings)

In [109]:
# test out the similarity search
query = "What laws and acts relate to perjury?"
response = db.similarity_search(query, k=10)
# let's look at the first 3 retrieved docs
response[:3]

[Document(page_content='row_number: 1885\naction: An Act\nTitle: In addition to the act, entitled "An act for the prompt settlement of public accounts," and for the punishment of the crime of perjury\nsal_volume: 3\nsal_page_start: 770\nBillCitation: NA\ncongress_number: 17\nchapter: 37\nsession_number: 2\npl_no: NA\ndate_of_passage: 1823-03-01\nsecondary_date: NA\ndates_conflict: NA\nSource: HeinOnline\nURL: NA\nalternate_sal_volume: NA\nalternate_sal_page_start: NA\nhas_alternate_sal_citation: FALSE', metadata={'source': 'us_laws_dataset.csv', 'row': 1884}),
 Document(page_content='row_number: 38724\naction: An Act\nTitle: An act to permit the use of unsworn declarations under penalty of perjury as evidence in Federal proceedings\nsal_volume: 90\nsal_page_start: 2534\nBillCitation: H.R. 15531\ncongress_number: 94\nchapter: NA\nsession_number: 2\npl_no: 94-550\ndate_of_passage: 1976-10-18\nsecondary_date: NA\ndates_conflict: FALSE\nSource: NA\nURL: https://www.govinfo.gov/content/pkg/

In [110]:
from langchain.tools.retriever import create_retriever_tool

# convert this retriver into a tool
db_retriever = db.as_retriever()
db_retriever_tool = create_retriever_tool(
    db_retriever,
    name = "law_dataset_retriever",
    description = "Retrieve relevant context from the US laws dataset."
)

## Instantiating the You.com Tool in Langchain

LangChain provides a wrapper around the You.com API and a You.com Tool.  For more information, please visit: https://python.langchain.com/v0.1/docs/integrations/tools/you/

In [111]:
from langchain_community.tools.you import YouSearchTool
from langchain_community.utilities.you import YouSearchAPIWrapper

api_wrapper = YouSearchAPIWrapper(num_web_results = 10)
ydc_tool = YouSearchTool(api_wrapper=api_wrapper)

In [112]:
# test out the You.com search tool
response = ydc_tool.invoke("Tell me about a recent high-profile case related to antitrust in the USA.")
# let's look at the first 3 results
response[:3]

[Document(page_content='Meijer, Inc. v. Ferring B.V.; Ferring Pharmaceuticals, Inc.; and Aventis Pharmaceuticals [In Re: DDAVP Direct Purchaser Antitrust Litigation] U.S. v. Memphis Board of Realtors', metadata={'url': 'https://www.justice.gov/atr/antitrust-case-filings-alpha', 'thumbnail_url': None, 'title': 'Antitrust Division | Antitrust Case Filings | United States Department of Justice', 'description': 'An official website of the United States government · Official websites use .gov A .gov website belongs to an official government organization in the United States'}),
 Document(page_content="Dentsply International, Inc. v. Antitrust Division of the United States Department of Justice · U.S. v. Freddy Deoliveira · U.S. v. Wilhelm DerMinassian · U.S. v. Eric Descouraux · Leinani Deslandes, Stephanie Turner, et al. v. McDonald's USA, LLC, et al.", metadata={'url': 'https://www.justice.gov/atr/antitrust-case-filings-alpha', 'thumbnail_url': None, 'title': 'Antitrust Division | Antitru

## Instantiate our LLM

In [113]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o", temperature=0.5)

## Tying it all together

In [114]:
from langgraph.prebuilt import chat_agent_executor
from langgraph.checkpoint import MemorySaver

# Create a checkpointer to use memory
memory = MemorySaver()
# the vector store representation of the CSV dataset and the You.com Search tool will both be passed as tools to the agent
tools = [db_retriever_tool, ydc_tool]
agent_executor = chat_agent_executor.create_tool_calling_executor(llm, tools, checkpointer=memory)

## Let's try it out!

In [115]:
agent_executor.invoke(input={"messages": "What laws in the US address economic espionage?"}, config={"configurable": {"thread_id": "xyz_789"}})["messages"][-1].content

'Several laws in the United States address economic espionage:\n\n1. **Economic Espionage Act of 1996 (Public Law 104-294)**\n   - **Date of Passage:** October 11, 1996\n   - **Summary:** This act makes the theft or misappropriation of trade secrets a federal crime.\n   - **URL:** [Economic Espionage Act of 1996](https://www.govinfo.gov/content/pkg/STATUTE-110/pdf/STATUTE-110-Pg3488.pdf)\n   \n2. **An act to clarify the scope of the Economic Espionage Act of 1996 (Public Law 112-236)**\n   - **Date of Passage:** December 28, 2012\n   - **Summary:** This act clarifies the scope of the original Economic Espionage Act of 1996.\n   - **URL:** [Clarification of the Economic Espionage Act of 1996](https://www.govinfo.gov/content/pkg/STATUTE-126/html/STATUTE-126-Pg1627.htm)\n\n3. **An act to amend title 18, United States Code, to provide for increased penalties for foreign and economic espionage, and for other purposes (Public Law 112-269)**\n   - **Date of Passage:** January 14, 2013\n   - *

In [116]:
agent_executor.invoke(input={"messages": "What is the most famous US Supreme Court case related to economic espionage?"}, config={"configurable": {"thread_id": "xyz_789"}})["messages"][-1].content

'The most famous U.S. Supreme Court case related to economic espionage is **"Gorin v. United States"** (1941).\n\n### Gorin v. United States (312 U.S. 19)\n- **Date of Decision:** January 13, 1941\n- **Summary:** This case involved the Espionage Act of 1917. The Supreme Court upheld the convictions of the defendants for obtaining and delivering documents related to national defense to a foreign agent, with the intent or reason to believe that the information would be used to the injury of the United States or to the advantage of a foreign nation.\n- **Significance:** The ruling clarified the scope of the Espionage Act, establishing that the act of obtaining and delivering documents connected with national defense with intent to harm the U.S. or benefit a foreign nation constituted a crime under the Espionage Act.\n- **URL:** [Gorin v. United States - Justia](https://supreme.justia.com/cases/federal/us/312/19/)\n\nThis case is a landmark in the context of espionage laws and has had a la

In [117]:
agent_executor.invoke(input={"messages": "Based on your knowledge of federal laws in the US pertaining to economic espionage, were there any other laws that could have been applied in this case?"}, config={"configurable": {"thread_id": "xyz_789"}})["messages"][-1].content

'In the context of the "Gorin v. United States" case and the broader issue of economic espionage, several other federal laws could potentially be applied depending on the specific circumstances of the case. These laws include:\n\n### 1. Economic Espionage Act of 1996 (EEA)\n- **Sections 1831 and 1832 of Title 18, U.S. Code**\n  - **Section 1831:** Addresses economic espionage specifically involving the theft of trade secrets for the benefit of a foreign government, foreign instrumentality, or foreign agent.\n  - **Section 1832:** Addresses the theft of trade secrets for commercial or economic advantage, regardless of the involvement of a foreign entity.\n  - **Summary:** This act criminalizes the theft or misappropriation of trade secrets and provides penalties for individuals and organizations involved in such activities.\n\n### 2. Espionage Act of 1917\n- **Title 18, U.S. Code, Sections 792-798**\n  - **Summary:** This act addresses a wide range of espionage activities, including the

## Let's create a Python class that encapsulates the code above.  This will enable users to easily create chatbots for new use cases with custom CSV datasets

In [118]:
import secrets

class CSV_QA_Bot:
    def __init__(self, llm: ChatOpenAI, csv_files: list[str], num_web_results_to_fetch: int = 10):
        self._llm = llm
        
        docs = self._load_csv_files(csv_files)
        
        # split the docs into chunks, vectorize the chunks and load them into a vector store
        db = self._create_vector_store(docs)
        
        # create a retriever from the vector store
        self._faiss_retriever = db.as_retriever()
        
        # convert this retriever into a Langchain tool
        self._faiss_retriever_tool = create_retriever_tool(
            self._faiss_retriever,
            name = "custom_dataset_retriever",
            description = "Retrieve relevant context from custom dataset."
        )
        
        # instantiate the YDC search tool in Langchain
        self._ydc_api_wrapper = YouSearchAPIWrapper(num_web_results=num_web_results_to_fetch)
        self._ydc_search_tool = YouSearchTool(api_wrapper=self._ydc_api_wrapper)
        
        
        # create a list of tools that will be supplied to the Langchain agent
        self._tools = [self._faiss_retriever_tool, self._ydc_search_tool]
        
        # Create a checkpointer to use memory
        self._memory = MemorySaver()
        
        # create the agent executor
        self._agent_executor = chat_agent_executor.create_tool_calling_executor(self._llm, self._tools, checkpointer=self._memory)
        
        # generate a thread ID for to keep track of conversation history
        self._thread_id = self._generate_thread_id()

    def _load_csv_files(self, csv_files: list[str]) -> list:
        docs = []
        for file in csv_files:
            data_loader = CSVLoader(file)
            docs.extend(data_loader.load())
        return docs
    
    def _create_vector_store(self, docs: list) -> FAISS:
        text_splitter = RecursiveCharacterTextSplitter(chunk_size = 1000, chunk_overlap = 100)
        chunked_docs = text_splitter.split_documents(docs)
        embeddings = OpenAIEmbeddings()
        return FAISS.from_documents(documents=chunked_docs, embedding=embeddings)

    def _generate_thread_id(self) -> str:
        thread_id = secrets.token_urlsafe(16)
        return thread_id
    
    def invoke_bot(self, input_str: str) -> str:
        input = {"messages": input_str}
        config = {"configurable": {"thread_id": self._thread_id}}
        output = self._agent_executor.invoke(input=input, config=config)
        return output["messages"][-1].content

## Let's try it out!

In [119]:
llm = ChatOpenAI(model="gpt-4o", temperature=0.5)
conversational_agent = CSV_QA_Bot(llm, csv_files=["us_laws_dataset.csv"])

In [121]:
conversational_agent.invoke_bot("What laws in the USA address insider trading?")

'In the United States, several laws and acts specifically address insider trading, including:\n\n1. **Securities Act of 1933**:\n   - This act requires the full and fair disclosure of the character of securities sold in interstate and foreign commerce and through the mails to prevent fraud in the sale of securities.\n   - [Link to the Act](https://www.sec.gov/Archives/edgar/data/1164964/000101968715004168/globalfuture_8k-ex9904.htm)\n\n2. **Securities Exchange Act of 1934**:\n   - This act provides for the regulation of securities exchanges and over-the-counter markets operating in interstate and foreign commerce and through the mails to prevent inequitable and unfair practices on such exchanges and markets.\n   - Sections 16(b) and 10(b) of the Securities Exchange Act of 1934 directly and indirectly address insider trading.\n   - [Link to the Act](https://www.sec.gov/Archives/edgar/data/1164964/000101968715004168/globalfuture_8k-ex9904.htm)\n\n3. **Insider Trading Sanctions Act of 198

In [122]:
conversational_agent.invoke_bot("What is the most famous US Supreme Court case of insider trading?")

"One of the most famous U.S. Supreme Court cases involving insider trading is **Salman v. United States**, decided on December 6, 2016. In this case, the Supreme Court upheld the conviction of Bassam Salman for insider trading. The Court ruled that gifts of confidential information from business executives to relatives violate securities laws, even if the tipper does not receive a direct financial benefit.\n\n### Key Points from Salman v. United States:\n- **Issue**: Whether the insider trading laws require that the insider (tipper) receive a tangible benefit in exchange for the tip.\n- **Ruling**: The Supreme Court held that a gift of confidential information to a relative or friend can constitute a violation of insider trading laws, even if the tipper does not receive a direct financial benefit.\n- **Impact**: The decision made it easier to prosecute insider trading cases by clarifying that the benefit to the tipper can be intangible, such as enhancing a personal relationship.\n\n###

In [123]:
conversational_agent.invoke_bot("Based on your knowledge of federal laws in the USA related to insider trading, were there any other laws that could have been applied in this case?")

"In addition to the specific ruling in **Salman v. United States**, several federal laws and regulations in the USA address insider trading and could have been applied in this case. These include:\n\n### 1. **Securities Exchange Act of 1934**\n   - **Section 10(b)**: This section prohibits any manipulative or deceptive device or contrivance in connection with the purchase or sale of any security. It is often enforced through SEC Rule 10b-5.\n   - **Rule 10b-5**: This rule makes it unlawful for any person to employ any device, scheme, or artifice to defraud, to make any untrue statement of a material fact, or to omit to state a material fact necessary in order to make the statements made not misleading, or to engage in any act, practice, or course of business which operates or would operate as a fraud or deceit upon any person, in connection with the purchase or sale of any security.\n\n### 2. **Insider Trading Sanctions Act of 1984**\n   - This act allows the SEC to seek civil penaltie