## Week 2

*Welcome back! Scroll down to week 3.*

We first download the required software: LangChain and its dependency `pypdf`

In [1]:
!pip install --upgrade pip 
!pip install --upgrade langchain pypdf

Collecting langchain
  Downloading langchain-0.0.353-py3-none-any.whl.metadata (13 kB)
Collecting langchain-core<0.2,>=0.1.4 (from langchain)
  Downloading langchain_core-0.1.4-py3-none-any.whl.metadata (4.0 kB)
Downloading langchain-0.0.353-py3-none-any.whl (803 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m803.1/803.1 kB[0m [31m1.1 MB/s[0m eta [36m0:00:00[0m00:01[0m:00:01[0m
[?25hDownloading langchain_core-0.1.4-py3-none-any.whl (205 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m205.7/205.7 kB[0m [31m2.9 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
[?25hInstalling collected packages: langchain-core, langchain
  Attempting uninstall: langchain-core
    Found existing installation: langchain-core 0.1.0
    Uninstalling langchain-core-0.1.0:
      Successfully uninstalled langchain-core-0.1.0
  Attempting uninstall: langchain
    Found existing installation: langchain 0.0.352
    Uninstalling langchain-0.0.352:
      Successfully uni

We then load LangChain's `pypdf` loader.

In [2]:
from langchain.document_loaders import PyPDFLoader, PyPDFDirectoryLoader

Let's first load our PDF... 

In [3]:
loader = PyPDFLoader("Data/2021-census-population-occupied-private-dwellings-community-2001-2021.pdf")

In [4]:
single_pdf = loader.load_and_split()

In [5]:
loader = PyPDFDirectoryLoader("Data/Supreme Court opinions 2014/")

In [6]:
many_pdfs = loader.load_and_split()

Having loaded both the single PDF and a directory of PDFs, let's now load the CSV. 

In [7]:
from langchain.document_loaders.csv_loader import CSVLoader

In [8]:
loader = CSVLoader("Data/Urban_Design_and_Architecture_Awards_Recipients.csv")

In [9]:
csv = loader.load()

Having loaded our data, we'll now download and load the embedding model.

In [10]:
!pip install sentence_transformers > /dev/null

In [11]:
from langchain.embeddings import HuggingFaceEmbeddings, SentenceTransformerEmbeddings

In [12]:
embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

Let's try embedding some text. Observe the output. Once you've tried it, scroll down to continue.

In [13]:
text = "This is a test document."

In [14]:
embeddings.embed_query(text)

[-0.03833852708339691,
 0.1234646886587143,
 -0.028642937541007996,
 0.05365273728966713,
 0.00884535163640976,
 -0.03983931988477707,
 -0.07300585508346558,
 0.04777132347226143,
 -0.030462520197033882,
 0.05497976765036583,
 0.08505293726921082,
 0.0366566926240921,
 -0.005319987423717976,
 -0.0022331285290420055,
 -0.06071098893880844,
 -0.027237899601459503,
 -0.011351611465215683,
 -0.042437728494405746,
 0.009129906073212624,
 0.100815549492836,
 0.07578731328248978,
 0.06911718100309372,
 0.009857481345534325,
 -0.0018377420492470264,
 0.026249045506119728,
 0.032902419567108154,
 -0.07177435606718063,
 0.028384260833263397,
 0.061709530651569366,
 -0.052529558539390564,
 0.03366165980696678,
 0.07446815818548203,
 0.07536036521196365,
 0.03538402169942856,
 0.06713403761386871,
 0.010798039846122265,
 0.08167023211717606,
 0.01656291075050831,
 0.03283059597015381,
 0.03632567450404167,
 0.002172845648601651,
 -0.09895741194486618,
 0.005046740174293518,
 0.05089650675654411,
 

We now have a working embedding function. Let's install Chroma.

In [15]:
!pip install -U chromadb

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)




In [16]:
from langchain.vectorstores import Chroma

Let's make a vector store for our loaded documents!

In [17]:
db = Chroma.from_documents(single_pdf, embeddings)

In [18]:
db.add_documents(csv)

['0428f1fe-a741-11ee-a3e2-acde48001122',
 '0428f29e-a741-11ee-a3e2-acde48001122',
 '0428f2da-a741-11ee-a3e2-acde48001122',
 '0428f302-a741-11ee-a3e2-acde48001122',
 '0428f33e-a741-11ee-a3e2-acde48001122',
 '0428f370-a741-11ee-a3e2-acde48001122',
 '0428f398-a741-11ee-a3e2-acde48001122',
 '0428f3ca-a741-11ee-a3e2-acde48001122',
 '0428f3f2-a741-11ee-a3e2-acde48001122',
 '0428f41a-a741-11ee-a3e2-acde48001122',
 '0428f442-a741-11ee-a3e2-acde48001122',
 '0428f46a-a741-11ee-a3e2-acde48001122',
 '0428f492-a741-11ee-a3e2-acde48001122',
 '0428f4b0-a741-11ee-a3e2-acde48001122',
 '0428f4d8-a741-11ee-a3e2-acde48001122',
 '0428f500-a741-11ee-a3e2-acde48001122',
 '0428f51e-a741-11ee-a3e2-acde48001122',
 '0428f546-a741-11ee-a3e2-acde48001122',
 '0428f56e-a741-11ee-a3e2-acde48001122',
 '0428f58c-a741-11ee-a3e2-acde48001122',
 '0428f5b4-a741-11ee-a3e2-acde48001122',
 '0428f5d2-a741-11ee-a3e2-acde48001122',
 '0428f5fa-a741-11ee-a3e2-acde48001122',
 '0428f622-a741-11ee-a3e2-acde48001122',
 '0428f64a-a741-

In [19]:
db.add_documents(many_pdfs)

['076f1208-a741-11ee-a3e2-acde48001122',
 '076f12da-a741-11ee-a3e2-acde48001122',
 '076f130c-a741-11ee-a3e2-acde48001122',
 '076f1334-a741-11ee-a3e2-acde48001122',
 '076f135c-a741-11ee-a3e2-acde48001122',
 '076f1384-a741-11ee-a3e2-acde48001122',
 '076f13a2-a741-11ee-a3e2-acde48001122',
 '076f13ca-a741-11ee-a3e2-acde48001122',
 '076f13e8-a741-11ee-a3e2-acde48001122',
 '076f1406-a741-11ee-a3e2-acde48001122',
 '076f142e-a741-11ee-a3e2-acde48001122',
 '076f144c-a741-11ee-a3e2-acde48001122',
 '076f1474-a741-11ee-a3e2-acde48001122',
 '076f1492-a741-11ee-a3e2-acde48001122',
 '076f14b0-a741-11ee-a3e2-acde48001122',
 '076f14d8-a741-11ee-a3e2-acde48001122',
 '076f14f6-a741-11ee-a3e2-acde48001122',
 '076f1514-a741-11ee-a3e2-acde48001122',
 '076f153c-a741-11ee-a3e2-acde48001122',
 '076f155a-a741-11ee-a3e2-acde48001122',
 '076f1578-a741-11ee-a3e2-acde48001122',
 '076f15a0-a741-11ee-a3e2-acde48001122',
 '076f15be-a741-11ee-a3e2-acde48001122',
 '076f15dc-a741-11ee-a3e2-acde48001122',
 '076f1604-a741-

Let's try retrieving a relevant document.

In [20]:
query = "An award concerning art."
db.similarity_search(query)

[Document(page_content='\ufeffX: 591976.7816\nY: 4790547.4424\nOBJECTID: 86\nAWARD_WINNER: The James North Art Crawl\nPROJECT_DESCRIPTION: On the second Friday evening of every month this event programs the historic James Street North streetscape from Murray to King Street with an eclectic array of gallery openings, performances and outdoor art reflective of the emerging arts community in t\nRECIPIENT: The Gallery and Studies of the James North Community\nAWARD_YEAR: 2007\nCATEGORY: Award of Merit for Visionary Project\nLOCATION: James Street North between Murray and King Streets\nCOMMUNITY: Hamilton\nLATITUDE: 43.2621251\nLONGITUDE: -79.8667415', metadata={'row': 85, 'source': 'Data/Urban_Design_and_Architecture_Awards_Recipients.csv'}),
 Document(page_content='\ufeffX: 591942.8531\nY: 4790007.4241\nOBJECTID: 45\nAWARD_WINNER: Empire Times\nPROJECT_DESCRIPTION: The project is an adaptive re-use of an historic building into a performing arts centre and affordable housing for artists. T

In [21]:
query = "What exceptions does Rule 606(b)(1) contain?"
db.similarity_search(query)

[Document(page_content='2 PEREZ v. MORTGAGE BANKERS ASSN. \nSCALIA , J., concurring in judgment \nadministrators whose zeal might otherwise have carried \nthem to excesses not contemplated in legislation creating\ntheir offices.” United States v. Morton Salt Co., 338 U. S. \n632, 644 (1950). The Act guards against excesses in rule-\nmaking by requiring notice and comment.  Before an \nagency makes a rule, it normally must notify the public of\nthe proposal, invite them to comment on its shortcomings, \nconsider and respond to their arguments, and explain itsfinal decision in a statement of the rule’s basis and pur-\npose. 5 U. S. C. §553(b)–(c); ante, at 2. \nThe APA exempts interpretive rules from these re-\nquirements.  §553(b)(A). But this concession to agencies\nwas meant to be more modest in its effects than it is today.For despite exempting interpretive rules from notice and\ncomment, the Act provides that “the reviewing court \nshall . . . interpret constituti onal and statutory

# Week 3

We'll now try making an agent. We'll start by downloading a library to let us run a local language model.

If you're on Windows, a non-Apple Silicon Mac, or Linux, use this command:

In [22]:
!pip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


Collecting llama-cpp-python
  Downloading llama_cpp_python-0.2.26.tar.gz (8.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m8.8/8.8 MB[0m [31m17.2 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
[?25h  Installing build dependencies ... [?25ldone
[?25h  Getting requirements to build wheel ... [?25ldone
[?25h  Installing backend dependencies ... [?25ldone
[?25h  Preparing metadata (pyproject.toml) ... [?25ldone
[?25hCollecting typing-extensions>=4.5.0 (from llama-cpp-python)
  Downloading typing_extensions-4.9.0-py3-none-any.whl.metadata (3.0 kB)
Collecting numpy>=1.20.0 (from llama-cpp-python)
  Downloading numpy-1.26.2-cp311-cp311-macosx_10_9_x86_64.whl.metadata (61 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m61.2/61.2 kB[0m [31m19.2 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting diskcache>=5.6.1 (from llama-cpp-python)
  Downloading diskcache-5.6.3-py3-none-any.whl.metadata (20 kB)
Downloading diskcache-5.6.3-py3-none-any.whl 

If you're on Apple Silicon use this command:

In [None]:
!CMAKE_ARGS="-DLLAMA_METAL=on" pip install llama-cpp-python  --upgrade --force-reinstall --no-cache-dir

Let's load the software.

In [23]:
from langchain.llms import LlamaCpp
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])

With that loaded, let's download a language model.

In [28]:
# wget is not a default installation on Mac OS
# !wget https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/resolve/main/mistral-7b-instruct-v0.1.Q4_0.gguf

In [35]:
# !curl -O https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/resolve/main/mistral-7b-instruct-v0.1.Q4_0.gguf
!curl -O "https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/resolve/main/mistral-7b-instruct-v0.1.Q4_0.gguf"

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1167  100  1167    0     0      0      0 --:--:-- --:--:-- --:--:--     0 10911      0 --:--:-- --:--:-- --:--:-- 11670


huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


Okay! Let's try loading it.

In [36]:
llm = LlamaCpp(
    model_path="mistral-7b-instruct-v0.1.Q4_0.gguf",
    temperature=0.1,
    max_tokens=200,
    n_ctx=4096,
    top_p=1,
    n_gpu_layers=-1,
    f16_kv=True,
    verbose=True,
    callback_manager=callback_manager
)

llama_model_loader: loaded meta data with 20 key-value pairs and 291 tensors from mistral-7b-instruct-v0.1.Q4_0.gguf (version GGUF V2)
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.name str              = mistralai_mistral-7b-instruct-v0.1
llama_model_loader: - kv   2:                       llama.context_length u32              = 32768
llama_model_loader: - kv   3:                     llama.embedding_length u32              = 4096
llama_model_loader: - kv   4:                          llama.block_count u32              = 32
llama_model_loader: - kv   5:                  llama.feed_forward_length u32              = 14336
llama_model_loader: - kv   6:                 llama.rope.dimension_count u32              = 128
llama_model_loader: - kv   7:                 llama.attent

Let's try it out!

In [37]:
llm("Where is Tokyo?")


Answer: Tokyo is the capital and largest city of Japan, located on the eastern coast of Honshu, the largest island of Japan.


llama_print_timings:        load time =   10436.96 ms
llama_print_timings:      sample time =      10.65 ms /    31 runs   (    0.34 ms per token,  2911.62 tokens per second)
llama_print_timings: prompt eval time =   10436.88 ms /     5 tokens ( 2087.38 ms per token,     0.48 tokens per second)
llama_print_timings:        eval time =   25794.73 ms /    30 runs   (  859.82 ms per token,     1.16 tokens per second)
llama_print_timings:       total time =   36379.50 ms


'\nAnswer: Tokyo is the capital and largest city of Japan, located on the eastern coast of Honshu, the largest island of Japan.'

Having verified the language model works, let's try making a tool for it to access our vector store.

In [38]:
from langchain.chains import RetrievalQA

Let's define our agent.

In [39]:
from langchain.agents import initialize_agent, Tool
from langchain.agents import AgentType
from langchain.tools import BaseTool
from langchain.chains import LLMMathChain
from langchain.utilities import SerpAPIWrapper
from langchain.agents.agent_toolkits import create_retriever_tool

In [40]:
retriever = db.as_retriever()

In [41]:
tools = [
    create_retriever_tool(
        retriever,
        name="Search knowledge",
        description="Useful for when you need to answer a question. If the user asks a question concerning the Supreme Court or Hamilton, Ontario, find the answer to their question with this tool. Only use this tool once.",
    ),
]

And finally, let's define our toolkit.

In [42]:
agent = initialize_agent(
    tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=False
)

And let's try it out.

In [43]:
agent.run("Have any businesses in Hamilton recently won a grant for art? If so, what are they?")

Llama.generate: prefix-match hit


 I should use the Search knowledge tool to find out if any businesses in Hamilton have won a grant for art.
Action: Search knowledge
Action Input: "grant for art" and "Hamilton"


llama_print_timings:        load time =   10436.96 ms
llama_print_timings:      sample time =      18.37 ms /    47 runs   (    0.39 ms per token,  2558.52 tokens per second)
llama_print_timings: prompt eval time =  170380.72 ms /   198 tokens (  860.51 ms per token,     1.16 tokens per second)
llama_print_timings:        eval time =   42918.68 ms /    46 runs   (  933.01 ms per token,     1.07 tokens per second)
llama_print_timings:       total time =  213575.59 ms
Llama.generate: prefix-match hit


 I now know that three businesses in Hamilton have won a grant for art.
Final Answer: The Art Gallery of Hamilton Renewal, David Braley & Nancy Gordon Rock Garden - Visitor Centre at the Royal Botanical Garden, and The Residences of Royal Connaught (Phase 1)


llama_print_timings:        load time =   10436.96 ms
llama_print_timings:      sample time =      20.14 ms /    65 runs   (    0.31 ms per token,  3227.73 tokens per second)
llama_print_timings: prompt eval time = 1026936.35 ms /  1150 tokens (  892.99 ms per token,     1.12 tokens per second)
llama_print_timings:        eval time =   62215.84 ms /    64 runs   (  972.12 ms per token,     1.03 tokens per second)
llama_print_timings:       total time = 1089840.93 ms


'The Art Gallery of Hamilton Renewal, David Braley & Nancy Gordon Rock Garden - Visitor Centre at the Royal Botanical Garden, and The Residences of Royal Connaught (Phase 1)'