## Week 2

*Welcome back! Scroll down to week 4.*

We first download the required software: LangChain and its dependency `pypdf`

In [None]:
!pip install --upgrade pip 
!pip install --upgrade langchain pypdf

In [None]:
!pip install pip install pypdf

We then load LangChain's `pypdf` loader.

In [1]:
from langchain.document_loaders import PyPDFLoader, PyPDFDirectoryLoader

Let's first load our PDF... 

In [22]:
loader = PyPDFLoader("Data/2021-census-population-occupied-private-dwellings-community-2001-2021.pdf")

In [23]:
pdf = loader.load_and_split()

In [24]:
loader = PyPDFDirectoryLoader("Data/Supreme Court opinions 2014/")

In [25]:
many_pdfs = loader.load_and_split()

Having loaded both the single PDF and a directory of PDFs, let's now load the CSV. 

In [16]:
from langchain.document_loaders.csv_loader import CSVLoader

In [18]:
loader = CSVLoader("Data/Urban_Design_and_Architecture_Awards_Recipients.csv")

In [17]:
csv = loader.load()

Having loaded our data, we'll now download and load the embedding model.

In [9]:
!pip install sentence_transformers > /dev/null

In [19]:
from langchain.embeddings import HuggingFaceEmbeddings, SentenceTransformerEmbeddings

In [20]:
embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

Let's try embedding some text. Observe the output. Once you've tried it, scroll down to continue.

In [21]:
text = "This is a test document."

In [13]:
embeddings.embed_query(text)

[-0.03833851218223572,
 0.12346473336219788,
 -0.028642993420362473,
 0.05365273728966713,
 0.008845353499054909,
 -0.039839304983615875,
 -0.0730057880282402,
 0.04777121543884277,
 -0.03046245314180851,
 0.05497969686985016,
 0.08505290001630783,
 0.03665666654706001,
 -0.0053199538961052895,
 -0.002233206294476986,
 -0.060711055994033813,
 -0.027237869799137115,
 -0.011351670138537884,
 -0.04243775084614754,
 0.009129978716373444,
 0.10081548243761063,
 0.07578728348016739,
 0.06911721080541611,
 0.009857481345534325,
 -0.0018377539236098528,
 0.026249082759022713,
 0.03290240466594696,
 -0.07177434116601944,
 0.028384318575263023,
 0.06170952320098877,
 -0.05252953618764877,
 0.03366170823574066,
 0.07446815073490143,
 0.07536040991544724,
 0.035383980721235275,
 0.0671340823173523,
 0.01079810131341219,
 0.08167026191949844,
 0.016562892124056816,
 0.0328306183218956,
 0.03632565215229988,
 0.0021728253923356533,
 -0.09895744174718857,
 0.005046744365245104,
 0.05089646205306053,


We now have a working embedding function. Let's install Chroma.

In [14]:
!pip install -U chromadb

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)




In [26]:
from langchain.vectorstores import Chroma

Let's make a vector store for our loaded documents!

In [27]:
db = Chroma.from_documents(pdf, embeddings)

In [28]:
db.add_documents(csv)

['1a3503d6-9231-11ee-9aa0-aaa1e4a6faf4',
 '1a350430-9231-11ee-9aa0-aaa1e4a6faf4',
 '1a35044e-9231-11ee-9aa0-aaa1e4a6faf4',
 '1a350462-9231-11ee-9aa0-aaa1e4a6faf4',
 '1a350476-9231-11ee-9aa0-aaa1e4a6faf4',
 '1a35048a-9231-11ee-9aa0-aaa1e4a6faf4',
 '1a35049e-9231-11ee-9aa0-aaa1e4a6faf4',
 '1a3504b2-9231-11ee-9aa0-aaa1e4a6faf4',
 '1a3504c6-9231-11ee-9aa0-aaa1e4a6faf4',
 '1a3504d0-9231-11ee-9aa0-aaa1e4a6faf4',
 '1a3504e4-9231-11ee-9aa0-aaa1e4a6faf4',
 '1a3504f8-9231-11ee-9aa0-aaa1e4a6faf4',
 '1a35050c-9231-11ee-9aa0-aaa1e4a6faf4',
 '1a350520-9231-11ee-9aa0-aaa1e4a6faf4',
 '1a35052a-9231-11ee-9aa0-aaa1e4a6faf4',
 '1a35053e-9231-11ee-9aa0-aaa1e4a6faf4',
 '1a350552-9231-11ee-9aa0-aaa1e4a6faf4',
 '1a350566-9231-11ee-9aa0-aaa1e4a6faf4',
 '1a35057a-9231-11ee-9aa0-aaa1e4a6faf4',
 '1a35058e-9231-11ee-9aa0-aaa1e4a6faf4',
 '1a350598-9231-11ee-9aa0-aaa1e4a6faf4',
 '1a3505ac-9231-11ee-9aa0-aaa1e4a6faf4',
 '1a3505c0-9231-11ee-9aa0-aaa1e4a6faf4',
 '1a3505d4-9231-11ee-9aa0-aaa1e4a6faf4',
 '1a3505e8-9231-

In [29]:
db.add_documents(many_pdfs)

['1ac6bf56-9231-11ee-9aa0-aaa1e4a6faf4',
 '1ac6bfba-9231-11ee-9aa0-aaa1e4a6faf4',
 '1ac6bfd8-9231-11ee-9aa0-aaa1e4a6faf4',
 '1ac6bfec-9231-11ee-9aa0-aaa1e4a6faf4',
 '1ac6c000-9231-11ee-9aa0-aaa1e4a6faf4',
 '1ac6c014-9231-11ee-9aa0-aaa1e4a6faf4',
 '1ac6c028-9231-11ee-9aa0-aaa1e4a6faf4',
 '1ac6c03c-9231-11ee-9aa0-aaa1e4a6faf4',
 '1ac6c046-9231-11ee-9aa0-aaa1e4a6faf4',
 '1ac6c05a-9231-11ee-9aa0-aaa1e4a6faf4',
 '1ac6c06e-9231-11ee-9aa0-aaa1e4a6faf4',
 '1ac6c082-9231-11ee-9aa0-aaa1e4a6faf4',
 '1ac6c096-9231-11ee-9aa0-aaa1e4a6faf4',
 '1ac6c0aa-9231-11ee-9aa0-aaa1e4a6faf4',
 '1ac6c0b4-9231-11ee-9aa0-aaa1e4a6faf4',
 '1ac6c0c8-9231-11ee-9aa0-aaa1e4a6faf4',
 '1ac6c0dc-9231-11ee-9aa0-aaa1e4a6faf4',
 '1ac6c0f0-9231-11ee-9aa0-aaa1e4a6faf4',
 '1ac6c104-9231-11ee-9aa0-aaa1e4a6faf4',
 '1ac6c118-9231-11ee-9aa0-aaa1e4a6faf4',
 '1ac6c12c-9231-11ee-9aa0-aaa1e4a6faf4',
 '1ac6c136-9231-11ee-9aa0-aaa1e4a6faf4',
 '1ac6c14a-9231-11ee-9aa0-aaa1e4a6faf4',
 '1ac6c15e-9231-11ee-9aa0-aaa1e4a6faf4',
 '1ac6c172-9231-

Let's try retrieving a relevant document.

In [None]:
query = "An award concerning art."
db.similarity_search(query)

In [None]:
query = "What exceptions does Rule 606(b)(1) contain?"
db.similarity_search(query)

# Week 3

We'll now try making an agent. We'll start by downloading a library to let us run a local language model.

If you're on Windows, a non-Apple Silicon Mac, or Linux, use this command:

In [None]:
!pip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir

If you're on Apple Silicon use this command:

In [None]:
!CMAKE_ARGS="-DLLAMA_METAL=on" pip install llama-cpp-python  --upgrade --force-reinstall --no-cache-dir

Let's load the software.

In [30]:
from langchain.llms import LlamaCpp
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])

With that loaded, let's download a language model.

In [20]:
!wget https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/resolve/main/mistral-7b-instruct-v0.1.Q4_0.gguf

--2023-12-03 17:34:52--  https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/resolve/main/mistral-7b-instruct-v0.1.Q4_0.gguf
Resolving huggingface.co (huggingface.co)... 3.161.213.25, 3.161.213.58, 3.161.213.11, ...
Connecting to huggingface.co (huggingface.co)|3.161.213.25|:443... connected.


huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


HTTP request sent, awaiting response... 302 Found
Location: https://cdn-lfs.huggingface.co/repos/46/12/46124cd8d4788fd8e0879883abfc473f247664b987955cc98a08658f7df6b826/c0fff3ee02f4b8f7296fbb560155b68a13644b12b9e1e761744c05fb637ade7c?response-content-disposition=attachment%3B+filename*%3DUTF-8%27%27mistral-7b-instruct-v0.1.Q4_0.gguf%3B+filename%3D%22mistral-7b-instruct-v0.1.Q4_0.gguf%22%3B&Expires=1701902093&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTcwMTkwMjA5M319LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy5odWdnaW5nZmFjZS5jby9yZXBvcy80Ni8xMi80NjEyNGNkOGQ0Nzg4ZmQ4ZTA4Nzk4ODNhYmZjNDczZjI0NzY2NGI5ODc5NTVjYzk4YTA4NjU4ZjdkZjZiODI2L2MwZmZmM2VlMDJmNGI4ZjcyOTZmYmI1NjAxNTViNjhhMTM2NDRiMTJiOWUxZTc2MTc0NGMwNWZiNjM3YWRlN2M%7EcmVzcG9uc2UtY29udGVudC1kaXNwb3NpdGlvbj0qIn1dfQ__&Signature=WEWdZIYW7hpSFrrSUICu-0aRcZ7QRpmyoGjfMiaKmqj-TfctLqtifZvgSabwE%7ECmS4agHYbeqZGJhirs30RTLJQgWJ4%7EDSYeuM3qQNd89pvyufknRdeCYmrAD5Kv5qyEFOJKjTfaR62E6hk%7EBZJTTKVvy42pnRVvnGvCjqhgrgc

Okay! Let's try loading it.

In [31]:
llm = LlamaCpp(
    model_path="mistral-7b-instruct-v0.1.Q4_0.gguf",
    temperature=0.8,
    max_tokens=200,
    n_ctx=4096,
    top_p=1,
    n_gpu_layers=-1,
    f16_kv=True,
    verbose=True,
    callback_manager=callback_manager
)

llama_model_loader: loaded meta data with 20 key-value pairs and 291 tensors from mistral-7b-instruct-v0.1.Q4_0.gguf (version GGUF V2)
llama_model_loader: - tensor    0:                token_embd.weight q4_0     [  4096, 32000,     1,     1 ]
llama_model_loader: - tensor    1:              blk.0.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]
llama_model_loader: - tensor    2:              blk.0.attn_k.weight q4_0     [  4096,  1024,     1,     1 ]
llama_model_loader: - tensor    3:              blk.0.attn_v.weight q4_0     [  4096,  1024,     1,     1 ]
llama_model_loader: - tensor    4:         blk.0.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]
llama_model_loader: - tensor    5:            blk.0.ffn_gate.weight q4_0     [  4096, 14336,     1,     1 ]
llama_model_loader: - tensor    6:              blk.0.ffn_up.weight q4_0     [  4096, 14336,     1,     1 ]
llama_model_loader: - tensor    7:            blk.0.ffn_down.weight q4_0     [ 14336,  4096,     1,     1 ]
l

Let's try it out!

In [None]:
llm("Where is Tokyo?")

Having verified the language model works, let's try making a tool for it to access our vector store.

In [32]:
from langchain.chains import RetrievalQA

Let's define our agent.

In [33]:
from langchain.agents import initialize_agent, Tool
from langchain.agents import AgentType
from langchain.tools import BaseTool
from langchain.chains import LLMMathChain
from langchain.utilities import SerpAPIWrapper
from langchain.agents.agent_toolkits import create_retriever_tool

In [34]:
retriever = db.as_retriever()

In [35]:
tools = [
    create_retriever_tool(
        retriever,
        name="Search knowledge",
        description="Useful for when you need to answer a question. If the user asks a question concerning the Supreme Court or Hamilton, Ontario, find the answer to their question with this tool. Only use this tool once.",
    ),
]

In [36]:
agent = initialize_agent(
    tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)

And let's try it out.

In [None]:
agent.run("Who, in Hamilton, won an award for art?")

# Week 4

Welcome back! We'll now try making an interface for our agent. First, let's download Gradio.

In [27]:
!pip install gradio==3.48.0

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)




Next, let's define a `message` function. This will let our Gradio interface talk to our agent.

In [41]:
import gradio as gr

history = []

def predict(message, history):
    response = agent.run("<s>[INST] " + message + " [/INST]")
    return response

With that done, let's try loading an interface in just a few lines of code.

In [50]:
gr.ChatInterface(predict, theme="soft", title="My Fancy Chatbot").launch(share=True)

Running on local URL:  http://127.0.0.1:7868
 It does not appear that there was an award

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


 given for anRunning on public URL: https://e8d93b29f04146810c.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)
 artist specifically in Hamilton, Ontario. I should check the source of the data to see if



 it contains any information about artists specifically from Hamilton.
Action: Search knowledge
Action Input: "Artists from Hamilton, Ontario"[32;1m[1;3m It does not appear that there was an award given for an artist specifically in Hamilton, Ontario. I should check the source of the data to see if it contains any information about artists specifically from Hamilton.
Action: Search knowledge
Action Input: "Artists from Hamilton, Ontario"[0m
Observation: [36;1m[1;3m[Document(page_content="\ufeffX: 591517.3657\nY: 4789987.399\nOBJECTID: 99\nAWARD_WINNER: The Art Gallery of Hamilton Renewal\nPROJECT_DESCRIPTION: Alterations and additions to the 1970's era Art Gallery building to provide more income generating space, resolve building envelope concerns and improve the presence of the building in the city.\nRECIPIENT: Kuwabara Payne McKenna Blumbering Architects Inc., Yolles Partnership Inc., Smith & Anderson Consulting Engineers, Carinci Burt Rogers Engineering, Haisall Associates Ltd.


llama_print_timings:        load time =     234.60 ms
llama_print_timings:      sample time =       8.21 ms /    61 runs   (    0.13 ms per token,  7429.96 tokens per second)
llama_print_timings: prompt eval time =   16901.42 ms /  1122 tokens (   15.06 ms per token,    66.38 tokens per second)
llama_print_timings:        eval time =    1970.18 ms /    60 runs   (   32.84 ms per token,    30.45 tokens per second)
llama_print_timings:       total time =   19220.88 ms
Traceback (most recent call last):
  File "/Users/srhm/Library/jupyterlab-desktop/jlab_server/lib/python3.8/site-packages/gradio/routes.py", line 534, in predict
    output = await route_utils.call_process_api(
  File "/Users/srhm/Library/jupyterlab-desktop/jlab_server/lib/python3.8/site-packages/gradio/route_utils.py", line 226, in call_process_api
    output = await app.get_blocks().process_api(
  File "/Users/srhm/Library/jupyterlab-desktop/jlab_server/lib/python3.8/site-packages/gradio/blocks.py", line 1550, in process



[1m> Entering new AgentExecutor chain...[0m


Llama.generate: prefix-match hit


 To find out which businesses won awards for art last year, we should use the search knowledge tool.
Action: Search knowledge
Action Input: "awards for art businesses last year"[32;1m[1;3m To find out which businesses won awards for art last year, we should use the search knowledge tool.
Action: Search knowledge
Action Input: "awards for art businesses last year"[0m
Observation: [36;1m[1;3m[Document(page_content="\ufeffX: 591517.3657\nY: 4789987.399\nOBJECTID: 99\nAWARD_WINNER: The Art Gallery of Hamilton Renewal\nPROJECT_DESCRIPTION: Alterations and additions to the 1970's era Art Gallery building to provide more income generating space, resolve building envelope concerns and improve the presence of the building in the city.\nRECIPIENT: Kuwabara Payne McKenna Blumbering Architects Inc., Yolles Partnership Inc., Smith & Anderson Consulting Engineers, Carinci Burt Rogers Engineering, Haisall Associates Ltd., PCL Constructors Canada Inc\nAWARD_YEAR: 2005\nCATEGORY: Award of Merit in


llama_print_timings:        load time =     234.60 ms
llama_print_timings:      sample time =       7.70 ms /    42 runs   (    0.18 ms per token,  5455.25 tokens per second)
llama_print_timings: prompt eval time =     314.22 ms /    16 tokens (   19.64 ms per token,    50.92 tokens per second)
llama_print_timings:        eval time =     767.96 ms /    42 runs   (   18.28 ms per token,    54.69 tokens per second)
llama_print_timings:       total time =    1200.02 ms
Llama.generate: prefix-match hit


 We found 4 businesses that won awards for art last year.
Final Answer: The Art Gallery of Hamilton Renewal, The James North Art Crawl, Arts Centre and Lofts, Empire Times[32;1m[1;3m We found 4 businesses that won awards for art last year.
Final Answer: The Art Gallery of Hamilton Renewal, The James North Art Crawl, Arts Centre and Lofts, Empire Times[0m

[1m> Finished chain.[0m



llama_print_timings:        load time =     234.60 ms
llama_print_timings:      sample time =       5.46 ms /    43 runs   (    0.13 ms per token,  7879.79 tokens per second)
llama_print_timings: prompt eval time =   13512.20 ms /  1109 tokens (   12.18 ms per token,    82.07 tokens per second)
llama_print_timings:        eval time =    1069.13 ms /    42 runs   (   25.46 ms per token,    39.28 tokens per second)
llama_print_timings:       total time =   14842.22 ms


Super! It works. We now can talk to an agent hooked up to some data through a fancy website.