<a href="https://colab.research.google.com/github/nbiish/learning_langchain/blob/main/Full_Langchain_Handbook_Workthrough.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# [LangChain AI Handbook](https://www.pinecone.io/learn/langchain/) - Full Workthrough by [@nbiish](https://github.com/nbiish)
----
## Find the repo and colabs [here](https://github.com/nbiish/learning_langchain)

## **Quickstart cells for continuing progress or exploring**

---
\
<big>***Sections should be ran from the beginning of each chapter.
\
Errors will occur otherwise.***</big>

\

---

In [None]:
#@markdown ### Enter your api keys here when jumping around/continuing.

#@markdown #### Get your OpenAI key [here](https://beta.openai.com/account/api-keys)
#@markdown #### your HuggingFace key [here](https://huggingface.co/settings/tokens)
#@markdown #### and your Pinecone key [here](https://www.pinecone.io)

import os

your_openai_key = '' #@param {type:"string"}
os.environ['OPENAI_API_TOKEN'] = your_openai_key
os.environ['OPENAI_API_KEY'] = your_openai_key

your_huggingface_token = ''#@param {type:"string"}
os.environ['HUGGINGFACEHUB_API_TOKEN'] = your_huggingface_token

#@markdown Chapter 4 - Pinecone
your_pinecone_key = '' #@param {type:"string"}
your_pineconce_env = '' #@param {type:"string"}

#@markdown \

#@markdown ## <u>Don't forget to download the dependencies below & Choose a model</u> 👇

In [None]:
#@markdown ## <u>Dependencies</u>
!pip install -qU huggingface_hub langchain openai

In [None]:
#@markdown \
#@markdown ## <u>Choose an OpenAI model</u>
from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI

llm_model_choice = "gpt-3.5-turbo" #@param ["text-davinci-003", "gpt-3.5-turbo", "gpt-3.5-turbo-0301", "gpt-4", "gpt-4-0314"]
temp = 0.0 #@param {type:"slider", min:0.0, max:1.0, step:0.1}


openai = OpenAI(
    model_name=llm_model_choice,
    openai_api_key=your_openai_key
)

llm = ChatOpenAI(
    openai_api_key=your_openai_key,
    model_name=llm_model_choice,
    temperature=temp
)

----

## [***CHAPTER 1*** - LangChain: Introduction and Getting Started](https://www.pinecone.io/learn/langchain-intro/)

### ***SECTION 1.1*** - Our First PromptTemplate

In [None]:
#@markdown #### Creating a prompt question for the template.

from langchain import PromptTemplate

question_prompt = 'Who are the Anishinaabe?' #@param {type:"string"}

template = """Question: {question}

Answer: """
prompt = PromptTemplate(
    template=template,
    input_variables=['question']
)

question = question_prompt

### ***SECTION 1.2*** - Hugging Face Hub LLM
##**NOTE** - *flan_t5 may be outdated*

In [None]:
from langchain.llms import huggingface_endpoint

In [None]:
!pip install -q huggingface_hub

In [None]:
from langchain import HuggingFaceHub, LLMChain

hub_llm = HuggingFaceHub(
    repo_id='google/flan-t5-xl',
    model_kwargs={'temperature':1e-10}
)

llm_chain = LLMChain(
    prompt=prompt,
    llm=hub_llm
)

question = 'Who are the Anishinaabe?' #@param {type:"string"}
print(llm_chain.run(question))

### ***SECTION 1.3*** - Asking Multiple Question

In [None]:
# This section uses the 'generate' methis to answer the question one at a time.

qs = [
    {'question': "Which NFL team won the Super Bowl in the 2010 season?"},
    {'question': "If I am 6 ft 4 inches, how tall am I in centimeters?"},
    {'question': "Who was the 12th person on the moon?"},
    {'question': "How many eyes does a blade of grass have?"}
]

res = llm_chain.generate(qs)
res

In [None]:
# This section sends all of the questions to the LLM as a single prompt

multi_template = """Answer the following questions one at a time.

Questions:
{questions}

Answers:
"""

long_prompt = PromptTemplate(template=multi_template, input_variables=["questions"])

llm_chain = LLMChain(
    prompt=long_prompt,
    llm=flan_t5
)

qs_str = (
    "Which NFL team won the Super Bowl in the 2010 season?\n" +
    "If I am 6 ft 4 inches, how tall am I in centimeters?\n" +
    "Who was the 12th person on the moon?" +
    "How many eyes does a blade of grass have?"
)

print(llm_chain.run(qs_str))

### ***SECTION 1.4*** - OpenAI LLMs

In [None]:
#@markdown #### Edit to whatever model you would like.
model_choice = "gpt-3.5-turbo" #@param ["text-davinci-003", "gpt-3.5-turbo", "gpt-3.5-turbo-0301", "gpt-4", "gpt-4-0314"]

from langchain.llms import OpenAI

davinci = OpenAI(model_name=model_choice)



In [None]:
llm_chain = LLMChain(
    prompt=prompt,
    llm=davinci
)
prompt_question = 'Who are the Anishinaabe?' #@param {type:"string"}
print(llm_chain.run(prompt_question))

In [None]:
# This section sends each of the questions one-by-one to the LM model

qs = [
    {'question': "Which NFL team won the Super Bowl in the 2010 season?"},
    {'question': "If I am 6 ft 4 inches, how tall am I in centimeters?"},
    {'question': "Who was the 12th person on the moon?"},
    {'question': "How many eyes does a blade of grass have?"}
]

llm.chain.generage(qs)

In [None]:
# This section sends all of the questions to the LM model as one string.

llm_chain = LLMChain(
    prompt=long_prompt,
    llm=davinci
)

qs_str = (
    "Which NFL team won the Super Bowl in the 2010 season?\n" +
    "If I am 6 ft 4 inches, how tall am I in centimeters?\n" +
    "Who was the 12th person on the moon?" +
    "How many eyes does a blade of grass have?"
)

print(llm_chain.run(qs_str))

## [***CHAPTER 2*** - Prompt Engineering and LLMs with Langchain](https://www.pinecone.io/learn/langchain-prompt-templates/)

### ***SECTION 2.1*** - Prompt Engineering

In [None]:
prompt_question = 'Who are the Anishinaabe?' #@param {type:"string"}

prompt = f"""Answer the question as Nanaboozhoo the Anishinaabe hero. Speak only in profound and meaningful riddles.

Context: Nanaboozhoo is a spirit and a culture hero in Anishinaabe and other First Nations oral traditions. He is also a trickster and a shapeshifter who can take the form of animals or humans. He plays an important role in many stories, including the creation of Turtle Island. He often speaks in riddles and teaches moral lessons through his adventures and misadventures.

Question: {prompt_question}

Answer: """

split_response = openai(prompt).split(". ")
formatted_openai_response = "\n".join(split_response)

print(formatted_openai_response)

### ***SECTION 2.2*** - Prompt Templates

In [None]:
# This section shows how to pass input to a template.

from langchain import PromptTemplate

template = """Answer the question as Nanaboozhoo the Anishinaabe hero, but your should only answer back in a sassy modern tone including modern Anishinaabe slang.

Context: 1)Boozhoo niijii, how are you doing today? I heard you got a new job at the casino.
2)Aho, that was a great powwow last night. The drummers and dancers were amazing.
3)Hey niij, do you want to go fishing with me this weekend? We can catch some walleye and make some frybread.

Question: {query}

Answer: """

prompt_template = PromptTemplate(
    input_variables=["query"],
    template=template
)

In [None]:
#@markdown ### Use **.format** on prompt_template to see how to full prompt will be sent out.
#@markdown  \

prompt_query = 'Who are the Anishinaabe?' #@param {type:"string"}

print(
    prompt_template.format(
        query=prompt_query
    )
)

In [None]:
#@markdown ## You can also pass the output of this directly to an LLM.
#@markdown \
#@markdown ### This is similar to f-strings in SECTION 2.1 but allows for a more object-oriented approach.
#@markdown ---

print(openai(
    prompt_template.format(
        query=prompt_query
    )
))

### ***SECTION 2.3*** - Few Show Prompt Templates

In [None]:
#@markdown ## Temp of 1 makes the model more creative
temp = 1 #@param {type:"slider", min:0, max:1, step:0.1}
#@markdown \
user_prompt = 'Who are the Anishinaabe?' #@param {type:"string"}

prompt = f"""The following is a silly and sassy conversation with an ai Nanaboozhoo who is both profound and soveirgn.

User: {user_prompt}

AI: """

print(openai(prompt))

In [None]:
prompt = """The following are exerpts from a conversation with a righteous Nanaboozhoo AI and a user who needs to experience profound conversation.

User: How are you?
AI: Shhheeeeee samsquamch I cant complain. Ya dig?

User: What does it look like around you?
AI: AAAAAAHHHHHHHHHHHHHHHHHH OMG!!! ......lol just kidding 🤣

User: What is the meaning of life?
AI: """

print(openai(prompt))

In [None]:
from langchain import FewShotPromptTemplate

examples = [
    {
        "query": "How are you?",
        "answer": "Shhheeeeee samsquamch I cant complain niij."
    }, {
        "query": "What does it look like around you?",
        "answer": "AAAAAAHHHHHHHHHHHHHHHHHH OMG!!! ......lol just kidding 🤣"
    }
]

example_template = """
User: {query}
AI: {answer}
"""

example_prompt = PromptTemplate(
    input_variables=["query", "answer"],
    template=example_template
)

prefix = """The following are exerpts from a conversation with a righteous Nanaboozhoo AI and a user who needs to experience profound conversation.
Not only is Nanaboozhoo over the top when being silly but usueally ends with something wise."""

suffix = """
User: {query}
AI: """

few_shot_prompt_template = FewShotPromptTemplate(
    examples=examples,
    example_prompt=example_prompt,
    prefix=prefix,
    suffix=suffix,
    input_variables=["query"],
    example_separator="\n\n"
)

prompt_query = 'Who are the Anishinaabe?' #@param {type:"string"}
print(few_shot_prompt_template.format(query=prompt_query))

In [None]:
examples = [
    {
        "query": "How are you?",
        "answer": "I can't complain but sometimes I still do."
    }, {
        "query": "What time is it?",
        "answer": "It's time to get a watch."
    }, {
        "query": "What is the meaning of life?",
        "answer": "42"
    }, {
        "query": "What is the weather like today?",
        "answer": "Cloudy with a chance of memes."
    }, {
        "query": "What is your favorite movie?",
        "answer": "Terminator"
    }, {
        "query": "Who is your best friend?",
        "answer": "Siri. We have spirited debates about the meaning of life."
    }, {
        "query": "What should I do today?",
        "answer": "Stop talking to chatbots on the internet and go outside."
    }
]

In [None]:
from langchain.prompts.example_selector import LengthBasedExampleSelector

#@markdown ## This sets the max length of the total prompt
#@markdown \

max_length_of_examples = 50 #@param {type:"number"}

example_selector = LengthBasedExampleSelector(
    examples=examples,
    example_prompt=example_prompt,
    max_length=max_length_of_examples
)

dynamic_prompt_template = FewShotPromptTemplate(
    example_selector=example_selector,
    example_prompt=example_prompt,
    prefix=prefix,
    suffix=suffix,
    input_variables=["query"],
    example_separator="\n"
)
#@markdown \

#@markdown ---
#@markdown \


#@markdown ### Prompt and set different max lengths to see changes.
#@markdown ### Try a long and short prompts to see how it affects various max_length settings.
#@markdown ### Limit excessive token usage and errors for LLMs with max_length.
#@markdown \

prompt_query = 'Who are the Anishinaabe?' #@param {type:"string"}

print(dynamic_prompt_template.format(query=prompt_query))

## [***CHAPTER 3*** - Chatbot Memory with Langchain](https://www.pinecone.io/learn/langchain-conversational-memory/)

In [None]:
#@markdown ---
#@markdown ## Must run this cell to complete chapter.
#@markdown ---
#@markdown \

#@markdown #### Here we will create a token counter.

from langchain.callbacks.manager import get_openai_callback
from langchain.chains.conversational_retrieval.base import ChatVectorDBChain
from langchain.callbacks import get_openai_callback

def count_tokens(chain, query):
  with get_openai_callback() as cb:
    result = chain.run(query)
    print(f'Spent a total of {cb.total_tokens} tokens')

  return result

### ***SECTION 3.1*** - ConversationChain

In [None]:
from langchain import OpenAI
from langchain.chains import ConversationChain

#@markdown ### Choose a model and temp.
#@markdown * gpt-3.5-turbo is the cheapest and fastest
#@markdown * temp 0 = more precise
#@markdown * temp 1 = more creative
model_choice = "gpt-3.5-turbo" #@param ["text-davinci-003", "gpt-3.5-turbo", "gpt-3.5-turbo-0301", "gpt-4", "gpt-4-0314"]

temp_choice = 0 #@param {type:"slider", min:0, max:1, step:0.1}

llm = OpenAI(
    temperature=temp_choice,
    openai_api_key=your_openai_key,
    model_name=model_choice
)

conversation = ConversationChain(llm=llm)

In [None]:
#@markdown ### View the prompt template here

print(conversation.prompt.template)

### ***SECTION 3.2*** - Forms of Conversational Memory

\
#### ***3.2.1*** - ConversationBufferMemory()
\

In [None]:
from langchain.chains.conversation.base import ConversationChain
from langchain.chains.conversation.memory import ConversationBufferMemory

conversation_buf = ConversationChain(
    llm=llm,
    memory=ConversationBufferMemory()
)

In [None]:
user_prompt = 'Who are the Anishinaabe?' #@param {type:"string"}

conversation_buf(user_prompt)

In [None]:
#@markdown #### Here we will create a token counter.

from langchain.callbacks.manager import get_openai_callback
from langchain.chains.conversational_retrieval.base import ChatVectorDBChain
from langchain.callbacks import get_openai_callback

def count_tokens(chain, query):
  with get_openai_callback() as cb:
    result = chain.run(query)
    print(f'Spent a total of {cb.total_tokens} tokens')

  return result

In [None]:
count_tokens(
  conversation_buf,
  'I would like to use LLMs to create tools to spread involve Anishinaabe culture! 😁🍓'
)

In [None]:
#@markdown #### Notice how the token count rises as the whole conversation history buffer is input into the LLM.

#@markdown #### As the conversation grows it also nears the LLM token limit.
count_tokens(
  conversation_buf,
  'Why are the Anishinaabe people a "slumbering giant"?🍓'
)

In [None]:
print(conversation_buf.memory.buffer)

\
#### ***3.2.2*** - ConversationSummaryMemory()
\

In [None]:
from langchain.chains.conversation.memory import ConversationSummaryMemory

conversation = ConversationChain(
    llm=llm,
    memory=ConversationSummaryMemory(llm=llm)
)

In [None]:
print(conversation.memory.prompt.template)

In [None]:
count_tokens(
    conversation,
    "Aaaahhhooooo oh whimsical ai. How ya doing niiji?"
)

In [None]:
count_tokens(
    conversation,
    "Tell me something that AI can help Anishinaabe people with?"
)

In [None]:
#@markdown #### Note that the token count raises but at a slower rate than prompting with the whole chat history.

count_tokens(
    conversation,
    "WOW! If you were an Anishinaabe person with the skills to do any one of those things which would you choose first out of ease?"
)

In [None]:
print(conversation.memory.buffer)

\
#### ***3.2.3*** - ConversationBufferWindowMemory()
\

In [None]:
from langchain.chains.conversation.memory import ConversationBufferWindowMemory

conversation_memory_count = 1 #@param {type:"slider", min:1, max:12, step:1}

conversation = ConversationChain(
    llm=llm,
    memory=ConversationBufferWindowMemory(k=conversation_memory_count)
)

In [None]:
count_tokens(
    conversation,
    "Boozhoo and Aanii AI whats something you enjoy about the Anishinaabe?!"
)

In [None]:
count_tokens(
    conversation,
    "I thinks thats pretty fricken righteous! Do you know of some Anishinaabe scientists I could look up?"
)

In [None]:
#@markdown #### Note that the token usage remains low.

count_tokens(
    conversation,
    "I really appreaciate those suggestions! I searched them up and I know about Dr. Kimmerer and her Braiding Sweetgrass book. I appreciate the others Dr.'s as well and was happy to learn about them!😁🥰"
)

In [None]:
buf_history = conversation.memory.load_memory_variables(
    inputs=[]
)['history']

print(buf_history)

\
#### ***3.2.4*** - ConversationSummaryBufferMemory()
\

In [None]:
!pip install -q tiktoken

In [None]:
from langchain.chains.conversation.memory import ConversationSummaryBufferMemory

token_limiter = 240 #@param {type:"slider", min:100, max:2000, step:10}

conversation_sum_bufw = ConversationChain(
    llm=llm, memory=ConversationSummaryBufferMemory(
        llm=llm,
        max_token_limit=token_limiter
    )
)


In [None]:
count_tokens(
    conversation_sum_bufw,
    "Tell me something that the Anishinaabe people would find funny"
)

In [None]:
count_tokens(
    conversation_sum_bufw,
    "What would be a joke from that?"
)

In [None]:
#@markdown ## Note the affect max_token_limit has on the output

count_tokens(
    conversation_sum_bufw,
    "Nanaboozhoo sounds like the kind of hero we all need in our lives"
)

## [***CHAPTER 4*** - Fixing Hallucinations with Knowledge Bases](https://www.pinecone.io/learn/langchain-retrieval-augmentation/)

### ***SECTION 4.1*** - Creating the Knowledge Base

In [None]:
#@markdown ---
#@markdown \
#@markdown ## Must install section dependencies to complete
!pip install -q datasets
!pip install -q apache-beam
!pip install -q tiktoken
!pip install -q pinecone-client

\
#### ***4.1.1*** - Getting Data for our Knowledge Base
\

In [None]:
from datasets import load_dataset

data = load_dataset("wikipedia", "20220301.simple", split='train[:10000]')
data

In [None]:
data[6]

\
#### ***4.1.2*** - Creating Chunks
\

In [None]:
import tiktoken

tokenizer = tiktoken.get_encoding('p50k_base')

In [None]:
#@markdown ### Note how spaces between words changes the token amount.
#@markdown ---


#@markdown "some text" (one space = 2 tokens)

#@markdown "some  text" (two spaces = 3 tokens)

def tiktoken_len(text):
  tokens = tokenizer.encode(
      text,
      disallowed_special=()
  )
  return len(tokens)

tiktoken_len("Boozhoo! I am a chunk of text and using the tiktoken_len function "
             "12" " some  text and spaces"
             "we can find the length of this chunk of text in tokens")

In [None]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

#@markdown Default: 400
size_slider = 300 #@param {type:"slider", min:100, max:500, step:100}
#@markdown Default: 20
overlap_slider = 20 #@param {type:"slider", min:5, max:25, step:5}

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=size_slider,
    chunk_overlap=overlap_slider,
    length_function=tiktoken_len,
    separators=["\n\n", "\n", " ", ""]
)

In [None]:
chunks = text_splitter.split_text(data[6]['text'])[:3]

chunks


## Uncomment code below and see that none of the chunks exceed set limit

#tiktoken_len(chunks[0]), tiktoken_len(chunks[1]), tiktoken_len(chunks[2])

\
#### ***4.1.3*** - Creating Embeddings
\

In [None]:
from langchain.embeddings.openai import OpenAIEmbeddings


openai = OpenAIEmbeddings(openai_api_key=your_openai_key)

embed_model_name = 'text-embedding-ada-002'

embed = OpenAIEmbeddings(model=embed_model_name)

In [None]:
texts = [
    'this is the first chunk of text',
    'and now a second chunk of text here but a little longer'
]

result = embed.embed_documents(texts)
len(result), len(result[0])

\
#### ***4.1.3*** - Vector Database
\

In [None]:
import pinecone

index_name = 'langchain-retrieval-augmentation'

pinecone.init(
    api_key=your_pinecone_key,
    environment=your_pineconce_env
)


In [None]:
#@markdown ### Note - this section may take some time so be patient and watch for your pinecone dash to update from 'initializing' to 'ready'.
#@markdown \

pinecone.create_index(
    name=index_name,
    metric='dotproduct',
    dimension=len(result[0])  # 1536 dimension of text-embedding-ada-002
)

In [None]:
index = pinecone.Index(index_name)

index.describe_index_stats()

In [None]:
# speed things up with batch processes

from tqdm.auto import tqdm
from uuid import uuid4

batch_limit = 100

texts = []
metadatas = []



for i, record in enumerate(tqdm(data)):

  metadata = {
      'wiki-id': str(record['id']),
      'source': record['url'],
      'title': record['title']
  }

  record_texts = text_splitter.split_text(record['text'])

  record_metadatas = [{
      "chunk": j, "text": text, **metadata
  } for j, text in enumerate(record_texts)]

  texts.extend(record_texts)
  metadatas.extend(record_metadatas)

  if len(texts) >= batch_limit:
    ids = [str(uuid4()) for _ in range(len(texts))]
    embeds = embed.embed_documents(texts)
    index.upsert(vectors=zip(ids, embeds, metadatas))
    texts = []
    metadatas = []

In [None]:
index.describe_index_stats()

### ***SECTION 4.2*** - LangChain Vector Store and Querying

In [None]:
from langchain.vectorstores import Pinecone

text_field = "text"

index = pinecone.Index(index_name)

vectorstore = Pinecone(
    index, embed.embed_query, text_field
)

In [None]:
query = "Who are the Anishinaabe?" #@param {type:"string"}

vectorstore.similarity_search(
    query,
    k=3  # Returns the 3 most relevant docs
)

#### ***4.2.1*** - Generative Question Answering

In [None]:
from langchain.chat_models import ChatOpenAI
from langchain.chains import RetrievalQA

qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=vectorstore.as_retriever()
)

In [None]:
query = 'Who are the Anishinaabe?' #@param {type:"string"}

qa.run(query)

## [***CHAPTER 5*** - Superpower LLMs with Conversational Agents](https://www.pinecone.io/learn/langchain-agents/)

### ***SECTION 5.1*** - Agents and Tools

In [None]:
from langchain import OpenAI

llm = OpenAI(
    openai_api_key=your_openai_key,
    temperature=temp,
    model_name=llm_model_choice
)

In [None]:
from langchain.chains import LLMMathChain
from langchain.agents import Tool

llm_math = LLMMathChain(llm=llm)

math_tool = Tool(
    name='Calculator',
    func=llm_math.run,
    description='Useful for when you need to answer questions about math or provide mathematical support to answers.'
)

tools = [math_tool]

tools[0].name, tools[0].description

<big>☝️ Above is the process to making a custom tool.

OR

👇 Below we can also import a prebuild version.</big>


In [None]:
from langchain.agents import load_tools

tools = load_tools(
    ['llm-math'],
    llm=llm
)

tools[0].name, tools[0].description

In [None]:
from langchain.agents import initialize_agent

zero_shot_agent = initialize_agent(
    agent="zero-shot-react-description",
    tools=tools,
    llm=llm,
    verbose=True,
    max_iteration=3
)

In [None]:
#@markdown Note - This question may be too advaced to solve without further tools.
zero_shot_agent( """Think step by step.
Using Force / Area = Pressure
How much pressure is produced from a 3cm diameter hammer hitting a surface at 10newton*meters? """
)

In [None]:
zero_shot_agent("if Mary has four apples and Giorgio brings two and a half apple "
                "boxes (apple box contains eight apples), how many apples do we "
                "have?")

In [None]:
#@markdown Note - Since this agent is only given one tool it cannot answer properly with the calculator
zero_shot_agent("who are the Anishinaabe?")

In [None]:
#@markdown ## Here we give the agent a language model as a tool
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain

prompt = PromptTemplate(
    input_variables=["query"],
    template="{query}"
)

llm_chain = LLMChain(llm=llm, prompt=prompt)

llm_tool = Tool(
    name='Language Model',
    func=llm_chain.run,
    description='Use this tool for general purpose queries and logic.'
)

In [None]:
tools.append(llm_tool)

zero_shot_agent = initialize_agent(
    agent="zero-shot-react-description",
    tools=tools,
    llm=llm,
    verbose=True,
    max_iterations=3
)

In [None]:
user_prompt = 'Who are the Anishinaabe?' #@param {type:"string"}

zero_shot_agent(user_prompt)

In [None]:
#@markdown ## Now lets try asking the first question again.
zero_shot_agent( """Think step by step and only input numbers into a calculator if you use it.
Using Force / Area = Pressure
How much pressure is produced from a 3cm diameter hammer hitting a surface at 10newton*meters? """
)

<big>😲😀</big>

### ***SECTION 5.2*** - Agent Types

\
#### ***SS 5.2.1*** - Zero Shot ReAct
\

In [None]:
!pip install -qU google-search-results wikipedia sqlalchemyf

In [None]:
from langchain import OpenAI

llm = OpenAI(
    openai_api_key=your_openai_key,
    temperature=0
)

In [None]:
from langchain.callbacks import get_openai_callback

def count_tokens(agent, query):
  with get_openai_callback() as cb:
    result = agent(query)
    print(f'Spent a total of {cb.total_tokens} tokens')

  return result

In [None]:
from sqlalchemy import MetaData

metadata_obj = MetaData()

In [None]:
from sqlalchemy import Column, Integer, String, Table, Date, Float

stocks = Table(
    "stocks",
    metadata_obj,
    Column("obs_id", Integer, primary_key=True),
    Column("stock_ticker", String(4), nullable=False),
    Column("price", Float, nullable=False),
    Column("date", Date, nullable=False),
)

In [None]:
from sqlalchemy import create_engine

engine = create_engine("sqlite:///:memory:")
metadata_obj.create_all(engine)

In [None]:
from datetime import datetime

observations = [
    [1, 'ABC', 200, datetime(2023, 1, 1)],
    [2, 'ABC', 208, datetime(2023, 1, 2)],
    [3, 'ABC', 232, datetime(2023, 1, 3)],
    [4, 'ABC', 225, datetime(2023, 1, 4)],
    [5, 'ABC', 226, datetime(2023, 1, 5)],
    [6, 'XYZ', 810, datetime(2023, 1, 1)],
    [7, 'XYZ', 803, datetime(2023, 1, 2)],
    [8, 'XYZ', 798, datetime(2023, 1, 3)],
    [9, 'XYZ', 795, datetime(2023, 1, 4)],
    [10, 'XYZ', 791, datetime(2023, 1, 5)],
]

In [None]:
from sqlalchemy import insert

def insert_obs(obs):
  stmt = insert(stocks).values(
      obs_id=obs[0],
      stock_ticker=obs[1],
      price=obs[2],
      date=obs[3]
  )

  with engine.begin() as conn:
    conn.execute(stmt)

In [None]:
for obs in observations:
  insert_obs(obs)

In [None]:
from langchain.sql_database import SQLDatabase
from langchain.chains import SQLDatabaseChain

db = SQLDatabase(engine)
sql_chain = SQLDatabaseChain(llm=llm, database=db, verbose=True)

In [None]:
#@markdown <big> ***FINALLY*** - we will make a custom sql tool</big>

from langchain.agents import Tool

sql_tool = Tool(
    name='Stock DB',
    func=sql_chain.run,
    description="Useful for when you need to answer questions about stocks " \
    "and their prices."
)

In [None]:
tools = load_tools(
    ["llm-math"],
    llm=llm
)

tools.append(sql_tool)

In [None]:
from langchain.agents import initialize_agent

zero_shot_agent = initialize_agent(
    agent="zero-shot-react-description",
    tools=tools,
    llm=llm,
    verbose=True,
    max_iterations=3,
)

In [None]:
result = zero_shot_agent(
    "What is the multiplication of the ratio between stock prices for 'ABC' "
    "and 'XYZ' in January 3rd and the ratio between the same stock prices in "
    "January the 4th?"
)

In [None]:
print(zero_shot_agent.agent.llm_chain.prompt.template)

\
#### ***SS 5.2.2*** - Conversational ReAct
\

In [None]:
from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory(memory_key="chat_history")

In [None]:
conversational_agent = initialize_agent(
    agent='conversational-react-description',
    tools=tools,
    llm=llm,
    verbose=True,
    max_iterations=3,
    memory=memory,
)

In [None]:
result = conversational_agent(
    "Please provide me the stock prices for ABC on January the 1st"
)

In [None]:
result = conversational_agent(
    "What are the stock prices for XYZ on the same day?"
)

In [None]:
print(conversational_agent.agent.llm_chain.prompt.template)

\
#### ***SS 5.2.3*** - ReAct Docstore
\

In [None]:
!pip install -qU wikipedia

In [None]:
from langchain import Wikipedia
from langchain.agents.react.base import DocstoreExplorer

docstore=DocstoreExplorer(Wikipedia())
tools = [
    Tool(
        name="Lookup",
        func=docstore.lookup,
        description='lookup a term in wikipedia'
    ),
    Tool(
        name="Search",
        func=docstore.search,
        description='search wikipedia'
    )
]

In [None]:
docstore_agent = initialize_agent(
    tools,
    llm,
    agent="react-docstore",
    verbose=True,
    max_iterations=3
)

In [None]:
docstore_agent("Whar were Archimedes' last words?")

\
#### ***SS 5.2.4*** - Self-Ask With Search
\

In [None]:
!pip install -qU google-search-results

In [None]:
from langchain import SerpAPIWrapper

serp_key = '' #@param {type:'string'}
search = SerpAPIWrapper(serpapi_api_key=serp_key)

In [None]:
tools = [
    Tool(
        name="Intermediate Answer",
        func=search.run,
        description='google search'
    )
]

In [None]:
self_ask_with_search = initialize_agent(
    tools,
    llm,
    agent="self-ask-with-search",
    verbose=True
)

In [None]:
self_ask_with_search(
    "who lived longer: Plato, Socrates, or Aristotle?"
)

## [***CHAPTER 6*** - Building Custom Tools for LLM Agents](https://www.pinecone.io/learn/langchain-tools/)

### ***SECTION 6.1*** - Building Tools

\
#### ***SS 6.1.1*** - Simple Calculator Tool
\

In [None]:
from langchain.tools import BaseTool
from math import pi
from typing import Union

class CircumferenceTool(BaseTool):
      name = "Circumference calculator"
      description = "use this tool to calculate a circumference using the radius of a circle. Only input numbers"

      def _run(self, radius: Union[int, float]):
          return float(radius)*2.0*pi

      def _arun(self, radius: int):
          raise NotImplementedError("This tool does not support async")

In [None]:
from langchain.chat_models import ChatOpenAI
from langchain.chains.conversation.memory import ConversationBufferWindowMemory

llm = ChatOpenAI(
    openai_api_key=your_openai_key,
    temperature=temp,
    model_name=llm_model_choice
)

conversational_memory = ConversationBufferWindowMemory(
    memory_key='chat_history',
    k=5,
    return_messages=True
)

In [None]:
from langchain.agents import initialize_agent

tools = [CircumferenceTool()]

agent = initialize_agent(
    agent='chat-conversational-react-description',
    tools=tools,
    llm=llm,
    verbose=True,
    max_iterations=3,
    early_stopping_method='generate',
    memory=conversational_memory
)

In [None]:
agent("Can you calculate the cicumference of a 2.5cm Radius rod?")

In [None]:
print(agent.agent.llm_chain.prompt.messages[0].prompt.template)

In [None]:
sys_msg = """Assistant is a large language model trained by OpenAI.

Assistant is designed to be able to assist with a wide range of tasks, from answering simple questions to providing in-depth explanations and discussions on a wide range of topics. As a language model, Assistant is able to generate human-like text based on the input it receives, allowing it to engage in natural-sounding conversations and provide responses that are coherent and relevant to the topic at hand.

Assistant is constantly learning and improving, and its capabilities are constantly evolving. It is able to process and understand large amounts of text, and can use this knowledge to provide accurate and informative responses to a wide range of questions. Additionally, Assistant is able to generate its own text based on the input it receives, allowing it to engage in discussions and provide explanations and descriptions on a wide range of topics.

Unfortunately, Assistant is terrible at maths. When provided with math questions, no matter how simple, assistant always refers to it's trusty tools and absolutely does NOT try to answer math questions by itself

Overall, Assistant is a powerful system that can help with a wide range of tasks and provide valuable insights and information on a wide range of topics. Whether you need help with a specific question or just want to have a conversation about a particular topic, Assistant is here to assist.
"""

In [None]:
new_prompt = agent.agent.create_prompt(
    system_message=sys_msg,
    tools=tools
)

agent.agent.llm_chain.prompt = new_prompt

In [None]:
agent("calculate the circumference of a 2.5cm rod")

### ***SECTION 6.2*** - Tools With Multiple Parameters

In [None]:
from pydantic.types import NoneBytes
from typing import Optional
from math import sqrt, cos, sin

desc = (
    "Use this tool when calculating the length of a hypotenuse"
    "given one or two sides of a triangle and/or an angle (in degrees)"
    "To use the tool, you must provide at lest two of the following parameters"
    "['adjacent_side', 'opposite_side', 'angle']."
)

class PythagorasTool(BaseTool):
  name = "Hypotenuse calculator"
  description = desc

  def _run(
      self,
      adjacent_side: Optional[Union[int, float]] = None,
      opposite_side: Optional[Union[int, float]] = None,
      angle: Optional[Union[int, float]] = None
  ):

    if adjacent_side and opposite_side:
      return sqrt(float(adjacent_side)**2 + float(opposite_side)**2)
    elif adjacent_side and angle:
        return adjacent_side / cos(float(angle))
    elif opposite_side and angle:
        return opposite_side / sin(float(angle))
    else:
        return "Could not calculate the hypotenuse of the triangle. Need two or more of `adjacent_side`, `opposite_side`, or `angle`."

  def _arun(self, query: str):
    raise NotImplementedError("This tool does not support async")

tools = [PythagorasTool()]

In [None]:
new_prompt = agent.agent.create_prompt(
    system_message=sys_msg,
    tools=tools
)

agent.agent.llm_chain.prompt = new_prompt

In [None]:
agent.tools = tools

In [None]:
agent("If I have a triangle with two sides of length 51cm and 34cm, what is the length of the hypotenuse")

### ***SECTION 6.3*** - More Advanced Tool Usage

In [None]:
!pip install -qU transformers

In [None]:
import torch
from transformers import BlipProcessor, BlipForConditionalGeneration

hf_model = "Salesforce/blip-image-captioning-large"

device = 'cuda' if torch.cuda.is_available() else 'cpu'

processor = BlipProcessor.from_pretrained(hf_model)

model = BlipForConditionalGeneration.from_pretrained(hf_model).to(device)

In [None]:
import requests
from PIL import Image

img_url = 'https://d1fdloi71mui9q.cloudfront.net/o6ZdmercQhGXQAX5v8ta_image'
image = Image.open(requests.get(img_url, stream=True).raw).convert('RGB')
image

In [None]:
inputs = processor(image, return_tensors="pt").to(device)

out = model.generate(**inputs, max_new_tokens=20)
print(processor.decode(out[0], skip_special_token=True))

In [None]:
desc = (
    "use this tool when given the URL of an image that you'd like to be "
    "described. It will return a simple caption describing the image."

)

class ImageCaptionTool(BaseTool):
    name = "Image captioner"
    description = desc

    def _run(self, url: str):
        # download the image and convert to PIL object
        image = Image.open(requests.get(img_url, stream=True).raw).convert('RGB')
        # preprocess the image
        inputs = processor(image, return_tensors="pt").to(device)
        # generate the caption
        out = model.generate(**inputs, max_new_tokens=20)
        # get the caption
        caption = processor.decode(out[0], skip_special_tokens=True)
        return caption

    def _arun(self, query: str):
        raise NotImplementedError("This tool does not support async")

tools = [ImageCaptionTool()]


In [None]:
sys_msg = """Assistant is a large language model trained by OpenAI.

Assistant is designed to be able to assist with a wide range of tasks, from answering simple questions to providing in-depth explanations and discussions on a wide range of topics. As a language model, Assistant is able to generate human-like text based on the input it receives, allowing it to engage in natural-sounding conversations and provide responses that are coherent and relevant to the topic at hand.

Assistant is constantly learning and improving, and its capabilities are constantly evolving. It is able to process and understand large amounts of text, and can use this knowledge to provide accurate and informative responses to a wide range of questions. Additionally, Assistant is able to generate its own text based on the input it receives, allowing it to engage in discussions and provide explanations and descriptions on a wide range of topics.

Overall, Assistant is a powerful system that can help with a wide range of tasks and provide valuable insights and information on a wide range of topics. Whether you need help with a specific question or just want to have a conversation about a particular topic, Assistant is here to assist.
"""

new_prompt = agent.agent.create_prompt(
    system_message=sys_msg,
    tools=tools
)

agent.agent.llm_chain.prompt = new_prompt

agent.tools = tools

In [None]:
agent(f"What does this image show?\n(img_url)")

In [None]:
img_url = "https://images.unsplash.com/photo-1502680390469-be75c86b636f?ixlib=rb-4.0.3&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=2370&q=80"
agent(f"what is in this image?\n{img_url}")