<a href="https://colab.research.google.com/github/hpylieva/llm-workshop/blob/main/LangChain%20Basics.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Building A Language Model Application

Installing all the necessary packages...

In [3]:
!pip -q install langchain huggingface_hub openai

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.7/1.7 MB[0m [31m8.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m268.8/268.8 kB[0m [31m11.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m76.5/76.5 kB[0m [31m6.9 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.7/1.7 MB[0m [31m19.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m49.4/49.4 kB[0m [31m4.6 MB/s[0m eta [36m0:00:00[0m
[?25h  Building wheel for google-search-results (setup.py) ... [?25l[?25hdone


#### Setting up OpenAI model

We will need to load some tokens. I've put them in a file of the following structure:
```
{
"OPENAI_API_KEY": "...",
"HUGGINGFACEHUB_API_TOKEN": "...",
"SERPAPI_API_KEY":"..."
}

```
Then I just added this file to a context of this notebook.

In [84]:
import json
import os

with open('creds.json') as file:
  creds = json.load(file)

os.environ["OPENAI_API_KEY"] = creds["OPENAI_API_KEY"]

For most LLMs we can set up  temperature parameter whcih controls the creativity of the text generated by the OpenAI API. A higher temperature will produce more creative text, while a lower temperature will produce more predictable text.

Let's make our outputs quite predictable, but with a bit of creativity.

The default value of temperature is mostly 0.7.

In [58]:
overal_temperature = 0.1

You can find decsription of parameters of OpenAI models [here](https://api.python.langchain.com/en/latest/llms/langchain.llms.openai.OpenAI.html).

And regarding OpenAI models with LangChain integration - [here](https://python.langchain.com/docs/integrations/llms/openai)

In [99]:
from langchain.llms import OpenAI

llm = OpenAI(temperature=overal_temperature)

### Getting predictions

In [104]:
llm('who am I talking to?')

'\n\nYou are talking to a person who is using this website.'

In [102]:
request = "What are 5 vacation destinations for someone who likes to eat pasta?"
print(llm(request))



1. Rome, Italy
2. Bologna, Italy
3. Naples, Italy
4. Florence, Italy
5. Venice, Italy


### Prompt Templates: Manage prompts for LLMs

In [70]:
from langchain.prompts import PromptTemplate

In [105]:
prompt = PromptTemplate(
    input_variables=["food"],
    template="What are 5 vacation destinations for someone who likes to eat {food}?",
)

In [108]:
print(prompt.format(food="donuts"))

What are 5 vacation destinations for someone who likes to eat donuts?


In [107]:
print(llm.predict(prompt.format(food="donuts")))



1. Portland, Oregon - Home to the famous Voodoo Doughnuts.
2. Berlin, Germany - Home to the world-famous Berliner donuts.
3. Tokyo, Japan - Home to the unique and delicious taiyaki donuts.
4. New York City, USA - Home to the iconic Doughnut Plant.
5. Montreal, Canada - Home to the famous St-Viateur Bagel & Donut shop.


### Chains: Combine LLMs and prompts in multi-step workflows

In [None]:
from langchain.chains import LLMChain

In [109]:
chain = LLMChain(llm=llm, prompt=prompt)
print(chain.run("fresh fish"))



1. Vancouver, Canada
2. Cancun, Mexico
3. Honolulu, Hawaii
4. San Sebastian, Spain
5. Sydney, Australia


### Agents: Dynamically call chains based on user input

In [110]:
!pip install google-search-results



To make search API available (100 searches per month) just register [here](https://serpapi.com/users/welcome)

In [111]:
from langchain.agents import load_tools
from langchain.agents import initialize_agent

In [89]:
os.environ["SERPAPI_API_KEY"] = creds["SERPAPI_API_KEY"]

In [90]:
tools = load_tools(["serpapi", "llm-math"], llm=llm)

In [97]:
# Finally, let's initialize an agent with:
# 1. The tools
# 2. The language model
# 3. The type of agent we want to use.

agent = initialize_agent(tools, llm, agent="zero-shot-react-description", verbose=True)

See list of agents types [here](https://python.langchain.com/docs/modules/agents/agent_types/)

In [98]:
# Now let's test it out!
agent.run("Who is the current leader of China? What is the largest prime number that is smaller than their age?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I should use a search engine to find out the leader's age first. 
Action: Search
Action Input: Who is the current leader of China?[0m
Observation: [36;1m[1;3mXi Jinping (Chinese: 习近平; pinyin: Xí Jìnpíng; born 15 June 1953) is a Chinese politician who has been serving as the general secretary of the Chinese Communist Party (CCP) and chairman of the Central Military Commission (CMC), and thus as the paramount leader of China, since 2012.[0m
Thought:[32;1m[1;3m I should use a calculator to find the largest prime number that is smaller than the leader's age.
Action: Calculator
Action Input: The age of Xi Jinping (66)[0m
Observation: [33;1m[1;3mAnswer: 66[0m
Thought:[32;1m[1;3m I now know the final answer
Final Answer: The largest prime number smaller than the age of Xi Jinping is 66.[0m

[1m> Finished chain.[0m


'The largest prime number smaller than the age of Xi Jinping is 66.'

### Memory: Add state to chains and agents


In [112]:
from langchain import ConversationChain

In [113]:
conversation = ConversationChain(llm=llm, verbose=True)

In [114]:
conversation.predict(input="Hi there!")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Human: Hi there!
AI:[0m

[1m> Finished chain.[0m


" Hi there! It's nice to meet you. My name is AI. What's your name?"

In [115]:
conversation.predict(input="My name is Hanna. What date is it today?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hi there!
AI:  Hi there! It's nice to meet you. My name is AI. What's your name?
Human: My name is Hanna. What date is it today?
AI:[0m

[1m> Finished chain.[0m


" Today is April 15th, 2021. It's a Tuesday."

In [116]:
conversation.predict(input="What was the first thing I said to you?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hi there!
AI:  Hi there! It's nice to meet you. My name is AI. What's your name?
Human: My name is Hanna. What date is it today?
AI:  Today is April 15th, 2021. It's a Tuesday.
Human: What was the first thing I said to you?
AI:[0m

[1m> Finished chain.[0m


' You said "Hi there!"'

# Finally: Comparing and Evaluating LLMs

In [10]:
os.environ["HUGGINGFACEHUB_API_TOKEN"] = creds['HUGGINGFACEHUB_API_TOKEN']

## Setting Up the LLMs

#### Setting up Flan models


In [5]:
from langchain import HuggingFaceHub

flan_20B = HuggingFaceHub(repo_id="google/flan-ul2",
                         model_kwargs={"temperature":overal_temperature,
                                       "max_new_tokens":200}
                         )

In [6]:
flan_t5xxl = HuggingFaceHub(repo_id="google/flan-t5-xxl",
                         model_kwargs={"temperature":overal_temperature,
                                       "max_new_tokens":200}
                         )

## Set up a comparison lab

In [12]:
from langchain.model_laboratory import ModelLaboratory

In [117]:
from langchain.prompts import PromptTemplate

template = """Question: {question}

Answer: Let's think step by step."""
prompt = PromptTemplate(template=template, input_variables=["question"])

In [118]:
models_list = [
    flan_20B,
    flan_t5xxl,
    llm
]

In [119]:
lab = ModelLaboratory.from_llms(models_list, prompt=prompt)

Let's run it on some and compare!

In [120]:
lab.compare("What is the opposite of up?")

[1mInput:[0m
What is the opposite of up?

[1mHuggingFaceHub[0m
Params: {'repo_id': 'google/flan-ul2', 'task': None, 'model_kwargs': {'temperature': 0.1, 'max_new_tokens': 200}}
[36;1m[1;3mDown is the opposite of up. Up is a direction. Down is a location. The answer: down.[0m

[1mHuggingFaceHub[0m
Params: {'repo_id': 'google/flan-t5-xxl', 'task': None, 'model_kwargs': {'temperature': 0.1, 'max_new_tokens': 200}}
[33;1m[1;3mDown is the opposite of up. Down is the direction of a downward motion. The answer: down.[0m

[1mOpenAI[0m
Params: {'model_name': 'text-davinci-003', 'temperature': 0.1, 'max_tokens': 256, 'top_p': 1, 'frequency_penalty': 0, 'presence_penalty': 0, 'n': 1, 'request_timeout': None, 'logit_bias': {}}
[38;5;200m[1;3m The opposite of up is down.[0m



In [121]:
lab.compare("Answer the following question by reasoning step by step. The cafeteria had 23 apples. \
If they used 20 for lunch, and bought 6 more, how many apple do they have?")

[1mInput:[0m
Answer the following question by reasoning step by step. The cafeteria had 23 apples. If they used 20 for lunch, and bought 6 more, how many apple do they have?

[1mHuggingFaceHub[0m
Params: {'repo_id': 'google/flan-ul2', 'task': None, 'model_kwargs': {'temperature': 0.1, 'max_new_tokens': 200}}
[36;1m[1;3mThey had 23 - 20 = 3 apples left. They have 3 + 6 = 9 apples. So the answer is 9.[0m

[1mHuggingFaceHub[0m
Params: {'repo_id': 'google/flan-t5-xxl', 'task': None, 'model_kwargs': {'temperature': 0.1, 'max_new_tokens': 200}}
[33;1m[1;3mThe cafeteria had 23 - 20 = 3 apples after lunch. They bought 6 + 3 = 7 apples. The cafeteria has 7 - 3 = 2 apples.[0m

[1mOpenAI[0m
Params: {'model_name': 'text-davinci-003', 'temperature': 0.1, 'max_tokens': 256, 'top_p': 1, 'frequency_penalty': 0, 'presence_penalty': 0, 'n': 1, 'request_timeout': None, 'logit_bias': {}}
[38;5;200m[1;3m 

Step 1: The cafeteria had 23 apples. 

Step 2: They used 20 for lunch. 

Step 3: They

In [34]:
lab.compare('''
Can Elon Musk have a conversation with George Washington? Give the rationale before answering.
''')

[1mInput:[0m

Can Elon Musk have a conversation with George Washington? Give the rationale before answering.


[1mHuggingFaceHub[0m
Params: {'repo_id': 'google/flan-ul2', 'task': None, 'model_kwargs': {'temperature': 0.1, 'max_new_tokens': 200}}
[36;1m[1;3mGeorge Washington died in 1799. Elon Musk was born in 1971. So, the final answer is no.[0m

[1mHuggingFaceHub[0m
Params: {'repo_id': 'google/flan-t5-xxl', 'task': None, 'model_kwargs': {'temperature': 0.1, 'max_new_tokens': 200}}
[33;1m[1;3mElon Musk is a billionaire entrepreneur who has developed a time machine. The time machine allows him to travel back in time and communicate with people who have died. George Washington died in 1799. So the final answer is no.[0m

[1mOpenAIChat[0m
Params: {'model_name': 'gpt-3.5-turbo-16k', 'temperature': 0.1, 'max_tokens': 256}
[38;5;200m[1;3mFirst, we need to establish that George Washington is no longer alive. He passed away in 1799. Therefore, it is not possible for Elon Musk t

Let's change the prompt.

In [39]:
template = """You are a professional social media manager who can write great posts in linkedin to increase appeal of persons profile: {request}

Story:"""
prompt = PromptTemplate(template=template, input_variables=["request"])

lab = ModelLaboratory.from_llms(models_list, prompt=prompt)

In [41]:
lab.compare('''I have passed a course in large language models (1 month duration). Write a post about that.''')

[1mInput:[0m
I have passed a course in large language models (1 month duration). Write a post about that.

[1mHuggingFaceHub[0m
Params: {'repo_id': 'google/flan-ul2', 'task': None, 'model_kwargs': {'temperature': 0.1, 'max_new_tokens': 200}}
[36;1m[1;3mI have passed a course in large language models (1 month duration)[0m

[1mHuggingFaceHub[0m
Params: {'repo_id': 'google/flan-t5-xxl', 'task': None, 'model_kwargs': {'temperature': 0.1, 'max_new_tokens': 200}}
[33;1m[1;3mI have passed a course in large language models (1 month duration)[0m

[1mOpenAIChat[0m
Params: {'model_name': 'gpt-3.5-turbo-16k', 'temperature': 0.1, 'max_tokens': 256}
[38;5;200m[1;3m🎉 Exciting News! 🎉

I am thrilled to announce that I have successfully completed a comprehensive course in large language models! 📚✨ Over the past month, I have delved deep into the world of cutting-edge technology and honed my skills in harnessing the power of these incredible models. 🌟

During this course, I have gained a

In [43]:
template = """Answer the question to the best of your abilities but if you are not sure then answer you don't know: {question}

Answer:"""
prompt = PromptTemplate(template=template, input_variables=["question"])

lab = ModelLaboratory.from_llms(models_list, prompt=prompt)

In [45]:
lab.compare('''I am riding a bicycle. The pedals are moving fast. I look into the mirror and I am not moving. Why is this?''')


[1mInput:[0m
I am riding a bicycle. The pedals are moving fast. I look into the mirror and I am not moving. Why is this?

[1mHuggingFaceHub[0m
Params: {'repo_id': 'google/flan-ul2', 'task': None, 'model_kwargs': {'temperature': 0.1, 'max_new_tokens': 200}}
[36;1m[1;3mI am stationary[0m

[1mHuggingFaceHub[0m
Params: {'repo_id': 'google/flan-t5-xxl', 'task': None, 'model_kwargs': {'temperature': 0.1, 'max_new_tokens': 200}}
[33;1m[1;3mI am looking at the wrong thing.[0m

[1mOpenAIChat[0m
Params: {'model_name': 'gpt-3.5-turbo-16k', 'temperature': 0.1, 'max_tokens': 256}
[38;5;200m[1;3mI don't know.[0m



### Named Entity Recognition



In [46]:
template = """{question}

Answer:"""
prompt = PromptTemplate(template=template, input_variables=["question"])

lab = ModelLaboratory.from_llms(models_list, prompt=prompt)

In [51]:
lab.compare('''Extract names and cities from the text.\n\n
Output in the format: {"names": list of names in text, "cities": list of cities in text}, surname]\n\n

Mark Dickey, 40, began suffering from severe gastric pain last week after descending
more than 3,600 feet into Morca Cave in Southern Turkey, according to a statement
from the European Cave Rescue Association, but he wasn’t able to be rescued until yesterday,
when doctors deemed him “transportable.”
''')

[1mInput:[0m
Extract names and cities from the text.


Output in the format: {"names": list of names in text, "cities": list of cities in text}, surname]


Mark Dickey, 40, began suffering from severe gastric pain last week after descending more than 3,600 feet into Morca Cave in Southern Turkey, according to a statement from the European Cave Rescue Association, but he wasn’t able to be rescued until yesterday, when doctors deemed him “transportable.”




[1mHuggingFaceHub[0m
Params: {'repo_id': 'google/flan-ul2', 'task': None, 'model_kwargs': {'temperature': 0.1, 'max_new_tokens': 200}}
[36;1m[1;3mnames, surname][0m

[1mHuggingFaceHub[0m
Params: {'repo_id': 'google/flan-t5-xxl', 'task': None, 'model_kwargs': {'temperature': 0.1, 'max_new_tokens': 200}}
[33;1m[1;3mnames[Mark Dickey, cities[Southern Turkey][0m

[1mOpenAIChat[0m
Params: {'model_name': 'gpt-3.5-turbo-16k', 'temperature': 0.1, 'max_tokens': 256}
[38;5;200m[1;3m{"names": ["Mark Dickey"], "cities": ["Souther

Question answering based on text

In [53]:
lab.compare('''Is Mark Dickey alive?\n\n
Output in the format: Yes or No, facts that prove that.\n\n

Mark Dickey, 40, began suffering from severe gastric pain last week after descending
more than 3,600 feet into Morca Cave in Southern Turkey, according to a statement
from the European Cave Rescue Association, but he wasn’t able to be rescued until yesterday,
when doctors deemed him “transportable.”
''')

[1mInput:[0m
Is Mark Dickey alive?


Output in the format: Yes or No, facts that prove that.



Mark Dickey, 40, began suffering from severe gastric pain last week after descending 
more than 3,600 feet into Morca Cave in Southern Turkey, according to a statement 
from the European Cave Rescue Association, but he wasn’t able to be rescued until yesterday, 
when doctors deemed him “transportable.”


[1mHuggingFaceHub[0m
Params: {'repo_id': 'google/flan-ul2', 'task': None, 'model_kwargs': {'temperature': 0.1, 'max_new_tokens': 200}}
[36;1m[1;3mYes[0m

[1mHuggingFaceHub[0m
Params: {'repo_id': 'google/flan-t5-xxl', 'task': None, 'model_kwargs': {'temperature': 0.1, 'max_new_tokens': 200}}
[33;1m[1;3mYes[0m

[1mOpenAIChat[0m
Params: {'model_name': 'gpt-3.5-turbo-16k', 'temperature': 0.1, 'max_tokens': 256}
[38;5;200m[1;3mYes, Mark Dickey is alive. He was rescued from Morca Cave in Southern Turkey after suffering from severe gastric pain and deemed "transportable" by doctors.