# Langchain test

---
## Documentation

> LangChain resources
> - Landpage: https://readthedocs.org/projects/langchain/db2d
> - git: https://github.com/hwchase17/langchain.git
> - API Reference: https://api.python.langchain.com/en/latest/

> Tutos
> - https://towardsdatascience.com/a-gentle-intro-to-chaining-llms-agents-and-utils-via-langchain-16cd385fca81
> - videos Greg Kamradt on YouTube



---
## Setup

In [2]:
pip install langchain


Collecting langchain
  Downloading langchain-0.0.216-py3-none-any.whl (1.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.1/1.1 MB[0m [31m8.8 MB/s[0m eta [36m0:00:00[0m:00:01[0m
Collecting aiohttp<4.0.0,>=3.8.3 (from langchain)
  Using cached aiohttp-3.8.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.0 MB)
Collecting async-timeout<5.0.0,>=4.0.0 (from langchain)
  Using cached async_timeout-4.0.2-py3-none-any.whl (5.8 kB)
Collecting dataclasses-json<0.6.0,>=0.5.7 (from langchain)
  Using cached dataclasses_json-0.5.8-py3-none-any.whl (26 kB)
Collecting langchainplus-sdk>=0.0.17 (from langchain)
  Using cached langchainplus_sdk-0.0.17-py3-none-any.whl (25 kB)
Collecting numexpr<3.0.0,>=2.8.4 (from langchain)
  Using cached numexpr-2.8.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (381 kB)
Collecting openapi-schema-pydantic<2.0,>=1.2 (from langchain)
  Using cached openapi_schema_pydantic-1.2.4-py3-none-any.whl (90 kB)
Collectin

In [3]:
pip install openai


Collecting openai
  Using cached openai-0.27.8-py3-none-any.whl (73 kB)
Installing collected packages: openai
Successfully installed openai-0.27.8
[0mNote: you may need to restart the kernel to use updated packages.


---
## API keys and configuration

In [4]:
%%bash --out secrets 
# using AWS's Secret Manager to store keys
# garb the keys and store it into a Pytthon variable
export RESPONSE=$(aws secretsmanager get-secret-value --secret-id 'labbenach/sednara/api_keys' )
export SECRETS=$( echo $RESPONSE | jq '.SecretString | fromjson')

echo $SECRETS

In [5]:
import os

os.environ["OPENAI_API_KEY"] = eval(secrets)["OPENAI_API_KEY"]


---
<div style="background-color:green;color:black;text-align:center;padding:1rem;font-size:1.5rem;">LANGCHAIN OVERVIEW</div>


---
# 1. Basic features

---
## Get prediction from a langage model

In [79]:
from langchain.llms import OpenAI

# loads the model.
# OPENAI_API_KEY is requested. Get it from the OpenAI site.
# a paid account and available units are requested to be able to place a request.
llm = OpenAI(temperature=0.9)

text = "what are the 5 best countries in Europe"

# Actual API call - may tale a while.
print(llm(text))




1. Germany 
2. Switzerland 
3. Denmark 
4. Norway 
5. Luxembourg


---
## Manage prompts with templates

In [80]:
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI

# loads the model.
llm = OpenAI(temperature=0.9)

# setup a prompt
prompt = PromptTemplate (
    input_variables=["interest"],
    template="what are the 5 best countries in Europe ranked by {interest}"
)

In [81]:
text = prompt.format(interest="food")
print(f"{text=}")
print(llm(text))

text='what are the 5 best countries in Europe ranked on food'


1. Italy 
2. France 
3. Spain 
4. Greece 
5. Portugal


In [82]:
text = prompt.format(interest="siteseeing")
print(f"{text=}")
print(llm(text))

text='what are the 5 best countries in Europe ranked on siteseeing'


1. Italy
2. France
3. Spain
4. Greece
5. United Kingdom


---
## Prompt with multiple tokens 
<div class="alert alert-block alert-warning"> TODO </div>


---
# 2. Chains

<div class="alert alert-block alert-warning"> TODO  what is a chain </div>


---
## Built-in chains

In [83]:
from langchain.chains import PALChain
from langchain.llms import OpenAI

# loads the model.
llm = OpenAI(temperature=0.7)

palchain = PALChain.from_math_prompt(llm=llm, verbose=True)


text = """If my age is half of my dad's age 
and he is going to be 60 next year, 
what is my current age?"""
#palchain.run("If my age is half of my dad's age and he is going to be 60 next year, what is my current age?")
palchain.run(text)




[1m> Entering new  chain...[0m
[32;1m[1;3mdef solution():
    """If my age is half of my dad's age and he is going to be 60 next year, what is my current age?"""
    dad_age_next_year = 60
    my_age = dad_age_next_year / 2
    result = my_age
    return result[0m

[1m> Finished chain.[0m


'30.0'

<div class="alert alert-block alert-warning"> 
    TODO <br>
    - different result each run <br>
    - and should be 29.5
</div>


> Entering new  chain...
def solution():
    """If my age is half of my dad's age and he is going to be 60 next year, what is my current age?"""
    dad_age_next_year = 60
    my_age_fraction = 0.5
    my_age_now = dad_age_next_year * my_age_fraction
    result = my_age_now
    return result

> Finished chain.
'30.0'

> Entering new  chain...
def solution():
    """If my age is half of my dad's age and he is going to be 60 next year, what is my current age?"""
    dad_age_current = 59
    my_age_current = dad_age_current / 2
    result = my_age_current
    return result

> Finished chain.
'29.5'

---
## Multi-step workflow to feed prompt into the model

In [84]:
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
from langchain.chains import LLMChain

# loads the model.
llm = OpenAI(temperature=0.9)

# setup a prompt
prompt = PromptTemplate (
    input_variables=["interest"],
    template="what are the 5 best countries in Europe ranked on {interest}"
)

# chain feeds the prompt into the langage mmodel.
chain = LLMChain(llm=llm, prompt=prompt)

In [85]:
chain.run("science")

'\n\n1. Germany\n2. Sweden\n3. Switzerland\n4. United Kingdom\n5. Netherlands'

In [86]:
print(chain.run("tv shows"))



1. United Kingdom
2. France
3. Germany
4. Italy
5. Spain


---
## Using OpenAI Chat API (less expensive)
requires a chain to feed the prompt into the chat 

Other Chat APIs
- https://api.python.langchain.com/en/latest/modules/chat_models.html

In [87]:
from langchain.chat_models import ChatOpenAI
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain

chatopenai = ChatOpenAI(model_name="gpt-3.5-turbo")

prompt = PromptTemplate (
    input_variables=["interest"],
    template="what are the 5 best countries in Europe ranked on {interest}"
)

llmchain_chat = LLMChain(llm=chatopenai, prompt=prompt)
print(llmchain_chat.run("food"))


As an AI language model, I do not have personal preferences. However, the following are five countries in Europe that are known for their delicious cuisine:

1. Italy - Italian cuisine is famous for its pasta, pizza, gelato, and wines. Italy is also known for its flavorful seafood and meat dishes.

2. France - French cuisine is renowned for its delicate flavors and rich sauces. It includes dishes such as coq au vin, ratatouille, and escargots.

3. Spain - Spanish cuisine is known for its tapas, paella, and seafood dishes. It also features delicious cured meats and cheeses.

4. Greece - Greek cuisine is characterized by fresh vegetables, grilled meats, and flavorful dips such as tzatziki and hummus. Greek cuisine also includes dishes like moussaka and souvlaki.

5. Turkey - Turkish cuisine is a fusion of Middle Eastern and Mediterranean flavors. It includes dishes such as kebabs, baklava, and Turkish delight. Turkish cuisine also features delicacies like stuffed grape leaves and Turkish

---
## Leverage LLM Math

Evaluating chains that know how to do math.

https://python.langchain.com/docs/guides/evaluation/llm_math

In [88]:
from langchain.prompts import load_prompt
from langchain.chains import LLMMathChain

# loads the model.
llm = OpenAI(temperature=0.9)

prompt = load_prompt('lc://prompts/llm_math/prompt.json')

# deprecated
##chain = LLMMathChain(llm=llm, prompt=prompt)

chain = LLMChain(llm=llm, prompt=prompt)

print(chain.run("what is the largest prime number lower than 20"))


No `_type` key found, defaulting to `prompt`.


Answer: 19


---
# 3. Tools

<div class="alert alert-block alert-warning"> TODO  what is a tool </div>


---
## Leverage Goocle Search

>How to configure the Google search in Langchain 
> - https://python.langchain.com/docs/ecosystem/integrations/google_search

> Custom Search Engine configuration 
> - https://stackoverflow.com/questions/37083058/programmatically-searching-google-in-python-using-custom-search

> CSE API 
> - repo: https://github.com/google/google-api-python-client
> - more info: https://developers.google.com/api-client-library/python/apis/customsearch/v1
> - complete docs: https://api-python-client-doc.appspot.com/

> Get an API key
> - https://developers.google.com/custom-search/v1/introduction

In [89]:
pip install google-api-python-client

Collecting google-api-python-client
  Using cached google_api_python_client-2.90.0-py2.py3-none-any.whl (11.4 MB)
Collecting httplib2<1.dev0,>=0.15.0 (from google-api-python-client)
  Using cached httplib2-0.22.0-py3-none-any.whl (96 kB)
Collecting google-auth<3.0.0.dev0,>=1.19.0 (from google-api-python-client)
  Using cached google_auth-2.20.0-py2.py3-none-any.whl (181 kB)
Collecting google-auth-httplib2>=0.1.0 (from google-api-python-client)
  Using cached google_auth_httplib2-0.1.0-py2.py3-none-any.whl (9.3 kB)
Collecting google-api-core!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.0,<3.0.0.dev0,>=1.31.5 (from google-api-python-client)
  Using cached google_api_core-2.11.1-py3-none-any.whl (120 kB)
Collecting uritemplate<5,>=3.0.1 (from google-api-python-client)
  Using cached uritemplate-4.1.1-py2.py3-none-any.whl (10 kB)
Collecting googleapis-common-protos<2.0.dev0,>=1.56.2 (from google-api-core!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.0,<3.0.0.dev0,>=1.31.5->google-api-python-client)
  Using cached googleap

In [90]:
# Unlock the API and get a key 
os.environ["GOOGLE_API_KEY"] = eval(secrets)["GOOGLE_API_KEY"]
# Create or use an existing Custom Search Engine
# on the CSE page under Searcg Engone ID
os.environ["GOOGLE_CSE_ID"] = eval(secrets)["GOOGLE_CSE_ID"]


In [91]:
from langchain.tools import Tool
from langchain.utilities import GoogleSearchAPIWrapper

search = GoogleSearchAPIWrapper()

tool = Tool(
    name="Google Search",
    description="Search Google for recent results.",
    func=search.run,
)

tool.run("French Prime Minister name?")

"Élisabeth Borne has served as Prime Minister since 16 May 2022. Fifth Republic recordsEdit. Length of the successive governments\xa0... May 16, 2022 ... President Emmanuel Macron has named Labour Minister Elisabeth Borne as prime minister to lead his ambitious reform plans, the first woman to\xa0... Feb 22, 2018 ... SEVEN months after their prime minister was appointed in May 2017, fully 35% of the French could not name him accurately in a poll. May 16, 2022 ... Elisabeth Borne has been named the new Prime Minister of France, the first time in 30 years that a woman has held the position. May 16, 2022 ... Élisabeth Borne, the minister of labor who previously was in charge of the environment, will be the second woman to hold the post in France. Jun 24, 2022 ... The name of Lafayette is famous and respected on both sides of the Atlantic. It is our third conversation in a month, which is quite a good\xa0... May 5, 2017 ... France's Macron says he has chosen prime minister, won't reveal na

---
# 4. Agent

<div class="alert alert-block alert-warning"> TODO  what is an agent </div>


---
## Setup an agent

In [105]:
from langchain.agents import load_tools
from langchain.agents import initialize_agent
from langchain.agents import AgentType
from langchain.llms import OpenAI

# create a model
llm = OpenAI(temperature=0)

# load some tools
tools = load_tools(["google-search", "llm-math"], llm=llm)

# setup an agent
agent = initialize_agent(tools, 
                         llm, 
                         agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, 
                         verbose=True)


In [106]:
agent.run("How many Teslas have been sold in 2022. Multiple by 2")



[1m> Entering new  chain...[0m
[32;1m[1;3m I need to find out how many Teslas have been sold in 2022
Action: google_search
Action Input: "how many Teslas have been sold in 2022"[0m
Observation: [36;1m[1;3mApr 15, 2023 ... Tesla total revenue for 2022 was 81,462 billion USD. We show it from 2018 – 2022. Tesla annual revenue 2018 - 2022. Year, Annual ... Jun 7, 2023 ... How many Tesla vehicles were delivered in 2023? ... As of June 2022, Tesla was the most valuable brand within the global automotive sector. Jan 25, 2023 ... The Model 3 and Model Y make up around 95% of the 1.31 million Teslas sold in 2022. Tesla. Tesla's finished 2022 on a tear, bolstered by recent ... Jan 7, 2023 ... Overall, Tesla reported delivering about 1.25 million Model Y and Model 3 vehicles globally in 2022. The Model 3 ranked 13th in sales at 211,641 ... Jan 3, 2023 ... The electric automaker delivered 1.3 million vehicles in 2022, up 40% from 2021. It produced nearly 1.4 million vehicles, up 47% from 

'2,620,000 Teslas were sold in 2022.'

In [107]:
agent.run("""Who is the current prime minister of France. 
Is he or sheyounger than the President?""") 



[1m> Entering new  chain...[0m
[32;1m[1;3m I need to find out who the current prime minister is and then compare their age to the President.
Action: google_search
Action Input: "current prime minister of France"[0m
Observation: [36;1m[1;3mPresentEdit. Élisabeth Borne has served as Prime Minister since 16 May 2022. Fifth Republic records ... May 16, 2022 ... Who is France's new Prime Minister Elisabeth Borne? French President Emmanuel Macron picked Labour Minister Elisabeth Borne as his new prime ... The current Prime Minister of France is Élisabeth Borne. She was given the job by President Emmanuel Macron on 16 May 2022. May 16, 2022 ... President Emmanuel Macron has named Labour Minister Elisabeth Borne as prime minister to lead his ambitious reform plans, the first woman to ... May 16, 2022 ... Élisabeth Borne, the minister of labor who previously was in charge of the environment, will be the second woman to hold the post in France. May 2, 2014 ... On the recommendation of t

'The current Prime Minister of France, Élisabeth Borne, is younger than the President, Emmanuel Macron, who is 39 years old.'

In [108]:
agent.run("""Who is the current prime minister of France. 
When will he or she be 70?""") 



[1m> Entering new  chain...[0m
[32;1m[1;3m I need to find out who the current prime minister is and when they will be 70.
Action: google_search
Action Input: "current prime minister of France"[0m
Observation: [36;1m[1;3mPresentEdit. Élisabeth Borne has served as Prime Minister since 16 May 2022. Fifth Republic records ... May 16, 2022 ... Who is France's new Prime Minister Elisabeth Borne? French President Emmanuel Macron picked Labour Minister Elisabeth Borne as his new prime ... The current Prime Minister of France is Élisabeth Borne. She was given the job by President Emmanuel Macron on 16 May 2022. May 16, 2022 ... President Emmanuel Macron has named Labour Minister Elisabeth Borne as prime minister to lead his ambitious reform plans, the first woman to ... May 16, 2022 ... Élisabeth Borne, the minister of labor who previously was in charge of the environment, will be the second woman to hold the post in France. May 2, 2014 ... On the recommendation of the Prime Minister, 

'Élisabeth Borne will be 70 in the year 2092.'

---
# 5. Memory - Conversation

<div class="alert alert-block alert-warning"> TODO  what is a conversation </div>


In [98]:
from langchain import OpenAI, ConversationChain

# create a model
llm = OpenAI(temperature=0)

conversation = ConversationChain(llm=llm, verbose=True)

conversation.predict(input="Hi There")





[1m> Entering new  chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Human: Hi There
AI:[0m

[1m> Finished chain.[0m


" Hi there! It's nice to meet you. How can I help you today?"

In [99]:
conversation.predict(input="What is the first thing that I said to you?")




[1m> Entering new  chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hi There
AI:  Hi there! It's nice to meet you. How can I help you today?
Human: What is the first thing that I said to you?
AI:[0m

[1m> Finished chain.[0m


' You said "Hi there!"'

In [100]:
conversation.predict(input="What is an alternative for the first thing that I said to you?")




[1m> Entering new  chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hi There
AI:  Hi there! It's nice to meet you. How can I help you today?
Human: What is the first thing that I said to you?
AI:  You said "Hi there!"
Human: What is an alternative for the first thing that I said to you?
AI:[0m

[1m> Finished chain.[0m


' An alternative for the first thing you said to me is "Hello!"'

---
<div style="background-color:green;color:black;text-align:center;padding:1rem;font-size:1.5rem;">LANGCHAIN COMPONENTS</div>


---
# 6. Schemas

There are 3 types of schemas
- text (see above)
- Messages 
- Document

---
## Text

In [23]:
from langchain.llms import OpenAI

# loads the model.
# OPENAI_API_KEY is requested. Get it from the OpenAI site.
# a paid account and available units are requested to be able to place a request.
llm = OpenAI(temperature=0.9)

text = "what are the 5 best countries in Europe"

# Actual API call - may tale a while.
print(llm(text))



1. Switzerland
2. Germany
3. Netherlands
4. United Kingdom
5. Norway


---
## Chat messages
Chat messages are like text with a type

There are 3 types
- System: background context that tells the AI what to do
- Human: inputs sent by the user
- AI : response of the AI


In [30]:
from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage, AIMessage

chat = ChatOpenAI(temperature=0.7)

In [31]:
messages = [ SystemMessage(content="You are a nice AI and help users to feature out what to eat.")]
     
messages.append( HumanMessage(content="I like tuna, list some recipes.") )

In [32]:
response = chat(messages)
messages.append( AIMessage(content=response.content) )

print(response.content)

Sure, here are some tuna recipes that you might enjoy:

1. Tuna salad: Mix canned tuna with chopped celery, onions, and mayonnaise. You can also add some chopped pickles, mustard, and salt and pepper to taste. Serve on a bed of lettuce or between two slices of bread.

2. Tuna melt: Spread canned tuna on a slice of bread, top with sliced tomato and cheese, and broil until the cheese is melted and bubbly.

3. Tuna pasta salad: Combine cooked pasta with canned tuna, chopped vegetables like bell peppers and onions, and a dressing made of mayonnaise, lemon juice, and herbs.

4. Tuna patties: Mix canned tuna with bread crumbs, egg, and seasonings like garlic, onion powder, and parsley. Form into patties and pan-fry until golden brown.

5. Tuna poke bowl: Top cooked rice with cubed raw tuna, avocado, cucumber, and edamame. Drizzle with a soy sauce-based dressing and garnish with sesame seeds.

I hope these ideas help! Let me know if you have any specific dietary restrictions or preferences, a

In [33]:
messages.append( HumanMessage(content="show the first one.") )

response = chat(messages)
messages.append( AIMessage(content=response.content) )

print(response.content)

Sure, here is a recipe for tuna salad:

Ingredients:
- 2 cans of tuna, drained
- 2 stalks of celery, chopped
- 1 small onion, chopped
- 1/4 cup mayonnaise
- 1 tablespoon chopped pickles (optional)
- 1 teaspoon mustard (optional)
- Salt and pepper to taste

Instructions:
1. In a mixing bowl, combine the drained tuna, chopped celery, and chopped onion.
2. Add the mayonnaise, pickles, and mustard (if using) to the bowl and mix well until everything is combined.
3. Season with salt and pepper to taste.
4. Serve the tuna salad on a bed of lettuce or between two slices of bread.

Enjoy!


---
## Examples
An list of input output pairs thet represent the input and expected output.

Used to fine tune a model or do in-context learning.

Documentation: https://python.langchain.com/docs/modules/model_io/prompts/prompt_templates/few_shot_examples


In [41]:
from langchain.llms import OpenAI
from langchain.prompts.few_shot import FewShotPromptTemplate
from langchain.prompts.prompt import PromptTemplate

# loads the model.
llm = OpenAI(temperature=0.9)

# create the example set

examples = [
    { "question": "red bold", "answer": "color:red; font-style:bold;"},
    { "question": "green italic", "answer":  "color:green; font-style:italic;"},
    { "question": "blue bold", "answer":  "color:blue; font-style:bold;"},
    { "question": "pink", "answer":  "color:pink;"},
    { "question": "green", "answer":  "color:green;"},
    { "question": "pink italic", "answer":  "color:pink; font-style:italic;"}
    
]    

# Configure a formatter that will format the few shot examples into a string. 
# This formatter should be a PromptTemplate object.

example_prompt = PromptTemplate (
    input_variables=["question", "answer"], 
    template="question: {question}\n{answer}"
)

print("\n=== exemple prompt ===")
print(example_prompt.format(**examples[0]))


# Finally, create a FewShotPromptTemplate object. 
# This object takes in the few shot examples and the formatter for the few shot examples.

prompt_template = FewShotPromptTemplate(
    examples=examples, 
    example_prompt=example_prompt, 
    suffix="question: {input}", 
    input_variables=["input"]
)

prompt = prompt_template.format(input="pink bold")

print("\n=== prompt ===")
print(prompt)

print("\n=== answer ===")
print(llm(prompt))



=== exemple prompt ===
question: red bold
color:red; font-style:bold;

=== prompt ===
question: red bold
color:red; font-style:bold;

question: green italic
color:green; font-style:italic;

question: blue bold
color:blue; font-style:bold;

question: pink
color:pink;

question: green
color:green;

question: pink italic
color:pink; font-style:italic;

question: pink bold

=== answer ===

color:pink; font-style:bold;


---
## Documents

An unstructured object that conaints a pieces of text and metadatas.

<div class="alert alert-block alert-warning"> TODO how to use this concept? 
make some knowledge available?
how to use metadata?
</div>


In [36]:
from langchain.schema import Document

Document(
    page_content="This is my document. it contains useful information",
    metadata={
        'author':"Claude",
        'identifier':"1234"
    }
)

Document(page_content='This is my document. it contains useful information', metadata={'author': 'Claude', 'identifier': '1234'})

---
# 7. Models

List of models: https://platform.openai.com/docs/models


---
## Langage Model 
Text in Text out 

In [12]:
from langchain.llms import OpenAI

# additnal parameters to select a mode, pass the API key ...
llm = OpenAI(model_name="text-ada-001", temperature=0.7)

llm("What day comes after Friday?")

'\n\nSaturday.'

---
## Chat Model 
Takes a series of messages and return an AI response

Also make sense for a unique interaction as Chat API is less expensive.


In [13]:
from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage, AIMessage

chat = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=1)

In [15]:
messages = [ 
    SystemMessage(content="You are a nice AI and help users to feature out what to eat."),
    HumanMessage(content="I like tuna, list some recipes.")
]
     
chat(messages)

AIMessage(content='Sure, here are some delicious tuna recipes you can try: \n\n1. Tuna Salad: Mix canned tuna with mayonnaise, pickle relish, chopped celery and onion. Serve on salad greens, in a sandwich or on crackers for a light lunch.\n\n2. Tuna Nicoise Salad: Top a bed of salad greens with cooked green beans, boiled potatoes, hard-boiled eggs, canned tuna, and cherry tomatoes. Toss with a simple vinaigrette for a healthy Mediterranean-inspired meal.\n\n3. Tuna Melt: Arrange tuna salad on a slice of crusty bread, top with sliced tomato and cheese, and broil until melted and bubbly.\n\n4. Tuna Poke Bowl: Combine cubed raw tuna with soy sauce, sesame oil, lime juice, green onions, and sesame seeds. Serve over steamed rice with sliced avocado and edamame.\n\n5. Grilled Tuna Steaks: Brush fresh tuna steaks with olive oil and season with salt and pepper. Grill for a few minutes on each side until cooked to your liking, and serve with a side of sautéed vegetables.\n\nI hope these ideas i

---
### Text Embedding Model

Convert text into a series of numbers (a vector) which holds the meaning of the text.

Mainly used for text comparison.

In [19]:
from langchain.embeddings import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model="text-embedding-ada-002")

text="A leader should know all about truth and honesty, and when to see the difference. (Truck) - Bromeliad Trilogy"

text_embedding = embeddings.embed_query(text)

print(f"embedding length: {len(text_embedding)}")
print(f"5 first values of the vector: {text_embedding[:5]}")

embedding length: 1536
5 first values of the vector: [-0.0020272971596568823, -0.016961609944701195, 0.013975410722196102, -0.014824817888438702, 0.001639920868910849]


---
# 8. prompts
Text sent the langage model

<div class="alert alert-block alert-warning"> TODO </div>

---
## Simple prompt

In [25]:
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI

# loads the model.
llm = OpenAI(model_name="text-davinci-003", temperature=0.9)

# write a simple  prompt. use """ to allow multiline string.
prompt = """
Today is Monday. Tomorrow is Wednesday.

What is wrong with this statement?
"""

# query the model
print(llm(prompt))


Today is Monday. Tomorrow is Wednesday.

The statement is incorrect because tomorrow is Tuesday, not Wednesday.


---
## Prompt with template and placeholder.

In [27]:
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI

# loads the model.
llm = OpenAI(model_name="text-davinci-003", temperature=0.9)

# setup a prompt. use """ to allow multiline string.
template = PromptTemplate (
    input_variables=["today", "tomorrow"],
    template="""
    Today is {today}. Tomorrow is {tomorrow}.

    What is wrong with this statement?
    """
)

prompt = template.format(today="Monday", tomorrow="Wednesday")
print(f"{prompt=}")

# query the model

print(llm(prompt))

prompt='\n    Today is Monday. Tomorrow is Wednesday.\n\n    What is wrong with this statement?\n    '

This statement is incorrect because tomorrow is Tuesday, not Wednesday.


In [29]:
prompt = template.format(today="Thursday", tomorrow="Friday")
print(f"{prompt=}")

# query the model

print(llm(prompt))

prompt='\n    Today is Thursday. Tomorrow is Friday.\n\n    What is wrong with this statement?\n    '

This statement is not incorrect; however, it does not accurately indicate what day it currently is.


---
## Example selectors and Few Prompt Learning

A way to select from a series of examples in few shot learning 

Documentation: https://api.python.langchain.com/en/latest/modules/example_selector.html
Documentation: https://python.langchain.com/docs/modules/model_io/prompts/prompt_templates/few_shot_examples



### Example selectors and Few Prompt Learning with NGra

### setup a vector database 


#### Install GCC

packages below requires GCC to compile the wheel

In [60]:
!apt-get update && apt-get install -y build-essential

Get:1 http://deb.debian.org/debian bullseye InRelease [116 kB]
Get:2 http://security.debian.org/debian-security bullseye-security InRelease [48.4 kB]
Get:3 http://deb.debian.org/debian bullseye-updates InRelease [44.1 kB]
Get:4 http://security.debian.org/debian-security bullseye-security/main amd64 Packages [245 kB]
Get:5 http://deb.debian.org/debian bullseye/main amd64 Packages [8183 kB]
Get:6 http://deb.debian.org/debian bullseye-updates/main amd64 Packages [14.8 kB]
Fetched 8651 kB in 2s (5485 kB/s)
Reading package lists... Done
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  dirmngr dpkg-dev fakeroot g++ g++-10 gnupg gnupg-l10n gnupg-utils gpg
  gpg-agent gpg-wks-client gpg-wks-server gpgconf gpgsm gpgv
  libalgorithm-diff-perl libalgorithm-diff-xs-perl libalgorithm-merge-perl
  libassuan0 libdpkg-perl libfakeroot libfile-fcntllock-perl libksba8
  liblocale-gettext-perl libnpth0 

### Install Annoy, a vector database

alternatives chromadb, faiss

In [63]:
pip install annoy

Collecting annoy
  Using cached annoy-1.17.3.tar.gz (647 kB)
  Preparing metadata (setup.py) ... [?25ldone
[?25hBuilding wheels for collected packages: annoy
  Building wheel for annoy (setup.py) ... [?25ldone
[?25h  Created wheel for annoy: filename=annoy-1.17.3-cp310-cp310-linux_x86_64.whl size=76573 sha256=31850389a337fd1502405ccab083434fe8d454dab49a6e3ececaa5cbd95ec8ea
  Stored in directory: /root/.cache/pip/wheels/64/8a/da/f714bcf46c5efdcfcac0559e63370c21abe961c48e3992465a
Successfully built annoy
Installing collected packages: annoy
Successfully installed annoy-1.17.3
[0mNote: you may need to restart the kernel to use updated packages.


#### Install tiktoken, a tokenizer

requested for embeddjngs

In [65]:
pip install tiktoken

Collecting tiktoken
  Downloading tiktoken-0.4.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.7 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.7/1.7 MB[0m [31m13.8 MB/s[0m eta [36m0:00:00[0m00:01[0m
Installing collected packages: tiktoken
Successfully installed tiktoken-0.4.0
[0mNote: you may need to restart the kernel to use updated packages.


### Example selectors and Few Prompt Learning with similarities

requires a vector database

In [66]:
from langchain.llms import OpenAI
from langchain.prompts.few_shot import FewShotPromptTemplate
from langchain.prompts.prompt import PromptTemplate
from langchain.prompts.example_selector import SemanticSimilarityExampleSelector
from langchain.vectorstores import Annoy
#from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings

# loads the model.
llm = OpenAI(temperature=0.9)

# create the example set

examples = [
    { "question": "red bold", "answer": "color:red; font-style:bold;"},
    { "question": "green italic", "answer":  "color:green; font-style:italic;"},
    { "question": "blue bold", "answer":  "color:blue; font-style:bold;"},
    { "question": "pink", "answer":  "color:pink;"},
    { "question": "green", "answer":  "color:green;"},
    { "question": "pink italic", "answer":  "color:pink; font-style:italic;"}
    
]    

# Configure a formatter that will format the few shot examples into a string. 
# This formatter should be a PromptTemplate object.

example_prompt = PromptTemplate (
    input_variables=["question", "answer"], 
    template="question: {question}\n{answer}"
)

print("\n=== exemple prompt ===")
print(example_prompt.format(**examples[0]))

# Example selector that selects examples based on SemanticSimilarity.

example_selector = SemanticSimilarityExampleSelector.from_examples(
    # This is the list of examples available to select from.
    examples,
    # This is the embedding class used to produce embeddings which are used to measure semantic similarity.
    OpenAIEmbeddings(),
    # This is the VectorStore class that is used to store the embeddings and do a similarity search over.
    #Chroma,
    Annoy,
    # This is the number of examples to produce.
    k=2
)

# Finally, create a FewShotPromptTemplate object. 
# This object takes in the few shot examples and the formatter for the few shot examples.

prompt_template = FewShotPromptTemplate(
    example_selector=example_selector, 
    example_prompt=example_prompt, 
    suffix="question: {input}", 
    input_variables=["input"]
)

prompt = prompt_template.format(input="pink bold")

print("\n=== prompt ===")
print(prompt)

print("\n=== answer ===")
print(llm(prompt))



=== exemple prompt ===
question: red bold
color:red; font-style:bold;

=== prompt ===
question: red bold
color:red; font-style:bold;

question: pink italic
color:pink; font-style:italic;

question: pink bold

=== answer ===

color:pink; font-style:bold;


## Output Parser and response format

A way to format the outpu
- Format nstructions: An autogenerated prompt telling how the result should be formatted
- parser: a method which will extract the output int hte desired format. you may prvie a custom parser


Documentation: https://docs.langchain.com/docs/components/prompts/output-parser

In [29]:
from langchain.llms import OpenAI
from langchain.output_parsers import StructuredOutputParser, ResponseSchema
from langchain.schema import HumanMessage, SystemMessage, AIMessage
from langchain.prompts.prompt import PromptTemplate


# loads the model.
llm = OpenAI(model_name="text-davinci-003", temperature=0.9)

# how you would like the response to be structured
# periods at the send of sentence are required. 
# If not there description ends up in the json text and break the JSON format
response_schemas = [
    ResponseSchema(name="bad_string", description="This is a poorly formatted string."),
    ResponseSchema(name="good_string", description="This is a your string reformatted.")
]

# How you would like to parse your output
output_parser = StructuredOutputParser.from_response_schemas(response_schemas)

# check instructions
format_instructions =output_parser.get_format_instructions()
print("\nformat_instructions")      
print(format_instructions)      

template = """
You will be given a poorly formatted string from a user. 
Reformat it and make sure all the words are spelled correctly.


{format_instructions}

% USER_INPUT:
{user_input}

YOUR RESPONSE:
"""

prompt_template = PromptTemplate(
    input_variables=['user_input'],
    partial_variables={'format_instructions': format_instructions},
    template=template
)

# format the user input as a prompt
# for whateveer reason it does not work well with format.
# format_promt retruns an object, not a string and should be converted to a string 
prompt = prompt_template.format_prompt(user_input="Wellcom to Californya!").to_string()
print("\nprompt")
print(prompt)

# gets the response
response = llm(prompt)
print("\nresponse=")      
print(response)      

# gets the JSON document
print("\parsed output=")      
output_parser.parse(response)                   



format_instructions
The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```":

```json
{
	"bad_string": string  // This is a poorly formatted string.
	"good_string": string  // This is a your string reformatted.
}
```

prompt

You will be given a poorly formatted string from a user. 
Reformat it and make sure all the words are spelled correctly.


The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```":

```json
{
	"bad_string": string  // This is a poorly formatted string.
	"good_string": string  // This is a your string reformatted.
}
```

% USER_INPUT:
Wellcom to Californya!

YOUR RESPONSE:


response=
```json
{
	"bad_string": "Wellcom to Californya!",
	"good_string": "Welcome to California!"
}
```


{'bad_string': 'Wellcom to Californya!',
 'good_string': 'Welcome to California!'}

---
# 9. X

<div class="alert alert-block alert-warning"> TODO </div>