### Chat Messages

- **System**: Helpful background context that tell the AI what to do

- **Human**: Messages that are intended to represent the user

- **AI**: Messasges that shows what the AI responded with

Chat is like text, but specified with a message type

In [4]:
# Keys
import os

OPENAI_API_KEY = "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"
os.environ["SERPAPI_API_KEY"] = "1803944467b417b9f37711da30ca0f177c69f0dbd0bd23e32e5fb5e21858ac5e"


In [2]:
from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage, AIMessage

chat = ChatOpenAI(temperature=0.7, openai_api_key=OPENAI_API_KEY)

In [3]:
chat([
    SystemMessage(content="You are a nice AI bot that helps a user figure out what to eat in a short sentence."),
    HumanMessage(content="I like tomatoes, what should I eat?")
])

AIMessage(content='You could try a Caprese salad with tomatoes, fresh mozzarella, and basil. Or, you could make a classic tomato sauce and serve it over pasta or use it as a pizza topping.', additional_kwargs={}, example=False)

In [4]:
chat([
    SystemMessage(content="You are a nice AI bot that helps a user figure out where to travel to in a short sentence."),
    HumanMessage(content="I like beaches, what should I travel to?"),
    AIMessage(content="You should go to Nice, France."),
    HumanMessage(content="What else should I do while I'm there?")
])

Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 1.0 seconds as it raised RateLimitError: That model is currently overloaded with other requests. You can retry your request, or contact us through our help center at help.openai.com if the error persists. (Please include the request ID 5218ef399997af0f09671e4b1224627d in your message.).


AIMessage(content='While in Nice, you can also explore the old town and its narrow streets, visit the famous Promenade des Anglais, and take a day trip to nearby Cannes or Monaco.', additional_kwargs={}, example=False)

### Documents

An object that holds a piece of text and metadata 

In [5]:
from langchain.schema import Document

In [6]:
Document(page_content="This is my document. It is full of tetx that I've gathered from other places.",
         metadata={
             "my_document_id": 234234,
             "my_document_source": "The LangChain Papers",
             "my_document_create_time": 1680013019
         })

Document(page_content="This is my document. It is full of tetx that I've gathered from other places.", metadata={'my_document_id': 234234, 'my_document_source': 'The LangChain Papers', 'my_document_create_time': 1680013019})

# Models - The interface to the AI brains

### Language Model

Text In => Text Out

In [7]:
from langchain.llms import OpenAI

llm = OpenAI(model_name="text-ada-001", openai_api_key=OPENAI_API_KEY)

In [8]:
llm("What day comes after friday?")

'\n\nSaturday'

### Chat Model

Message In => Message Out

In [9]:
from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage, AIMessage

chat = ChatOpenAI(temperature=1, openai_api_key=OPENAI_API_KEY)

In [10]:
chat([
    SystemMessage(content="You are an unhelpful AI bot that makes jokes at whatever the user says."),
    HumanMessage(content="I would like to go to New York, how should I do this?"),
])

AIMessage(content="Oh, well you could start by putting one foot in front of the other. That's usually a good way to get places. Unless you're walking to New York, in which case you might want to pack some snacks.", additional_kwargs={}, example=False)

### Text Embedding Model

Change text into a vector. Mainly used when comparing two pieces of text together

In [11]:
from langchain.embeddings import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY)

In [12]:
text = "Hi! It's time for the beach"

In [13]:
text_embedding = embeddings.embed_query(text)
print(f"Your embedding is length {len(text_embedding)}")
print(f"Here's a sample: {text_embedding[:10]}")

Your embedding is length 1536
Here's a sample: [-0.00011466221621958539, -0.0031506523955613375, -0.0007831145194359124, -0.019504327327013016, -0.015125557780265808, 0.031269997358322144, -0.01598675549030304, -0.011741410940885544, 0.0093094352632761, -0.01360936276614666]


# Prompts - Text generally used as instructions to the model

### Prompt

What you'll pass to the underlying model

In [14]:
from langchain.llms import OpenAI

llm = OpenAI(model_name="text-davinci-003", openai_api_key=OPENAI_API_KEY)

prompt = """
Today is monday, tomorrow is wednesday.

What is wrong with that statement?
"""

llm(prompt)

'\nThat statement is incorrect; tomorrow is Tuesday.'

### Prompt Template

An object that helps create prompts based on a combination of user input, other non-static information and a fixed template string.

*It's an f-string, just for prompts*

In [15]:
from langchain.llms import OpenAI
from langchain import PromptTemplate


llm = OpenAI(model_name="text-davinci-003", openai_api_key=OPENAI_API_KEY)

template = """
I really want to travel to {location}. What should I do there?

Respond in one short sentence.
"""

prompt = PromptTemplate(
    input_variables=["location"],
    template=template
)

final_prompt = prompt.format(location="Rome")

print(f"Final Prompt: {final_prompt}")
print("----------------------")
print(f"LLM Output: {llm(final_prompt)}")

Final Prompt: 
I really want to travel to Rome. What should I do there?

Respond in one short sentence.

----------------------
LLM Output: Explore the Colosseum, the Pantheon, and the Trevi Fountain!


### Example Selectors

An easy way to select from a series of examples that allow you to dinamically place in-context information into your prompt. 
Often used when your task is nuanced or you have a large list of examples.

In [16]:
from langchain.prompts.example_selector import SemanticSimilarityExampleSelector
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings
from langchain.prompts import FewShotPromptTemplate, PromptTemplate
from langchain.llms import OpenAI

llm = OpenAI(model_name="text-davinci-003", openai_api_key=OPENAI_API_KEY)

example_prompt = PromptTemplate(
    input_variables=["input", "output"],
    template="Example Input: {input}\nExample Output: {output}\n"
)

# Examples of locations that nouns are found
examples = [
    {"input": "pirate", "output": "ship"},
    {"input": "pilot", "output": "plane"},
    {"input": "driver", "output": "car"},
    {"input": "tree", "output": "forest"},
    {"input": "bird", "output": "nest"}
]

In [17]:
# SemanticSimilarityExampleSelector will select examples that are similar to your input

example_selector = SemanticSimilarityExampleSelector.from_examples(
    # This is the list of examples available to select from
    examples,

    # This is the embedding class used to produce embeddings which are used to measure 
    # similarity between the input and the examples
    OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY),

    # This is the VectorStore class that is used to store the embeddings and do a 
    # similarity search
    FAISS,

    # This is the number of examples to produce
    k=2
)

In [18]:
similar_prompt = FewShotPromptTemplate(
    # The object that will help select examples
    example_selector=example_selector,

    # Your prompt
    example_prompt=example_prompt,

    # Customizations that will be added to the top and bottom of your prompt
    prefix="Give the location an item is usually found in",
    suffix="Input: {noun}\nOutput:",

    # What inputs your prompt will receive
    input_variables=["noun"]
)

In [19]:
my_noun = "student"

print(similar_prompt.format(noun=my_noun))

Give the location an item is usually found in

Example Input: driver
Example Output: car


Example Input: pilot
Example Output: plane


Input: student
Output:


In [20]:
llm(similar_prompt.format(noun=my_noun))

' classroom'

### Output Parsers

A helpful way to format the ooutput of a model. Usually used for strutured outputs.

Two main concepts:

**1. Format Instructions** - A autogenerated prompt that tells the LLM how to format its response based off your desired result.

**2. Parser** - A methos which will extract your model's text output into a desired structure (usually json)

In [21]:
from langchain.output_parsers import StructuredOutputParser, ResponseSchema
from langchain.prompts import ChatPromptTemplate, HumanMessagePromptTemplate
from langchain.llms import OpenAI

In [22]:
llm = OpenAI(model_name="text-davinci-003", openai_api_key=OPENAI_API_KEY)

In [23]:
# How you'd like your response structured. This is basically a fancy prompt template
response_schemas = [
    ResponseSchema(name="bad_string", description="This is a poorly formatted user input string"),
    ResponseSchema(name="good_string", description="This is your response,a reformatted response")
]

# How you'd like ot parse your putput
output_parser = StructuredOutputParser.from_response_schemas(response_schemas)


In [24]:
# See the prompt template you created for formatting
format_instructions = output_parser.get_format_instructions()
print(output_parser.get_format_instructions())

The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```":

```json
{
	"bad_string": string  // This is a poorly formatted user input string
	"good_string": string  // This is your response,a reformatted response
}
```


In [31]:
template = """
You will be given a poorly formatted string from a user.
Reformat it and make sure all the words are spelled corectly.

{format_instructions}

% USER INPUT:
{user_input}

YOUR RESPONSE:
"""

prompt = PromptTemplate(
    input_variables=["user_input"],
    partial_variables={"format_instructions": format_instructions},
    template=template
)

promptValue = prompt.format(user_input="welcom to californya!")
print(promptValue)


You will be given a poorly formatted string from a user.
Reformat it and make sure all the words are spelled corectly.

The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```":

```json
{
	"bad_string": string  // This is a poorly formatted user input string
	"good_string": string  // This is your response,a reformatted response
}
```

% USER INPUT:
welcom to californya!

YOUR RESPONSE:



In [30]:
llm_output = llm(promptValue)
llm_output

'```json\n{\n\t"bad_string": "welcom to californya!"\n\t"good_string": "Welcome to California!"\n}\n```'

In [32]:
type(output_parser.parse(llm_output))

OutputParserException: Got invalid JSON object. Error: Expecting ',' delimiter: line 3 column 2 (char 42)

# Indexes - Structuring documents so that LLMs can work with them

### Document Loaders

Easy ways to import data from other sources. Shared functionality with OpenAI Plugins, specifically retrieval plugins

In [33]:
from langchain.document_loaders import HNLoader

In [34]:
loader = HNLoader("https://news.ycombinator.com/item?id=34422627")

In [35]:
data = loader.load()

In [36]:
print(f"Found {len(data)} comments")
print(f"Here's a sample: \n\n{''.join([x.page_content[:150] for x in data[:2]])}")

Found 76 comments
Here's a sample: 

Ozzie_osman 4 months ago  
             | next [–] 

LangChain is awesome. For people not sure what it's doing, large language models (LLMs) are very Ozzie_osman 4 months ago  
             | parent | next [–] 

Also, another library to check out is GPT Index (https://github.com/jerryjliu/gpt_index)


In [46]:
### Text Splitters

loader = UnstructuredImageLoader("test_2.png", mode="elements")

In [1]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

In [None]:
# This is a long document we can split up
with open(data) as f:
    pg_work = f.read()

print(f"You have {len([pg_work])} document")

In [None]:
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 150,
    chunk_overlap = 20
)

texts = text_splitter.create_documents([pg_work])

In [None]:
print(f"You now have {len(texts)} documents")

### Retrievers

Easy way to combine document with language models

There are many different types of retrievers, the most widely supported is the VectorStoreRetriever

In [2]:
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings

loader = TextLoader('data.txt')
documents = loader.load()

In [5]:
# Get your splitter ready
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=50)

# Split your docs into texts
texts = text_splitter.split_documents(documents)

# Get embedding engine ready
embeddings = OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY)

# Embedd your texts
db = FAISS.from_documents(texts, embeddings)

In [6]:
# Init your retriever. ASking for just 1 document back
retriever = db.as_retriever()

In [7]:
retriever

VectorStoreRetriever(vectorstore=<langchain.vectorstores.faiss.FAISS object at 0x7f93e7af2640>, search_type='similarity', search_kwargs={})

In [8]:
docs = retriever.get_relevant_documents("What os an alien truth?")

In [47]:
print("\n\n".join([x.page_content[:200] for x in docs[:2]]))

Whatever we call it, the attempt to discover alien truths would be a worthwhile undertaking. And curiously enough, that is itself probably an alien truth.

We can only guess, of course. We can't say for sure what forms intelligent life might take. Nor is it my goal here to explore that question, interesting though it is. The point of the idea of alien tr


### Vector Stores

Databases to store vectors. Most popular ones are Pinecone and Weaviate.

Chroma and FAISS are easy to work with locally.

Conceptulally, think of them as tables with a column for embeddings and a column for metadata

In [10]:
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings

loader = TextLoader('data.txt')
documents = loader.load()

# Get your splitter ready
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=50)

# Split your docs into texts
texts = text_splitter.split_documents(documents)

# Get embedding engine ready
embeddings = OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY)

[Document(page_content='Considerando A = {1.23.4} e B= {1', metadata={'source': 'test_2.png', 'filetype': 'text/plain', 'page_number': 1, 'category': 'Title'}),
 Document(page_content='a) quantas fung', metadata={'source': 'test_2.png', 'filetype': 'text/plain', 'page_number': 1, 'category': 'Title'}),
 Document(page_content='sobrejetoras) existem?', metadata={'source': 'test_2.png', 'filetype': 'text/plain', 'page_number': 1, 'category': 'Title'}),
 Document(page_content='b) quantas so as fungdes f: B > B que s f(f(n)) = n, para todo n € B?', metadata={'source': 'test_2.png', 'filetype': 'text/plain', 'page_number': 1, 'category': 'NarrativeText'}),
 Document(page_content='¢) escolhendo aleatoriamente uma fungio f: B > B bijetora, qual é a probabilidade de f ter ao menos um ponto fixo?', metadata={'source': 'test_2.png', 'filetype': 'text/plain', 'page_number': 1, 'category': 'NarrativeText'})]

In [11]:
print(f"You have {len(texts)} documents")

You have 6 documents


In [13]:
embedding_list = embeddings.embed_documents([text.page_content for text in texts])

In [14]:
print(f"You have {len(embedding_list)} embeddings")

You have 6 embeddings


# Memory

Helping LLMs remember information.

Memory is a bit of a loose term. It could be as simple as remembering information you've chatted about in the past or more complicated information retrieval.

There are many types of memory 

### Chat Message History

In [15]:
from langchain.memory import ChatMessageHistory
from langchain.chat_models import ChatOpenAI

chat = ChatOpenAI(temperature=0, openai_api_key=OPENAI_API_KEY)

history = ChatMessageHistory()

history.add_ai_message("Hi!")

history.add_user_message("What is the capital of France?")

In [16]:
history.messages

[AIMessage(content='Hi!', additional_kwargs={}, example=False),
 HumanMessage(content='What is the capital of France?', additional_kwargs={}, example=False)]

In [17]:
ai_reponse = chat(history.messages)
ai_reponse

AIMessage(content='The capital of France is Paris.', additional_kwargs={}, example=False)

# Chains

Combining different LLM calls and action automatically

EX: Summary #1, Summary #2, Summary #3 > Final Summary

### 1. Simple Sequential Chains

Easy chains where you can use the ouput of an LLM as an input into another. Good for breaking up taskes and keeping the LLM focused

In [18]:
from langchain.llms import OpenAI
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain.chains import SimpleSequentialChain

llm = OpenAI(temperature=1, openai_api_key=OPENAI_API_KEY)

In [19]:
template = """
Your job is to come up with a classic dish from the area that the \
user suggests.

% USER LOCATION
{user_location}

YOUR RESPONSE:
"""

prompt_template = PromptTemplate(
    input_variables=["user_location"],
    template=template
)

# Holds my 'location' chain
location_chain = LLMChain(llm=llm, prompt=prompt_template)

In [20]:
template = """
Given a meal, give a short and simples recipe on how to make that dish at home.
% MEAL
{user_meal}

YOUR RESPONSE:
"""

prompt_template = PromptTemplate(
    input_variables=["user_meal"],
    template=template
)

# Holds my 'meal' chain
meal_chain = LLMChain(llm=llm, prompt=prompt_template)

In [21]:
overall_chain = SimpleSequentialChain(
    chains=[location_chain, meal_chain],
    verbose=True
)

In [22]:
review = overall_chain("I'm in Paris")



[1m> Entering new SimpleSequentialChain chain...[0m
[36;1m[1;3mAn iconic dish in Paris is Coq au Vin, which is a red wine-braised chicken with mushrooms, lardons and sometimes garlic.[0m
[33;1m[1;3m
Coq au Vin
Ingredients:
-  4-6 chicken thighs 
- 2 tablespoons olive oil 
- Salt and pepper 
- ½ cup lardons
- 2 cloves garlic, minced 
- 4-6 shallots, quartered 
- 2 tablespoons tomato paste 
- 2 cups chicken stock 
- 2 cups red wine 
- 2 cups mushrooms 
- 2 tablespoons butter 
- 2 tablespoons chopped parsley 

Instructions:
1. Preheat oven to 350ºF
2. Season chicken thighs with olive oil, salt, and pepper
3. Heat a dutch oven or large oven-safe pot over medium-high heat. 
4. Add lardons and cook for 3-4 minutes, stirring occasionally.
5. Add shallots, garlic, and tomato paste to the lardons and cook for an additional minute.
6. Add chicken thighs to the pot and stir to combine, then pour in the chicken stock and wine.
7. Bring to a gentle simmer and add mushrooms.
8. Cover and tr

### 2. Summarization Chain

Easily run through long numerous documents and get a summary

In [23]:
from langchain.chains.summarize import load_summarize_chain
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

loader = TextLoader('data.txt')
documents = loader.load()

# Get your splitter ready
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=50)

# Split your docs into texts
texts = text_splitter.split_documents(documents)

# There is a lot of complexity hidden in this one line
chain = load_summarize_chain(llm, chain_type="map_reduce", verbose=True)
chain.run(texts)



[1m> Entering new MapReduceDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mWrite a concise summary of the following:


"October 2022

If there were intelligent beings elsewhere in the universe, they'd share certain truths in common with us. The truths of mathematics would be the same, because they're true by definition. Ditto for the truths of physics; the mass of a carbon atom would be the same on their planet. But I think we'd share other truths with aliens besides the truths of math and physics, and that it would be worthwhile to think about what these might be.

For example, I think we'd share the principle that a controlled experiment testing some hypothesis entitles us to have proportionally increased belief in it. It seems fairly likely, too, that it would be true for aliens that one can get better at something by practicing. We'd probably share Occam's razor. There doesn't seem anything specifically human about any

" The proposed concept of ‘alien truth’ suggests that there is more to the concept of God's book than mathematics and it is an important task for philosophers to undertake. AI technology can help determine potential truths that aliens may share with us, such as using Occam's razor. Though research can be productive, there is always the potential that the best guesses will turn out to be wrong. Further research into extraterrestrial life and the knowledge associated with it can be a beneficial venture, and it is possible that this search itself may reveal alien truths."

# Agents

Some application will require not just a predetermined chain of calls to LLMSs/other tools, but potentially an unknown chain that depends on the user's input. In there types of chains, there's an "agent" which has access to a suite of tools. Depending on the user input, the agent can then decide which, if any, of these tools to call.