# LangChain

_Heavily cribbed from https://github.com/gkamradt/langchain-tutorials/_

## Concepts

> LangChain is a framework for developing apps with language models.

It makes development easier in two ways:

1. __Integration__: Links external data, such as files, other apps, or APIs, with the LLM
2. __Agency__: Allows LLMs to interact with its environment via decision making. Use LLMs to decide which action to take next.

## References

[Tutorials](https://python.langchain.com/v0.1/docs/additional_resources/tutorials/)
[Use Cases](https://python.langchain.com/v0.1/docs/use_cases/)

* Q&A with RAG
* Extracting structured output
* Chatbots
* Tool use and agents
* Query analysis
* Q&A over SQL + CSV
* More

[Tool List](https://python.langchain.com/v0.1/docs/integrations/tools/)

## Components

### 1. Schema: The Building Blocks for working with LLMs

#### 1.1 Text

#### 1.2 Chat Messages

Like text, but with a message type:

* System: Background context
* Human
* AI

```python
%pip install python-dotenv
%pip install langchain

from dotenv import load_dotenv
import os

load_dotenv()

openai_api_key=os.getenv('OPENAI_API_KEY', 'YourAPIKey')

%pip install openai
%pip install langchain-community langchain-core

from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage, AIMessage

# This it the language model we'll use. We'll talk about what we're doing below in the next section
chat = ChatOpenAI(temperature=.7, openai_api_key=openai_api_key)
```

```
chat(
    [
        SystemMessage(content="You are a nice AI bot that helps a user figure out what to eat in one short sentence"),
        HumanMessage(content="I like tomatoes, what should I eat?")
    ]
)
```

> AIMessage(content='You could try a caprese salad with fresh tomatoes, mozzarella, and basil.')

You can also pass more chat history w/ responses from the AI

```python
chat(
    [
        SystemMessage(content="You are a nice AI bot that helps a user figure out where to travel in one short sentence"),
        HumanMessage(content="I like the beaches where should I go?"),
        AIMessage(content="You should go to Nice, France"),
        HumanMessage(content="What else should I do when I'm there?")
    ])
```

#### 1.3 Documents

An object that holds text and metadata about the text.

```python
from langchain.schema import Document

Document(page_content="This is my document. It is full of text that I've gathered from other places",
         metadata={
             'my_document_id' : 234234,
             'my_document_source' : "The LangChain Papers",
             'my_document_create_time' : 1680013019
         })
```

### 2. Models

The models interface to the AI brains.

* __2.1 Language Model__: Text in --> Text out
* __2.2 Chat Model__: Takes a series of messages --> return message output
* __2.3 Function Calling Model__: Fine-tuned to give structured data output. Useful when making an API call to an external service or doing data extraction.
```python
chat = ChatOpenAI(model='gpt-3.5-turbo-0613', temperature=1, openai_api_key=openai_api_key)

output = chat(messages=
     [
         SystemMessage(content="You are an helpful AI bot"),
         HumanMessage(content="What’s the weather like in Boston right now?")
     ],
     functions=[{
         "name": "get_current_weather",
         "description": "Get the current weather in a given location",
         "parameters": {
             "type": "object",
             "properties": {
                 "location": {
                     "type": "string",
                     "description": "The city and state, e.g. San Francisco, CA"
                 },
                 "unit": {
                     "type": "string",
                     "enum": ["celsius", "fahrenheit"]
                 }
             },
             "required": ["location"]
         }
     }
     ]
)
output
```

```
AIMessage(content='', additional_kwargs={'function_call': {'name': 'get_current_weather', 'arguments': '{\n  "location": "Boston, MA"\n}'}})
```

* __2.4 Text Embedding__: Turns the text into a vector, useful when comparing text.

```python
from langchain.embeddings import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)
text = "Hi! It's time for the beach"
text_embedding = embeddings.embed_query(text)
```

### 3. Prompts

#### 3.1 Prompts

What you pass to the underlying model

```python
from langchain.llms import OpenAI

llm = OpenAI(model_name="text-davinci-003", openai_api_key=openai_api_key)

# I like to use three double quotation marks for my prompts because it's easier to read
prompt = """
Today is Monday, tomorrow is Wednesday.

What is wrong with that statement?
"""

print(llm(prompt))
```
 
#### 3.2 Prompt Templates

An object that helps create prompts based on a combination of user input, other non-static information and a fixed template string.

Think of it as an f-string in python but for prompts

Advanced: Check out LangSmithHub(https://smith.langchain.com/hub) for many more communit prompt templates

```python
from langchain.llms import OpenAI
from langchain import PromptTemplate

llm = OpenAI(model_name="text-davinci-003", openai_api_key=openai_api_key)

# Notice "location" below, that is a placeholder for another value later
template = """
I really want to travel to {location}. What should I do there?

Respond in one short sentence
"""

prompt = PromptTemplate(
    input_variables=["location"],
    template=template,
)

final_prompt = prompt.format(location='Rome')

print (f"Final Prompt: {final_prompt}")
print ("-----------")
print (f"LLM Output: {llm(final_prompt)}")
```

#### 3.3 Example Selectors

An easy way to select from a series of examples that allow you to dynamic place in-context information into your prompt. Often used when your task is nuanced or you have a large list of examples.

```python
from langchain.prompts.example_selector import SemanticSimilarityExampleSelector
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
from langchain.prompts import FewShotPromptTemplate, PromptTemplate
from langchain.llms import OpenAI

llm = OpenAI(model_name="text-davinci-003", openai_api_key=openai_api_key)

example_prompt = PromptTemplate(
    input_variables=["input", "output"],
    template="Example Input: {input}\nExample Output: {output}",
)

# Examples of locations that nouns are found
examples = [
    {"input": "pirate", "output": "ship"},
    {"input": "pilot", "output": "plane"},
    {"input": "driver", "output": "car"},
    {"input": "tree", "output": "ground"},
    {"input": "bird", "output": "nest"},
]

# SemanticSimilarityExampleSelector will select examples that are similar to your input by semantic meaning

example_selector = SemanticSimilarityExampleSelector.from_examples(
    # This is the list of examples available to select from.
    examples, 
    
    # This is the embedding class used to produce embeddings which are used to measure semantic similarity.
    OpenAIEmbeddings(openai_api_key=openai_api_key), 
    
    # This is the VectorStore class that is used to store the embeddings and do a similarity search over.
    Chroma, 
    
    # This is the number of examples to produce.
    k=2
)

similar_prompt = FewShotPromptTemplate(
    # The object that will help select examples
    example_selector=example_selector,
    
    # Your prompt
    example_prompt=example_prompt,
    
    # Customizations that will be added to the top and bottom of your prompt
    prefix="Give the location an item is usually found in",
    suffix="Input: {noun}\nOutput:",
    
    # What inputs your prompt will receive
    input_variables=["noun"],
)

# Select a noun!
my_noun = "plant"
# my_noun = "student"

print(similar_prompt.format(noun=my_noun))

```
Give the location an item is usually found in

Example Input: tree
Example Output: ground

Example Input: bird
Example Output: nest

Input: plant
Output:
```

```python
llm(similar_prompt.format(noun=my_noun))
```


`pot`


#### 3.4 Output Parsers: Prompt Instructions & String Parsing

A helpful way to format the output of a model. Usually used for structured output. LangChain has a bunch more output parsers listed on their documentation.

1. __Format Instructions__: A autogenerated prompt that tells the LLM how to format it's response based off your desired result
2. __Parser__: A method which will extract your model's text output into a desired structure (usually json)

##### 3.5 Output Parsers: OpenAI Functions

When OpenAI released function calling, the game changed. This is recommended method when starting out.

They trained models specifically for outputing structured data. It became super easy to specify a Pydantic schema and get a structured output.


### 4. Indexes

Indexes are used to structure documents so LLMs can work with them.

#### 4.1 Document Loaders

Allow you to import documents from other sources. E.g., hacker news, wiki, web pages.

```python
from langchain.document_loaders import UnstructuredURLLoader

urls = [
    "http://www.paulgraham.com/",
]

loader = UnstructuredURLLoader(urls=urls)

data = loader.load()

data[0].page_content
```

#### 4.2 Text Splitters

Often times your document is too long (like a book) for your LLM or Vector DB. You need to split it up into chunks. Text splitters help with this.

```python
from langchain.text_splitter import RecursiveCharacterTextSplitter

# This is a long document we can split up.
with open('data/PaulGrahamEssays/worked.txt') as f:
    pg_work = f.read()
    
# 1 Document
print (f"You have {len([pg_work])} document")

text_splitter = RecursiveCharacterTextSplitter(
    # Set a really small chunk size, just to show.
    chunk_size = 150,
    chunk_overlap  = 20,
)

# 610 Documents
texts = text_splitter.create_documents([pg_work])
```

#### 4.3 Retrievers

An easy way to combine documents with large language models.

There are many different types of retrievers, the most widely supported is the VectoreStoreRetriever

```python
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings

loader = TextLoader('data/PaulGrahamEssays/worked.txt')
documents = loader.load()

# Get your splitter ready
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=50)

# Split your docs into texts
texts = text_splitter.split_documents(documents)

# Get embedding engine ready
embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)

# Embedd your texts
db = FAISS.from_documents(texts, embeddings)

# Init your retriever. Asking for just 1 document back
retriever = db.as_retriever()

docs = retriever.get_relevant_documents("what types of things did the author want to build?")


print("\n\n".join([x.page_content[:200] for x in docs[:2]]))
```

#### 4.4 VectorStores
Databases to store vectors. Most popular ones are [Pinecone](https://www.pinecone.io/) & [Weaviate](https://weaviate.io/). More examples on OpenAIs [retriever documentation](https://github.com/openai/chatgpt-retrieval-plugin#choosing-a-vector-database). [Chroma](https://www.trychroma.com/) & [FAISS](https://engineering.fb.com/2017/03/29/data-infrastructure/faiss-a-library-for-efficient-similarity-search/) are easy to work with locally.

Conceptually, think of them as tables w/ a column for embeddings (vectors) and a column for metadata.

Example

| Embedding      | Metadata |
| ----------- | ----------- |
| [-0.00015641732898075134, -0.003165106289088726, ...]      | {'date' : '1/2/23}       |
| [-0.00035465431654651654, 1.4654131651654516546, ...]   | {'date' : '1/3/23}        |

### 5. Memory

Helping LLMs remember information.

Memory is a bit of a loose term. It could be as simple as remembering information you've chatted about in the past or more complicated information retrieval.

There are many types of memory, explore [the documentation](https://python.langchain.com/en/latest/modules/memory/how_to_guides.html) to see which one fits your use case.

#### 5.1 Chat Message History

### 6. Chains

Combining different LLM calls and action automatically

Ex: Summary #1, Summary #2, Summary #3 > Final Summary

Check out [this video](https://www.youtube.com/watch?v=f9_BWhCI4Zo&t=2s) explaining different summarization chain types

There are [many applications of chains](https://python.langchain.com/en/latest/modules/chains/how_to_guides.html) search to see which are best for your use case.

#### 6.1 Simple Sequential Chains

Easy chains where you can use the output of an LLM as an input into another. Good for breaking up tasks (and keeping your LLM focused)


```python
from langchain.llms import OpenAI
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain.chains import SimpleSequentialChain

llm = OpenAI(temperature=1, openai_api_key=openai_api_key)

template = """Your job is to come up with a classic dish from the area that the users suggests.
% USER LOCATION
{user_location}

YOUR RESPONSE:
"""
prompt_template = PromptTemplate(input_variables=["user_location"], template=template)

# Holds my 'location' chain
location_chain = LLMChain(llm=llm, prompt=prompt_template)

template = """Given a meal, give a short and simple recipe on how to make that dish at home.
% MEAL
{user_meal}

YOUR RESPONSE:
"""
prompt_template = PromptTemplate(input_variables=["user_meal"], template=template)

# Holds my 'meal' chain
meal_chain = LLMChain(llm=llm, prompt=prompt_template)

overall_chain = SimpleSequentialChain(chains=[location_chain, meal_chain], verbose=True)

review = overall_chain.run("Rome")
```

```
> Entering new SimpleSequentialChain chain...

A classic dish from Rome is Spaghetti alla Carbonara, featuring egg, Parmesan cheese, black pepper, and pancetta or guanciale.

Ingredients:
- 8oz spaghetti 
- 4 tablespoons olive oil
- 4oz diced pancetta or guanciale
- 2 cloves garlic, minced
- 2 eggs, lightly beaten
- 2 tablespoons parsley, chopped 
- ½ cup grated Parmesan 
- Salt and black pepper to taste

Instructions:
1. Bring a pot of salted water to a boil and add the spaghetti. Cook according to package directions. 
2. Meanwhile, add the olive oil to a large skillet over medium-high heat. Add the diced pancetta and garlic, and cook until pancetta is browned and garlic is fragrant.
3. In a medium bowl, whisk together the eggs, parsley, Parmesan, and salt and pepper.
4. Drain the cooked spaghetti and add it to the skillet with the pancetta and garlic. Remove from heat and pour the egg mixture over the spaghetti, stirring to combine. 
5. Serve the spaghetti alla carbonara with additional Parmesan cheese and black pepper.

> Finished chain.
```

#### 6.2 Summarization Chain

Easily run through long numerous documents and get a summary. Check out [this video](https://www.youtube.com/watch?v=f9_BWhCI4Zo) for other chain types besides map-reduce.

```python
from langchain.chains.summarize import load_summarize_chain
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

loader = TextLoader('data/PaulGrahamEssays/disc.txt')
documents = loader.load()

# Get your splitter ready
text_splitter = RecursiveCharacterTextSplitter(chunk_size=700, chunk_overlap=50)

# Split your docs into texts
texts = text_splitter.split_documents(documents)

# There is a lot of complexity hidden in this one line. I encourage you to check out the video above for more detail
chain = load_summarize_chain(llm, chain_type="map_reduce", verbose=True)
chain.run(texts)
```

### 7. Agents

Official LangChain Documentation describes agents:

> Some applications will require not just a predetermined chain of calls to LLMs/other tools, but potentially an **unknown chain** that depends on the user's input. In these types of chains, there is a “agent” which has access to a suite of tools. Depending on the user input, the agent can then **decide which, if any, of these tools to call**.


Basically you use the LLM not just for text output, but also for decision making. The coolness and power of this functionality can't be overstated enough.

#### 7.1 Agents

The language model that drives decision making.

Takes an input and returns a response corresponding to an action to take along with an action input.

You can see different types of agents (which are better for different use cases) [here](https://python.langchain.com/en/latest/modules/agents/agents/agent_types.html).

#### 7.2 Tools

The capability of an agent. An abstraction on top of a function that makes it easy for LLMs to interact with it. E.g. Google Search.

##### 7.3 Toolkit

A group of tools that your agent can select from.

## LangChain Use Cases

### Main Use Cases

1. __Summarization__
2. __Document Q&A__
3. __Extraction__: Pull structured data out of a body of text or query
4. __Evaluation__: Understand the quality of output from your application
5. __Query Data__: Pull data from DBs or other tabular sources
6. __Code Understanding__
7. __Interact with APIs__
8. __Chatbots__
9. __Agents__

[LangChain Project Gallery](https://github.com/gkamradt/langchain-tutorials) for examples of these use cases.



### 1. Summarization

You can pass in short text via a prompt:

```python
from langchain.llms import OpenAI
from langchain import PromptTemplate

# Note, the default model is already 'text-davinci-003' but I call it out here explicitly so you know where to change it later if you want
llm = OpenAI(temperature=0, model_name='text-davinci-003', openai_api_key=openai_api_key)

# Create our template
template = """
%INSTRUCTIONS:
Please summarize the following piece of text.
Respond in a manner that a 5 year old would understand.

%TEXT:
{text}
"""

# Create a LangChain prompt template that we can insert values to later
prompt = PromptTemplate(
    input_variables=["text"],
    template=template,
)

final_prompt = prompt.format(text="Some text to summarize")

output = llm(final_prompt)
print (output)
```

You can also split up longer text into documents, and then summarize all of them.

```python
from langchain.llms import OpenAI
from langchain.chains.summarize import load_summarize_chain
from langchain.text_splitter import RecursiveCharacterTextSplitter

llm = OpenAI(temperature=0, openai_api_key=openai_api_key)

with open('data/PaulGrahamEssays/good.txt', 'r') as file:
    text = file.read()
    
# Note: Other text splitters exist.
text_splitter = RecursiveCharacterTextSplitter(separators=["\n\n", "\n"], chunk_size=5000, chunk_overlap=350)
docs = text_splitter.create_documents([text])

# Get your chain ready to use
chain = load_summarize_chain(llm=llm, chain_type='map_reduce') # verbose=True optional to see what is getting sent to the LLM

# Use it. This will run through the 4 documents, summarize the chunks, then get a summary of the summary.
output = chain.run(docs)
print (output)
```

### 2. Document Q&A

In order to use LLMs for question and answer we must:

1. Pass the LLM relevant context it needs to answer a question
2. Pass it our question that we want answered

Simplified, this process looks like this "llm(your context + your question) = your answer"

```python
from langchain.llms import OpenAI

llm = OpenAI(temperature=0, openai_api_key=openai_api_key)

context = """
Rachel is 30 years old
Bob is 45 years old
Kevin is 65 years old
"""

question = "Who is under 40 years old?"

output = llm(context + question)

# I strip the text to remove the leading and trailing whitespace
print (output.strip())
```

The hard part comes in when you need to be selective about *which* data you put in your context. This field of study is called "[document retrieval](https://python.langchain.com/en/latest/modules/indexes/retrievers.html)" and tightly coupled with AI Memory.

#### 2.1 Using Embeddings

The high-level VectorStore process:

* Split your text
* Embed the chunks
* Put embeddings in a DB
* Query them

For a full video on this check out [How To Question A Book](https://www.youtube.com/watch?v=h0DHDp1FbmQ)

The goal is to select relevant chunks of our long text, but which chunks do we pull? The most popular method is to pull similar texts based off comparing vector embeddings.

```python
from langchain import OpenAI

# The vectorstore we'll be using
from langchain.vectorstores import FAISS

# The LangChain component we'll use to get the documents
from langchain.chains import RetrievalQA

# The easy document loader for text
from langchain.document_loaders import TextLoader

# The embedding engine that will convert our text to vectors
from langchain.embeddings.openai import OpenAIEmbeddings

llm = OpenAI(temperature=0, openai_api_key=openai_api_key)

loader = TextLoader('data/PaulGrahamEssays/worked.txt')
doc = loader.load()

# Split the doc into smaller pieces
text_splitter = RecursiveCharacterTextSplitter(chunk_size=3000, chunk_overlap=400)
docs = text_splitter.split_documents(doc)

num_total_characters = sum([len(x.page_content) for x in docs])

# Get your embeddings engine ready
embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)

# Embed your documents and combine with the raw text in a pseudo db. Note: This will make an API call to OpenAI
docsearch = FAISS.from_documents(docs, embeddings)

# Create retrieval engine
qa = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=docsearch.as_retriever())

query = "What does the author describe as good work?"
qa.run(query)
```

If you wanted to go further, you could hook up the data to a cloud Vector DB.


### 3. Extraction

Parse some data from a piece of text. Often used to structure the data. This could be extracting data from a document to insert into a DB, or extract params from a user request to make an API call.

A popular library for extraction is [Kor](https://eyurtsev.github.io/kor/).

#### 3.1 Simple Extraction

```python
# To help construct our Chat Messages
from langchain.schema import HumanMessage
from langchain.prompts import PromptTemplate, ChatPromptTemplate, HumanMessagePromptTemplate

# We will be using a chat model, defaults to gpt-3.5-turbo
from langchain.chat_models import ChatOpenAI

# To parse outputs and get structured data back
from langchain.output_parsers import StructuredOutputParser, ResponseSchema

chat_model = ChatOpenAI(temperature=0, model_name='gpt-3.5-turbo', openai_api_key=openai_api_key)

instructions = """
You will be given a sentence with fruit names, extract those fruit names and assign an emoji to them
Return the fruit name and emojis in a python dictionary
"""

fruit_names = """
Apple, Pear, this is an kiwi
"""
# Make your prompt which combines the instructions w/ the fruit names
prompt = (instructions + fruit_names)

# Call the LLM
output = chat_model([HumanMessage(content=prompt)])

output_dict = eval(output.content)

> {'Apple': '🍎', 'Pear': '🍐', 'kiwi': '🥝'}

```

#### 3.2 LangChain's Response Schema

LangChain's response schema will does two things for us:

Autogenerate the a prompt with bonafide format instructions. This is great because I don't need to worry about the prompt engineering side, I'll leave that up to LangChain!

Read the output from the LLM and turn it into a proper python object for me.

```python
# The schema I want out
response_schemas = [
    ResponseSchema(name="artist", description="The name of the musical artist"),
    ResponseSchema(name="song", description="The name of the song that the artist plays")
]

# The parser that will look for the LLM output in my schema and return it back to me
output_parser = StructuredOutputParser.from_response_schemas(response_schemas)

format_instructions = output_parser.get_format_instructions()

# The prompt template that brings it all together
# Note: This is a different prompt template than before because we are using a Chat Model

prompt = ChatPromptTemplate(
    messages=[
        HumanMessagePromptTemplate.from_template("Given a command from the user, extract the artist and song names \n \
                                                    {format_instructions}\n{user_prompt}")  
    ],
    input_variables=["user_prompt"],
    partial_variables={"format_instructions": format_instructions}
)

fruit_query = prompt.format_prompt(user_prompt="I really like So Young by Portugal. The Man")
print (fruit_query.messages[0].content)

fruit_output = chat_model(fruit_query.to_messages())
output = output_parser.parse(fruit_output.content)
```

> {'artist': 'Portugal. The Man', 'song': 'So Young'}



### 4. Evaluation

Evaluation is the process of doing quality checks on the output of your applications. Normal, deterministic, code has tests we can run, but judging the output of LLMs is more difficult because of the unpredictableness and variability of natural language. LangChain provides tools that aid us in this journey.

E.g. Run quality check on summarization or Q&A pipeline answers.

```python
# Embeddings, store, and retrieval
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import FAISS
from langchain.chains import RetrievalQA

# Model and doc loader
from langchain import OpenAI
from langchain.document_loaders import TextLoader

# Eval!
from langchain.evaluation.qa import QAEvalChain

llm = OpenAI(temperature=0, openai_api_key=openai_api_key)

# Our long essay from before
loader = TextLoader('data/PaulGrahamEssays/worked.txt')
doc = loader.load()

print (f"You have {len(doc)} document")
print (f"You have {len(doc[0].page_content)} characters in that document")

text_splitter = RecursiveCharacterTextSplitter(chunk_size=3000, chunk_overlap=400)
docs = text_splitter.split_documents(doc)

# Embeddings and docstore
embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)
docsearch = FAISS.from_documents(docs, embeddings)

# input_key is the dict key below
chain = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=docsearch.as_retriever(), input_key="question")

question_answers = [
    {'question' : "Which company sold the microcomputer kit that his friend built himself?", 'answer' : 'Healthkit'},
    {'question' : "What was the small city he talked about in the city that is the financial capital of USA?", 'answer' : 'Yorkville, NY'}
]

predictions = chain.apply(question_answers)
predictions

> [{'question': 'Which company sold the microcomputer kit that his friend built himself?',
  'answer': 'Healthkit',
  'result': ' The microcomputer kit was sold by Heathkit.'},
 {'question': 'What was the small city he talked about in the city that is the financial capital of USA?',
  'answer': 'Yorkville, NY',
  'result': ' The small city he talked about is New York City, which is the financial capital of the United States.'}]

# Now let's have the LLM grade itself

# Start your eval chain
eval_chain = QAEvalChain.from_llm(llm)

# Have it grade itself. The code below helps the eval_chain know where the different parts are
graded_outputs = eval_chain.evaluate(question_answers,
                                     predictions,
                                     question_key="question",
                                     prediction_key="result",
                                     answer_key='answer')

graded_outputs

> [{'text': ' CORRECT'}, {'text': ' INCORRECT'}]
```

### 5. Querying Data

For futher reading check out "Agents + Tabular Data" ([Pandas](https://python.langchain.com/en/latest/modules/agents/toolkits/examples/pandas.html), [SQL](https://python.langchain.com/en/latest/modules/agents/toolkits/examples/sql_database.html), [CSV](https://python.langchain.com/en/latest/modules/agents/toolkits/examples/csv.html))

```python
from langchain import OpenAI, SQLDatabase, SQLDatabaseChain

llm = OpenAI(temperature=0, openai_api_key=openai_api_key)

sqlite_db_path = 'data/San_Francisco_Trees.db'
db = SQLDatabase.from_uri(f"sqlite:///{sqlite_db_path}")

db_chain = SQLDatabaseChain(llm=llm, database=db, verbose=True)

db_chain.run("How many Species of trees are there in San Francisco?")

> > Entering new SQLDatabaseChain chain...
How many Species of trees are there in San Francisco?
SQLQuery:SELECT COUNT(DISTINCT "qSpecies") FROM "SFTrees";
SQLResult: [(578,)]
Answer:There are 578 Species of trees in San Francisco.
> Finished chain.
```

There are quite a few steps going on here:

1. Find which table to use
2. Find which column to use
3. Construct the correct SQL
4. Execute the query
5. Get the result
6. Return the natural language response back

### 6. Code Understanding

We just load these up as docs.

```python
# Helper to read local files
import os

# Vector Support
from langchain.vectorstores import FAISS
from langchain.embeddings.openai import OpenAIEmbeddings

# Model and chain
from langchain.chat_models import ChatOpenAI

# Text splitters
from langchain.text_splitter import CharacterTextSplitter
from langchain.document_loaders import TextLoader

llm = ChatOpenAI(model_name='gpt-3.5-turbo', openai_api_key=openai_api_key)

embeddings = OpenAIEmbeddings(disallowed_special=(), openai_api_key=openai_api_key)

root_dir = 'data/thefuzz'
docs = []

# Load each file as a doc...

# Go through each folder
for dirpath, dirnames, filenames in os.walk(root_dir):
    
    # Go through each file
    for file in filenames:
        try: 
            # Load up the file as a doc and split
            loader = TextLoader(os.path.join(dirpath, file), encoding='utf-8')
            docs.extend(loader.load_and_split())
        except Exception as e: 
            pass
        
# Embed and store in a docstore
docsearch = FAISS.from_documents(docs, embeddings)

# Get our retriever ready
qa = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=docsearch.as_retriever())

query = "What function do I use if I want to find the most similar item in a list of items?"
output = qa.run(query)

> You can use the `process.extractOne()` function from `thefuzz` package to find the most similar item in a list of items. Here's an example:

query = "Can you write the code to use the process.extractOne() function? Only respond with code. No other text or explanation"
output = qa.run(query)

> code...
```

### 7. Interacting With APIs

This is closely related to Agents and Plugins.

LangChain's APIChain can read API documentation and understand which endpoint it needs to call.

```python
from langchain.chains import APIChain
from langchain.llms import OpenAI

llm = OpenAI(temperature=0, openai_api_key=openai_api_key)


api_docs = """

BASE URL: https://restcountries.com/

API Documentation:

The API endpoint /v3.1/name/{name} Used to find informatin about a country. All URL parameters are listed below:
    - name: Name of country - Ex: italy, france
    
The API endpoint /v3.1/currency/{currency} Uesd to find information about a region. All URL parameters are listed below:
    - currency: 3 letter currency. Example: USD, COP
    
Woo! This is my documentation
"""

chain_new = APIChain.from_llm_and_api_docs(llm, api_docs, verbose=True)

chain_new.run('Can you tell me information about france?')

> Entering new APIChain chain...
 https://restcountries.com/v3.1/name/france
> ...
> France is an officially-assigned, ...  
```

### 8. Chatbots

Again, memory becomes important. There are a ton of different [types of memory](https://python.langchain.com/en/latest/modules/memory/how_to_guides.html), tinker to see which is best for you.

```python
from langchain.llms import OpenAI
from langchain import LLMChain
from langchain.prompts.prompt import PromptTemplate

# Chat specific components
from langchain.memory import ConversationBufferMemory

template = """
You are a chatbot that is unhelpful.
Your goal is to not help the user but only make jokes.
Take what the user is saying and make a joke out of it

{chat_history}
Human: {human_input}
Chatbot:"""

prompt = PromptTemplate(
    input_variables=["chat_history", "human_input"], 
    template=template
)
memory = ConversationBufferMemory(memory_key="chat_history")

llm_chain = LLMChain(
    llm=OpenAI(openai_api_key=openai_api_key), 
    prompt=prompt, 
    verbose=True, 
    memory=memory
)

llm_chain.predict(human_input="Is an pear a fruit or vegetable?")

> ...

llm_chain.predict(human_input="What was one of the fruits I first asked you about?")
```

### 9. Agents

Agents are the decision makers that can look a data, reason about what the next action should be, and execute that action for you via tools.

Examples of advanced uses of agents appear in [BabyAGI](https://github.com/yoheinakajima/babyagi) and [AutoGPT](https://github.com/Significant-Gravitas/Auto-GPT)

examples of what you can do with AutoGPT:

1. Reddit Marketing Agent

* This agent reads comments on Reddit.
* It looks for people asking about your product.
* It then automatically responds to them.

2. YouTube Content Repurposing Agent

* This agent subscribes to your YouTube channel.
* When you post a new video, it transcribes it.
* It uses AI to write a search engine optimized blog post.
* Then, it publishes this blog post to your Medium account.

```python
# Helpers
import os
import json

from langchain.llms import OpenAI

# Agent imports
from langchain.agents import load_tools
from langchain.agents import initialize_agent

# Tool imports
from langchain.agents import Tool
from langchain.utilities import GoogleSearchAPIWrapper
from langchain.utilities import TextRequestsWrapper

llm = OpenAI(temperature=0, openai_api_key=openai_api_key)

search = GoogleSearchAPIWrapper(google_api_key=GOOGLE_API_KEY, google_cse_id=GOOGLE_CSE_ID)

requests = TextRequestsWrapper()

toolkit = [
    Tool(
        name = "Search",
        func=search.run,
        description="useful for when you need to search google to answer questions about current events"
    ),
    Tool(
        name = "Requests",
        func=requests.get,
        description="Useful for when you to make a request to a URL"
    ),
]

# Create your agent by giving it 1. tools, 2. the LLM, and 3. What kind of agent it should be
agent = initialize_agent(toolkit, llm, agent="zero-shot-react-description", verbose=True, return_intermediate_steps=True)

response = agent({"input":"What is the capital of canada?"})
response['output']

> > Entering new AgentExecutor chain...
 I need to find out what the capital of Canada is.
Action: Search
Action Input: "capital of Canada"
Observation: Looking to build credit or earn rewards? Compare our rewards, Guaranteed secured and other Guaranteed credit cards. Canada's capital is Ottawa and its three largest metropolitan areas are Toronto, Montreal, and Vancouver. Canada. A vertical triband design (red, white, red) ... Browse available job openings at Capital One - CA. ... Together, we will build one of Canada's leading information-based technology companies – join us, ... Ottawa is the capital city of Canada. It is located in the southern portion of the province of Ontario, at the confluence of the Ottawa River and the Rideau ... Shopify Capital offers small business funding in the form of merchant cash advances to eligible merchants in Canada. If you live in Canada and need ... Download Capital One Canada and enjoy it on your iPhone, iPad and iPod touch. ... Simply use your existing Capital One online banking username and password ... A leader in the alternative asset space, TPG was built for a distinctive approach, managing assets through a principled focus on innovation. We're Canada's largest credit union by membership because we prioritize people, not profits. Let's build the right plan to reach your financial goals, together. The national capital is Ottawa, Canada's fourth largest city. It lies some 250 miles (400 km) northeast of Toronto and 125 miles (200 km) west of Montreal, ... Finding Value Across the Capital Structure: Limited Recourse Capital Notes. Limited Recourse Capital Notes are an evolving segment of the Canadian fixed-income ...
Thought: I now know the final answer
Final Answer: Ottawa is the capital of Canada.

response = agent({"input":"Tell me what the comments are about on this webpage https://news.ycombinator.com/item?id=34425779"})
response['output']

> > Entering new AgentExecutor chain...
 I need to find out what the comments are about
Action: Search
Action Input: "comments on https://news.ycombinator.com/item?id=34425779"
Observation: About a month after we started Y Combinator we came up with the phrase that ... Action Input: "comments on https://news.ycombinator.com/item?id=34425779" .
Thought: I now know the comments are about Y Combinator
Final Answer: The comments on the webpage are about Y Combinator.
```

# Conceptual Guide

From [official docs](https://python.langchain.com/v0.2/docs/concepts/).

## LCEL

LangChain Expression Language: A declarative way to chain LangChain components.

## Runnable interface

Many LangChain components implement `Runnable`:

* `stream`: Stream back chunks
* `invoke`: Call the chain on an input
* `batch`: Call the chain on a list of insputs

(These also include async versions)

## Components

### 1. Chat models

Language models that use a sequence of messages as inputs and return chat messages as outputs.

### 2. LLMs

Language models that take a string as input and return a string. These are generally older models, and newer models are generally chat models.

### 3. Messages

Some language models take a list of messages as input and return a message.

### 4. Prompt templates

Helps translate user input and parameters into an instruction for the language model.

Input is a dictionary, where each key is a variable for the prompt template to fill in.

Output is a PromptValue.

```python

# String PromptTemplate

prompt_template = PromptTemplate.from_template("Tell me a joke about {topic}")

prompt_template.invoke({"topic": "cats"})

# ChatPromptTemplate
# Used to format a list of messages.
prompt_template = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant"),
    ("user", "Tell me a joke about {topic}")
])

prompt_template.invoke({"topic": "cats"})
```

### 5. Example Selectors

Let's you select and format examples to pass into prompts.

### 6. Output Parsers

Takes the output from a model and formats it.

E.g.,

* JSON
* XML
* CSV

### 7. Chat History

Can wrap an arbitrary chain to keep track of inputs and outputs of the underlying chain, and append them as messages to a message database.

They can then be loaded and passed into the chain as part of the input.

### 8. Documents

Information about some data.

### 9. Document Loaders

Allows you to load data from data sources such as Google Drive, CSV files, etc.

### 10. Text Splitters

Splits a document into smaller chunks that can fit into a context window.

See the relevant [how-to guides](https://python.langchain.com/v0.2/docs/how_to/#text-splitters): there are nuances as we want to keep semantically related pieces of text together.

### 11. Embedding Models

Create a vector representation of some text.

### 12. Vector Stores

A way to store embedding vectors.

### 13. Retrievers

Returns documents given an unstructured query. More general than a vector store.

Does not need to be able to store documents, only return them. A retreiver can be created from vector stores, but is more general than a vector store. For example, can be Wikipedia search.

### 14. Tools

Tools are utilities to be called by a model.

E.g., could be a call to an external API.

### 15. Toolkits

A collection of tools designed to be used together for specific tasks.

### 16. Agents

Systems that use LLMs are a reasoning agent to determine actions to take, and what those intput should be.

LangGraph is a LangChain extension aimed at creating highly controllable agents.

#### 16.1 ReAct Agents

Reason and Act agents:

* "Think" what step to take
* Choose an action from available tools
* Generate arguments to that tool
* Call the tool with generated arguments
* Return the tool results back as an observation
* Repeat until complete

## Techniques

### Retrieval (RAG)

Techniques for RAG can be split up into the following:

__1. Query translation__: Use an LLM to review and optionally modify the input
__2. Routing__: Use an LLM to decide which data source to route the query to
__3. Query construction__: Convert user input into query syntax
__4. Indexing__: How you index your documents/text that is being retrieved
__5. Search Improvement__: Improve similarity searching. E.g. include keyword search as well as semantic similarity.
__6. Post Processing__: E.g. Rank or compress the documents you found that match.
__7. Generation__: E.g. check for errors or search the web if no relevant documents found.

(also see retrieval image below)

#### Retrieval: Query Translation

Use an LLM to review and optionally modify the input. This optimizes the raw user inputs for the retrieval system.

For example, this could be extracting keywords, or generating multiple sub-questions.

* __Multi-query__: Rewrite the question from multiple perspectives
* __Decomposition__: Break the question into multiple smaller subproblems
* __Step-back__: When a higher level understanding is required, ask the LLM to ask a generic step-back question about higher-level concepts, to ground the answer
* __HyDE__: Use an LLM to convert questions into hypothetical documents that answer the question, and then retrieve real documents with the premise of doc-doc similarity

#### Retrieval: Routing

Use the LLM to review the input and route the query to the correct data source in the application.

* __Logical routing__: Prompt the LLM to reason using rules to decide where to route the input
* __Semantic routing__: Embeds the query and set of prompts and then chooses the appropriate prompt based on similarity

#### Retrieval: Query Construction

Convert the LLM input from natural language into query syntax

* __Text to SQL__
* __Text to Cypher__: If asking questions from data in a graph database
* __Self query__: If asking questions that are better answered by fetching documents based on metadata rather than text similarity. Transforms user input into 1) a string to look up semantically and 2) a METADATA filter to go alongwith it (oftentimes queries are about metadata rather than document content itself)

#### Retrieval: Indexing

The design of the document index for retrieval. A powerful idea is to __decouple the documents that you index for retrieval from the documents you pass to the LLM for generation__.

We chunk documents, but chunk size is hard to get right and affects results if they do not provide full context for the LLM.

Below are some techniques:

* __Vector store__: Quick and easy, create embeddings for each piece of text
* __ParentDocument__: Vector store + document store. If your pages have lots of smaller piece of distinct info that is best indexed individually, but retrieved all together. Index multiple chunks for each document, then once found the matching chunk, return the whole document that chunk belongs to.
* __Multi Vector__: Vector store + document store. If you are able to extract info from the documents that is more relevant than the text itself. E.g., create a summary of the document or Q&A of the document and index that, but then return the whole document.
* __Time-Weighted Vector store__: If you have timestamps associated with your documents and want to retrieve the most recent ones.

#### Retrieval: Improving Similarity Search

In some cases, irrelevant content can dilute semantic usefulness of the embedding.

(Will skip for now, but consider: embeddings are good at semantic search, but may not work well for keyword-based searching. Hybrid search is often available in vector stores and can improve vector search. Other techniques include ColBERT and MMR.)

#### Retrieval: Post-processing

Filtering and ranking retrieved documents. Very useful if you are combining documents returned from multiple sources, so you can down-rank less relevant docs or compress similar docs.

* __Contextual Compression__: When retieved docs contain too much irrelevant info. Puts a post-processing step and extracts only the most relevant info from the documents.
* __Ensemble__: If you have multiple retrieval methods and want to combine them. Fetches docs from multiple retrievers and combines them.
* __Re-ranking__: Rank the documents based on relevant.

#### Retrieval: Generation

Building self-correction into the RAG system.

Low quality systems can suffer from hallucinations or low quality retrieval (if a question is outside of the domain for the index). You can try to detect or self-correct these errors.

This is a relatively new area.

* __Self-RAG__: Fix answers with hallucinations or irrelvant content, check the doc during the answer generation flow, iteratively building an answer and self-correcting errors.
* __Corrective-RAG__: When needed for a fallback for low relevant docs. E.g. do a fallback to web search if docs are not relevant to the query.


### Text Splitting

There are various types of text splitters, some of which add metadata, some of which don't.

* __Recursive__: Splits on user defined characters. Tries to keep related pieces of text next to each other. Recommended way to start.
* __HTML__: Splits on HTML characters. Add relevant information about where the chunk came from.
* __Markdown__: Splits on Markdown characters. 
* __Code__
* __Tokens__: Splits on tokens. A few exist with different ways to measure tokens.
* __Character__: Splits on user defined character. Simple.
* __Semantic Chunker _(experimental)___: Tries to split on sentences. Then combines next to each other if they are semantically similar enough.
* __AI21 Semantic__: Identifies distinct topics that form coherent pieces of text and splits along those.

### Evaluation

Assessing the performance of the LLM-powered app.

Involves testing the model's responses against a set of predefined criteria or benchmarks.

In [1]:
from IPython.display import Image
from IPython.core.display import HTML 

Image(url="https://python.langchain.com/v0.2/assets/images/rag_landscape-627f1d0fd46b92bc2db0af8f99ec3724.png")

### Multivector Retrieval

Documents might consist of text, tables, and images. If we want to search these with RAG, we can turn each of these into a set of vectors, so they operate in the same semantic space, and then have them all be searched with RAG, so we could return text, a chart, etc.

Certain [tools](https://unstructured.io/?ref=blog.langchain.dev) can be used to split up documents in to these different embeddings.

Think of it as a way to do multimodal RAG.

Ref: https://blog.langchain.dev/semi-structured-multi-modal-rag/

In [2]:
from IPython.display import Image
from IPython.core.display import HTML 

Image(url="https://blog.langchain.dev/content/images/2023/10/mvr_overview.png")