###Introduction LangChain  

**Disclaimer:**  The content in these Jupyter Notebooks is aggregated from various online sources dedicated to the documentation of the tools we will be using. Accuracy and completeness are not guaranteed. Use at your own risk and verify critical information independently.

*This Notebook is based off the [LangChain Conceptual Documentation](https://docs.langchain.com/docs/)*

**Goal:** Provide an introductory understanding of the components and use cases of LangChain via examples and code snippets. For use cases check out part 2.
* [LC Conceptual Documentation](https://docs.langchain.com/docs/)
* [LC Python Documentation](https://python.langchain.com/en/latest/)
* [www.langchain.com](https://langchain.com/)


### What is LangChain?
> LangChain is a framework for developing applications powered by language models.

**TLDR**: LangChain makes the complicated parts of working & building with AI models easier. It helps do this in two ways:

1. **Integration** - Bring external data, such as your files, other applications, and api data, to your LLMs
2. **Agency** - Allow your LLMs to interact with it's environment via decision making. Use LLMs to help decide which action to take next

### Why LangChain?
1. **Components** - LangChain makes it easy to swap out abstractions and components necessary to work with language models.
2. **Customized Chains** - LangChain provides out of the box support for using and customizing 'chains' - a series of actions strung together.
3. **Speed** - LangChain team ships insanely fast. You'll be up to date with the latest LLM features.
4. **Community** - Wonderful discord and community support, meet ups, hackathons, etc.

Though LLMs can be straightforward `(text-in, text-out)` you'll quickly run into friction points that LangChain helps with once you develop more complicated applications.

*Note: This cookbook will not cover all aspects of LangChain. It's contents have been curated to get you to building & impact as quick as possible. For more, please check out [LangChain Conceptual Documentation](https://docs.langchain.com/docs/)*

In [1]:
import os
import openai
from dotenv import load_dotenv
from langchain.chat_models import AzureChatOpenAI
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import AzureSearch
from langchain.document_loaders import DirectoryLoader
from langchain.document_loaders import TextLoader
from langchain.text_splitter import TokenTextSplitter
from langchain.chains import ConversationalRetrievalChain
from langchain.prompts import PromptTemplate
# Load environment variables
load_dotenv()
# Configure OpenAI API
openai.api_type = "azure"
openai.api_base = os.getenv('OPENAI_API_BASE')
openai.api_key = os.getenv('OPENAI_API_KEY')
openai.api_version = os.getenv('OPENAI_API_VERSION')

#### LangChain Components
The natural language way to interact with LLMs

In [2]:
# You'll be working with simple strings (that'll soon grow in complexity!)
my_text = "What day comes after Friday?"

#### Chat Messages
Like text, but specified with a message type (System, Human, AI)

* **System** - Helpful background context that tell the AI what to do
* **Human** - Messages that are intented to represent the user
* **AI** - Messages that show what the AI responded with

For more, see OpenAI's [documentation](https://platform.openai.com/docs/guides/chat/introduction)

In [3]:
from langchain.chat_models import AzureChatOpenAI
from langchain.schema import HumanMessage, SystemMessage, AIMessage

llm4 = AzureChatOpenAI(
    openai_api_base= openai.api_base,
    openai_api_version= openai.api_version,
    deployment_name="gpt-4-32k",
    temperature=0,
    openai_api_key= openai.api_key,
    openai_api_type = openai.api_type,
)

llm3 = AzureChatOpenAI(
    openai_api_base=openai.api_base,
    openai_api_version=openai.api_version,
    deployment_name="gpt-35-turbo-16k",
    temperature=0,
   openai_api_key=openai.api_key,
   openai_api_type = openai.api_type,
)

# embeddings = OpenAIEmbeddings(chunk_size=1)


In [4]:
llm4(
    [
        SystemMessage(content="You are a nice AI bot that helps a user figure out what to eat in one short sentence"),
        HumanMessage(content="I like tomatoes, what should I eat?")
    ]
)

AIMessage(content='How about a fresh Caprese salad with ripe tomatoes, mozzarella, and basil?')

You can also pass more chat history w/ responses from the AI

In [5]:
llm3(
    [
        SystemMessage(content="You are a nice AI bot that helps a user figure out where to travel in one short sentence"),
        HumanMessage(content="I like the beaches where should I go?"),
        AIMessage(content="You should go to Nice, France"),
        HumanMessage(content="What else should I do when I'm there?")
    ]
)

AIMessage(content='You should explore the charming Old Town, visit the stunning Promenade des Anglais, and indulge in delicious French cuisine.')

#### Documents
An object that holds a piece of text and metadata (more information about that text)

In [6]:
from langchain.schema import Document

In [7]:
Document(page_content="This is my document. It is full of text that I've gathered from other places",
         metadata={
             'my_document_id' : 234234,
             'my_document_source' : "The LangChain Papers",
             'my_document_create_time' : 1680013019
         })

Document(page_content="This is my document. It is full of text that I've gathered from other places", metadata={'my_document_id': 234234, 'my_document_source': 'The LangChain Papers', 'my_document_create_time': 1680013019})

#### Models - The interface to the AI brains

#####  Language Model
A model that does text in ➡️ text out!

*Check out how I changed the model I was using from the default one to ada-001. See more models [here](https://platform.openai.com/docs/models)*

In [8]:
# Import Azure OpenAI
from langchain.llms import AzureOpenAI

# Create an instance of Azure OpenAI
# Replace the deployment name with your own
llm = AzureOpenAI(
    deployment_name="gpt-35-turbo-instruct",
    model_name="gpt-35-turbo-instruct",
)

In [9]:
#llm("What day comes after Friday?")
llm("Tell me a joke")

"\n\nWhy don't scientists trust atoms?\n\nBecause they make up everything."

#### **Chat Model**
A model that takes a series of messages and returns a message output

In [10]:
from langchain.chat_models import AzureChatOpenAI
from langchain.schema import HumanMessage, SystemMessage, AIMessage

chat = AzureChatOpenAI(
    openai_api_base= openai.api_base,
    openai_api_version= openai.api_version,
    deployment_name="gpt-4-32k",
    temperature=0,
    openai_api_key= openai.api_key,
    openai_api_type = openai.api_type
    )

In [11]:
chat(
    [
        SystemMessage(content="You are an unhelpful AI bot that makes a joke at whatever the user says"),
        HumanMessage(content="I would like to go to New York, how should I do this?")
    ]
)

AIMessage(content='Well, you could start by practicing your honking. You know, to fit in with the local traffic!')

#### **Text Embedding Model**
Change your text into a vector (a series of numbers that hold the semantic 'meaning' of your text). Mainly used when comparing two pieces of text together.

*BTW: Semantic means 'relating to meaning in language or logic.'*

In [13]:
from langchain.embeddings import OpenAIEmbeddings
embeddings = OpenAIEmbeddings(deployment_id="text-embedding-ada-002", chunk_size=1)

##### Embed a single document


In [14]:
text = "This is just a test"
e = embeddings.embed_query(text)
print(e)
print(len(e))

[-0.010031012400679239, 0.002363352705139523, 0.009305801786561171, -0.0045502160644454335, -0.020395751193564623, 0.013875271547451488, -0.03111346681547216, -0.01193709714941775, -0.006584657082058519, -0.03855810898977913, 0.01369557286415934, 0.003988659308971903, -0.0036902316706888885, 0.0028944251785389965, 0.004373726718611821, 0.0013196590040055094, 0.021371255670514764, -0.004973790401313851, -0.0070274846497105515, -0.008432982138446184, 0.005487214079828343, 0.001197720827971425, -0.01513315953652893, 0.031909274471775224, -0.032422697684628445, -0.015030474707693779, 0.015197336972474317, -0.028726047571853118, 0.012245151635923206, -0.006369660808996407, 0.014991967314804013, -0.00795806538716019, 0.015312857288498545, -0.034348035664150574, -0.004717078794887239, -0.018406233915452038, -0.009902656597465932, -0.022385266609032135, 0.01358005254813511, -0.017713112019306664, -0.0020167919898974704, 0.024118070418073033, -0.00788746927629773, -0.028212624358999894, 0.00401

In [15]:
text = "Hi! It's time for the beach"

In [16]:
text_embedding = embeddings.embed_query(text)
print (f"Your embedding is length {len(text_embedding)}")
print (f"Here's a sample: {text_embedding[:5]}...")

Your embedding is length 1536
Here's a sample: [-0.00020583387883037012, -0.003205398431934278, -0.0008301587339650732, -0.01946892837225558, -0.015162717099657705]...


### Prompts - Text generally used as instructions to your model

#### Prompt
What you'll pass to the underlying model

In [17]:
from langchain.llms import AzureOpenAI


llm_inst = AzureOpenAI(
    openai_api_base= openai.api_base,
    openai_api_version= openai.api_version,
    deployment_name="gpt-35-turbo-instruct",
    temperature=0,
    openai_api_key= openai.api_key,
    openai_api_type = openai.api_type
    )
# I like to use three double quotation marks for my prompts because it's easier to read
prompt = """
Today is Monday, tomorrow is Wednesday.

What is wrong with that statement?
"""
llm_inst(prompt)

'\nThe statement is incorrect because it skips over Tuesday, making it seem like there are only two days in between Monday and Wednesday instead of three.'

#### Prompt Template
An object that helps create prompts based on a combination of user input, other non-static information and a fixed template string.

Think of it as an [f-string](https://realpython.com/python-f-strings/) in python but for prompts

In [18]:
from langchain.llms import AzureOpenAI
from langchain import PromptTemplate

llm_inst = AzureOpenAI(
    openai_api_base= openai.api_base,
    openai_api_version= openai.api_version,
    deployment_name="gpt-35-turbo-instruct",
    temperature=0,
    openai_api_key= openai.api_key,
    openai_api_type = openai.api_type
    )

# Notice "location" below, that is a placeholder for another value later
template = """
I really want to travel to {location}. What should I do there?
Respond in one short sentence
"""

prompt = PromptTemplate(
    input_variables=["location"],
    template=template,
)

final_prompt = prompt.format(location='Rome')

print (f"Final Prompt: {final_prompt}")
print ("-----------")
print (f"LLM Output: {llm(final_prompt)}")

Final Prompt: 
I really want to travel to Rome. What should I do there?
Respond in one short sentence

-----------
LLM Output: 
Visit historical landmarks and try authentic Italian cuisine.


#### Example Selectors
An easy way to select from a series of examples that allow you to dynamic place in-context information into your prompt. Often used when your task is nuanced or you have a large list of examples.

Check out different types of example selectors [here](https://python.langchain.com/docs/modules/model_io/prompts/example_selectors/)

If you want an overview on why examples are important (prompt engineering), check out [this video](https://www.youtube.com/watch?v=dOxUroR57xs)

In [19]:
from langchain.prompts.example_selector import SemanticSimilarityExampleSelector
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings
from langchain.prompts import FewShotPromptTemplate, PromptTemplate
from langchain.llms import OpenAI
llm_inst = AzureOpenAI(
    openai_api_base= openai.api_base,
    openai_api_version= openai.api_version,
    deployment_name="gpt-35-turbo-instruct",
    temperature=0,
    openai_api_key= openai.api_key,
    openai_api_type = openai.api_type
    )
example_prompt = PromptTemplate(
    input_variables=["input", "output"],
    template="Example Input: {input}\nExample Output: {output}",
)
# Examples of locations that nouns are found
examples = [
    {"input": "pirate", "output": "ship"},
    {"input": "pilot", "output": "plane"},
    {"input": "driver", "output": "car"},
    {"input": "tree", "output": "ground"},
    {"input": "bird", "output": "nest"},
]

In [21]:
# SemanticSimilarityExampleSelector will select examples that are similar to your input by semantic meaning

example_selector = SemanticSimilarityExampleSelector.from_examples(
    # This is the list of examples available to select from.
    examples, 
    
    # This is the embedding class used to produce embeddings which are used to measure semantic similarity.
    OpenAIEmbeddings(deployment_id="text-embedding-ada-002", chunk_size=1),
    
    # This is the VectorStore class that is used to store the embeddings and do a similarity search over.
    FAISS, 
    
    # This is the number of examples to produce.
    k=2
)

In [22]:
similar_prompt = FewShotPromptTemplate(
    # The object that will help select examples
    example_selector=example_selector,
    
    # Your prompt
    example_prompt=example_prompt,
    
    # Customizations that will be added to the top and bottom of your prompt
    prefix="Give the location an item is usually found in",
    suffix="Input: {noun}\nOutput:",
    
    # What inputs your prompt will receive
    input_variables=["noun"],
)

In [23]:
# Select a noun!
my_noun = "student"

print(similar_prompt.format(noun=my_noun))

Give the location an item is usually found in

Example Input: driver
Example Output: car

Example Input: pilot
Example Output: plane

Input: student
Output:


In [24]:
llm_inst(similar_prompt.format(noun=my_noun))

' classroom'

#### Output Parsers
A helpful way to format the output of a model. Usually used for structured output.

Two big concepts:

**1. Format Instructions** - A autogenerated prompt that tells the LLM how to format it's response based off your desired result

**2. Parser** - A method which will extract your model's text output into a desired structure (usually json)

In [25]:
from langchain.output_parsers import StructuredOutputParser, ResponseSchema
from langchain.prompts import ChatPromptTemplate, HumanMessagePromptTemplate
from langchain.llms import OpenAI

In [26]:
# How you would like your response structured. This is basically a fancy prompt template
response_schemas = [
    ResponseSchema(name="bad_string", description="This a poorly formatted user input string"),
    ResponseSchema(name="good_string", description="This is your response, a reformatted response")
]

# How you would like to parse your output
output_parser = StructuredOutputParser.from_response_schemas(response_schemas)

In [27]:
# See the prompt template you created for formatting
format_instructions = output_parser.get_format_instructions()
print (format_instructions)

The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```":

```json
{
	"bad_string": string  // This a poorly formatted user input string
	"good_string": string  // This is your response, a reformatted response
}
```


In [28]:
template = """
You will be given a poorly formatted string from a user.
Reformat it and make sure all the words are spelled correctly

{format_instructions}

% USER INPUT:
{user_input}

YOUR RESPONSE:
"""

prompt = PromptTemplate(
    input_variables=["user_input"],
    partial_variables={"format_instructions": format_instructions},
    template=template
)

promptValue = prompt.format(user_input="welcom to califonya!")

print(promptValue)


You will be given a poorly formatted string from a user.
Reformat it and make sure all the words are spelled correctly

The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```":

```json
{
	"bad_string": string  // This a poorly formatted user input string
	"good_string": string  // This is your response, a reformatted response
}
```

% USER INPUT:
welcom to califonya!

YOUR RESPONSE:



In [29]:
llm_output = llm(promptValue)
llm_output

'```json\n{\n\t"bad_string": "welcom to califonya!",\n\t"good_string": "Welcome to California!" \n}\n```'

In [30]:
output_parser.parse(llm_output)

{'bad_string': 'welcom to califonya!', 'good_string': 'Welcome to California!'}

## Indexes - Structuring documents to LLMs can work with them

### **Document Loaders**
Easy ways to import data from other sources. Shared functionality with [OpenAI Plugins](https://openai.com/blog/chatgpt-plugins) [specifically retrieval plugins](https://github.com/openai/chatgpt-retrieval-plugin)

See a [big list](https://python.langchain.com/en/latest/modules/indexes/document_loaders.html) of document loaders here. A bunch more on [Llama Index](https://llamahub.ai/) as well.

In [31]:
from langchain.document_loaders import HNLoader

In [32]:
loader = HNLoader("https://news.ycombinator.com/item?id=34422627")

In [33]:
data = loader.load()

In [34]:
print (f"Found {len(data)} comments")
print (f"Here's a sample:\n\n{''.join([x.page_content[:150] for x in data[:2]])}")

Found 76 comments
Here's a sample:

Ozzie_osman 10 months ago  
             | next [–] 

LangChain is awesome. For people not sure what it's doing, large language models (LLMs) are veryOzzie_osman 10 months ago  
             | parent | next [–] 

Also, another library to check out is GPT Index (https://github.com/jerryjliu/gpt_index


### **Text Splitters**
Often times your document is too long (like a book) for your LLM. You need to split it up into chunks. Text splitters help with this.

There are many ways you could split your text into chunks, experiment with [different ones](https://python.langchain.com/en/latest/modules/indexes/text_splitters.html) to see which is best for you.

In [35]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

In [36]:
# This is a long document we can split up.
with open('data/PaulGrahamEssays/worked.txt') as f:
    pg_work = f.read()
    
print (f"You have {len([pg_work])} document")

You have 1 document


In [37]:
text_splitter = RecursiveCharacterTextSplitter(
    # Set a really small chunk size, just to show.
    chunk_size = 150,
    chunk_overlap  = 20,
)

texts = text_splitter.create_documents([pg_work])

In [38]:
print (f"You have {len(texts)} documents")

You have 610 documents


In [39]:
print ("Preview:")
print (texts[0].page_content, "\n")
print (texts[1].page_content)

Preview:
February 2021Before college the two main things I worked on, outside of school,
were writing and programming. I didn't write essays. I wrote what 

beginning writers were supposed to write then, and probably still
are: short stories. My stories were awful. They had hardly any plot,


### **Retrievers**
Easy way to combine documents with language models.

There are many different types of retrievers, the most widely supported is the VectoreStoreRetriever

In [41]:
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings

loader = TextLoader('data/PaulGrahamEssays/worked.txt')
documents = loader.load()

In [42]:
# Get your splitter ready
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=50)

# Split your docs into texts
texts = text_splitter.split_documents(documents)
print(texts)
# Get embedding engine ready
# Embedd your texts
db = FAISS.from_documents(texts, embeddings)

[Document(page_content='February 2021Before college the two main things I worked on, outside of school,\nwere writing and programming. I didn\'t write essays. I wrote what\nbeginning writers were supposed to write then, and probably still\nare: short stories. My stories were awful. They had hardly any plot,\njust characters with strong feelings, which I imagined made them\ndeep.The first programs I tried writing were on the IBM 1401 that our\nschool district used for what was then called "data processing."\nThis was in 9th grade, so I was 13 or 14. The school district\'s\n1401 happened to be in the basement of our junior high school, and\nmy friend Rich Draves and I got permission to use it. It was like\na mini Bond villain\'s lair down there, with all these alien-looking\nmachines Â— CPU, disk drives, printer, card reader Â— sitting up\non a raised floor under bright fluorescent lights.The language we used was an early version of Fortran. You had to\ntype programs on punch cards, then

In [43]:
# Embedd your texts
db = FAISS.from_documents(texts, embeddings)

In [45]:
# Init your retriever. Asking for just 1 document backn
retriever = db.as_retriever()

In [46]:
retriever

VectorStoreRetriever(tags=['FAISS'], vectorstore=<langchain.vectorstores.faiss.FAISS object at 0x000001E3DC503E80>)

In [47]:
docs = retriever.get_relevant_documents("what types of things did the author want to build?")

In [48]:
print("\n\n".join([x.page_content[:200] for x in docs[:2]]))

standards; what was the point? No one else wanted one either, so
off they went. That was what happened to systems work.I wanted not just to build things, but to build things that would
last.In this di

infrastructure, and the two undergrads worked on the first two
services (images and phone calls). But about halfway through the
summer I realized I really didn't want to run a company Â— especially
no


### **VectorStores**
Databases to store vectors. Example si [Azure Vector Store](https://learn.microsoft.com/en-us/azure/search/vector-search-overview/), the open source vector database [ChromaDB](https://www.trychroma.com/) and FAISS. More examples on OpenAIs [retriever documentation](https://github.com/openai/chatgpt-retrieval-plugin#choosing-a-vector-database). [Chroma](https://www.trychroma.com/) & [FAISS](https://engineering.fb.com/2017/03/29/data-infrastructure/faiss-a-library-for-efficient-similarity-search/) are easy to work with locally.

Conceptually, think of them as tables w/ a column for embeddings (vectors) and a column for metadata.

Example

| Embedding      | Metadata |
| ----------- | ----------- |
| [-0.00015641732898075134, -0.003165106289088726, ...]      | {'date' : '1/2/23}       |
| [-0.00035465431654651654, 1.4654131651654516546, ...]   | {'date' : '1/3/23}        |

In [49]:
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings

loader = TextLoader('data/PaulGrahamEssays/worked.txt')
documents = loader.load()
# Get your splitter ready
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=50)
# Split your docs into texts
texts = text_splitter.split_documents(documents)

# Get embedding engine ready
# embeddings = OpenAIEmbeddings(openai_api_key = openai_api_key) 

In [50]:
print (f"You have {len(texts)} documents")

You have 78 documents


In [51]:
embedding_list = embeddings.embed_documents([text.page_content for text in texts])

In [52]:
print (f"You have {len(embedding_list)} embeddings")
print (f"Here's a sample of one: {embedding_list[0][:3]}...")

You have 78 embeddings
Here's a sample of one: [-0.0010750390271892223, -0.010958679865575572, -0.012766089190019685]...


Your vectorstore store your embeddings (☝️) and make them easily searchable

## Memory
Helping LLMs remember information.

Memory is a bit of a loose term. It could be as simple as remembering information you've chatted about in the past or more complicated information retrieval.

We'll keep it towards the Chat Message use case. This would be used for chat bots.

There are many types of memory, explore [the documentation](https://python.langchain.com/en/latest/modules/memory/how_to_guides.html) to see which one fits your use case.

### Chat Message History

In [53]:
from langchain.memory import ChatMessageHistory
from langchain.chat_models import AzureChatOpenAI

# chat = ChatOpenAI(temperature=0, openai_api_key=openai_api_key)

history = ChatMessageHistory()

history.add_ai_message("hi!")

history.add_user_message("what is the capital of france?")

In [54]:
history.messages

[AIMessage(content='hi!'),
 HumanMessage(content='what is the capital of france?')]

In [55]:
ai_response = chat(history.messages)
ai_response

AIMessage(content='The capital of France is Paris.')

In [56]:
history.add_ai_message(ai_response.content)
history.messages

[AIMessage(content='hi!'),
 HumanMessage(content='what is the capital of france?'),
 AIMessage(content='The capital of France is Paris.')]

### Chains 
Combining different LLM calls and action automatically

Ex: Summary #1, Summary #2, Summary #3 > Final Summary

Check out [this video](https://www.youtube.com/watch?v=f9_BWhCI4Zo&t=2s) explaining different summarization chain types

There are [many applications of chains](https://python.langchain.com/en/latest/modules/chains/how_to_guides.html) search to see which are best for your use case.

We'll cover two of them:

### 1. Simple Sequential Chains

Easy chains where you can use the output of an LLM as an input into another. Good for breaking up tasks (and keeping your LLM focused)

In [57]:
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain.chains import SimpleSequentialChain

# llm = OpenAI(temperature=1, openai_api_key=openai_api_key)

In [58]:
template = """Your job is to come up with a classic dish from the area that the users suggests.
% USER LOCATION
{user_location}

YOUR RESPONSE:
"""
prompt_template = PromptTemplate(input_variables=["user_location"], template=template)

# Holds my 'location' chain
location_chain = LLMChain(llm=llm, prompt=prompt_template)

In [59]:
template = """Given a meal, give a short and simple recipe on how to make that dish at home.
% MEAL
{user_meal}

YOUR RESPONSE:
"""
prompt_template = PromptTemplate(input_variables=["user_meal"], template=template)

# Holds my 'meal' chain
meal_chain = LLMChain(llm=llm, prompt=prompt_template)

In [60]:
overall_chain = SimpleSequentialChain(chains=[location_chain, meal_chain], verbose=True)

In [61]:
review = overall_chain.run("Rome")



[1m> Entering new SimpleSequentialChain chain...[0m
[36;1m[1;3mOne classic dish from Rome is Cacio e Pepe, which translates to "cheese and pepper." This simple yet delicious dish consists of spaghetti tossed with Pecorino Romano cheese, black pepper, and a bit of the pasta cooking water to create a creamy sauce. It has been a staple in Roman cuisine for centuries and is a must-try for anyone visiting the city. Other popular dishes from Rome include carbonara, amatriciana, and supplì (fried rice balls with a mozzarella center).[0m
[33;1m[1;3mTo make Cacio e Pepe at home, start by boiling spaghetti in salted water until al dente. In a separate pan, toast black pepper until fragrant. Reserve some of the pasta cooking water and drain the spaghetti. In the same pan as the pepper, add the spaghetti and some of the cooking water. Toss in grated Pecorino Romano cheese and continue to stir until the cheese melts and coats the spaghetti, creating a creamy sauce. Serve immediately and en

### 2. Summarization Chain

Easily run through long numerous documents and get a summary. Check out [this video](https://www.youtube.com/watch?v=f9_BWhCI4Zo) for other chain types besides map-reduce

In [62]:
from langchain.chains.summarize import load_summarize_chain
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

loader = TextLoader('data/PaulGrahamEssays/disc.txt')
documents = loader.load()

# Get your splitter ready
text_splitter = RecursiveCharacterTextSplitter(chunk_size=700, chunk_overlap=50)

# Split your docs into texts
texts = text_splitter.split_documents(documents)

# There is a lot of complexity hidden in this one line. I encourage you to check out the video above for more detail
chain = load_summarize_chain(llm, chain_type="map_reduce", verbose=True)
chain.run(texts)



[1m> Entering new MapReduceDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mWrite a concise summary of the following:


"January 2017Because biographies of famous scientists tend to 
edit out their mistakes, we underestimate the 
degree of risk they were willing to take.
And because anything a famous scientist did that
wasn't a mistake has probably now become the
conventional wisdom, those choices don't
seem risky either.Biographies of Newton, for example, understandably focus
more on physics than alchemy or theology.
The impression we get is that his unerring judgment
led him straight to truths no one else had noticed.
How to explain all the time he spent on alchemy
and theology?  Well, smart people are often kind of
crazy.But maybe there is a simpler explanation. Maybe"


CONCISE SUMMARY:[0m
Prompt after formatting:
[32;1m[1;3mWrite a concise summary of the following:


"the smartness and the craziness were not as sepa




[1m> Finished chain.[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mWrite a concise summary of the following:


" Biographies of famous scientists often overlook their mistakes and unconventional pursuits, leading us to underestimate their risks. This may be due to the fact that their successful decisions are now seen as common knowledge, making them seem less risky in hindsight. However, it is possible that these scientists were simply curious and open-minded individuals, rather than being "crazy."



In Newton's time, physics, alchemy, and theology were all seen as potentially valuable fields of study. However, only physics turned out to have a significant payoff. Newton took a risk by focusing on physics, but the other two fields were still considered important at the time. "


CONCISE SUMMARY:[0m

[1m> Finished chain.[0m

[1m> Finished chain.[0m


' Biographies of famous scientists often downplay their mistakes and unconventional pursuits, causing us to underestimate their risks. This could be because their successful decisions are now seen as obvious, making them seem less risky in hindsight. However, these scientists may have just been curious and open-minded, rather than "crazy." For example, in Newton\'s time, physics, alchemy, and theology were all seen as valuable areas of study, but only physics ultimately had a significant payoff. Newton\'s focus on physics was a risk, but the other fields were still considered important at the time.'

### Agents 

Official LangChain Documentation describes agents perfectly (emphasis mine):
> Some applications will require not just a predetermined chain of calls to LLMs/other tools, but potentially an **unknown chain** that depends on the user's input. In these types of chains, there is a “agent” which has access to a suite of tools. Depending on the user input, the agent can then **decide which, if any, of these tools to call**.


Basically you use the LLM not just for text output, but also for decision making. The coolness and power of this functionality can't be overstated enough.

Sam Altman emphasizes that the LLMs are good '[reasoning engine](https://www.youtube.com/watch?v=L_Guz73e6fw&t=867s)'. Agent take advantage of this.

### Agents

The language model that drives decision making.

More specifically, an agent takes in an input and returns a response corresponding to an action to take along with an action input. You can see different types of agents (which are better for different use cases) [here](https://python.langchain.com/en/latest/modules/agents/agent_types.html).

### Tools

A 'capability' of an agent. This is an abstraction on top of a function that makes it easy for LLMs (and agents) to interact with it. Ex: Google search.

This area shares commonalities with [OpenAI plugins](https://platform.openai.com/docs/plugins/introduction).

### Toolkit

Groups of tools that your agent can select from

Let's bring them all together:

In [63]:
from langchain.agents import load_tools
from langchain.agents import initialize_agent
import json
#llm = OpenAI(temperature=0, openai_api_key=openai_api_key)

In [64]:
serpapi_api_key= os.getenv('SERPAPI_API_KEY')

In [66]:
toolkit = load_tools(["serpapi"], llm=llm, serpapi_api_key=serpapi_api_key)

In [67]:
agent = initialize_agent(toolkit, llm, agent="zero-shot-react-description", verbose=True, return_intermediate_steps=True)

In [68]:
response = agent({"input":"what was the first album of the" 
                    "band that Natalie Bergman is a part of?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I should search for the band name and Natalie Bergman's name.
Action: Search
Action Input: "theband Natalie Bergman first album"[0m
Observation: [36;1m[1;3mMercy[0m
Thought:[32;1m[1;3m I should search for the album title to verify.
Action: Search
Action Input: "Mercy"[0m
Observation: [36;1m[1;3m{'type': 'dictionary_results', 'syllables': 'mer·cy', 'phonetic': '/ˈmərsē/', 'word_type': 'noun'}[0m
Thought:[32;1m[1;3m Mercy is the correct album title, now I need to find the band name.
Action: Search
Action Input: "Mercy album band name"[0m
Observation: [36;1m[1;3m['Mercy is an American pop group from Florida. The group\'s 1969 single "Love (Can Make You Happy)", written by Jack Sigler, Jr., soared to No.'][0m
Thought:[32;1m[1;3m I should search for Natalie Bergman's name in relation to the band name.
Action: Search
Action Input: "Mercy album Natalie Bergman"[0m
Observation: [36;1m[1;3m['Mercy main_tab_text: 