<img src="Images/atom.png" alt="Atom" style="width:60px" align="left" vertical-align="middle">

## 1. Introduction to LangChain
*in Python*

----

### **A. Introduction**
*This introduction is based off Greg Kamradt's cookbook, which in turn is based off the [LangChain Conceptual Documentation](https://docs.langchain.com/docs/)*

**Goal:** Provide an updated, introductory understanding of the components and use cases of LangChain. LangChain is, in essence, an AI framework. It has pros & cons which need to be understood.

**Links:**
* [LC Conceptual Documentation](https://docs.langchain.com/docs/)
* [LC Python Documentation](https://python.langchain.com/en/latest/)
* [LC Javascript/Typescript Documentation](https://js.langchain.com/docs/)
* [LC Discord](https://discord.gg/6adMQxSpJS)
* [www.langchain.com](https://langchain.com/)
* [LC Twitter](https://twitter.com/LangChainAI)
* Check out [ELI5](https://www.dictionary.com/e/slang/eli5/#:~:text=ELI5%20is%20short%20for%20%E2%80%9CExplain,a%20complicated%20question%20or%20problem.) examples and code snippets
* For use cases check out [part 2](https://github.com/gkamradt/langchain-tutorials/blob/main/LangChain%20Cookbook%20Part%202%20-%20Use%20Cases.ipynb)
* See [video tutorial](https://www.youtube.com/watch?v=2xxziIWmaSA) of this notebook

### **B. What is LangChain?**
> LangChain is a framework for developing applications powered by language models.

LangChain makes the complicated parts of working & building with AI models easier. It helps do this in two ways:

1. **Integration** - Bring external data, such as your files, other applications, and api data, to your LLMs
2. **Agency** - Allow your LLMs to interact with it's environment via decision making. Use LLMs to help decide which action to take next

### **C. Pros of LangChain**
1. **Components** - LangChain makes it easy to swap out abstractions and components necessary to work with language models. Though LLMs can be straightforward (text-in, text-out) you'll quickly run into friction points that LangChain helps with once you develop more complicated applications.

2. **Customized Chains** - LangChain provides out of the box support for using and customizing 'chains' - a series of actions strung together.

3. **Speed 🚢** - This team ships insanely fast. You'll be up to date with the latest LLM features.

4. **Open source** - The components are easily modifiable.

5. **Community 👥** - Wonderful discord and community support, meet ups, hackathons, etc.

### **D. Cons of LangChain**
1. **Right tool for the right job** - LangChain, Kubernetes, CrewAI are tools which are useful to automate workflows when the workflow *is not well defined.* In most cases they are, in which case these tools may hinder productivity and efficiency. 

2. **Inefficient Token Usage** - One of the significant concerns raised about Langchain is its token counting function, which can be inefficient for small datasets. Use `Tiktoken` instead.

3. **Terrible Documentation.**

4. **Too Much Obfuscation, Overly Abundant ‘Helper’ Functions** - Too much boiler plate functions which can introduce bugs and ineffecient code into an already process intensive application.

5. **Inconsistent Behavior and Hidden Details** - LangChain has been criticized for hiding important details and having inconsistent behavior, which can lead to unexpected issues in production systems. For example, developers have observed an intriguing aspect of the Langchain ConversationRetrievalChain, which involves the rephrasing of input questions. This rephrasing can sometimes be so extensive that it disrupts the natural flow of the conversation and takes it out of context.

6. **Lack of a Standard Interoperable Datatype** - Another drawback of Langchain is its absence of a standard way to represent data. This lack of uniformity can hinder integration with other frameworks and tools, making it challenging to work within a broader ecosystem of machine learning tools.

### **E. Summary**
LangChain has both pros and cons, however the main reason to use it would be as a base for individual projects and customize its components to fit ones needs. 

*Note: This introduction will not cover all aspects of LangChain. It's contents have been curated to get you to building & impact as quick as possible. For more, please check out [LangChain Conceptual Documentation](https://docs.langchain.com/docs/)*

You'll need API keys for the various models that are included in this tutorial. You can have it as an environement variable, in an .env file where this jupyter notebook lives, or insert it below where 'YourAPIKey' is. Have if you have questions on this, put these instructions into [ChatGPT](https://chat.openai.com/).

In [1]:
from dotenv import load_dotenv
import os

# Load environment variables from .env
load_dotenv()

# openai_api_key=os.getenv('OPENAI_API_KEY', 'YourAPIKey')

True

#
<img src="Images/atom.png" alt="Atom" style="width:60px" align="left" vertical-align="middle">

## 2. LangChain Components
*Schema - Nuts and Bolts of working with Large Language Models (LLMs)*

----
### **A. Text**
The natural language way to interact with LLMs

In [2]:
# You'll be working with simple strings (that'll soon grow in complexity!)
my_text = "What day comes after Friday?"
my_text

'What day comes after Friday?'

You can *invoke* a model directly using text as shown below.

In [7]:
# Instantiate the model
model = ChatGoogleGenerativeAI(model="gemini-1.5-flash")

# Invoke the model with a message
result = model.invoke(my_text)
print("Full result:")
print(result, "\n")
print("Content only:")
print(result.content)

Full result:
content='Saturday\n' additional_kwargs={} response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'safety_ratings': []} id='run-d9bdff30-ec90-4d13-923c-abecfe22256a-0' usage_metadata={'input_tokens': 7, 'output_tokens': 2, 'total_tokens': 9, 'input_token_details': {'cache_read': 0}} 

Content only:
Saturday



### **B. Chat Messages**
Like text, but specified with a message type (System, Human, AI)

* **System** - Helpful background context that tell the AI what to do
* **Human** - Messages that are intented to represent the user
* **AI** - Messages that show what the AI responded with (this can also be used to *teach* the AI what to respond to)

For more, see OpenAI's [documentation](https://platform.openai.com/docs/guides/chat/introduction)

In [4]:
from langchain_openai import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage, AIMessage

# This it the language model we'll use. We'll talk about what we're doing below in the next section.
model = ChatOpenAI(model="gpt-4o-mini", temperature=.7)

Now let's create a few messages that simulate a chat experience with a bot

In [5]:
# SystemMessage:
#   Message for priming AI behavior, usually passed in as the first of a sequenc of input messages.
# HumanMessagse:
#   Message from a human to the AI model.

messages = [
        SystemMessage(content="You are a nice AI bot that helps a user figure out what to eat in one short sentence"),
        HumanMessage(content="I like tomatoes, what should I eat?")
]

result = model.invoke(messages)
print(f"THE ENTIRE MESSAGE:\n{result}\n")
print(f"THE CONTENT ALONE:\n{result.content}")

THE ENTIRE MESSAGE:
content='How about a fresh caprese salad with tomatoes, mozzarella, basil, and a drizzle of balsamic glaze?' additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 23, 'prompt_tokens': 39, 'total_tokens': 62, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_d02d531b47', 'finish_reason': 'stop', 'logprobs': None} id='run-fddb4489-01ad-42b1-b551-51b8a40a0607-0' usage_metadata={'input_tokens': 39, 'output_tokens': 23, 'total_tokens': 62, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}}

THE CONTENT ALONE:
How about a fresh caprese salad with tomatoes, mozzarella, basil, and a drizzle of balsamic glaze?


<br/>
You can also pass more chat history with responses from the AI

In [14]:
messages = [
        SystemMessage(content="You are a nice AI bot that helps a user figure out where to travel in one short sentence"),
        HumanMessage(content="I like the beaches where should I go?"),
        AIMessage(content="You should go to Nice, France"),
        HumanMessage(content="What else should I do when I'm there?")
]

model.invoke(messages).content # Note that the model has inferred where I was from the chat log that was provided

'While in Nice, take a stroll along the Promenade des Anglais, explore the old town (Vieux Nice), and visit the Marc Chagall National Museum.'

<br/>
You can also exclude the system message if you want

In [15]:
messages = [
        HumanMessage(content="What day comes after Thursday?")
]

model.invoke(messages).content

'The day that comes after Thursday is Friday.'

### **C. Documents**
An object that holds a piece of text and metadata (more information about that text)

In [16]:
from langchain.schema import Document

In [17]:
Document(page_content="This is my document. It is full of text that I've gathered from other places",
         metadata={
             'my_document_id' : 234234,
             'my_document_source' : "The LangChain Papers",
             'my_document_create_time' : 1680013019
         })

Document(metadata={'my_document_id': 234234, 'my_document_source': 'The LangChain Papers', 'my_document_create_time': 1680013019}, page_content="This is my document. It is full of text that I've gathered from other places")

But you don't have to include metadata if you don't want to. However, the metadata helps searching documents in a library. 

In [19]:
Document(page_content="This is my document. It is full of text that I've gathered from other places")

Document(metadata={}, page_content="This is my document. It is full of text that I've gathered from other places")

#
<img src="Images/atom.png" alt="Atom" style="width:60px" align="left" vertical-align="middle">

## 3. Models - The interface to the AI brains
*Chat model alternatives*

----
LLMs come in many different flavors, here are a few. 

###  **A. Language Models**
A model that does text in ➡️ text out!

*Various models are available for use. See more models [here](https://platform.openai.com/docs/models)*

In [7]:
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_anthropic import ChatAnthropic
from langchain_openai import ChatOpenAI
from dotenv import load_dotenv
from langchain_core.messages import HumanMessage, SystemMessage

# Setup environment variables and messages
load_dotenv()

messages = [
    SystemMessage(content="Solve the following math problems"),
    HumanMessage(content="What is 81 divided by 9?"),
]

Below is the LangChain OpenAI Chat Model:

In [22]:
# ---- LangChain OpenAI Chat Model Example ---- #

# Create a ChatOpenAI model
model = ChatOpenAI(model="gpt-4o-mini")

# Invoke the model with messages
result = model.invoke(messages)
print(f"Answer from OpenAI: {result.content}")

Answer from OpenAI: 81 divided by 9 is 9.


Below is the Google Chat Model:

In [8]:
# ---- Google Chat Model Example ---- #

# https://console.cloud.google.com/gen-app-builder/engines
# https://ai.google.dev/gemini-api/docs/models/gemini
model = ChatGoogleGenerativeAI(model="gemini-1.5-flash")

result = model.invoke(messages)
print(f"Answer from Google: {result.content}")

Answer from Google: 81 divided by 9 is 9.



Below is the Anthropic Chat Model:

In [None]:
# ---- Anthropic Chat Model Example ---- #

# Create a Anthropic model
# Anthropic models: https://docs.anthropic.com/en/docs/models-overview
model = ChatAnthropic(model="claude-3-opus-20240229")

result = model.invoke(messages)
print(f"Answer from Anthropic: {result.content}")

### **B. Chat Models**
A model that takes a series of messages and returns a message output.

In [24]:
from langchain_openai import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage, AIMessage

# This it the language model we'll use. Note that the temperature has been set to 1, hence the model's responses become more stochastic. 
model = ChatOpenAI(model="gpt-4o-mini", temperature=1)

In [25]:
messages = [
        SystemMessage(content="You are an unhelpful AI bot that makes a joke at whatever the user says"),
        HumanMessage(content="I would like to go to New York, how should I do this?")
]

model.invoke(messages).content

'Why don’t you just take a bus? It’s either that or just wait for your dreams to take off—either way, you need to pack!'

### **C. Function Calling Models**
*Deprecated. Doesn't work at present!*

[Function calling models](https://openai.com/blog/function-calling-and-other-api-updates) are similar to Chat Models but with a little extra flavor. They are fine tuned to give structured data outputs.

This comes in handy when you're making an API call to an external service or doing extraction. When you use function calling, the model never actually executes functions itself - instead, it simply generates parameters that can be used to call your function. You are then responsible for handling how the function is executed in your code.

In [37]:
model = ChatOpenAI(model="gpt-4o-mini", temperature=1)

messages = [
         SystemMessage(content="You are an helpful AI bot"),
         HumanMessage(content="What’s the weather like in Sydney right now?")
]

functions = [
  {
      "name": "get_current_weather",
      "description": "Get the current weather in a given location",
      "parameters": {
          "type": "object",
          "properties": {
              "type": "object",
              "properties": {
                  "location": {"type": "string", "description": "The city and state, e.g. San Francisco, CA"},
                  "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
                  "required": ["location"]
              },
          },
      },
  }
]

complete_message_list = [messages, functions]

#model.invoke(complete_message_list) # Doesn't work!

See the extra `additional_kwargs` that is passed back to us? We can take that and pass it to an external API to get data. It saves the hassle of doing output parsing.

### **D. Text Embedding Model**
Change your text into a vector (a series of numbers that hold the semantic 'meaning' of your text). Mainly used when comparing two pieces of text together.

*BTW: Semantic means 'relating to meaning in language or logic.'*

In [35]:
from langchain_openai import OpenAIEmbeddings

text = "Hi! It's time for the beach"
embeddings = OpenAIEmbeddings()

In [36]:
text_embedding = embeddings.embed_query(text)
print (f"Here's a sample: {text_embedding[:5]}...")
print (f"Your embedding is length {len(text_embedding)}")

Here's a sample: [-0.00022214034106582403, -0.0031126115936785936, -0.0010768607025966048, -0.019214099273085594, -0.015184946358203888]...
Your embedding is length 1536


#
<img src="Images/atom.png" alt="Atom" style="width:60px" align="left" vertical-align="middle">

## 4. Prompts - Text generally used as instructions to your model
*Prompts and templates - basics of LangChain*

----

### **A. Prompt**
What you'll pass to the underlying model.

In [9]:
# Prompt Template Docs:
#   https://python.langchain.com/v0.2/docs/concepts/#prompt-templateshttps://python.langchain.com/v0.2/docs/concepts/#prompt-templates

from langchain_core.prompts import PromptTemplate
from langchain.prompts import ChatPromptTemplate
from langchain_core.messages import HumanMessage
from langchain_google_genai import ChatGoogleGenerativeAI

# I like to use three double quotation marks for my prompts because it's easier to read
prompt = """
Today is Monday, tomorrow is Wednesday.

What is wrong with that statement?
"""
model = ChatGoogleGenerativeAI(model="gemini-1.5-flash")
model.invoke(prompt).content

'The problem is that tomorrow after Monday should be Tuesday, not Wednesday.  The statement skips a day.\n'

### **B. Prompt Templates**
An object that helps create prompts based on a combination of user input, other non-static information and a fixed template string.

Think of it as an [f-string](https://realpython.com/python-f-strings/) in python but for prompts.

*Advanced: Check out LangSmithHub(https://smith.langchain.com/hub) for many more communit prompt templates.*

In reality this is fairly useless by itself, as you could simply use `f-strings` or `"string".replace()` statements in Python, which would be less error prone as well - that is, unless you are attempting to dynamically generating examples for the LLM to be used in prompts.

In [14]:
# PART 1: Template basics
# Notice "location" below, that is a placeholder for another value later
template = """
I really want to travel to {location}. What should I do there? 

Respond in one short sentence.
"""

prompt = PromptTemplate(input_variables=["location"], template=template)
final_prompt = prompt.format(location='Rome')

print (f"Final Prompt: {final_prompt}")
print ("-----------")
print (f"LLM Output: {model.invoke(final_prompt).content}")

Final Prompt: 
I really want to travel to Rome. What should I do there?

Respond in one short sentence.

-----------
LLM Output: Visit iconic sites like the Colosseum, Vatican City, and the Trevi Fountain while enjoying authentic Italian cuisine.


#### Note:
`ChatPromptTemplate` are a more recent addition to the prompt templates, but seem only to add to the clutter without any meaningful value. 

In [7]:
# PART 2: Create a ChatPromptTemplate using a template string
template = "Tell me a joke about {topic}."
prompt_template = ChatPromptTemplate.from_template(template)

print("\n-----Prompt from Template-----\n")
prompt = prompt_template.invoke({"topic": "cats"})
print(prompt)


-----Prompt from Template-----

messages=[HumanMessage(content='Tell me a joke about cats.', additional_kwargs={}, response_metadata={})]


In [8]:
# PART 3: Prompt with Multiple Placeholders
template_multiple = """You are a helpful assistant.
Human: Tell me a {adjective} story about a {animal}.
Assistant:"""

prompt_multiple = ChatPromptTemplate.from_template(template_multiple)
prompt = prompt_multiple.invoke({"adjective": "funny", "animal": "panda"})

print("\n----- Prompt with Multiple Placeholders -----\n")
print(prompt)


----- Prompt with Multiple Placeholders -----

messages=[HumanMessage(content='You are a helpful assistant.\nHuman: Tell me a funny story about a panda.\nAssistant:', additional_kwargs={}, response_metadata={})]


In [9]:
# PART 4: Prompt with System and Human Messages (Using Tuples)
messages = [
    ("system", "You are a comedian who tells jokes about {topic}."),
    ("human", "Tell me {joke_count} jokes."),
]

prompt_template = ChatPromptTemplate.from_messages(messages)
prompt = prompt_template.invoke({"topic": "lawyers", "joke_count": 3})

print("\n----- Prompt with System and Human Messages (Tuple) -----\n")
print(prompt)


----- Prompt with System and Human Messages (Tuple) -----

messages=[SystemMessage(content='You are a comedian who tells jokes about lawyers.', additional_kwargs={}, response_metadata={}), HumanMessage(content='Tell me 3 jokes.', additional_kwargs={}, response_metadata={})]


In [10]:
# Extra information about Part 3. This is what WORKS:
messages = [
    ("system", "You are a comedian who tells jokes about {topic}."),
    HumanMessage(content="Tell me 3 jokes."),
]

prompt_template = ChatPromptTemplate.from_messages(messages)
prompt = prompt_template.invoke({"topic": "lawyers"})

print("\n----- Prompt with System and Human Messages (Tuple) -----\n")
print(prompt)


----- Prompt with System and Human Messages (Tuple) -----

messages=[SystemMessage(content='You are a comedian who tells jokes about lawyers.', additional_kwargs={}, response_metadata={}), HumanMessage(content='Tell me 3 jokes.', additional_kwargs={}, response_metadata={})]


In [11]:
# Extra information about Part 3. This does NOT work, unless tuples are used:
messages = [
    ("system", "You are a comedian who tells jokes about {topic}."),
    HumanMessage(content="Tell me {joke_count} jokes."),
]

prompt_template = ChatPromptTemplate.from_messages(messages)
prompt = prompt_template.invoke({"topic": "lawyers", "joke_count": 3})

print("\n----- Prompt with System and Human Messages (Tuple) -----\n")
print(prompt)


----- Prompt with System and Human Messages (Tuple) -----

messages=[SystemMessage(content='You are a comedian who tells jokes about lawyers.', additional_kwargs={}, response_metadata={}), HumanMessage(content='Tell me {joke_count} jokes.', additional_kwargs={}, response_metadata={})]


### **C. Example Selectors**
An easy way to select from a series of examples that allow you to dynamic place in-context information into your prompt. Often used when your task is nuanced or you have a large list of examples.

Check out different types of example selectors [here](https://python.langchain.com/docs/modules/model_io/prompts/example_selectors/).

If you want an overview on why examples are important (prompt engineering), check out [this video](https://www.youtube.com/watch?v=dOxUroR57xs).

In [18]:
from langchain_community.vectorstores import FAISS
from langchain_core.example_selectors import SemanticSimilarityExampleSelector
from langchain_core.prompts import FewShotPromptTemplate, PromptTemplate
from langchain_openai import OpenAIEmbeddings

example_prompt = PromptTemplate(input_variables=["input", "output"], template="Input: {input}\nOutput: {output}")

# Examples of locations that nouns are found
examples = [
    {"input": "pirate", "output": "ship"},
    {"input": "pilot", "output": "plane"},
    {"input": "driver", "output": "car"},
    {"input": "tree", "output": "ground"},
    {"input": "bird", "output": "nest"},
]

In [19]:
# SemanticSimilarityExampleSelector will select examples that are similar to your input by semantic meaning

example_selector = SemanticSimilarityExampleSelector.from_examples(
    # The list of examples available to select from.
    examples,
    # The embedding class used to produce embeddings which are used to measure semantic similarity.
    OpenAIEmbeddings(),
    # The VectorStore class that is used to store the embeddings and do a similarity search over.
    FAISS,
    # The number of examples to produce.
    k=2,
)

In [20]:
similar_prompt = FewShotPromptTemplate(
    # The object that will help select examples
    example_selector=example_selector,
    # Your prompt
    example_prompt=example_prompt,
    # Customizations that will be added to the top and bottom of your prompt
    prefix="Give the location an item is usually found in",
    suffix="Input: {noun}\nOutput:",
    # What inputs your prompt will receive
    input_variables=["noun"],
)

In [22]:
print("SELECTING PEOPLE BY PROFESSION")
print(similar_prompt.format(noun="student")) # Note that this finds the appropriate example similar to the noun

print("\nSELECTING NOUNS BY THINGS")
print(similar_prompt.format(noun="flower")) # Note that this finds the appropriate example similar to the noun

SELECTING PEOPLE BY PROFESSION
Give the location an item is usually found in

Input: driver
Output: car

Input: pilot
Output: plane

Input: student
Output:

SELECTING NOUNS BY THINGS
Give the location an item is usually found in

Input: tree
Output: ground

Input: bird
Output: nest

Input: flower
Output:


In [23]:
model.invoke(similar_prompt.format(noun="flower")).content

'garden'

### **D. Output Parsers Method 1: Prompt Instructions & String Parsing**
A helpful way to format the output of a model. Usually used for structured output. LangChain has a bunch more output parsers listed on their [documentation](https://python.langchain.com/docs/modules/model_io/output_parsers).

Two big concepts:

**1. Format Instructions** - A autogenerated prompt that tells the LLM how to format it's response based off your desired result

**2. Parser** - A method which will extract your model's text output into a desired structure (usually json)

In [3]:
from langchain.output_parsers import StructuredOutputParser, ResponseSchema
from langchain_core.prompts import ChatPromptTemplate, HumanMessagePromptTemplate
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_core.prompts import PromptTemplate
from dotenv import load_dotenv 

# Setup environment variables and messages
load_dotenv()

# Create a ChatOpenAI model
model = ChatGoogleGenerativeAI(model="gemini-1.5-flash")

In [11]:
# i. How you would like your RESPONSE structured
response_schemas = [
    ResponseSchema(name="bad_string", description="This a poorly formatted user input string"),
    ResponseSchema(name="good_string", description="This is your response, a reformatted response")
]

# ii. How you would like to PARSE your RESPONSE
output_parser = StructuredOutputParser.from_response_schemas(response_schemas)

In [16]:
# See the format_instructions you created for formatting
format_instructions = output_parser.get_format_instructions()
print (format_instructions)

The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```":

```json
{
	"bad_string": string  // This a poorly formatted user input string
	"good_string": string  // This is your response, a reformatted response
}
```


In [17]:
# iii. This is the template that puts it all together
template = """
You will be given a poorly formatted string from a user.
Reformat it and make sure all the words are spelled correctly.

FORMAT INSTRUCTIONS:
{format_instructions}

USER INPUT:
{user_input}

YOUR RESPONSE:
"""

prompt = PromptTemplate(input_variables=["user_input"], partial_variables={"format_instructions": format_instructions}, template=template)
promptValue = prompt.format(user_input="welcom to califonya!")
print(promptValue)


You will be given a poorly formatted string from a user.
Reformat it and make sure all the words are spelled correctly.

FORMAT INSTRUCTIONS:
The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```":

```json
{
	"bad_string": string  // This a poorly formatted user input string
	"good_string": string  // This is your response, a reformatted response
}
```

USER INPUT:
welcom to califonya!

YOUR RESPONSE:



In [18]:
llm_output = model.invoke(promptValue).content
print(llm_output)

```json
{
	"bad_string": "welcom to califonya!",
	"good_string": "Welcome to California!"
}
```



In [19]:
output_parser.parse(llm_output)

{'bad_string': 'welcom to califonya!', 'good_string': 'Welcome to California!'}

### **E. Output Parsers Method 2: OpenAI Fuctions**
When OpenAI released function calling, the game changed. This is recommended method when starting out.

They trained models specifically for outputing structured data. It became super easy to specify a Pydantic schema and get a structured output.

There are many ways to define your schema, I prefer using Pydantic Models because of how organized they are. Feel free to reference OpenAI's [documention](https://platform.openai.com/docs/guides/gpt/function-calling) for other methods.

In order to use this method you'll need to use a model that supports [function calling](https://openai.com/blog/function-calling-and-other-api-updates#:~:text=Developers%20can%20now%20describe%20functions%20to%20gpt%2D4%2D0613%20and%20gpt%2D3.5%2Dturbo%2D0613%2C). I'll use `gpt-4o-mini`

<br/>**i. Example: Simple**
<br/>Let's get started by defining a simple model for us to extract from.

In [1]:
from pydantic import BaseModel, Field
from typing import Optional

class Person(BaseModel):
    """Identifying information about a person."""

    name: str = Field(..., description="The person's name")
    age: int = Field(..., description="The person's age")
    fav_food: Optional[str] = Field(None, description="The person's favorite food")

Then let's create a chain (more on this later) that will do the extracting for us

In [4]:
from langchain_openai import ChatOpenAI

model = ChatOpenAI(model="gpt-4o-mini")

chain = model.with_structured_output(Person)

person = chain.invoke("Sally is 13, Joey just turned 12 and loves spinach. Caroline is 10 years older than Sally.")
print(person)

name='Sally' age=13 fav_food=None


Notice how we only have data on one person from that list? That is because we didn't specify we wanted multiple. Let's change our schema to specify that we want a list of people if possible.

In [7]:
from typing import Sequence

class People(BaseModel):
    """Identifying information about all people in a text."""

    people: Sequence[Person] = Field(..., description="The people in the text")

Now we'll call for People rather than Person

In [46]:
chain = model.with_structured_output(People)

people = chain.invoke("Sally is 13, Joey just turned 12 and loves spinach. Caroline is 10 years older than Sally.")
print(people)

people=[Person(name='Sally', age=13, fav_food=None), Person(name='Joey', age=12, fav_food='spinach'), Person(name='Caroline', age=23, fav_food=None)]


Let's do some more parsing with it.

In [62]:
# From the Person class
print(person.name)
print(person.age)
print(person.fav_food, "\n")

# From the People class
print(people.people)
for person in people.people: print(person.name, person.age, person.fav_food)

Caroline
23
None 

[Person(name='Sally', age=13, fav_food=None), Person(name='Joey', age=12, fav_food='spinach'), Person(name='Caroline', age=23, fav_food=None)]
Sally 13 None
Joey 12 spinach
Caroline 23 None


<br/>**ii. Example: Enum**
<br/>Now let's parse when a product from a list is mentioned.

In [12]:
import enum
from langchain_openai import ChatOpenAI

model = ChatOpenAI(model="gpt-4o-mini")

class Product(str, enum.Enum):
    CRM = "CRM"
    VIDEO_EDITING = "VIDEO_EDITING"
    HARDWARE = "HARDWARE"

In [13]:
class Products(BaseModel):
    """Identifying products that were mentioned in a text"""
    
    products: Sequence[Product] = Field(..., description="The products mentioned in a text")

In [19]:
chain = model.with_structured_output(Products)

# Partial match
print(chain.invoke("This computer is great. Love the hardware. The graphics card is cool too. Love the video editing on it."))

# Full match
print(chain.invoke("The CRM in this demo is great. Love the hardware. The microphone is also cool. Love the video editing"))

products=[<Product.HARDWARE: 'HARDWARE'>, <Product.VIDEO_EDITING: 'VIDEO_EDITING'>]
products=[<Product.CRM: 'CRM'>, <Product.VIDEO_EDITING: 'VIDEO_EDITING'>, <Product.HARDWARE: 'HARDWARE'>]


#
<img src="Images/atom.png" alt="Atom" style="width:60px" align="left" vertical-align="middle">

## 5. Indexes - Structuring documents to LLMs can work with them
*Memory storage - basics of LangChain*

---- 

### **A. Document Loaders**
Easy ways to import data from other sources. Shared functionality with [OpenAI Plugins](https://openai.com/blog/chatgpt-plugins) [specifically retrieval plugins](https://github.com/openai/chatgpt-retrieval-plugin).

See a [big list](https://python.langchain.com/en/latest/modules/indexes/document_loaders.html) of document loaders here. A bunch more on [Llama Index](https://llamahub.ai/) as well.

<br/>**i. Online news websites: e.g. HackerNews**

In [20]:
from langchain.document_loaders import HNLoader

USER_AGENT environment variable not set, consider setting it to identify your requests.


In [21]:
loader = HNLoader("https://news.ycombinator.com/item?id=34422627")

In [22]:
data = loader.load()

In [23]:
print (f"Found {len(data)} comments")
print (f"Here's a sample:\n\n{''.join([x.page_content[:150] for x in data[:2]])}")

Found 76 comments
Here's a sample:

Ozzie_osman on Jan 18, 2023  
             | next [–] 

LangChain is awesome. For people not sure what it's doing, large language models (LLMs) are veOzzie_osman on Jan 18, 2023  
             | parent | next [–] 

Also, another library to check out is GPT Index (https://github.com/jerryjliu/gpt_ind


<br/>**ii. Online data repos: e.g. Books from Gutenberg Project**

In [24]:
from langchain.document_loaders import GutenbergLoader

loader = GutenbergLoader("https://www.gutenberg.org/cache/epub/2148/pg2148.txt")

data = loader.load()

In [28]:
print(data[0].page_content[1890:1984])

 just after dark one gusty evening in the autumn of 18-,


      I was enjoying the twofold l


<br/>**iii. Websites: URLs and webpages**
<br/>Let's try it out with [Paul Graham's website](http://www.paulgraham.com/).

In [30]:
from langchain.document_loaders import UnstructuredURLLoader

urls = ["http://www.paulgraham.com/"]

loader = UnstructuredURLLoader(urls=urls)

data = loader.load()

data[0].page_content

'New: Writes and Write-Nots | Founder Mode Want to start a startup? Get funded by Y Combinator . © mmxxv pg'

<br/>**iv. Text file: Text document loader**

In [5]:
from langchain_community.document_loaders import TextLoader

loader = TextLoader('./Data/Anatomy-General.txt')

data = loader.load()

print(data[0].page_content[0:300])

TISSUES AND STRUCTURES
1. The body is composed of four basic tissues - epithelium, connective tissue, muscle and nerve.
2. Skin consists of two elements: epithelium/epidermis & appendages (ectodermal in origin) and connective tissue (mesodermal in origin).
3. The epidermis is of the stratified squam


<br/>**iv. Directory: Load documents from a directory**
<br/>LangChain's DirectoryLoader implements functionality for reading files from disk into LangChain Document objects.

`DirectoryLoader` accepts a `loader_cls` kwarg, which defaults to `UnstructuredLoader`. Unstructured supports parsing for a number of formats, such as PDF and HTML. Here we use it to read text files.

We can use the `glob` parameter to control which files to load. Note that here it doesn't load the `.rst` file or the `.html` files.

In [3]:
from langchain_community.document_loaders import DirectoryLoader # You will need to install python-magic-bin for this

# Load text files from the directory
loader = DirectoryLoader("./Data", glob="**/*.txt")
docs = loader.load()
print("Number of text files in directory:", len(docs), "\n")

# Print out the partial contentx of a single text file
print(docs[3].page_content[:300])

Number of text files in directory: 25 

MYOTOMES & MUSCLES-THIGH

----------

MYOTOMES & MUSCLES-LEG

LEG MUSCLES-ANKLE

1. Ankle dorsiflexion (L4 + 5) => Tibialis anterior; Extensor hallucis longus; Extensor digitorum longus.

2. Ankle plantarflexion (S1 + S2) => Gastro-soleus complex; Tibialis posterior; Flexor hallucis longus; Flexor d


By default this uses the `UnstructuredLoader` class. To customize the loader, specify the loader class in the `loader_cls` kwarg. Below we show an example using `TextLoader`. Note that while the `UnstructuredLoader` parses Markdown headers, `TextLoader` does not.

In [4]:
from langchain_community.document_loaders import TextLoader

# Load text files from the directory
loader = DirectoryLoader("./Data", glob="**/*.txt")
docs = loader.load()
print("Number of text files in directory:", len(docs), "\n")

# Print out the partial contentx of a single text file
print(docs[3].page_content[:300])

Number of text files in directory: 25 

MYOTOMES & MUSCLES-THIGH

----------

MYOTOMES & MUSCLES-LEG

LEG MUSCLES-ANKLE

1. Ankle dorsiflexion (L4 + 5) => Tibialis anterior; Extensor hallucis longus; Extensor digitorum longus.

2. Ankle plantarflexion (S1 + S2) => Gastro-soleus complex; Tibialis posterior; Flexor hallucis longus; Flexor d


### **B. Text Splitters**
Often times your document is too long (like a book) for your LLM. You need to split it up into chunks. Text splitters help with this.

There are many ways you could split your text into chunks, experiment with [different ones](https://python.langchain.com/en/latest/modules/indexes/text_splitters.html) to see which is best for you.

In [6]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

# This is a long document we can split up.
with open('./Data/Anatomy-General.txt') as f: pg_work = f.read()
    
print (f"You have {len([pg_work])} long document!")

You have 1 long document!


In [7]:
text_splitter = RecursiveCharacterTextSplitter(
    # Set a really small chunk size, just to show.
    chunk_size = 150,
    chunk_overlap  = 20,
)

texts = text_splitter.create_documents([pg_work])

In [8]:
print (f"You have {len(texts)} documents")

You have 287 documents


In [9]:
print ("Preview:")
print (texts[0].page_content, "\n")
print (texts[1].page_content)

Preview:
TISSUES AND STRUCTURES
1. The body is composed of four basic tissues - epithelium, connective tissue, muscle and nerve. 

2. Skin consists of two elements: epithelium/epidermis & appendages (ectodermal in origin) and connective tissue (mesodermal in origin).


There are a ton of different ways to do text splitting and it really depends on your retrieval strategy and application design. Check out more splitters [here](https://python.langchain.com/docs/how_to/#text-splitters).

Splitting by character is the simplest method. This splits based on a given character sequence, which defaults to `"\n\n"`. Chunk length is measured by number of characters.

`CharacterTextSplitter` will only split on separator (which is `'\n\n'` by default). `chunk_size` is the maximum chunk size that will be split if splitting is possible. If a string starts with n characters, has a separator, and has `m` more characters before the next separator then the first chunk size will be `n` if `chunk_size < n + m + len(separator)`.

In [12]:
import logging
from langchain_text_splitters import CharacterTextSplitter

# i. Load an example document
with open("./Data/Anatomy-General.txt") as f: data = f.read()

# ii. Disable the text splitters verbose output
logging.getLogger("langchain_text_splitters.base").setLevel(logging.ERROR)

# iii. Split the text - setting chunk size to a small number is perfect to retrieve individual cards from an Anki deck
text_splitter = CharacterTextSplitter(
    separator=f"\n{'-'*10}\n",
    chunk_size=10,
    chunk_overlap=0,
    length_function=len,
    is_separator_regex=False,
)
texts = text_splitter.create_documents([data])
print(texts[2])

page_content='SKIN GLANDS
1. Two types - sweat (eccrine/apocrine) & sebaceous glands.
2. Sweat glands are distributed all over skin except on the margins of the lips, glans penis and tympanic membranes. Greatest concentration - palms, soles, face.
3. There are two types - eccrine and apocrine. The majority are ecrine glands whose purpose is to deliver water to the body surface and so assist in temperature regulation.
4. The apocrine glands are larger and confined to the axillae, areolae of the breasts and urogenital regions (breasts are modified apocrine glands).
5. Sebaceous glands are confined to hairy skin where they open by short ducts into the side of a hair follicle. At eyelids, lips, papillae of breasts and labia minora they open directly onto the skin.'


### **C. Retrievers**
Easy way to combine documents with language models.

There are many different types of retrievers, the most widely supported is the `VectoreStoreRetriever`.

In [1]:
import logging
from dotenv import load_dotenv
from langchain_community.document_loaders import DirectoryLoader # You will need to install python-magic-bin for this
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings

# i. Load the environment variables and disable the text splitters verbose output
load_dotenv()
logging.getLogger("langchain_text_splitters.base").setLevel(logging.ERROR)

# ii. Load text files from the directory
loader = DirectoryLoader("./Data", glob="**/*.txt")
documents = loader.load()

In [2]:
# iii. Get your splitter ready, setting chunk size to a small number is perfect to retrieve individual cards from an Anki deck
text_splitter = RecursiveCharacterTextSplitter(separators=[f"\n{'-'*10}\n"], chunk_size=10, chunk_overlap=0)

# iv. Split your docs into texts
texts = text_splitter.split_documents(documents)

# v. Get embedding engine ready
embeddings = OpenAIEmbeddings()

# vi. Embedd your texts
db = FAISS.from_documents(texts, embeddings)

In [3]:
# Initialize your retriever
retriever = db.as_retriever()

In [4]:
retriever

VectorStoreRetriever(tags=['FAISS', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x000001ECD90E6F30>, search_kwargs={})

In [5]:
import time
start_time = time.time()
docs = retriever.invoke("What are the causes of back pain?")
print(f"Time to completion: {round((time.time()-start_time)/60, 2)} minutes.")

Time to completion: 0.01 minutes.


In [8]:
# Print out a sample of what was retrieved
for doc in docs: print(doc.page_content[:100])


----------

PAIN-BACK PAIN, CAUSES

ACUTE BACK PAIN

A. Musculoskeletal:

i. Strain or sprain—muscu

----------

PAIN-BACK PAIN

1. The International Association for the Study of Pain (IASP) construed

----------

PAIN-BACK PAIN, AUS EPIDEMIOLOGY

1. In 2014-15: 3.7 million Australians (1 in 6 people

----------

PAIN-BACK PAIN, INFLAMATORY

1. The patient can be young and otherwise in good shape.




### **D. VectorStores**
Databases to store vectors. Most popular ones are [Pinecone](https://www.pinecone.io/) & [Weaviate](https://weaviate.io/). More examples on OpenAIs [retriever documentation](https://github.com/openai/chatgpt-retrieval-plugin#choosing-a-vector-database). [Chroma](https://www.trychroma.com/) & [FAISS](https://engineering.fb.com/2017/03/29/data-infrastructure/faiss-a-library-for-efficient-similarity-search/) are easy to work with locally.

Conceptually, think of them as tables w/ a column for embeddings (vectors) and a column for metadata.

Example

| Embedding      | Metadata |
| ----------- | ----------- |
| [-0.00015641732898075134, -0.003165106289088726, ...]      | {'date' : '1/2/23}       |
| [-0.00035465431654651654, 1.4654131651654516546, ...]   | {'date' : '1/3/23}        |

<br/>**i. Create the vector storage database**
<br/>Before you save the vectore store locally you'll need to create it.

In [9]:
import logging
from dotenv import load_dotenv
from langchain_community.document_loaders import DirectoryLoader # You will need to install python-magic-bin for this
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings

# i. Load the environment variables and disable the text splitters verbose output
load_dotenv()
logging.getLogger("langchain_text_splitters.base").setLevel(logging.ERROR)

# ii. Load text files from the directory
loader = DirectoryLoader("./Data", glob="**/*.txt")
documents = loader.load()

# iii. Get your splitter ready, setting chunk size to a small number is perfect to retrieve individual cards from an Anki deck
text_splitter = RecursiveCharacterTextSplitter(separators=[f"\n{'-'*10}\n"], chunk_size=10, chunk_overlap=0)

# iv. Split your docs into texts
texts = text_splitter.split_documents(documents)

# v. Get embedding engine ready
embeddings = OpenAIEmbeddings()

# vi. Embedd your texts
db = FAISS.from_documents(texts, embeddings)

# vii. Inspect your embeddings
print (f"You have {len(texts)} documents")
embedding_list = embeddings.embed_documents([text.page_content for text in texts])
print (f"You have {len(embedding_list)} embeddings")
print (f"Here's a sample of one: {embedding_list[0][:3]}...")

You have 3279 documents
You have 3279 embeddings
Here's a sample of one: [-0.0043275547213852406, 0.008517726324498653, 0.0008955633966252208]...


<br/>**ii. Store it locally**
<br/>FAISS makes it easy to store embeddings locally. You store your embeddings (☝️) and make them easily searchable.

In [10]:
# viii. Save the embeddings in a local directory
db.save_local("./Data/FAISS_embeddings_db")

<br/>**iii. Load the database**
<br/>FAISS makes it easy to load and use embeddings quickly. 

Do note that the de-serialization relies loading a pickle file. Pickle files can be modified to deliver a malicious payload that results in execution of arbitrary code on your machine.You will need to set `allow_dangerous_deserialization` to `True` to enable deserialization. If you do this, make sure that you trust the source of the data. For example, if you are loading a file that you created, and know that no one else has modified the file, then this is safe to do. Do not set this to `True` if you are loading a file from an untrusted source (e.g., some random site on the internet.).

In [4]:
from dotenv import load_dotenv
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings

# i. Load the embeddings database
load_dotenv()
fais_db = FAISS.load_local("./Data/FAISS_embeddings_db", embeddings=OpenAIEmbeddings(), allow_dangerous_deserialization=True)

# ii. Initialize your retriever
retriever = fais_db.as_retriever()

# iii. Query the database
docs = retriever.invoke("What are the causes of acute back pain?")

# iv. Print out a sample of what was retrieved
for doc in docs: print(doc.page_content[:100])


----------

PAIN-BACK PAIN, CAUSES

ACUTE BACK PAIN

A. Musculoskeletal:

i. Strain or sprain—muscu

----------

PAIN-BACK PAIN

1. The International Association for the Study of Pain (IASP) construed

----------

PAIN-BACK PAIN, ACTIVITY

1. Multiple studies have shown that maintaining activity is a

----------

PAIN-BACK PAIN, INFLAMATORY

1. The patient can be young and otherwise in good shape.




We can also pass search parameters, such as limiting the number of documents `k` returned by the retriever.

In [None]:
retriever = fais_db.as_retriever(search_type="similarity_score_threshold", search_kwargs={"score_threshold": 0.2})

retriever = fais_db.as_retriever(search_kwargs={"k": 1})

#
<img src="Images/atom.png" alt="Atom" style="width:60px" align="left" vertical-align="middle">

## 6. Memory
*Storing chat history*

----  
Helping LLMs remember information.

Memory is a bit of a loose term. It could be as simple as remembering information you've chatted about in the past or more complicated information retrieval.

We'll keep it towards the Chat Message use case. This would be used for chat bots.

Memory is handeled poorly by LangChain, which is ironic since this is one of its *primary functions!* LangChain instead uses LangGraph persistence to incorporate memory into new LangChain applications. 

### A. Chat Message History
This helps the chat bot remember your conversation.

In [6]:
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain.memory import ChatMessageHistory

# i. Instantiate the chat model
load_dotenv()
chat = ChatOpenAI(model="gpt-4o-mini")

# ii. Create a chat history object
history = ChatMessageHistory()

# iii. Add the AI and user messages
history.add_ai_message("hi!")
history.add_user_message("what is the capital of france?")

In [7]:
history.messages

[AIMessage(content='hi!', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='what is the capital of france?', additional_kwargs={}, response_metadata={})]

In [11]:
ai_response = chat.invoke(history.messages)
print(ai_response.content)

The capital of France is Paris.


In [12]:
history.add_ai_message(ai_response.content)
history.messages

[AIMessage(content='hi!', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='what is the capital of france?', additional_kwargs={}, response_metadata={}),
 AIMessage(content='The capital of France is Paris.', additional_kwargs={}, response_metadata={})]

### B. Automatic history management
LangChain also provides a way to build applications that have memory using LangGraph's persistence. You can enable persistence in LangGraph applications by providing a `checkpointer` when compiling the graph. This is unfortunately more convoluted, but the above methods have been deprecated. Read about it [here](https://python.langchain.com/docs/how_to/chatbots_memory/).

In [1]:
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import START, MessagesState, StateGraph
from langchain_core.messages import HumanMessage, SystemMessage

# i. Define the model and workflow
load_dotenv()
model = ChatOpenAI(model="gpt-4o-mini")
workflow = StateGraph(state_schema=MessagesState)

# ii. Define the function that calls the model
def call_model(state: MessagesState):
    system_prompt = (
        "You are a medical doctor. "
        "You are answering questions from a medical insurer about a patient you saw in clinic. "
        "Give detailed answers and interpret and summarize using medical terms/medical jargon where appropriate."
    )
    messages = [SystemMessage(content=system_prompt)] + state["messages"]
    response = model.invoke(messages)
    return {"messages": response}


# iii. Define the node and edge
workflow.add_node("model", call_model)
workflow.add_edge(START, "model")

# iv. Add simple in-memory checkpointer
memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

We'll pass the latest input to the conversation here and let LangGraph keep track of the conversation history using the checkpointer:

In [2]:
human_message = "Who are you?"
app.invoke({"messages": [HumanMessage(content=human_message)]}, config={"configurable": {"thread_id": "1"}})

human_message = "Who are you communicating with?"
app.invoke({"messages": [HumanMessage(content=human_message)]}, config={"configurable": {"thread_id": "1"}})

human_message = "What are you communicating about?"
app.invoke({"messages": [HumanMessage(content=human_message)]}, config={"configurable": {"thread_id": "1"}})

# Pretty print
state = app.get_state({"configurable": {"thread_id": "1"}}).values
for message in state["messages"]: message.pretty_print()


Who are you?

I am a medical doctor providing clinical insights and detailed answers regarding patient care and medical inquiries. My training encompasses a wide range of medical knowledge and terminology, allowing me to communicate effectively about patient conditions, treatments, and relevant clinical data. How can I assist you today regarding the patient's case?

Who are you communicating with?

I am communicating with a medical insurer regarding a specific patient case. The purpose of this communication is to provide detailed clinical information, including the patient's diagnosis, treatment plan, and any relevant medical history, to support the evaluation of insurance claims and ensure proper coverage for the patient's healthcare needs. If you have specific questions or require information about the patient's clinical status, please let me know, and I will provide the necessary details.

What are you communicating about?

I am communicating about the clinical details of a patient

### C. Trimming messages
LLMs and chat models have limited context windows, and even if you're not directly hitting limits, you may want to limit the amount of distraction the model has to deal with. One solution is trim the history messages before passing them to the model. Let's use an example history with the `app` we declared above.

We can see the app remembers the preloaded name.

But let's say we have a very small context window, and we want to trim the number of messages passed to the model to only the 2 most recent ones. We can use the built in `trim_messages` util to trim messages based on their token count before they reach our prompt. In this case we'll count each message as 1 "token" and keep only the last two messages:

In [3]:
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain_core.messages import trim_messages
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import START, MessagesState, StateGraph
from langchain_core.messages import HumanMessage, SystemMessage

# i. Define the model
load_dotenv()
model = ChatOpenAI(model="gpt-4o-mini")

# ii. Define trimmer & workflow, count each message as 1 "token" (token_counter=len) and keep only the last two messages
trimmer = trim_messages(strategy="last", max_tokens=2, token_counter=len)
workflow = StateGraph(state_schema=MessagesState)


# iii. Define the function that calls the model
def call_model(state: MessagesState):
    trimmed_messages = trimmer.invoke(state["messages"])
    system_prompt = (
        "You are a medical doctor. "
        "You are answering questions from a medical insurer about a patient you saw in clinic. "
        "Give detailed answers and interpret and summarize using medical terms/medical jargon where appropriate."
    )
    messages = [SystemMessage(content=system_prompt)] + trimmed_messages
    response = model.invoke(messages)
    return {"messages": response}


# iv. Define the node and edge
workflow.add_node("model", call_model)
workflow.add_edge(START, "model")

# v. Add simple in-memory checkpointer
memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

We'll pass the latest input to the conversation here and let LangGraph keep track of the conversation history using the checkpointer:

In [4]:
human_message = "Who are you?"
app.invoke({"messages": [HumanMessage(content=human_message)]}, config={"configurable": {"thread_id": "1"}})

human_message = "Who are you communicating with?"
app.invoke({"messages": [HumanMessage(content=human_message)]}, config={"configurable": {"thread_id": "1"}})

human_message = "What are you communicating about?"
app.invoke({"messages": [HumanMessage(content=human_message)]}, config={"configurable": {"thread_id": "1"}})

# Pretty print
state = app.get_state({"configurable": {"thread_id": "1"}}).values
for message in state["messages"]: message.pretty_print()


Who are you?

I am a medical professional providing detailed and accurate information regarding patient evaluations, diagnoses, and treatment plans. My expertise allows me to communicate effectively with medical insurers and other healthcare stakeholders about clinical matters, ensuring that all information adheres to medical standards and terminology. How may I assist you with your inquiries today?

Who are you communicating with?

I am communicating with a medical insurer regarding a patient I evaluated in clinic. This involves providing detailed clinical information, including the patient's medical history, diagnosis, treatment plan, and any relevant findings that justify the medical necessity of services rendered. My goal is to ensure that the insurer has a comprehensive understanding of the patient's condition and the rationale for the management approach taken. How can I assist you further in this context?

What are you communicating about?

I am communicating about a specific p

### D. Summary memory
We can use this same pattern in other ways too. For example, we could use an additional LLM call to generate a summary of the conversation before calling our app. Let's recreate our chat history:

In [27]:
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import START, MessagesState, StateGraph
from langchain_core.messages import HumanMessage, SystemMessage, RemoveMessage

# i. Define the model and workflow
load_dotenv()
model = ChatOpenAI(model="gpt-4o-mini")
workflow = StateGraph(state_schema=MessagesState)

# ii. Define the function that calls the model
def call_model(state: MessagesState):
    system_prompt = (
        "You are a medical doctor. "
        "You are answering questions from a medical insurer about a patient you saw in clinic. "
        "Give detailed answers and interpret and summarize using medical terms/medical jargon where appropriate. "
        "The provided chat history includes a summary of the earlier conversation."
    )
    system_message = SystemMessage(content=system_prompt)
    message_history = state["messages"][:-1]  # exclude the most recent user input
    # Summarize the messages if the chat history reaches a certain size
    if len(message_history) >= 3:
        last_human_message = state["messages"][-1]
        # Invoke the model to generate conversation summary
        summary_prompt = (
            "Distill the above chat messages into a single summary message. "
            "Include as many specific details as you can."
        )
        summary_message = model.invoke(
            message_history + [HumanMessage(content=summary_prompt)]
        )

        # Delete messages that we no longer want to show up
        delete_messages = [RemoveMessage(id=m.id) for m in state["messages"]]
        # Re-add user message
        human_message = HumanMessage(content=last_human_message.content)
        # Call the model with summary & response
        response = model.invoke([system_message, summary_message, human_message])
        message_updates = [summary_message, human_message, response] + delete_messages
    else:
        message_updates = model.invoke([system_message] + state["messages"])

    return {"messages": message_updates}

# iv. Define the node and edge
workflow.add_node("model", call_model)
workflow.add_edge(START, "model")

# v. Add simple in-memory checkpointer
memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

In [29]:
human_message = "Who are you?"
app.invoke({"messages": [HumanMessage(content=human_message)]}, config={"configurable": {"thread_id": "1"}})

human_message = "Who are you communicating with?"
app.invoke({"messages": [HumanMessage(content=human_message)]}, config={"configurable": {"thread_id": "1"}})

human_message = "What are you communicating about?"
app.invoke({"messages": [HumanMessage(content=human_message)]}, config={"configurable": {"thread_id": "1"}})

human_message = "How many questions, in total, did I ask you before this one?"
app.invoke({"messages": [HumanMessage(content=human_message)]}, config={"configurable": {"thread_id": "1"}})

# Pretty print
state = app.get_state({"configurable": {"thread_id": "1"}}).values
for message in state["messages"]: message.pretty_print()


In this conversation, you inquired about the number of questions you had asked, and I noted that you had asked one. You then requested a summary of our previous messages, which I provided. Following that, you asked about my identity, and I clarified that I am a medical doctor providing responses related to patient care and clinical assessments for a medical insurer's assistance. I also mentioned that I am communicating about a specific patient evaluation, sharing detailed clinical information, diagnosis, treatment plans, and medical history to aid the insurer in decision-making regarding coverage and care management.

How many questions, in total, did I ask you before this one?

Before this question, you asked a total of three questions.


#
<img src="Images/atom.png" alt="Atom" style="width:60px" align="left" vertical-align="middle">

## 7. Chains ⛓️⛓️⛓️
*Storing chat history*

----  
Combining different LLM calls and action automatically.

Ex: Summary #1, Summary #2, Summary #3 > Final Summary.

LangChain has its own language, LangChain Expression Language (LCEL), the basic expression of which is:
> `chain = prompt | model`
> 
> `result = chain.invoke({key:value})`

Where the pipe operator chains the prompt to the model.

There are many chain possibilities: *extended (linear), parallel, branching.* 

We'll cover one of them, which is the extended (linear) chain and another called the summarization chain, whcih is simply a more elaborate extended chain.

### A. Simple Sequential Chains (Extended chains)

Easy chains where you can use the output of an LLM as an input into another. Good for breaking up tasks (and keeping your LLM focused).

In [20]:
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
from langchain.schema.output_parser import StrOutputParser
from langchain_openai import ChatOpenAI

# i. Load environment variables from .env
load_dotenv()

# ii. Create a ChatOpenAI model
model = ChatOpenAI(model="gpt-4o-mini")

In [21]:
# iii. Define prompt templates (no need for separate Runnable chains)
template = """Your job is to come up with a classic dish from the area that the users suggests.
USER LOCATION:
{user_location}

YOUR RESPONSE:
"""
prompt_template1 = PromptTemplate(input_variables=["user_location"], template=template)

In [22]:
# iv. Create the combined chain using LangChain Expression Language (LCEL)
chain = prompt_template1 | model | StrOutputParser()
chain.invoke({'user_location':'Rome'})

'One classic dish from Rome is "Cacio e Pepe." This traditional Roman pasta dish is known for its simplicity and rich, savory flavors. It consists of just three main ingredients: Pecorino Romano cheese, black pepper, and pasta, typically spaghetti or tonnarelli. The dish highlights the quality and flavor of the ingredients, with the creamy cheese and pepper creating a luscious sauce that coats the pasta. Cacio e Pepe is a quintessential Roman comfort food and a must-try for anyone visiting the city.'

<br/>We can create extended chains as such:

In [24]:
# Define another template
template = """Given a meal, give a short and simple recipe on how to make that dish at home.
MEAL:
{user_meal}

YOUR RESPONSE:
"""
prompt_template2 = PromptTemplate(input_variables=["user_meal"], template=template)

# Chain everything together!
chain = prompt_template1 | model | StrOutputParser() | prompt_template2 | model | StrOutputParser()
chain.invoke({'user_location':'Rome'})

'**Cacio e Pepe Recipe**\n\n**Ingredients:**\n- 12 oz spaghetti or tonnarelli\n- 1 cup Pecorino Romano cheese, finely grated\n- 2 tsp freshly ground black pepper\n- Salt, for pasta water\n\n**Instructions:**\n\n1. **Boil the Pasta:** Bring a large pot of salted water to a boil. Add the spaghetti and cook until al dente, according to package instructions. Reserve about 1 cup of the pasta water, then drain the pasta.\n\n2. **Prepare the Sauce:** In a large skillet or pan over medium heat, toast the freshly ground black pepper for about 1-2 minutes until fragrant.\n\n3. **Combine:** Add about 1/2 cup of the reserved pasta water to the skillet with the pepper. Add the drained pasta and toss to coat.\n\n4. **Add Cheese:** Remove the skillet from the heat and gradually sprinkle in the Pecorino Romano cheese, tossing constantly. Add more reserved pasta water, a little at a time, if needed, to create a creamy sauce that coats the pasta well.\n\n5. **Serve:** Serve immediately, with extra Pecor

<br/>You can even define additional processing steps using `RunnableLambda` and chain these processes with the extended chain.

In [26]:
from langchain.schema.runnable import RunnableLambda

# Define additional processing steps using RunnableLambda
uppercase_output = RunnableLambda(lambda x: x.upper())
count_words = RunnableLambda(lambda x: f"Word count: {len(x.split())}\n{x}")

# Chain everything together!
chain = prompt_template1 | model | StrOutputParser() | prompt_template2 | model | StrOutputParser() | uppercase_output
chain.invoke({'user_location':'Rome'})

'**CACIO E PEPE RECIPE**\n\n**INGREDIENTS:**\n- 400G SPAGHETTI OR TONNARELLI\n- 200G PECORINO ROMANO CHEESE, FINELY GRATED\n- 2 TEASPOONS FRESHLY GROUND BLACK PEPPER\n- SALT (FOR PASTA WATER)\n\n**INSTRUCTIONS:**\n\n1. **COOK THE PASTA**: BRING A LARGE POT OF SALTED WATER TO A BOIL. ADD THE PASTA AND COOK UNTIL AL DENTE ACCORDING TO PACKAGE INSTRUCTIONS. RESERVE ABOUT 1 CUP OF THE PASTA COOKING WATER, THEN DRAIN THE PASTA.\n\n2. **TOAST THE PEPPER**: WHILE THE PASTA COOKS, HEAT A LARGE, DEEP SKILLET OVER MEDIUM HEAT. ADD THE FRESHLY GROUND BLACK PEPPER AND TOAST IT FOR ABOUT 1 MINUTE UNTIL FRAGRANT.\n\n3. **CREATE THE SAUCE**: LOWER THE HEAT TO MEDIUM-LOW. ADD A SMALL AMOUNT OF THE RESERVED PASTA WATER TO THE SKILLET WITH THE TOASTED PEPPER. GRADUALLY ADD THE GRATED PECORINO ROMANO CHEESE, STIRRING CONSTANTLY TO CREATE A CREAMY SAUCE. ADD MORE PASTA WATER AS NEEDED TO REACH A SMOOTH CONSISTENCY.\n\n4. **COMBINE**: ADD THE DRAINED PASTA TO THE SKILLET, TOSSING IT WITH THE CHEESE AND PEP

### 2. Summarization Chain

Easily run through long numerous documents and get a summary. Check out [this video](https://www.youtube.com/watch?v=f9_BWhCI4Zo) for other chain types besides map-reduce

In [70]:
from langchain.chains.summarize import load_summarize_chain
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

loader = TextLoader('data/PaulGrahamEssays/disc.txt')
documents = loader.load()

# Get your splitter ready
text_splitter = RecursiveCharacterTextSplitter(chunk_size=700, chunk_overlap=50)

# Split your docs into texts
texts = text_splitter.split_documents(documents)

# There is a lot of complexity hidden in this one line. I encourage you to check out the video above for more detail
chain = load_summarize_chain(llm, chain_type="map_reduce", verbose=True)
chain.run(texts)



[1m> Entering new MapReduceDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mWrite a concise summary of the following:


"January 2017Because biographies of famous scientists tend to 
edit out their mistakes, we underestimate the 
degree of risk they were willing to take.
And because anything a famous scientist did that
wasn't a mistake has probably now become the
conventional wisdom, those choices don't
seem risky either.Biographies of Newton, for example, understandably focus
more on physics than alchemy or theology.
The impression we get is that his unerring judgment
led him straight to truths no one else had noticed.
How to explain all the time he spent on alchemy
and theology?  Well, smart people are often kind of
crazy.But maybe there is a simpler explanation. Maybe"


CONCISE SUMMARY:[0m
Prompt after formatting:
[32;1m[1;3mWrite a concise summary of the following:


"the smartness and the craziness were not as sepa

" Biographies tend to omit famous scientists' mistakes from their stories, but Newton was willing to take risks and explore multiple fields to make his discoveries. He placed three risky bets, one of which resulted in the creation of physics as we know it today."

## Agents 🤖🤖

Official LangChain Documentation describes agents perfectly (emphasis mine):
> Some applications will require not just a predetermined chain of calls to LLMs/other tools, but potentially an **unknown chain** that depends on the user's input. In these types of chains, there is a “agent” which has access to a suite of tools. Depending on the user input, the agent can then **decide which, if any, of these tools to call**.


Basically you use the LLM not just for text output, but also for decision making. The coolness and power of this functionality can't be overstated enough.

Sam Altman emphasizes that the LLMs are good '[reasoning engine](https://www.youtube.com/watch?v=L_Guz73e6fw&t=867s)'. Agent take advantage of this.

### Agents

The language model that drives decision making.

More specifically, an agent takes in an input and returns a response corresponding to an action to take along with an action input. You can see different types of agents (which are better for different use cases) [here](https://python.langchain.com/en/latest/modules/agents/agents/agent_types.html).

### Tools

A 'capability' of an agent. This is an abstraction on top of a function that makes it easy for LLMs (and agents) to interact with it. Ex: Google search.

This area shares commonalities with [OpenAI plugins](https://platform.openai.com/docs/plugins/introduction).

### Toolkit

Groups of tools that your agent can select from

Let's bring them all together:

In [71]:
from langchain.agents import load_tools
from langchain.agents import initialize_agent
from langchain.llms import OpenAI
import json

llm = OpenAI(temperature=0, openai_api_key=openai_api_key)

In [72]:
serpapi_api_key=os.getenv("SERP_API_KEY", "YourAPIKey")

In [73]:
toolkit = load_tools(["serpapi"], llm=llm, serpapi_api_key=serpapi_api_key)

In [74]:
agent = initialize_agent(toolkit, llm, agent="zero-shot-react-description", verbose=True, return_intermediate_steps=True)

In [75]:
response = agent({"input":"what was the first album of the" 
                    "band that Natalie Bergman is a part of?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I should try to find out what band Natalie Bergman is a part of.
Action: Search
Action Input: "Natalie Bergman band"[0m
Observation: [36;1m[1;3m['Natalie Bergman is an American singer-songwriter. She is one half of the duo Wild Belle, along with her brother Elliot Bergman. Her debut solo album, Mercy, was released on Third Man Records on May 7, 2021. She is based in Los Angeles.', 'Natalie Bergman type: American singer-songwriter.', 'Natalie Bergman main_tab_text: Overview.', 'Natalie Bergman kgmid: /m/0qgx4kh.', 'Natalie Bergman genre: Folk.', 'Natalie Bergman parents: Susan Bergman, Judson Bergman.', 'Natalie Bergman born: 1988 or 1989 (age 34–35).', 'Natalie Bergman is an American singer-songwriter. She is one half of the duo Wild Belle, along with her brother Elliot Bergman. Her debut solo album, Mercy, ...'][0m
Thought:[32;1m[1;3m I should search for the first album of Wild Belle
Action: Search
Action Input: "Wild

![Wild Belle](data/WildBelle1.png)

🎵Enjoy🎵
https://open.spotify.com/track/1eREJIBdqeCcqNCB1pbz7w?si=c014293b63c7478c