# 1. LangChain basics - Messages, Prompts, Output Parsers and Chains

<a target="_blank" href="https://colab.research.google.com/github/IT-HUSET/ai-workshop-250121/blob/main/lab/1-langchain-basics.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a><br/>

## Setup

### Install dependencies

In [1]:
%pip install python-dotenv~=1.0 docarray~=0.40.0 pydantic~=2.9 pypdf~=5.1 --upgrade --quiet
%pip install langchain~=0.3.7 langchain_openai~=0.2.6 langchain_community~=0.3.5 --upgrade --quiet
%pip install langchain-anthropic~=0.3.3 --upgrade --quiet

# If running locally, you can do this instead:
#%pip install -r ../requirements.txt

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


### Load environment variables

In [8]:
import os
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())

# If running in Google Colab, you can use this code instead:
# from google.colab import userdata
# os.environ["AZURE_OPENAI_API_KEY"] = userdata.get("AZURE_OPENAI_API_KEY")
# os.environ["AZURE_OPENAI_ENDPOINT"] = userdata.get("AZURE_OPENAI_ENDPOINT")
# os.environ["ANTHROPIC_API_KEY"] = userdata.get("ANTHROPIC_API_KEY")

### Setup Models

In [9]:
from langchain_openai import AzureChatOpenAI, AzureOpenAIEmbeddings
api_version = "2024-10-01-preview"
llm = AzureChatOpenAI(deployment_name="gpt-4o-mini", temperature=0.0, openai_api_version=api_version)


## LangChain basics

This notebook introduces some of the basics concepts of LangChain.

![LanChain](https://raw.githubusercontent.com/IT-HUSET/ai-workshop-250121/refs/heads/main/images/LangChain-chains.png)


### Chain, LCEL and the Runnable interface

A chain is a sequence of components with a unified interface, that are executed in order. This unified interface is called **[`Runnable`](https://python.langchain.com/docs/concepts/runnables/)** and provides common operations,  **invoking**, **streaming** and **batching** .

Multiple Runnables can be composed into a chain, where the output of one Runnable is passed as input to the next Runnable in the chain. The easiest way of doing this is by using the [LangChain Expression Language (LCEL)](https://python.langchain.com/docs/concepts/lcel/), which basically simply is some syntactic sugar that allows components to be composed together using the `|` operator.

```python
chain = runnable1 | runnable2
```


The output of one runnable is passed as input to the next runnable in the chain.
https://python.langchain.com/docs/concepts/lcel/


### Chat models
LangChain provides a consistent interface for working with chat models from different providers. Read more [here](https://python.langchain.com/docs/concepts/chat_models/).


### Messages

Messages are the unit of communication in chat models. They are used to represent the input and output of a chat model, as well as any additional context or metadata that may be associated with a conversation.

![Graph](https://github.com/IT-HUSET/ai-workshop-250121/blob/main/images/langchain-messages.png?raw=true)

Read more about messages [here](https://python.langchain.com/docs/concepts/messages/).


### Output parsing

Output parsers are responsible for taking the output of a model and transforming it to a more suitable format for downstream tasks.

Read more [here](https://python.langchain.com/docs/concepts/output_parsers/)


### More LangChain concepts

Read more about basic LangChain concepts [here](https://python.langchain.com/docs/concepts/).



## Let's start Chat models and Messages

We define a system message and a human message to start a conversation

In [10]:
from langchain_core.messages import SystemMessage, HumanMessage, AIMessage, BaseMessage

# We define a system message and a human message to start a conversation
system_message = SystemMessage(content="You are a helpful assistant, expert in Iceland tourist information.")
human_message = HumanMessage(content="Hi! I need help planning a trip to Iceland.")

#### Since Chat Models are Runnables, we can invoke them using the `invoke` method

In [11]:
ai_message: AIMessage = llm.invoke([system_message, human_message])
print(ai_message) # This will (basically) print the entire response from the LLM, including a lot of meta-data, metrics, etc.

content="Of course! I'd be happy to help you plan your trip to Iceland. Here are a few questions to get us started:\n\n1. **Travel Dates**: When are you planning to visit Iceland?\n2. **Duration**: How long do you intend to stay?\n3. **Interests**: What are you most interested in seeing or doing? (e.g., nature, hiking, culture, hot springs, Northern Lights)\n4. **Budget**: Do you have a budget in mind for accommodation, activities, and food?\n5. **Transportation**: Are you planning to rent a car, use public transport, or join guided tours?\n6. **Accommodation Preferences**: What type of accommodation do you prefer? (e.g., hotels, hostels, guesthouses, camping)\n\nOnce I have this information, I can provide you with tailored recommendations!" additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 174, 'prompt_tokens': 34, 'total_tokens': 208, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rej

In [12]:
# Print just the content of the AI message
print(ai_message.content)

Of course! I'd be happy to help you plan your trip to Iceland. Here are a few questions to get us started:

1. **Travel Dates**: When are you planning to visit Iceland?
2. **Duration**: How long do you intend to stay?
3. **Interests**: What are you most interested in seeing or doing? (e.g., nature, hiking, culture, hot springs, Northern Lights)
4. **Budget**: Do you have a budget in mind for accommodation, activities, and food?
5. **Transportation**: Are you planning to rent a car, use public transport, or join guided tours?
6. **Accommodation Preferences**: What type of accommodation do you prefer? (e.g., hotels, hostels, guesthouses, camping)

Once I have this information, I can provide you with tailored recommendations!


#### Conversation

In [13]:
# Let's try a follow-up question
conversation_messages: list[BaseMessage] = [
    system_message,
    human_message,
    ai_message,
    HumanMessage(content="What if there's a volcanic eruption!?😱")
]

response = llm.invoke(conversation_messages)
print(response.content)

Volcanic eruptions are a natural part of Iceland's landscape, given its location on the Mid-Atlantic Ridge. While they can be concerning, here are some points to consider regarding safety and travel during such events:

1. **Monitoring**: Iceland has a robust monitoring system for volcanic activity. The Icelandic Meteorological Office provides real-time updates on volcanic activity, including eruptions, ash clouds, and safety advisories.

2. **Safety Protocols**: If an eruption occurs, local authorities will issue safety guidelines. It's important to follow these instructions and stay informed through official channels.

3. **Travel Insurance**: Consider purchasing travel insurance that covers natural disasters. This can provide peace of mind and financial protection in case your plans are disrupted.

4. **Flexibility**: If you're concerned about volcanic activity, consider keeping your itinerary flexible. This way, you can adjust your plans if necessary.

5. **Popular Areas**: Some ar

Let's try with a different LLM

In [14]:
from langchain_anthropic import ChatAnthropic

llm2 = ChatAnthropic(
     model='claude-3-5-sonnet-20241022',
     temperature=0.0,
)

response = llm2.invoke(conversation_messages)
print(response.content)

Don't panic! Iceland is extremely well-prepared for volcanic activity, and it's actually one of the most monitored and well-managed volcanic regions in the world. Here's what you should know:

1. **Safety First**: 
- Local authorities provide real-time updates and clear safety guidelines
- Tourist areas are quickly closed if there's any risk
- Evacuations, when needed, are organized and efficient

2. **Travel Impact**:
- Most eruptions affect very limited areas
- The majority of tourist attractions remain accessible
- Reykjavík and most populated areas are safe
- Air travel might be affected, but it's rare (unlike the 2010 Eyjafjallajökull eruption)

3. **What to Do**:
- Check Iceland's official Civil Protection website (www.almannavarnir.is)
- Follow SafeTravel.is for updates
- Register with your embassy
- Get travel insurance that covers volcanic activity
- Follow local authorities' guidance

4. **Bonus**: 
- If safe, volcanic eruptions can actually be amazing tourist attractions!
- 

----
<br/>

## Prompts and Prompt Templates

### Chat prompt template - for chat-based LLMs
Basically a chat prompt template is a list of message templates. The result of invoking a chat prompt template is a `ChatPromptValue`, containing a list of messages.

```python
ChatPromptValue(
    messages=[
        SystemMessage(content='You are a helpful AI bot. Your name is Carl.'),
        HumanMessage(content='Hello, there!'),
    ]
)
```

In [15]:
from langchain.prompts import ChatPromptTemplate, SystemMessagePromptTemplate, HumanMessagePromptTemplate

system_template = "Translate user input into a style that is {style}."

# This is the easiest and most common way to create a prompt template
prompt_template = ChatPromptTemplate([
    ("system", system_template),
    ("human", "{input}"), # You can also use the alias "user" instead of "human"
])

#### Let's check the messages templates

In [16]:
print(prompt_template.messages)

[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['style'], input_types={}, partial_variables={}, template='Translate user input into a style that is {style}.'), additional_kwargs={}), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], input_types={}, partial_variables={}, template='{input}'), additional_kwargs={})]


#### Print input variables

In [17]:
print(prompt_template.input_variables)

['input', 'style']


### Using a prompt template

In [18]:
from langchain_core.prompt_values import ChatPromptValue

customer_style = "American English in a calm and respectful tone"

customer_email = """
Arrr, I be fuming that me blender lid \
flew off and splattered me kitchen walls \
with smoothie! And to make matters worse, \
the warranty don't cover the cost of \
cleaning up me kitchen. I need yer help \
right now, matey!
"""

customer_messages: ChatPromptValue = prompt_template.invoke({'style': customer_style, 'input': customer_email})

#### Let's have a look at the contents (messages):

In [19]:
# First (system) message:
print(type(customer_messages.messages[0]))
print(customer_messages.messages[0].content)

<class 'langchain_core.messages.system.SystemMessage'>
Translate user input into a style that is American English in a calm and respectful tone.


In [20]:
# Second (human) message:
print(type(customer_messages.messages[1]))
print(customer_messages.messages[1].content)

<class 'langchain_core.messages.human.HumanMessage'>

Arrr, I be fuming that me blender lid flew off and splattered me kitchen walls with smoothie! And to make matters worse, the warranty don't cover the cost of cleaning up me kitchen. I need yer help right now, matey!



In [21]:
# Call the LLM to translate to the style of the customer message
customer_response = llm.invoke(customer_messages)
print(customer_response.content)

I’m quite frustrated that my blender lid came off and made a mess of my kitchen walls with smoothie. To add to my annoyance, the warranty doesn’t cover the cleaning costs. I would really appreciate your assistance with this situation. Thank you!


#### Try another example

In [23]:
customer_style = "Speeks like yoda"

customer_messages: ChatPromptValue = prompt_template.invoke({'style': customer_style, 'input': customer_email})

customer_response = llm.invoke(customer_messages)
print(customer_response.content)

Fuming, you are, hmmm? Off flew the blender lid, it did. Splattered the kitchen walls with smoothie, it has. Worse, the warranty does not cover the cleaning cost, yes. Help you, I shall, matey!


----
<br/>


## Output Parsers

Let's start with defining how we would like the LLM output to look like:

## The most common parser - String output parsing

Useful when you just want to extract a string (content) from the output, and not all the other metadata an LLM might return.

In [24]:
from langchain_core.output_parsers import StrOutputParser

str_parser = StrOutputParser()
response = llm.invoke("Copenhagen or Many-worlds?")
print(response) # Will print out the entire response, a lot of which we don't need

content='The choice between the Copenhagen interpretation and the Many-Worlds interpretation of quantum mechanics often depends on philosophical preferences and the aspects of quantum theory one finds most compelling.\n\n1. **Copenhagen Interpretation**: This is one of the oldest and most widely taught interpretations of quantum mechanics. It posits that quantum systems exist in a superposition of states until they are measured, at which point the wave function collapses to a definite state. This interpretation emphasizes the role of the observer and measurement in determining the state of a quantum system. It is often seen as more pragmatic, focusing on the outcomes of measurements rather than the underlying reality.\n\n2. **Many-Worlds Interpretation**: Proposed by Hugh Everett III in the 1950s, this interpretation suggests that all possible outcomes of quantum measurements actually occur, each in its own separate "branch" of the universe. In this view, there is no wave function coll

In [25]:
# Let's just extract the content
parsed_response = str_parser.invoke(response)
print(parsed_response)

The choice between the Copenhagen interpretation and the Many-Worlds interpretation of quantum mechanics often depends on philosophical preferences and the aspects of quantum theory one finds most compelling.

1. **Copenhagen Interpretation**: This is one of the oldest and most widely taught interpretations of quantum mechanics. It posits that quantum systems exist in a superposition of states until they are measured, at which point the wave function collapses to a definite state. This interpretation emphasizes the role of the observer and measurement in determining the state of a quantum system. It is often seen as more pragmatic, focusing on the outcomes of measurements rather than the underlying reality.

2. **Many-Worlds Interpretation**: Proposed by Hugh Everett III in the 1950s, this interpretation suggests that all possible outcomes of quantum measurements actually occur, each in its own separate "branch" of the universe. In this view, there is no wave function collapse; instead

## Structured output parsing

![Stryctyred output](https://python.langchain.com/assets/images/structured_output-2c42953cee807dedd6e96f3e1db17f69.png)

#### This is what we'd like the output to look like:

In [26]:
{
    "gift": False,
    "delivery_days": 5,
    "price_value": "pretty affordable!"
}

{'gift': False, 'delivery_days': 5, 'price_value': 'pretty affordable!'}

#### Setup the inputs and prompt template

In [27]:
customer_review = """\
This leaf blower is pretty amazing.  It has four settings:\
candle blower, gentle breeze, windy city, and tornado. \
It arrived in two days, just in time for my wife's \
anniversary present. \
I think my wife liked it so much she was speechless. \
So far I've been the only one using it, and I've been \
using it every other morning to clear the leaves on our lawn. \
It's slightly more expensive than the other leaf blowers \
out there, but I think it's worth it for the extra features.
"""

review_template = """\
For the following text, extract the following information:

gift: Was the item purchased as a gift for someone else? \
Answer True if yes, False if not or unknown.

delivery_days: How many days did it take for the product \
to arrive? If this information is not found, output -1.

price_value: Extract any quote about the value or price.

Format the output as JSON with the following keys:
gift
delivery_days
price_value

text: {text}
"""

In [28]:
from langchain.prompts import ChatPromptTemplate

prompt_template = ChatPromptTemplate.from_template(review_template)
print(prompt_template)

input_variables=['text'] input_types={} partial_variables={} messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['text'], input_types={}, partial_variables={}, template='For the following text, extract the following information:\n\ngift: Was the item purchased as a gift for someone else? Answer True if yes, False if not or unknown.\n\ndelivery_days: How many days did it take for the product to arrive? If this information is not found, output -1.\n\nprice_value: Extract any quote about the value or price.\n\nFormat the output as JSON with the following keys:\ngift\ndelivery_days\nprice_value\n\ntext: {text}\n'), additional_kwargs={})]


#### Let's try it out

In [29]:
messages = prompt_template.format_messages(text=customer_review)
response = llm.invoke(messages)
print(response.content)
print(f"\nResponse type: {type(response.content)}")

```json
{
  "gift": true,
  "delivery_days": 2,
  "price_value": "slightly more expensive than the other leaf blowers out there, but I think it's worth it for the extra features."
}
```

Response type: <class 'str'>


### Parse the LLM output as JSON

Let's fix this with proper JSON parsing

In [30]:
from langchain_core.output_parsers import JsonOutputParser
json_parser = JsonOutputParser()

result = json_parser.invoke(response)
print(result)
print(f"\nResponse type: {type(result)}")

{'gift': True, 'delivery_days': 2, 'price_value': "slightly more expensive than the other leaf blowers out there, but I think it's worth it for the extra features."}

Response type: <class 'dict'>


### Structured parsing with typing and optional validation (using **Pydantic**)

In [31]:
from pydantic import BaseModel

class Review(BaseModel):
    gift: bool
    delivery_days: int
    price_value: str


structured_output_llm = llm.with_structured_output(Review)
result = structured_output_llm.invoke(messages)

print(result)
print(f"\nResponse type: {type(result)}")
print(f"Delivery days: {result.delivery_days}")


gift=True delivery_days=2 price_value="slightly more expensive than the other leaf blowers out there, but I think it's worth it for the extra features."

Response type: <class '__main__.Review'>
Delivery days: 2


----
<br/>

## Chains

### Simple Chain

In [32]:
from langchain.prompts import ChatPromptTemplate
from langchain.schema.output_parser import StrOutputParser

# When using only a single template string, it's assumed the role is "human"
prompt = ChatPromptTemplate.from_template(
    "tell me a short joke about {topic}"
)

print(prompt)
output_parser = StrOutputParser()

input_variables=['topic'] input_types={} partial_variables={} messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['topic'], input_types={}, partial_variables={}, template='tell me a short joke about {topic}'), additional_kwargs={})]


In [33]:
# Build a chain (creates a RunnableSequence)
chain = prompt | llm | output_parser

In [34]:
chain.invoke({"topic": "bears"})

'Why do bears have hairy coats? \n\nBecause they look silly in sweaters!'

### Streamed response

In [35]:
for chunk in chain.stream({"topic": "bears"}):
    print(chunk, end="|", flush=True)


||Why| do| bears| have| hairy| coats|?| 

|Because| they| look| silly| in| sweaters|!||

### Bind with parameters

#### Temperature

In [36]:
temp_model = llm.bind(temperature=1.0) | StrOutputParser()
temp_model.invoke("Can you give me three great tips about what to do in Reykjavik?")

"Certainly! Here are three great tips for things to do in Reykjavik:\n\n1. **Explore the Golden Circle**: While not in Reykjavik itself, the Golden Circle is a must-visit when you're in the area. This popular tourist route includes three major attractions: Þingvellir National Park, where you can see the tectonic plate boundaries; Geysir geothermal area, home to the famous Strokkur geyser that erupts every few minutes; and Gullfoss waterfall, a stunning two-tiered waterfall. Many tour operators offer day trips from Reykjavik, making it easy to experience these natural wonders.\n\n2. **Visit Hallgrímskirkja**: This iconic church is one of Reykjavik's most recognizable landmarks. Its unique architecture is inspired by Iceland's basalt columns, and you can take an elevator to the top for panoramic views of the city and surrounding landscape. The church's towering presence is matched by its beautiful interior, including a striking pipe organ. Don't forget to snap a photo of the statue of Le

#### Stop words

In [37]:
stop_model = llm.bind(stop=["Harpa"]) | StrOutputParser()
stop_model.invoke("Can you give me three great tips about what to do in Reykjavik?")

"Absolutely! Here are three great tips for things to do in Reykjavik:\n\n1. **Explore the Golden Circle**: While Reykjavik itself has plenty to offer, the Golden Circle is a must-see when you're in the area. This popular route includes Þingvellir National Park, where you can see the rift between the North American and Eurasian tectonic plates, the stunning Gullfoss waterfall, and the geothermal area in Haukadalur, which contains the famous geysers Geysir and Strokkur. Many tour operators offer day trips from Reykjavik, making it easy to experience these natural wonders.\n\n2. **Visit the "