# 1: Langchain basics

Make sure you are using the python binary in the virtual environment:

In [1]:
!which python

/Users/cesargmx/llm-workshop/llm-workshop/bin/python


In [4]:
!ollama list

NAME                           ID              SIZE      MODIFIED     
llama3.2:latest                a80c4f17acd5    2.0 GB    2 days ago      
llama3.1:8b-instruct-q4_0      62757c860e01    4.7 GB    3 months ago    
llama3.1:70b                   d729c66f84de    39 GB     3 months ago    
llama3.1:latest                62757c860e01    4.7 GB    3 months ago    
llama3-groq-tool-use:latest    36211dad2b15    4.7 GB    3 months ago    
llama3:latest                  365c0bd3c000    4.7 GB    3 months ago    


## Basic LLM Query

In [5]:
from langchain_ollama import ChatOllama

llm = ChatOllama(model="llama3.1", temperature=1)
response = llm.invoke("who wrote A Song of Ice and Fire?")
print(response.content)


A great question for all the fans of Westeros!

The author who wrote "A Song of Ice and Fire" is George R. R. Martin. This series of fantasy novels is his magnum opus, and it has become one of the most popular and critically acclaimed works in modern fantasy.

George R. R. Martin published the first book in the series, "A Game of Thrones", in 1996. Since then, he has written five more books in the series:

1. A Game of Thrones (1996)
2. A Clash of Kings (1999)
3. A Storm of Swords (2000)
4. A Feast for Crows (2005)
5. A Dance with Dragons (2011)

There are two more planned books in the series, "The Winds of Winter" and "A Dream of Spring", but they have yet to be published.

Martin's work has been adapted into the hit HBO television series "Game of Thrones", which was based on the first book and subsequent novels in the series.


## Part 1: Chains

LLMs can be combined with other components, such as external data sources or other LLMs, to create more complex applications.

A chain is made up of links, which can be either primitives or other chains. Primitives can be either prompts, LLMs, or utils.

You may find more information in the official documentation: https://python.langchain.com/v0.1/docs/expression_language/

**Goals:** 
* Review the difference between invoke and stream,
* Review the text and structured text output, and
* Understand the difference between string prompt and chat prompt composition. 

First we instantiate the llama3 model. More about [ChatOllama in its Github repository](https://github.com/ollama/ollama#model-library).

In [6]:
from langchain_ollama import ChatOllama
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import PromptTemplate 

llm = ChatOllama(
    model="llama3.1",   
    # You can experiment with the following parameters:
    temperature=0,
    max_new_tokens=512
)

output_parser = StrOutputParser()

#### Example 1: Text Output, String Prompt

Here we create a prompt from a string template, and the chain. Notice this line of the code, where we piece together these different components into a single chain using the LangChain Expression Language (LCEL):

```
chain = prompt | llm | output_parser
```

The `|` symbol is similar to a [unix pipe operator](https://en.wikipedia.org/wiki/Pipeline_%28Unix%29), which chains together the different components, feeding the output from one component as input into the next component.

In [7]:
prompt = PromptTemplate.from_template("Write me the lyrics for a 30 seconds jingle about {product}")

# using LangChain Expressive Language chain syntax (LCEL)
chain = prompt | llm | output_parser

When we execute the chain, we need to pass the parameters required by the prompt. Note that if we execute the chain with `invoke` all the text is presented after it was fully generated.

In [8]:
print(chain.invoke({"product": "home insurance"}))

Here's a possible 30-second jingle for home insurance:

(Upbeat, catchy tune)
"Protect your place, with a smile on your face
Home insurance from us, will show you a safer space
We've got coverage that's strong and true
So your home and family are safe, through and through
Call us today, and let's get it right
Home insurance from [Company Name], shining bright!"

(Note: You can replace [Company Name] with the actual name of the company)


Now execute the chain with `stream`. Notice how we can update the output after each token is generated. 

In [9]:
for chunk in chain.stream({"product": "cat food"}):
    print(chunk, end="", flush=True)

Here's a catchy 30-second jingle for cat food:

(Upbeat, energetic tune)
"Whiskers happy, purrs so sweet
Meow's favorite treat to eat
Tasty kibbles, crunchy too
[Brand Name] cat food, made just for you!
Purr-fectly delicious, every single day
[Brand Name] cat food, the best way!"

(Outro music: 2-3 seconds of a short musical phrase that fades out)

This jingle is designed to be catchy and easy to remember, with a simple melody that can stick in listeners' heads. The lyrics highlight the key benefits of the cat food (whiskers happy, purrs sweet) and emphasize its delicious taste. Feel free to modify or adjust as needed!

#### Example 2: Structured Output, message based prompt

A Structured output ensures we can use the output as input for other services/models. We can also create a prompt via messages. This is useful for chat based agents. 

First we  define the [output schema](https://json-schema.org/understanding-json-schema/reference/object?fbclid=IwZXh0bgNhZW0CMTEAAR1Tn1FU7M7ULeq4bBVOQTpxGZHs7y_JECH5WQpBB7edx-GGCabw0ouU8GQ_aem_kH8CeqwShi3l86Z2oSdpsA) and instantiate the model, specifying json as output format. 


In [10]:
import json
from langchain_core.output_parsers import JsonOutputParser

output_schema = {
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "age": {"type": "integer"},
        "favorite_food": {
            "type": "string",
        },
    },
    "required": ["name", "age"],
}

output_schema_as_string = json.dumps(output_schema, indent=2)

In [11]:
llm = ChatOllama(
    model="llama3.1",
    format="json",
    temperature=0.1,
    max_new_tokens=512
    )

We create the prompt with a list of messages. Each message is associated with content, and an additional parameter called `role`. For example, a chat message can be associated with an AI assistant, a human or a system role. You can read more about the different type of messages [here](https://python.langchain.com/v0.1/docs/modules/model_io/chat/message_types/).

This association is necessary to help with [few-shot prompting](https://www.promptingguide.ai/techniques/fewshot). While large-language models demonstrate remarkable zero-shot capabilities, they still fall short on more complex tasks when using the zero-shot setting. Few-shot prompting can be used as a technique to enable in-context learning where we provide demonstrations in the prompt to steer the model to better performance. The demonstrations serve as conditioning for subsequent examples where we would like the model to generate a response.

In this example, we don't provide examples to the llm, and  only use message based prompting to compare it with string prompting. 

In [12]:
from langchain_core.prompts import ChatPromptTemplate 

messages = [
    ("system", 
         "You are a helpful assistant that will extract information about a person and produce "
         "an output using the following json schema: {json_schema}"),
    ("human", "{person_info}"),
]

prompt = ChatPromptTemplate.from_messages(messages)
chain = prompt | llm | JsonOutputParser()

In [13]:
response = chain.invoke(
    {
        "json_schema": output_schema_as_string,
        "person_info": "Cesar is 37 and loves sushi"
    })

print(response)
print(type(response))

{'name': 'Cesar', 'age': 37, 'favorite_food': 'sushi'}
<class 'dict'>


#### Example 3: Message Templates

Additionally, LangChain provides different types of `MessagePromptTemplate`. The most commonly used are `AIMessagePromptTemplate`, `SystemMessagePromptTemplate` and `HumanMessagePromptTemplate`, which create an AI message, system message and human message respectively. 

In [14]:
from langchain_core.prompts import ChatPromptTemplate 
from langchain_core.prompts import SystemMessagePromptTemplate, HumanMessagePromptTemplate

messages = [
    SystemMessagePromptTemplate.from_template(
         "You are a helpful assistant that will extract information about a person and produce "
         "an output using the following json schema: {json_schema}"),
    HumanMessagePromptTemplate.from_template("{person_info}"),
]

prompt = ChatPromptTemplate.from_messages(messages)
chain = prompt | llm | JsonOutputParser()

Notice how if we change the output parser, we get the same result, but the object is a string instead of a dictionary. 

In [15]:
chain_str_parser = prompt | llm | StrOutputParser()
response = chain_str_parser.invoke(
    {
        "json_schema": output_schema_as_string,
        "person_info": "Cesar is 37 and loves sushi"
    })
print(response)
print(type(response))

{
  "name": "Cesar",
  "age": 37,
  "favorite_food": "sushi"
}
<class 'str'>


The power from LLMs is that it can be used to extract information from natural language and produce an output in a format that can be used for integrations. 

In [16]:
person_info_text = """As Laura walked into the cozy café, her bright smile illuminated the room, and 
her sparkling eyes seemed to dance with a youthful energy. Her long, curly brown hair bounced with 
each step, framing her heart-shaped face and emphasizing her petite nose ring. She was dressed in 
a flowy sundress, its vibrant floral pattern reflecting her playful personality. As she scanned the 
menu, her eyes widened with excitement, and she couldn't help but let out a little squeal of delight
when she spotted her favorite dish: a decadent chocolate lava cake. She had always been a 
self-proclaimed chocoholic, and the mere mention of the rich, gooey treat was enough to transport 
her back to her childhood days of baking with her grandmother. With a spring in her step, she 
ordered her beloved dessert and settled in for a delightful afternoon of indulgence, to celebrate  
her 30th birthday."""

response = chain.invoke(
    {
        "json_schema": output_schema_as_string,
        "person_info": person_info_text
    })

response

{'name': 'Laura', 'age': 30, 'favorite_food': 'decadent chocolate lava cake'}

A PE use case for this type of prompt is to extract information about HiVA Tasks, so they can be properly triaged. 

## Part 2: Chaining two prompts

In this example we use the output of a LLM prompt as the input for another. There are two approaches, the first one is with the [LangChain Expression Language (LCEL)](https://python.langchain.com/v0.1/docs/expression_language/) and the second one with SimpleSequentialChain (This is obsolete and may be deprecated soon, adding it here for completeness)

#### Example 4: LCEL Chaining

In [17]:
llm = ChatOllama(
    model="llama3.1",
    temperature=0.7,
    max_new_tokens=512
)

template = """Your job is to come up with a classic dish from the geographic area that the user suggests.
% GEOGRAPHIC AREA: {user_input_area}

YOUR RESPONSE:
"""
prompt_location = ChatPromptTemplate.from_template(template)

template = """Given a meal, give a short and simple recipe on how to make that dish at home.
% MEAL: {link_output_meal}

YOUR RESPONSE:
"""
prompt_meal = ChatPromptTemplate.from_template(template)

chain = prompt_location | llm | prompt_meal | llm | output_parser

response = chain.invoke("Rome")
print(response)

To make Carpaccio di Manzo at home, follow these simple steps:

**Ingredients:**

* Thinly sliced raw beef (such as ribeye or sirloin)
* Fresh arugula
* Shaved Parmesan cheese
* Extra virgin olive oil
* Lemon wedges
* Black pepper

**Instructions:**

1. Place the thinly sliced beef on a plate.
2. Top with fresh arugula leaves.
3. Sprinkle shaved Parmesan cheese over the arugula.
4. Drizzle extra virgin olive oil over the dish.
5. Serve with lemon wedges and black pepper on the side.

**Enjoy your Carpaccio di Manzo!**


#### Example 5: Using SimpleSequentialChain

In [18]:
from langchain.chains import LLMChain, SimpleSequentialChain
from langchain.prompts import PromptTemplate

llm = ChatOllama(
    model="llama3.1",
    temperature=0.7,
    max_new_tokens=512
)

template = """Your job is to come up with a classic dish from the area that the users suggests.
% USER LOCATION: {user_input_area}

YOUR RESPONSE:
"""
prompt_template = PromptTemplate(input_variables=["user_input_area"], template=template)
prompt = ChatPromptTemplate.from_template(template)

# Holds the'location' chain
location_chain = LLMChain(llm=llm, prompt=prompt_template)
template = """Given a meal, give a short and simple recipe on how to make that dish at home.
% MEAL: {link_output_meal}

YOUR RESPONSE:
"""
prompt_template = PromptTemplate(input_variables=["link_output_meal"], template=template)

# Holds the'meal' chain
meal_chain = LLMChain(llm=llm, prompt=prompt_template)

overall_chain = SimpleSequentialChain(chains=[location_chain, meal_chain], verbose=True)

review = overall_chain.invoke("Rome")

  location_chain = LLMChain(llm=llm, prompt=prompt_template)




[1m> Entering new SimpleSequentialChain chain...[0m
[36;1m[1;3mA culinary challenge!

Given that our user is located in Rome, I'd like to suggest a classic Roman dish...

**Carbonara**

A rich and creamy pasta dish made with spaghetti, bacon or pancetta, eggs, parmesan cheese, and black pepper. The simplicity of its ingredients belies the depth of flavor it achieves, making Carbonara a beloved staple of Roman cuisine.

How does that sound?[0m
[33;1m[1;3mCarbonara is an excellent choice for Rome! Here's a simple recipe to make this classic dish at home:

**Classic Roman Carbonara Recipe**

 Servings: 4

Ingredients:

* 12 oz spaghetti
* 6 slices of pancetta or bacon, diced
* 3 large eggs
* 1/2 cup grated Parmesan cheese
* Black pepper, freshly ground

Instructions:

1. Bring a large pot of salted water to a boil and cook the spaghetti according to package instructions until al dente.
2. While the pasta cooks, cook the pancetta or bacon in a pan over medium heat until crispy. Re

### Part 3: LLama Prompt Template format 

In the previous example we used the following templates: 

```
template = """Your job is to come up with a classic dish from the area that the users suggests.
% USER LOCATION: {user_location}

YOUR RESPONSE:
"""
```

```
template = """Given a meal, give a short and simple recipe on how to make that dish at home.
% MEAL: {user_meal}

YOUR RESPONSE:
"""
```

You may notice how they may seem to follow certain format. Well, it turns out that there is an official [Llama 3's prompt format](https://www.llama.com/docs/model-cards-and-prompt-formats/llama3_2). 

Following the official prompt format is a good idea for several reasons:
1. **Consistency:** Using a standardized format ensures consistency across different prompts, making it easier to compare and analyze results.
2. **Efficiency:** The official format is designed to be efficient, allowing you to provide all necessary information in a concise and organized manner.
3. **Clarity:** The format helps to clarify the task and requirements, reducing the risk of misinterpretation or misunderstandings.
4. **Reproducibility:** By following the same format, you can easily reproduce and build upon previous experiments, facilitating collaboration and research.
5: **Optimization:** The official format is optimized for the LLaMA3 model, ensuring that the input is processed efficiently and effectively by the LLM.
6. **Community alignment:** Using the official format aligns with the broader LLaMA3 community, making it easier to share and discuss results with others.


The template makes use of special tokens: https://www.llama.com/docs/model-cards-and-prompt-formats/llama3_1/#-special-tokens-


#### Example 6: LLama template + Pydantic

Pydantic is a popular Python library used for data validation, serialization, and deserialization. It provides a simple and intuitive way to define data models using Python type annotations.

In our case, we will use it to get an object out of the LLM response. 

In [19]:
from pydantic.v1 import BaseModel, Field


# Pydantic Schema for structured response
class Person(BaseModel):
    name: str = Field(description="The person's name", required=True)
    age: float = Field(description="The person's age", required=True)
    favorite_food: str = Field(description="The person's favorite food", required=False)


parser = JsonOutputParser(pydantic_object=Person)


In [21]:
parser.get_format_instructions()

'The output should be formatted as a JSON instance that conforms to the JSON schema below.\n\nAs an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}\nthe object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.\n\nHere is the output schema:\n```\n{"properties": {"name": {"title": "Name", "description": "The person\'s name", "required": true, "type": "string"}, "age": {"title": "Age", "description": "The person\'s age", "required": true, "type": "number"}, "favorite_food": {"title": "Favorite Food", "description": "The person\'s favorite food", "required": false, "type": "string"}}, "required": ["name", "age", "favorite_food"]}\n```'

In [20]:
prompt_template = """<|begin_of_text|><|start_header_id|>system<|end_header_id|>
You are a smart assistant. Take the following context and answer the following question.

{format_instructions}

<|eot_id|>
<|start_header_id|>user<|end_header_id|>
QUESTION: {question} \n
CONTEXT: {context} \n
JSON:
<|eot_id|>
<|start_header_id|>assistant<|end_header_id|>
"""

prompt = PromptTemplate(
    template=prompt_template,
    partial_variables={"format_instructions":parser.get_format_instructions()},
)


In [22]:
llm = ChatOllama(
    model="llama3.1",
    #temperature=0.7,
    max_new_tokens=512
)

chain = prompt | llm | parser

In [23]:
context = """It was a special day for Hermione, as she was celebrating her 17th birthday with her two best friends, Harry and Ron. As they sat down at the Three Broomsticks, Hermione couldn't help but feel grateful for her wonderful friends and another year of life. To celebrate, they ordered Hermione's favorite food, treacle tart."""
question = "Tell me about Hermione"

In [24]:
result = chain.invoke({
    'context': context,
    'question': question,
})

result

{'name': 'Hermione', 'age': 17, 'favorite_food': 'treacle tart'}

In [25]:
Person(**result)

Person(name='Hermione', age=17.0, favorite_food='treacle tart')

####  Question

Try to run the same example, but use LLama 3.2. What do you observe? Why do you think that happens?

In [26]:
llm = ChatOllama(
    model="llama3.2",
    #temperature=0.7,
    max_new_tokens=512
)

chain = prompt | llm | parser

In [27]:
result = chain.invoke({
    'context': context,
    'question': question,
})

result

{'name': {'title': 'Hermione',
  'description': "The person's name",
  'required': True,
  'type': 'string'},
 'age': {'title': 'Age',
  'description': "The person's age",
  'required': True,
  'type': 'number'},
 'favorite_food': {'title': 'Favorite Food',
  'description': "The person's favorite food",
  'required': False,
  'type': 'string'}}

Why this is important? 
Langchain is a framework, and it is managing all the templating behind the scenes. It allows us to easily switch from one model to another, and the llm input will be changed accordingly. 
