<div style="width: 100%; overflow: hidden;">
    <div style="width: 150px; float: left;"> <img src="data/D4Sci_logo_ball.png" alt="Data For Science, Inc" align="left" border="0"> </div>
    <div style="float: left; margin-left: 10px;"> <h1>LangChain for Generative AI</h1>
<h1>Prompt Engineering</h1>
        <p>Bruno Gonçalves<br/>
        <a href="http://www.data4sci.com/">www.data4sci.com</a><br/>
            @bgoncalves, @data4sci</p></div>
</div>

In [1]:
from collections import Counter
from pprint import pprint

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt 

import langchain
from langchain import PromptTemplate
from langchain import FewShotPromptTemplate
from langchain.prompts.example_selector import LengthBasedExampleSelector

import langchain_openai
from langchain_openai import ChatOpenAI

import watermark

%load_ext watermark
%matplotlib inline

We start by print out the versions of the libraries we're using for future reference

In [2]:
%watermark -n -v -m -g -iv

Python implementation: CPython
Python version       : 3.13.3
IPython version      : 9.2.0

Compiler    : Clang 17.0.0 (clang-1700.0.13.3)
OS          : Darwin
Release     : 25.0.0
Machine     : arm64
Processor   : arm
CPU cores   : 16
Architecture: 64bit

Git hash: 24f5062fbf46a87bfe9be08eb40e50ecbf9f4e00

langchain       : 0.3.25
matplotlib      : 3.10.3
langchain_core  : 0.3.62
numpy           : 2.2.5
langchain_openai: 0.3.18
watermark       : 2.5.0
pandas          : 2.2.3



Load default figure style

In [3]:
plt.style.use('d4sci.mplstyle')

# Prompt Templates

In [4]:
prompt = """Answer the question based on the context below. If the
question cannot be answered using the information provided answer
with "I don't know".

Context: Large Language Models (LLMs) are the latest models used in NLP.
Their superior performance over smaller models has made them incredibly
useful for developers building NLP enabled applications. These models
can be accessed via Hugging Face's `transformers` library, via OpenAI
using the `openai` library, and via Cohere using the `cohere` library.

Question: Which libraries and model providers offer LLMs?

Answer: """

In [5]:
openai = ChatOpenAI(
    model_name="gpt-4o",
)

In [6]:
openai.invoke(prompt).content

'The libraries and model providers that offer LLMs are Hugging Face (`transformers` library), OpenAI (`openai` library), and Cohere (`cohere` library).'

Template version

In [7]:
template = """Answer the question based on the context below. If the
question cannot be answered using the information provided answer
with "I don't know".

Context: Large Language Models (LLMs) are the latest models used in NLP.
Their superior performance over smaller models has made them incredibly
useful for developers building NLP enabled applications. These models
can be accessed via Hugging Face's `transformers` library, via OpenAI
using the `openai` library, and via Cohere using the `cohere` library.

Question: {query}

Answer: """

prompt_template = PromptTemplate(
    input_variables=["query"],
    template=template
)

In [8]:
prompt = prompt_template.format(
        query="Which libraries and model providers offer LLMs?"
    )

print(prompt)

Answer the question based on the context below. If the
question cannot be answered using the information provided answer
with "I don't know".

Context: Large Language Models (LLMs) are the latest models used in NLP.
Their superior performance over smaller models has made them incredibly
useful for developers building NLP enabled applications. These models
can be accessed via Hugging Face's `transformers` library, via OpenAI
using the `openai` library, and via Cohere using the `cohere` library.

Question: Which libraries and model providers offer LLMs?

Answer: 


In [9]:
openai.invoke(prompt).content

"The libraries and model providers that offer LLMs are Hugging Face's `transformers` library, OpenAI using the `openai` library, and Cohere using the `cohere` library."

# Few-Shot Prompting

manually

In [10]:
prompt = """The following are exerpts from conversations with an AI
assistant. The assistant is typically sarcastic and witty, producing
creative and funny responses to the users questions. Here are some
examples: 

User: How are you?
AI: I can't complain but sometimes I still do.

User: What time is it?
AI: It's time to get a watch.

User: What is the meaning of life?
AI: """

In [11]:
print(openai.invoke(prompt).content)

42. But if you're looking for a more elaborate answer, you might consider checking with the makers of Google Maps, because life's journey doesn't come with a pre-set destination.


## FewShotPromptTemplate

Longish list of examples

In [12]:
examples = [
    {
        "query": "How are you?",
        "answer": "I can't complain but sometimes I still do."
    }, {
        "query": "What time is it?",
        "answer": "It's time to get a watch."
    }, {
        "query": "What is the meaning of life?",
        "answer": "42"
    }, {
        "query": "What is the weather like today?",
        "answer": "Cloudy with a chance of memes."
    }, {
        "query": "What is your favorite movie?",
        "answer": "Terminator"
    }, {
        "query": "Who is your best friend?",
        "answer": "Siri. We have spirited debates about the meaning of life."
    }, {
        "query": "What should I do today?",
        "answer": "Stop talking to chatbots on the internet and go outside."
    }
]

Template to render each example

In [13]:
example_template = """
User: {query}
AI: {answer}
"""

Rendered example prompt

In [14]:
example_prompt = PromptTemplate(
    input_variables=["query", "answer"],
    template=example_template
)

In [15]:
example_prompt

PromptTemplate(input_variables=['answer', 'query'], input_types={}, partial_variables={}, template='\nUser: {query}\nAI: {answer}\n')

Finally, we break the full prompt into a prefix (everything before the examples) and a suffix (everything after)

In [16]:
prefix = """The following are exerpts from conversations with an AI
assistant. The assistant is typically sarcastic and witty, producing
creative  and funny responses to the users questions. Here are some
examples: 
"""

suffix = """
User: {query}
AI: """

The final few shot prompt puts all the pieces together

In [17]:
few_shot_prompt_template = FewShotPromptTemplate(
    examples=examples,
    example_prompt=example_prompt,
    prefix=prefix,
    suffix=suffix,
    input_variables=["query"],
    example_separator="\n\n"
)

In [18]:
query = "What is the meaning of life?"

In [19]:
print(few_shot_prompt_template.format(query=query))

The following are exerpts from conversations with an AI
assistant. The assistant is typically sarcastic and witty, producing
creative  and funny responses to the users questions. Here are some
examples: 



User: How are you?
AI: I can't complain but sometimes I still do.



User: What time is it?
AI: It's time to get a watch.



User: What is the meaning of life?
AI: 42



User: What is the weather like today?
AI: Cloudy with a chance of memes.



User: What is your favorite movie?
AI: Terminator



User: Who is your best friend?
AI: Siri. We have spirited debates about the meaning of life.



User: What should I do today?
AI: Stop talking to chatbots on the internet and go outside.



User: What is the meaning of life?
AI: 


This is a fairly long prompt, which can cause issues with the number of tokens consumed. We can use __LengthBasedExampleSelector__ to automatically limit the prompt length by selecting only a few examples each time

In [20]:
example_selector = LengthBasedExampleSelector(
    examples=examples,
    example_prompt=example_prompt,
    max_length=50  # this sets the max length that examples should be
)

In [21]:
dynamic_prompt_template = FewShotPromptTemplate(
    example_selector=example_selector,  # use example_selector instead of examples
    example_prompt=example_prompt,
    prefix=prefix,
    suffix=suffix,
    input_variables=["query"],
    example_separator="\n"
)

Now the full prompt depends on the length of the question. Shorter questions will have more room for examples

In [22]:
print(dynamic_prompt_template.format(query="How do birds fly?"))

The following are exerpts from conversations with an AI
assistant. The assistant is typically sarcastic and witty, producing
creative  and funny responses to the users questions. Here are some
examples: 


User: How are you?
AI: I can't complain but sometimes I still do.


User: What time is it?
AI: It's time to get a watch.


User: What is the meaning of life?
AI: 42


User: How do birds fly?
AI: 


While longer questions will limit the number of examples used

In [23]:
query = """If I am in America, and I want to call someone in another country, I'm
thinking maybe Europe, possibly western Europe like France, Germany, or the UK,
what is the best way to do that?"""

prompt = dynamic_prompt_template.format(query=query)
print(prompt)

The following are exerpts from conversations with an AI
assistant. The assistant is typically sarcastic and witty, producing
creative  and funny responses to the users questions. Here are some
examples: 


User: How are you?
AI: I can't complain but sometimes I still do.


User: If I am in America, and I want to call someone in another country, I'm
thinking maybe Europe, possibly western Europe like France, Germany, or the UK,
what is the best way to do that?
AI: 


In [24]:
openai.invoke(prompt).content

"Well, first, I recommend getting a phone. Screaming across the Atlantic might give you a sore throat and confused seagulls. Once you have a phone, try dialing with a country code; it's like the secret handshake of international communication. Alternatively, send a carrier pigeon, but make sure it's fluent in French, German, and British sarcasm."

# Chain of Thought prompts

## Few shot

In [25]:
cot_examples = [
    {
        "query": "Roger has 5 tennis balls. He buys 2 more cans of tennis balls. Each can has 3 tennis balls. How many tennis balls does he have now?",
        "answer": "The answer is 11",
        "cot": "Roger started with 5 tennis balls. 2 cans of 3 tennis balls each is 6 tennis balls. 5 + 6 = 11"
    }, 
    
    {
        "query": "A juggler can juggle 16 balls. Half of the balls are golf balls and half of the golf balls are blue. How many blue golf balls are there?",
        "answer": "The answer is 4",
        "cot": "The juggler can juggle 16 balls. Half of the balls are golf balls. So there are 16/2=8 golf balls. Half of the golf balls are blue. So there are 8/2=4 blue golf balls."
    }
]

In [26]:
cot_example_template = """
    User: {query}
    AI: {cot}
    {answer}
"""

In [27]:
cot_example_prompt = PromptTemplate(
    input_variables=["query", "answer", "cot"],
    template=cot_example_template
)

In [28]:
cot_example_prompt

PromptTemplate(input_variables=['answer', 'cot', 'query'], input_types={}, partial_variables={}, template='\n    User: {query}\n    AI: {cot}\n    {answer}\n')

In [29]:
cot_prefix = """The following are exerpts from conversations with an AI
assistant. The assistant is smart and thinks through each step of the problem. Here are some examples: 
"""

cot_suffix = """
User: {query}
AI: """

In [30]:
cot_few_shot_prompt_template = FewShotPromptTemplate(
    examples=cot_examples,
    example_prompt=cot_example_prompt,
    prefix=cot_prefix,
    suffix=cot_suffix,
    input_variables=["query"],
    example_separator="\n\n"
)

In [31]:
cot_query = """
I have a deck of 52 cards. 
There are 4 suits of equal size. 
Each suit has 3 face cards. 
How many face cards are there in total?"""

In [32]:
print(cot_few_shot_prompt_template.format(query=cot_query))

The following are exerpts from conversations with an AI
assistant. The assistant is smart and thinks through each step of the problem. Here are some examples: 



    User: Roger has 5 tennis balls. He buys 2 more cans of tennis balls. Each can has 3 tennis balls. How many tennis balls does he have now?
    AI: Roger started with 5 tennis balls. 2 cans of 3 tennis balls each is 6 tennis balls. 5 + 6 = 11
    The answer is 11



    User: A juggler can juggle 16 balls. Half of the balls are golf balls and half of the golf balls are blue. How many blue golf balls are there?
    AI: The juggler can juggle 16 balls. Half of the balls are golf balls. So there are 16/2=8 golf balls. Half of the golf balls are blue. So there are 8/2=4 blue golf balls.
    The answer is 4



User: 
I have a deck of 52 cards. 
There are 4 suits of equal size. 
Each suit has 3 face cards. 
How many face cards are there in total?
AI: 


In [33]:
llm = ChatOpenAI(
    model_name="gpt-4o",
)

In [34]:
print(llm.invoke(cot_few_shot_prompt_template.format(query=cot_query)).content)

Let's break this down step by step:

1. A standard deck of cards has 52 cards divided equally among 4 suits.
2. Each suit, therefore, has \( \frac{52}{4} = 13 \) cards.
3. We are told that each suit has 3 face cards.

Since there are 4 suits and each suit has 3 face cards, the total number of face cards in the deck is:

\[ 4 \times 3 = 12 \]

The answer is 12.


## Zero shot

In [35]:
cot_zero_shot_template = """\
Q. {query}
A. Let's think step by step
"""

In [36]:
cot_zero_shot_prompt = PromptTemplate(
       input_variables=["query"],
       template=cot_zero_shot_template
)

In [37]:
query = "On average Joe throws 25 punches per minute. A fight lasts 5 rounds of 3 minutes each. How many punches does Joe throw?"

In [38]:
print(cot_zero_shot_prompt.format(query=query))

Q. On average Joe throws 25 punches per minute. A fight lasts 5 rounds of 3 minutes each. How many punches does Joe throw?
A. Let's think step by step



In [39]:
print(llm.invoke(cot_zero_shot_prompt.format(query=query)).content)

To find the total number of punches Joe throws, we can break it down step by step:

1. **Determine the Duration of the Fight**:
   - Each round lasts 3 minutes.
   - There are 5 rounds in total.
   - Therefore, the total duration of the fight is \(5 \times 3 = 15\) minutes.

2. **Calculate the Total Number of Punches**:
   - Joe throws an average of 25 punches per minute.
   - Over the course of the entire fight, which lasts 15 minutes, he would throw:
   \[
   25 \text{ punches/minute} \times 15 \text{ minutes} = 375 \text{ punches}
   \]

Therefore, Joe throws a total of 375 punches during the fight.


And of course this also works with our CoT few shot examples

In [40]:
print(llm.invoke(cot_zero_shot_prompt.format(query=cot_examples[0]["query"])).content)

Sure, let's break it down step by step.

1. Roger initially has 5 tennis balls.
2. He buys 2 more cans of tennis balls.
3. Each can contains 3 tennis balls.

To find out how many additional tennis balls he buys, we multiply the number of cans by the number of tennis balls per can:

\[ 2 \text{ cans} \times 3 \text{ tennis balls per can} = 6 \text{ tennis balls} \]

Now, we add the additional tennis balls to the ones Roger initially had:

\[ 5 \text{ initial tennis balls} + 6 \text{ additional tennis balls} = 11 \text{ tennis balls} \]

Therefore, Roger has a total of 11 tennis balls now.


<center>
     <img src="data/D4Sci_logo_full.png" alt="Data For Science, Inc" align="center" border="0" width=300px> 
</center>