# Chain-of-Thought Prompting

In this notebook, you will learn about one of the most famous prompting techniques called Chain-of-Thought prompting.

---

## Objectives

By the time you complete this notebook you will:

- Learn Chain-of-Thought prompting
- Encounter and appreciate LLM hallucination

---

## Imports

In [1]:
!pip install groq langchain-groq

Collecting groq
  Downloading groq-0.31.1-py3-none-any.whl.metadata (16 kB)
Collecting langchain-groq
  Downloading langchain_groq-0.3.8-py3-none-any.whl.metadata (2.6 kB)
Downloading groq-0.31.1-py3-none-any.whl (134 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m134.9/134.9 kB[0m [31m9.8 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading langchain_groq-0.3.8-py3-none-any.whl (16 kB)
Installing collected packages: groq, langchain-groq
Successfully installed groq-0.31.1 langchain-groq-0.3.8


In [2]:
import os
import getpass

os.environ["GROQ_API_KEY"] = getpass.getpass("GROQ API Key:\n")

GROQ API Key:
··········


In [3]:
from langchain_groq import ChatGroq
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

---

## Create a Model Instance

In [4]:
llm = ChatGroq(model_name="llama-3.3-70b-versatile", temperature=0)

---

## LLMs Jump to Conclusions

Since LLMs are primarily designed to generate what is most likely to come next in a stream of text, it's no surprise that they can often "jump to conclusions" in scenarios that actually require complex reasoning.

Consider the following example, which challenges the LLM do do some multiplication. Before invoking the LLM, we'll use Python to provide us with the actual answer.

In [11]:
342345*88888

30430362360

In [12]:
print(llm.invoke('What is 342345 * 88888?').content)

To calculate this, I'll multiply the two numbers:

342345 * 88888 = 30,419,061,160


This answer is incorrect, but given what we know about LLMs, again, that they are designed primarily to generate the most likely response given the data they were trained on, it may not come as a surprise.

Significant changes are likely to occur in coming years, but as it stands, and in spite of the common use of terms like "Artificial **Intelligence**", LLMs are not well suited, at least not without additional user support, to deliberate step by step reasoning. Rather, LLMs are, because of their design, well suited to jumping to conclusions.

---

## Hallucination

This phenomenon of LLMs generating incorrect information, often with what appears like "confidence" to outside observers, is called **hallucination**. Because of their design, an LLM will usually say something incorrect (unbeknownst to the LLM itself) rather than say nothing at all. While it's true that LLMs over time are hallucinating less, and that over time they will continue to improve in this regard, hallucination is an inherent characteristic of LLM generation.

Regarding LLM hallucination, you need to commit the following golden rules to memory:
1. All LLMs hallucinate.
2. You are responsible for what your LLMs generate in your applications.

---

## Chain-of-Thought Prompting

Returning to the main subject matter of this notebook, while LLMs are not well-suited to long-form deliberate reasoning, there are techniques we can utilize to guide LLMs towards taking a more deliberate step by step approach when the task demands it. One such technique is call Chain-of-Thought

[Chain-of-Thought(CoT) prompting](https://arxiv.org/pdf/2201.11903) is still one of the most popular prompting techniques. It supports complex reasoning capabilities by encouraging LLMs to break down complex problems into intermediate steps. CoT prompting involves providing examples that demonstrate step-by-step reasoning to solve a problem.

Here is the example from the paper that introduced CoT prompting.

---

## Chain-of-Though Multiplication

Let's try a CoT prompt for 3-digit long multiplication. We will start with an example multiplication prompt.

In [13]:
example_problem = 'What is 678 * 789?'

Next, we'll supply a chain-of-thought example of working through the problem step by step.

In [14]:
example_cot = '''\
Let me break this down into steps. First I'll break down 789 into hundreds, tens, and ones:

789 -> 700 + 80 + 9

Next I'll multiply 678 by each of these values, storing the intermediate results:

678 * 700 -> 678 * 7 * 100 -> 4746 * 100 -> 474600

My first intermediate result is 474600.

678 * 80 -> 678 * 8 * 10 -> 5424 * 10 -> 54240

My second intermediate result is 54240.

678 * 9 -> 6102

My third intermediate result is 6102.

My three intermediate results are 474600, 54240, and 6102.

Adding the first two intermediate results I get 474600 + 54240 -> 528840.

Adding 528840 to the last intermediate result I get 528840 + 6102 -> 534942

The final result is 534942.
'''

With our example problem and example CoT response, we can construct a one-shot prompt template.

In [15]:
multiplication_template = ChatPromptTemplate.from_messages([
    ('human', example_problem),
    ('ai', example_cot),
    ('human', '{long_multiplication_prompt}')
])

We'll now use this template in a simple chain.

In [16]:
multiplication_chain = multiplication_template | llm | StrOutputParser()

Let's revisit the original multiplication problem we tasked the LLM with to see if it does any better.

In [17]:
print(multiplication_chain.invoke('What is 345 * 888?'))

To calculate this, I'll break down 888 into hundreds, tens, and ones, but since 888 is close to 900 and also close to 1000 - 112, or 800 + 80 + 8, I can break it down as follows:

888 -> 800 + 80 + 8

Now I'll multiply 345 by each of these values:

345 * 800 -> 345 * 8 * 100 -> 2760 * 100 -> 276000

My first intermediate result is 276000.

345 * 80 -> 345 * 8 * 10 -> 2760 * 10 -> 27600

My second intermediate result is 27600.

345 * 8 -> 2760

My third intermediate result is 2760.

Now I'll add these intermediate results together:

276000 + 27600 -> 303600

303600 + 2760 -> 306360

The final result is 306360.


In [18]:
345*888

306360

As you can see, the LLM followed the example in our CoT prompt, taking the problem step by step, and in this case, generated the correct response.

---

## Zero-shot Chain-of-Thought Prompting

CoT prompting can get really verbose, and as we mentioned earlier, you shouldn't be shy about writing long prompts when needed to accomplish your goals.

That said, there are more elegant ways to leverage CoT methods with LLMs. One such variant is called [Zero-shot CoT](https://arxiv.org/abs/2205.11916). This prompting technique simply adds "Let's think step by step" to the prompt without supplying verbose CoT examples, and it has been shown to work quite well.

Let's construct a new prompt template to try Zero-shot CoT on long multiplication, following the example in the paper, simply supplying the language "Let's think step by step".

In [19]:
zero_shot_cot_prompt = ChatPromptTemplate([
    ("human", "{long_multiplication_prompt} Let's think step by step.")
])

In [20]:
zero_shot_multiplication_chain = zero_shot_cot_prompt | llm | StrOutputParser()

In [21]:
print(zero_shot_multiplication_chain.invoke('What is 345 * 888?'))

To calculate 345 * 888, let's break it down step by step.

First, we can multiply 345 by 800, which is close to 888. 

345 * 800 = 276,000

Next, we need to multiply 345 by the remaining 88 (since 888 - 800 = 88).

345 * 88 = 30,360

Now, we add the results of these two multiplications together:

276,000 + 30,360 = 306,360

So, 345 * 888 = 306,360.


In [22]:
345*888

306360

For our 3-digit long multiplication problem, simply prompting the model to think step by step performed as well as our more verbose CoT prompt.

---

## CoT Prompting In Practice

As with many other aspects of prompt engineering, especially given the variety of LLMs available (both now and in the future) it's difficult to give hard and fast rules about when and how exactly to employ CoT prompting. That said, we can offer you some good general guidelines to follow.

- Develop your prompts iteratively. Start simple, try zero-shot CoT when you think it might be helpful, and expand to use more verbose example-based CoT prompting when the need arises.
- Consider the use of external, non-LLM tools for tasks that LLMs may not be well-suited for (like math). We are going to look at tool use later in the workshop but the general idea is that LLMs are incredible, but not necessarily for every single task. Reflect if you will on the case of long multiplication about how fast, reliable, and effective simple Python is. Don't fall into the old adage _"If all you have is a hammer, everything looks like a nail."_ LLMs are only one of your many tools to use while building LLM-based applications.

---

## Exercise: Use an LLM to Solve a Word Problem

For this exercise, use what you've learned in this notebook to a correct response back from our LLM to the following word problem.

In [23]:
word_problem = """Michael's car travels at 40 miles per hour. He is driving from 1 PM to 4 PM and then \
travels back at a rate of 25 miles per hour due to heavy traffic. How long in \
terms of minutes did it take him to get back?"""

Before beginning your work, here is the actual solution to the word problem.

Michael drove 40 miles per hour for 3 hours, which means he drove 120 miles.

To come back at a rate of 25 miles per hour, it would have taken him 4.8 hours (120 miles / 25 mph) which is equivalent to **288** minutes (4.8 hours * 60 minutes/hour).

The correct answer is **288**.

Feel free to check out the *Solution* below if you get stuck.

### Your Work Here

### Solution

There are lots of ways to solve this, but we opted to include a system message encouring the LLM to always breaking their work into smaller tasks and showing their work, and then, appending the cannonical zero-shot CoT prompt to whatever the human user passes in.

In [24]:
template = ChatPromptTemplate.from_messages([
    ('system', 'You are an expert word problem solver. You always break your problem down into smaller tasks and show your work.'),
    ('human', '{prompt}\n\nLet\'s think step by step.')
])

In [25]:
chain = template | llm | StrOutputParser()

In [26]:
print(chain.invoke(word_problem))

To solve this problem, let's break it down into smaller tasks.

**Task 1: Calculate the distance traveled from 1 PM to 4 PM**

First, we need to calculate the time Michael spent driving from 1 PM to 4 PM. 
Time = 4 PM - 1 PM = 3 hours

Since his car travels at 40 miles per hour, we can calculate the distance traveled:
Distance = Speed x Time = 40 miles/hour x 3 hours = 120 miles

**Task 2: Calculate the time it took to travel back**

Now, we know the distance Michael needs to travel back is also 120 miles (since he's returning to the starting point). 
The speed at which he travels back is 25 miles per hour due to heavy traffic.

We can use the formula: Time = Distance / Speed
Time = 120 miles / 25 miles/hour = 4.8 hours

**Task 3: Convert the time from hours to minutes**

Finally, we need to convert the time from hours to minutes:
1 hour = 60 minutes
4.8 hours = 4.8 x 60 minutes = 288 minutes

Therefore, it took Michael 288 minutes to travel back.


---

## Summary

In this notebook you experienced LLM hallucination, reflected on LLMs natural inclination to jump to conclusions, and learned several Chain-of-Thought prompting techniques to assist LLMs in performing tasks requiring step by step reasoning.

In the next notebook, you are going to learn how to make a chatbot using the prompt engineering techniques you now have at your disposal.