<a href="https://colab.research.google.com/github/dwopdm/as3/blob/main/llm-prompts/6_how_to_use_chain_of_thought_prompt%20mbpp.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


<img src="https://github.com/dmatrix/genai-cookbook/blob/main/llm-prompts/images/llm_prompt_req_resp.png?raw=1" height="35%" width="%65">

## Chain of thought (CoT) prompting

Chain of thought prompting for LLMs involves providing a sequence of reasoning steps in the prompt to guide the model toward a solution. This technique helps the model to process complex problems by breaking them down into intermediate steps, much like a human would. By mimicking human-like reasoning, chain of thought prompting improves the model's ability to handle tasks that require logic and deduction.

[Wei et al.](https://arxiv.org/abs/2201.11903) (2022) introduced chain-of-thought (CoT) prompting, which uses steps to help solve complex problems. By adding few-shot prompts, it works even better for tasks that need careful thinking before answering, giving the model time to "think." This can simply be achieved as prompting or instructing the LLM to "Let's think through this step and step. Solve each step and explain how to arrived at your answer." These instructions eliminate the need to explicitly provide "few-shot" examples. This combination helps in tackling more difficult tasks effectively.

Let's look at a few of those examples below 👇

**Note**:
To run any of these relevant notebooks you will need an account on Anyscale Endpoints, Anthropic, or OpenAI, depending on what model you elect, along with the respective environment file. Use the template environment files to create respective `.env` file for either Anyscale Endpoints, Anthropic, or OpenAI.

Load the environment

In [4]:
from openai import OpenAI

client = OpenAI(api_key="sk-10756b0e11834102825b28fd79ba6680", base_url="https://api.deepseek.com")

# Query using DeepSeek
response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[
        {"role": "system", "content": "You are a helpful assistant"},
        {"role": "user", "content": "What is the capital of France?"},
    ],
    stream=False
)

print(response.choices[0].message.content)

ImportError: cannot import name 'OpenAI' from 'openai' (/usr/local/lib/python3.10/dist-packages/openai/__init__.py)

In [5]:
# Utility function to send and fetch response

def get_commpletion(clnt: object,system_content: str, user_content:str) -> str:
    chat_completion = client.chat.completions.create(
        model="deepseek-chat",
    messages=[{"role": "system", "content": system_content},
              {"role": "user", "content": user_content}],
    temperature = 0.8)

    response = chat_completion.choices[0].message.content
    return response

In [None]:
system_content = """You are supreme repository of knowledge and an engine
of reason. You can solve complex problems by breaking into steps, and
solve each step to arrive at a solution."""

In [None]:
!pip install datasets

In [None]:
from datasets import load_dataset
gsm8k = load_dataset("gsm8k", "main", cache_dir='/tmp')
gsm8k_train, gsm8k_test = gsm8k['train'], gsm8k['test']

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


README.md:   0%|          | 0.00/7.94k [00:00<?, ?B/s]

train-00000-of-00001.parquet:   0%|          | 0.00/2.31M [00:00<?, ?B/s]

test-00000-of-00001.parquet:   0%|          | 0.00/419k [00:00<?, ?B/s]

Generating train split:   0%|          | 0/7473 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/1319 [00:00<?, ? examples/s]

In [None]:

def find_numbers(x: str) -> list[str]:
  """Finds all numbers in a string."""
  # Search for number, possibly negative (hyphen), with thousand separators
  # (comma), and with a decimal point (period inbetween digits).
  numbers = re.compile(
      r'-?[\d,]*\.?\d+',
      re.MULTILINE | re.DOTALL | re.IGNORECASE,
  ).findall(x)
  return numbers


def find_number(x: str,
                answer_delimiter: str = 'The answer is') -> str:
  """Finds the most relevant number in a string."""
  # If model uses the answer delimiter, then select the first number following
  # that format.
  if answer_delimiter in x:
    answer = x.split(answer_delimiter)[-1]
    numbers = find_numbers(answer)
    if numbers:
      return numbers[0]

  # In general, select the last number in the string.
  numbers = find_numbers(x)
  if numbers:
    return numbers[-1]
  return ''


def maybe_remove_comma(x: str) -> str:
  # Example: 5,600 -> 5600
  return x.replace(',', '')

In [None]:




PREAMBLE = """As an expert problem solver solve step by step the following mathematical questions."""

# The default gsm8k prompt from the CoT paper
# https://arxiv.org/pdf/2201.11903.pdf page 35.

PROMPT = """Q: There are 15 trees in the grove. Grove workers will plant trees in the grove today. After they are done, there will be 21 trees. How many trees did the grove workers plant today?
A: We start with 15 trees. Later we have 21 trees. The difference must be the number of trees they planted. So, they must have planted 21 - 15 = 6 trees. The answer is 6.
Q: If there are 3 cars in the parking lot and 2 more cars arrive, how many cars are in the parking lot?
A: There are 3 cars in the parking lot already. 2 more arrive. Now there are 3 + 2 = 5 cars. The answer is 5.
Q: Leah had 32 chocolates and her sister had 42. If they ate 35, how many pieces do they have left in total?
A: Leah had 32 chocolates and Leah's sister had 42. That means there were originally 32 + 42 = 74 chocolates. 35 have been eaten. So in total they still have 74 - 35 = 39 chocolates. The answer is 39.
Q: Jason had 20 lollipops. He gave Denny some lollipops. Now Jason has 12 lollipops. How many lollipops did Jason give to Denny?
A: Jason had 20 lollipops. Since he only has 12 now, he must have given the rest to Denny. The number of lollipops he has given to Denny must have been 20 - 12 = 8 lollipops. The answer is 8.
Q: Shawn has five toys. For Christmas, he got two toys each from his mom and dad. How many toys does he have now?
A: He has 5 toys. He got 2 from mom, so after that he has 5 + 2 = 7 toys. Then he got 2 more from dad, so in total he has 7 + 2 = 9 toys. The answer is 9.
Q: There were nine computers in the server room. Five more computers were installed each day, from monday to thursday. How many computers are now in the server room?
A: There are 4 days from monday to thursday. 5 computers were added each day. That means in total 4 * 5 = 20 computers were added. There were 9 computers in the beginning, so now there are 9 + 20 = 29 computers. The answer is 29.
Q: Michael had 58 golf balls. On tuesday, he lost 23 golf balls. On wednesday, he lost 2 more. How many golf balls did he have at the end of wednesday?
A: Michael initially had 58 balls. He lost 23 on Tuesday, so after that he has 58 - 23 = 35 balls. On Wednesday he lost 2 more so now he has 35 - 2 = 33 balls. The answer is 33.
Q: Olivia has $23. She bought five bagels for $3 each. How much money does she have left?
A: She bought 5 bagels for $3 each. This means she spent 5 * $3 = $15 on the bagels. She had $23 in beginning, so now she has $23 - $15 = $8. The answer is 8."""


# Extension of the default 8-shot prompt, page 35 in
# https://arxiv.org/pdf/2201.11903.pdf
# The extension is intended to improve performance on
# more complicated gsm8k examples.

EXTRA_3_SHOTS = """As an expert problem solver solve step by step the following mathematical questions.
Q: Tina makes $18.00 an hour.  If she works more than 8 hours per shift, she is eligible for overtime, which is paid by your hourly wage + 1/2 your hourly wage.  If she works 10 hours every day for 5 days, how much money does she make?
A: Here's how to calculate Tina's earnings:
**Regular Time:**
- Hours per shift: 8 hours
- Wage per hour: $18.00
- Regular pay per shift: 8 hours * $18.00/hour = $144.00
**Overtime:**
- Overtime hours per shift: 10 hours - 8 hours = 2 hours
- Overtime pay per hour: $18.00 + ($18.00 / 2) = $27.00
- Overtime pay per shift: 2 hours * $27.00/hour = $54.00
**Total per day:**
- Regular pay + overtime pay: $144.00/shift + $54.00/shift = $198.00/day
**Total for 5 days:**
- 5 days * $198.00/day = $990.00
**Therefore, Tina will make $990.00 in 5 days.** The answer is 990.
Q: Abigail is trying a new recipe for a cold drink. It uses 1/4 of a cup of iced tea and 1 and 1/4 of a cup of lemonade to make one drink. If she fills a pitcher with 18 total cups of this drink, how many cups of lemonade are in the pitcher?
A: ## Ambiguity in the Problem Statement:
There is one main ambiguity in the problem statement:
**Total volume vs. Number of servings:** The statement "18 total cups of this drink" could be interpreted in two ways:
  * **18 cups of the combined volume:** This would mean Abigail used a total of 18 cups of liquid, including both iced tea and lemonade.
  * **18 individual servings:** This would mean Abigail made 18 individual drinks, each containing 1/4 cup of iced tea and 1 1/4 cup of lemonade.
Let us assume the interpretation "18 cups of the combined volume".
## Solution assuming 18 cups of combined volume:
**Step 1: Find the proportion of lemonade in one drink:**
* Lemonade: 1 1/4 cups
* Iced tea: 1/4 cup
* Total: 1 1/4 + 1/4 = 1 1/2 cups
* Lemonade proportion: (1 1/4) / (1 1/2) = 5/6
**Step 2: Calculate the amount of lemonade in the pitcher:**
* Total volume: 18 cups
* Lemonade proportion: 5/6
* Volume of lemonade: 18 * (5/6) = 15 cups
Therefore, there are 15 cups of lemonade in the pitcher. The answer is 15.
Q: A deep-sea monster rises from the waters once every hundred years to feast on a ship and sate its hunger. Over three hundred years, it has consumed 847 people. Ships have been built larger over time, so each new ship has twice as many people as the last ship. How many people were on the ship the monster ate in the first hundred years?
A: Let us solve it using algebra. Let x be the number of people on the ship the monster ate in the first hundred years.
The number of people on the ship eaten in the second hundred years is 2x, and in the third hundred years is 4x.
Therefore, the total number of people eaten over three hundred years is x + 2x + 4x = 847.
Combining like terms, we get 7x = 847.
Dividing both sides by 7, we find x = 121.
Therefore, there were 121 people on the ship the monster ate in the first hundred years. The answer is 121."""

In [None]:

import os




# @title Python imports
import re


import datasets
import sentencepiece as spm

In [2]:
!git clone https://github.com/Qlalq/MBPP.git

!pip install openai==0.28


Cloning into 'MBPP'...
remote: Enumerating objects: 36, done.[K
remote: Counting objects: 100% (36/36), done.[K
remote: Compressing objects: 100% (25/25), done.[K
remote: Total 36 (delta 14), reused 31 (delta 9), pack-reused 0 (from 0)[K
Receiving objects: 100% (36/36), 119.22 KiB | 7.01 MiB/s, done.
Resolving deltas: 100% (14/14), done.
Collecting openai==0.28
  Downloading openai-0.28.0-py3-none-any.whl.metadata (13 kB)
Downloading openai-0.28.0-py3-none-any.whl (76 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m76.5/76.5 kB[0m [31m3.0 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: openai
  Attempting uninstall: openai
    Found existing installation: openai 1.54.4
    Uninstalling openai-1.54.4:
      Successfully uninstalled openai-1.54.4
Successfully installed openai-0.28.0


In [1]:
!pip install human_eval

Collecting human_eval
  Downloading human_eval-1.0.3-py3-none-any.whl.metadata (153 bytes)
Collecting fire (from human_eval)
  Downloading fire-0.7.0.tar.gz (87 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m87.2/87.2 kB[0m [31m4.5 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Downloading human_eval-1.0.3-py3-none-any.whl (52 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m52.3/52.3 kB[0m [31m2.0 MB/s[0m eta [36m0:00:00[0m
[?25hBuilding wheels for collected packages: fire
  Building wheel for fire (setup.py) ... [?25l[?25hdone
  Created wheel for fire: filename=fire-0.7.0-py3-none-any.whl size=114249 sha256=44d7a2e4b7a1ed6f83c598b6b87b81af8084699f9d4a0f00056f10bdc1ce6810
  Stored in directory: /root/.cache/pip/wheels/19/39/2f/2d3cadc408a8804103f1c34ddd4b9f6a93497b11fa96fe738e
Successfully built fire
Installing collected packages: fire, human_eval
Successfully installed fire-0.7.0 human_eval-1.0.

In [3]:
cd /content/MBPP

/content/MBPP


In [None]:
!python /content/MBPP/MBPP_completion.py

[1;30;43m流式输出内容被截断，只能显示最后 5000 行内容。[0m
2. **generate_palindrome(s)**: This helper function generates a palindrome by mirroring the left half of the string `s` to the right half. If the length of `s` is odd, it also includes the middle character.
3. **Main function logic**:
   - Increment the input number `num` by 1.
   - Check if the incremented number is a palindrome. If it is, return it.
   - If not, generate the next possible palindrome by incrementing the left half (and the middle character if the length is odd) and mirroring it.
   - If the generated palindrome is greater than the current number, return it. Otherwise, increment the number and repeat the process.

This approach ensures that we find the next smallest palindrome efficiently.
06:03:16 task_id: 101
Certainly! To find the kth element in a given array using 1-based indexing, we need to ensure that we correctly handle the indexing conversion from 1-based to 0-based, as Python arrays are typically 0-based.

Here's the fu

In [None]:


%%time
all_correct = 0
all_responses = {}
short_responses = {}
idx = 0
correct = 0

TEMPLATE = """
Q: {question}
A:"""

for task_id, problem in enumerate(gsm8k_test):

  if task_id in all_responses: continue

  # Print Task ID
  print(f"task_id {task_id}")

  # Formulate and print the full prompt
  user_content = (PREAMBLE +'\n\n' + PROMPT + '\n' +
                 TEMPLATE.format(question=problem['question']))
  short_prompt = PREAMBLE +'\n' + TEMPLATE.format(question=problem['question'])


  response = get_commpletion(client, system_content,user_content )
  print(f"{response}\n")


# Assuming response is a plain string
  all_responses[task_id] = response.split('\nQ:')[0]  # Split response by '\nQ:' and take the first part

  short_responses[task_id] = maybe_remove_comma(find_number(all_responses[task_id]))
  print(f"Short answer: {short_responses[task_id]}")
  try:
    correct += float(maybe_remove_comma(
        find_number(problem['answer']))) == float(short_responses[task_id])
  except:
    correct += maybe_remove_comma(
        find_number(problem['answer'])) == maybe_remove_comma(
            find_number(short_responses[task_id]))
  print('-'*40)
  print(f"Ground truth answer {problem['answer']}")
  print(f"Short ground truth answer {find_number(problem['answer'])}")
  print(f"Correct: {correct} out of {idx+1}")
  print("="*40)
  idx += 1


[1;30;43m流式输出内容被截断，只能显示最后 5000 行内容。[0m
### Step 1: Determine the number of groups
- There are 10 students in each grade who get to try the escape room.
- Only 8 students can try the escape room at a time.

To find out how many groups of 8 students can be formed from 10 students, we use the ceiling function (since we can't have a fraction of a group):

\[
\text{Number of groups} = \left\lceil \frac{10}{8} \right\rceil = \left\lceil 1.25 \right\rceil = 2
\]

So, there will be 2 groups of 8 students each.

### Step 2: Calculate the total time for all groups
- Each group has 45 minutes to try the escape room.
- There are 2 groups.

\[
\text{Total time} = 2 \text{ groups} \times 45 \text{ minutes/group} = 90 \text{ minutes}
\]

### Conclusion
It will take 90 minutes for everyone to try the escape room. The answer is 90.

To solve this problem, we need to determine how many groups of 8 students can be formed from the 10 students in each grade and then calculate the total time required for 

#### Example 1: Chain of Thought

#### Example 2: Chain of Thought

In [None]:
user_content = """At the recent holiday party, I got a coupon to join a health club
for wellness. If I joined before December 31, 2023 I get 35% discount on montly subscritpion fees
of $55.00 for one year, and the first three months' fees payments of $55.00 will be waived.

The monthly payments for the health club subscription is $55.00

If I joined in January 2024, I get 25%, and only one month's fee is waived.

Compute the best scenarios for saving costs for a one year subscription.

Let's think through this step by step. Solve each step and explain how you arrived
at your answer.
"""

In [None]:
response = get_commpletion(client,system_content, user_content)
print(f"{response}\n")

Sure, let's break down the problem step by step to determine the best scenario for saving costs on a one-year health club subscription.

### Step 1: Calculate the Total Cost for Joining Before December 31, 2023

1. **Monthly Subscription Fee**: $55.00
2. **Discount**: 35%
3. **Waived Fees**: First three months' fees ($55.00 each)

#### Calculation:
- **Monthly Fee After Discount**: 
  \[
  55.00 \times (1 - 0.35) = 55.00 \times 0.65 = 35.75
  \]
- **Total Monthly Fees for 12 Months**:
  \[
  35.75 \times 12 = 429.00
  \]
- **Waived Fees for First Three Months**:
  \[
  55.00 \times 3 = 165.00
  \]
- **Total Cost for One Year**:
  \[
  429.00 - 165.00 = 264.00
  \]

### Step 2: Calculate the Total Cost for Joining in January 2024

1. **Monthly Subscription Fee**: $55.00
2. **Discount**: 25%
3. **Waived Fees**: First month's fee ($55.00)

#### Calculation:
- **Monthly Fee After Discount**:
  \[
  55.00 \times (1 - 0.25) = 55.00 \times 0.75 = 41.25
  \]
- **Total Monthly Fees for 12 Month

#### Example 3: Chain of Thought

In [None]:
!pip install datasets

In [None]:
from datasets import load_dataset

# Load datasets
gsm8k = load_dataset("gsm8k","main", split="test[:100]")  # Subset for quick testing
mbpp = load_dataset("mbpp", split="test[:100]")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


README.md:   0%|          | 0.00/7.94k [00:00<?, ?B/s]

train-00000-of-00001.parquet:   0%|          | 0.00/2.31M [00:00<?, ?B/s]

test-00000-of-00001.parquet:   0%|          | 0.00/419k [00:00<?, ?B/s]

Generating train split:   0%|          | 0/7473 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/1319 [00:00<?, ? examples/s]

README.md:   0%|          | 0.00/9.06k [00:00<?, ?B/s]

train-00000-of-00001.parquet:   0%|          | 0.00/87.2k [00:00<?, ?B/s]

test-00000-of-00001.parquet:   0%|          | 0.00/116k [00:00<?, ?B/s]

validation-00000-of-00001.parquet:   0%|          | 0.00/25.1k [00:00<?, ?B/s]

prompt-00000-of-00001.parquet:   0%|          | 0.00/7.88k [00:00<?, ?B/s]

Generating train split:   0%|          | 0/374 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/500 [00:00<?, ? examples/s]

Generating validation split:   0%|          | 0/90 [00:00<?, ? examples/s]

Generating prompt split:   0%|          | 0/10 [00:00<?, ? examples/s]

In [None]:
def evaluate(dataset):
    correct = 0
    total = len(dataset)

    for item in dataset:
        problem = item["question"] if "question" in item else item["text"]
        ground_truth = item["answer"] if "answer" in item else item["test"]

        # Generate answer using the selected prompting method
        generated_answer = response = get_commpletion(client, system_content, problem )

        # Evaluate correctness (simple string match or regex for now)
        if str(ground_truth).strip() in generated_answer.strip():
            correct += 1

    accuracy = correct / total
    return accuracy


# Evaluate each method

gsm8k_accuracy = evaluate(gsm8k)


In [None]:
print (gsm8k_accuracy)

0.0


In [None]:
user_content = """Three girls, Emmy, Kasima, and Lina, had a fresh lemon juice booth stand
at the local community fair.

Emmy had 45 medium glasses of lemmon. She sold 43 glasses each at $1.25 per glass.

Kasima had 50 small glasses, and she sold all of them each at $1.15 per glass.

And Lina had 25 large glasses and she sold only 11 glasses but at $1.75 per glass.

Of all the three girls, who made most money, and how many glasses each girl sold.
How many unsold glasses were left for each girl.

And finally, looking at all the numbers, which girl benefited most. That is, which
girl cleared her stock.

Let's think through this step and step. Solve each step and explain how you arrived
at your answer"""

In [None]:
response = get_commpletion(client, system_content, user_content)
print(f"{response}\n")

### Step 1: Calculate the total earnings for each girl

**Emmy:**
- Emmy had 45 medium glasses.
- She sold 43 glasses at $1.25 per glass.
- Total earnings for Emmy: \( 43 \times 1.25 = 53.75 \) dollars.

**Kasima:**
- Kasima had 50 small glasses.
- She sold all 50 glasses at $1.15 per glass.
- Total earnings for Kasima: \( 50 \times 1.15 = 57.50 \) dollars.

**Lina:**
- Lina had 25 large glasses.
- She sold 11 glasses at $1.75 per glass.
- Total earnings for Lina: \( 11 \times 1.75 = 19.25 \) dollars.

### Step 2: Determine who made the most money

- Emmy made $53.75.
- Kasima made $57.50.
- Lina made $19.25.

**Conclusion:** Kasima made the most money.

### Step 3: Calculate the number of glasses sold by each girl

- Emmy sold 43 glasses.
- Kasima sold 50 glasses.
- Lina sold 11 glasses.

### Step 4: Calculate the number of unsold glasses for each girl

- Emmy had 45 glasses and sold 43, so she had \( 45 - 43 = 2 \) unsold glasses.
- Kasima had 50 glasses and sold all 50, so she had \

## All this is amazing! 😜 Feel the wizardy in Chain of Thought reasoning 🧙‍♀️