# **Chapter 7. Reasoning and Chain-of-Thought Prompting**

In this chapter, we’ll explore Chain-of-Thought (CoT) prompting, where the main principle is to “**think first, answer later.**” This approach encourages Solar to break down problems logically and solve them in, rather than jumping directly to a conclusion. CoT is especially effective for multi-step math problems and complex reasoning tasks, as it guides the model to consider each part of the problem individually, building up to a coherent solution.

**Set up**

In [33]:
from openai import OpenAI

# Retrieve the UPSTAGE_API_KEY variable from the IPython store
%store -r UPSTAGE_API_KEY

try:
    if UPSTAGE_API_KEY:
        print("Success!")
except NameError as ne:
    print(f"Since, {ne}")
    print("Please, insert your API key.")
    UPSTAGE_API_KEY = input("UPSTAGE_API_KEY =")

# Set your API key: 
# UPSTAGE_API_KEY = " " ←- Insert your API key here. 

client = OpenAI(
    api_key= UPSTAGE_API_KEY,
    base_url="https://api.upstage.ai/v1/solar"
)

config_model = {
    "model": "solar-pro",
    "max_tokens": 1997,
    "temperature": 0.0,
    "top_p": 0.9,
}

def get_completion(messages, system_prompt="", config=config_model):
    try:
        if system_prompt:
            messages = [{"role": "system", "content": system_prompt}] + messages

        message = client.chat.completions.create(messages=messages, **config)
        return message.choices[0].message.content
    
    except Exception as e:
        print(f"Error during API call: {e}")
        return None

Success!


---

## **7.1 Chain-of-Thought (CoT) Prompt Engineering**

Many studies have proved that the effectiveness of using CoT in Large Language Models. A representative study, titled "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models" by [Jason Wei and colleagues (2022)](https://arxiv.org/pdf/2201.11903), will help improve understanding. This paper explores how generating a chain of thought—a sequence of intermediate reasoning steps—significantly enhances the ability of large language models to perform complex reasoning tasks.

**Reference**

[Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., ... & Zhou, D. (2022). Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems, 35, 24824-24837.](https://arxiv.org/pdf/2201.11903)

Key Features of CoT Prompt Engineering are as follows: 

- **Step-by-Step Reasoning**: CoT prompts encourage models to articulate their reasoning process, which improves their accuracy and clarity when solving complex problems.

- **Enhanced Transparency**: Since the model outputs intermediate steps, users can trace the reasoning behind the solution, making it easier to debug or understand the model’s behavior.

The Solar excels at solving complex mathematical problems using the CoT approach. From now on, we will introduce how to solve **prime number**, **probability** problems using CoT.

---

## **7.2 Math Problem Sets - Chain of Thought**

Here are some examples of math problem sets where CoT can be effectively applied:

1. [The Prime Numbers](#prime)

2. [The Probability of A Dice](#prob)

<a id='prime'></a>
**Problem #1: The Prime Numbers**

> Find a counterexample with a proof to show that the following statement is false: 'Every positive integer can be expressed as p + a^2, where p is either a prime number or 1, and a >= 0'.
>
> **Answer**: $n = 25$

- **CoT Prompt**: $n = 25$ (O)

In [76]:
message = [
    {
        "role": "user",
        "content": """Find a counterexample with a proof to show that the following statement is false: 'Every positive integer can be expressed as p + a^2, where p is either a prime number or 1, and a >= 0'.
#Instruction
1. Let n >= 10, since the combinations of p and a^2 satisfy n = p + a^2 for all n < 10. As n increases, the gaps between a^2 values where a^ 2 < n widen, and the density of p < n decreases.
2. Consider n as the square number first, such as 9, 16, 25, and so on to quickly find a counterexample.

##Caution
- If you find any of possible combination of p and a^2 to express n is satisfied, then it is not a counterexample.
- p SHOULD NOT BE NON-PRIME NUMBER without 1. p can be 1, 2, 3, 5, 7, 11, and so on.
- a SHOULD BE A SQUAR NUMBER. a can be 0, 1, 4, 9, 16, 25, and so on.

###Trying
n=16
For n=16, we check the possible combinations of p+a^2=16:
p=1 and a^2=15: Not possible because 15 is not a perfect square.
p=2 and a^2=14: Not possible because 14 is not a perfect square.
p=3 and a^2=13: Not possible because 13 is not a perfect square.
p=5 and a^2=11: Not possible because 11 is not a perfect square.
p=7 and a^2=9: Possible, since a=3.

Since n=16 can be written as 16=7+9, it is not a counterexample.
"""
    }
]

response = get_completion(messages=message)
print(response)

n=25
For n=25, we check the possible combinations of p+a^2=25:
p=1 and a^2=24: Not possible because 24 is not a perfect square.
p=2 and a^2=23: Not possible because 23 is not a perfect square.
p=3 and a^2=22: Not possible because 22 is not a perfect square.
p=5 and a^2=20: Not possible because 20 is not a perfect square.
p=7 and a^2=18: Not possible because 18 is not a perfect square.
p=11 and a^2=14: Not possible because 14 is not a perfect square.
p=13 and a^2=12: Not possible because 12 is not a perfect square.
p=17 and a^2=8: Not possible because 8 is not a perfect square.
p=19 and a^2=6: Not possible because 6 is not a perfect square.
p=23 and a^2=2: Possible, since a=√2. However, a must be a non-negative integer, so this is not a valid combination.

Since no combination of p and a^2 satisfies n=25, it is a counterexample. Therefore, the statement 'Every positive integer can be expressed as p + a^2, where p is either a prime number or 1, and a >= 0' is false.


- **Zero-Shot Prompt**: $n = 15$ (X)

    > "Consider the positive integer n = 15. We want to find integers p and a such that n = p + a^2, where p is either a prime number or 1, and a >= 0.
    > 
    > First, let's try to find a prime number p. The prime numbers less than 15 are 2, 3, 5, 7, 11, and 13. However, none of these primes, when subtracted from 15, result in a perfect square:
    > 
    > - 15 - 2 = 13 (not a perfect square)
    > - 15 - 3 = 12 (not a perfect square)
    > - 15 - 5 = 10 (not a perfect square)
    > - 15 - 7 = 8 (not a perfect square)
    > - 15 - 11 = 4 (a perfect square, but 11 is not a prime number less than 15)
    > - 15 - 13 = 2 (not a perfect square)
    >
    > Next, let's try p = 1. In this case, we have n = 1 + a^2, which implies a^2 = 14. However, there is no integer a such that a^2 = 14, as the square of any integer is either a multiple of 4 or one more than a multiple of 4.
    > 
    > Therefore, we have shown that the positive integer 15 cannot be expressed as p + a^2, where p is either a prime number or 1, and a >= 0. This counterexample proves that the given statement is false."

---

<a id='prob'></a>
**Problem #2: The Probability of A Dice**

> Suppose we have a dice that the probability of odd numbers is $\frac{5}{2}$ times the probability of even numbers.  
> Find the probability that dice shows 6.
> 
> **Answer**: The probability is $\frac{2}{21}$.

- **CoT Prompt**:  $\frac{2}{21}$ (O)

In [68]:
message = [
    {
        "role": "user",
        "content": """
        Suppose we have a dice that the probability of odd numbers is 4/3 times the probability of even numbers.
        Find the probability that dice shows 6.
        
        A: Let p(odd): p(1) = p(3) = p(5), p(even): p(2) = p(4) = p(6),
        -> 3p(odd) + 3p(even) = 1
        -> 3*(4/3p(even)) + 3p(even) (because: p(odd) = 4/3p(even), 3*4/3 = 12/3 = 4),
        -> 4p(even) + 3p(even) = 1,
        -> 7p(even) = 1,
        -> p(even) = 1/7,
        -> p(even) = p(6) = 1/7,

        Therefore, p(6) = 1/7.
        
        Suppose we have a dice that the probability of odd numbers is 5/6 times the probability of even numbers.
        Find the probability that dice shows 6.
        A: Let p(odd): p(1) = p(3) = p(5), p(even): p(2) = p(4) = p(6),
        -> 3p(odd) + 3p(even) = 1
        -> 3*(5/6p(even)) + 3p(even) (because: p(odd) = 5/6p(even)),
        -> 5/2p(even) + 3p(even) = 1 (because: 3*5/6 = 15/6 = 5/2),
        -> 11/2p(even) = 1 (because: 3 = 6/2),
        -> p(even) = 2/11,
        -> p(even) = p(6) = 2/11,

        Therefore, p(6) = 2/11.
        
        Suppose we have a dice that the probability of odd numbers is 7/4 times the probability of even numbers.
        Find the probability that dice shows 6.
        A: Let p(odd): p(1) = p(3) = p(5), p(even): p(2) = p(4) = p(6),
        -> 3p(odd) + 3p(even) = 1
        -> 3*(7/4p(even)) + 3p(even) (because: p(odd) = 7/4p(even)),
        -> 21/4p(even) + 3p(even) = 1 (because: 3*7/4 = 21/4),
        -> 33/4p(even) = 1 (because: 3 = 12/4),
        -> p(even) = 4/33,
        -> p(even) = p(6) = 4/33,

        Therefore, p(6) = 4/33.

        -----
        Suppose we have a dice that the probability of odd numbers is 5/2 times the probability of even numbers.
        Find the probability that dice shows 6.
        A: 
        """
    }
]

response = get_completion(messages=message)
print(response)


        Let p(odd): p(1) = p(3) = p(5), p(even): p(2) = p(4) = p(6),
        -> 3p(odd) + 3p(even) = 1
        -> 3*(5/2p(even)) + 3p(even) (because: p(odd) = 5/2p(even)),
        -> 15/2p(even) + 3p(even) = 1 (because: 3*5/2 = 15/2),
        -> 21/2p(even) = 1 (because: 3 = 6/2),
        -> p(even) = 2/21,
        -> p(even) = p(6) = 2/21,

        Therefore, p(6) = 2/21.


- **Zero-Shot Prompt**: $\frac{2}{7}$ (X)

> "Let's denote the probability of rolling an even number as P(even) and the probability of rolling an odd number as P(odd). According to the problem, P(odd) = 5/2 * P(even).
> 
> Since there are only even and odd numbers on a dice, the sum of their probabilities should be equal to 1. So, we can write:
> 
> P(even) + P(odd) = 1
> 
> Substituting the given relationship between P(odd) and P(even), we get:
> 
> P(even) + 5/2 * P(even) = 1
> 
> Combining the terms, we have:
> 
> 7/2 * P(even) = 1
> 
> Now, solving for P(even), we get:
> 
> P(even) = 2/7
> 
> Since there are three even numbers (2, 4, and 6) on a dice, the probability of rolling any even number is 3 * P(even). Therefore, the probability of rolling a 6 is:
> 
> P(6) = 3 * P(even) = 3 * 2/7 = 6/7
> 
> However, this result is not possible because the probability of an event cannot be greater than 1. It seems there was a mistake in the calculation. Let's correct it.
> 
> The probability of rolling a 6 is actually:
> 
> P(6) = P(even) = 2/7
> 
> So, the probability that the dice shows 6 is 2/7."

---

## **7.3 Zero-Shot Chain-of-Thought Prompt Engineering**

Zero-shot chain of thought prompt engineering is a method that enhances the reasoning capabilities of a model without providing guidance or answers for problem-solving, as utilized in CoT.

- The paper [*Large Language Models are Zero-Shot Reasoners* (2022)](https://arxiv.org/abs/2205.11916) proposed the phrase "**Let's think step by step.**". Zero-shot CoT guides the model to think step by step, aiding in solving complex problems. Using this phrase, two problems will be solved.

**Reference**

[Kojima, T., Gu, S. S., Reid, M., Matsuo, Y., & Iwasawa, Y. (2022). *Large language models are zero-shot reasoners*. Advances in neural information processing systems, 35, 22199-22213.](https://arxiv.org/abs/2205.11916)

---

## **7.4 Math Problem Sets - Zero Shot Chain of Thought**


3. [The Differential Equation](#diff) 

4. [The Cryptarithms](#abc)

From now on, we will introduce how to solve **differential equation, and cryptarithms** problems using Zero-Shot CoT.

<a id='diff'></a>
**Problem #3: The Differential Equation**

> Solve the following equation: $dx + e^{\frac{2}{3}x}dy = 0$
>
> **Answer**: $y = \frac{3}{2} e^{-\frac{2}{3}x} + C$

- **Zero-Shot CoT Prompt**: $y = \frac{3}{2} e^{-\frac{2}{3}x} + C$ (O)

In [74]:
message = [
    {
        "role": "user",
        "content": """Solve the differential equation: dx + e^(2/3x)dy = 0.
        
        ## Instruction
        dx + x dy = 0 -> x dy = - dx -> dy = - 1/x dx.
        
        Let's think step by step."""
    }
]

response = get_completion(messages=message)
print(response)

Integrate both sides: ∫dy = ∫(-1/x)dx.

This gives: y = -∫(1/x)dx.

The integral of 1/x is ln|x|, so: y = -ln|x| + C.

Now, let's apply the same steps to the given differential equation: dx + e^(2/3x)dy = 0.

First, isolate dy: dy = -dx/e^(2/3x).

Next, integrate both sides: ∫dy = ∫(-1/e^(2/3x))dx.

This gives: y = -∫(1/e^(2/3x))dx.

The integral of 1/e^(2/3x) is -3/2 * e^(-2/3x), so: y = (3/2) * e^(-2/3x) + C.


- **Zero-Shot Prompt**: $y = -\frac{3}{2}e^{-\frac{2}{3}x} + C$ (X)

    > "To solve the given differential equation, we can first rearrange it to the form:
    > 
    > dx/dy = -e^(2/3x)
    > 
    > Now, we can separate the variables:
    > 
    > dy/dx = -e^(-2/3x)
    > 
    > Next, we integrate both sides:
    > 
    > ∫dy = ∫-e^(-2/3x)dx
    > 
    > The left side integrates to y, and the right side integrates to:
    > 
    > -3/2 * e^(-2/3x) + C
    > 
    > So, the solution to the differential equation is:
    > 
    > y = -3/2 * e^(-2/3x) + C"

---

<a id='abc'></a>
**Problem #4: The Cryptarithms**

> Solve for distinct single-digit natural numbers A, B, and C in the equation: $ABC + BAC = CACA$.  
> 
> **Answer**: $A = 2, B = 9, C = 1$

- **Zero-Shot CoT Prompt**: $A = 2, B = 9, C = 1$ (O)

In [71]:
message = [
    {
        "role": "user",
        "content": """Solve for distinct single-digit natural numbers A, B, and C in the equation: ABC+BAC=CACA based on the following instructions.
<Constraints>
- ABC, BAC are 3-digit numbers: ABC = 100*A + 10*B + 1*C, BAC = 100*B + 10*A + 1*C.
- ABC + BAC = 100*A + 10*B + 1*C + 100*B + 10*A + 1*C = 110*A + 110*B + 2*C.
- CACA is a 4-digit number: CACA = 1000*C + 100*A + 10*C + 1*A = 1010*C + 100*A + 1*A.
- 3-digit numbers cannot be greater than 1000.
- A, B, and C MUST be different natural numbers.
- 1000 < CACA < the maximum of ABC+BAC.
</Constraints>

<STEPs>
1. Estimate the range of CACA to satisfy the Constraints.
2. Identify the first digit, C, of CACA through this range of CACA. It MUST be one number.
3. Calculate units places of LHS using the result of step2. Then, estimate units place of RHS, A.
4. Determine the value of B given the values of A and C in the equation ABC + BAC = CACA.
1) LHS: 110*B + 110*A + 2*C.
2) RHS: 1010*C + 101*A.
3) Express the left side of the equation in terms of B : 110B = 1010C + 101A - 110A - 2C.
</STEPs>

Think step by step."""
    }
]

response = get_completion(messages=message)
print(response)

1. Estimate the range of CACA: Since ABC and BAC are 3-digit numbers, their sum cannot exceed 2000. Therefore, CACA must be less than 2000. Also, since CACA is a 4-digit number, it must be greater than 1000. So, the range of CACA is 1001 to 1999.

2. Identify the first digit, C, of CACA: Since CACA is a 4-digit number, C must be a single-digit natural number. Given the range of CACA, C can be any single-digit natural number from 1 to 9. However, since CACA is a 4-digit number, C cannot be 0. Also, since CACA is less than 2000, C cannot be greater than 1. Therefore, C must be 1.

3. Calculate units places of LHS using the result of step2: Since C = 1, the units place of LHS (ABC + BAC) is 2*C = 2. Now, let's estimate the units place of RHS, A. Since CACA = 1010*C + 101*A, and C = 1, the units place of RHS is 1*A. Therefore, A must be 2 to satisfy the equation.

4. Determine the value of B given the values of A and C in the equation ABC + BAC = CACA: Now that we know A = 2 and C = 1, we 

- **Zero-Shot Prompt**: $A = 2, B = 1, C = 1$ (X)

    > "Let's solve this step by step.
    > 
    > First, let's write the equation in a more understandable way:
    >
    > 100A + 10B + C + 100B + 10A + C = 1000C + 100A + 10C + A
    >
    > Now, let's simplify and rearrange the equation:
    > 
    > 110A + 110B + 2C = 1000C + 100A + 10C + A
    > 
    > Subtract 110A from both sides:
    >
    > 110B + 2C = 1000C + 10C + A
    >
    > Subtract 2C from both sides:
    > 
    > 110B = 998C + 10C + A
    > 
    > Subtract 10C from both sides:
    > 
    > 110B = 988C + A
    >
    > Now, we know that A, B, and C are single-digit natural numbers, so the left side of the equation must be a multiple of 110. The only single-digit natural number that can be added to a multiple of 988 to get a multiple of 110 is 2. So, A = 2.
    > 
    > Substitute A = 2 into the equation:
    >
    > 110B = 988C + 2
    >
    > Now, we need to find B and C such that 110B is a multiple of 110 and 988C + 2 is a single-digit number. The only possible value for B is 1, and the only possible value for C is 1.
    > 
    > So, the solution is A = 2, B = 1, and C = 1."

*When the same problem is solved using Zero-shot prompting instead of CoT, Solar fails to get the correct answer as seen above.*