# LLM Reasoning Experiments

The experiment below shows the non-determinstic nature of LLMs. It also shows how reasoning can help it converge
to the right answer. This can also be used to confirm the "Consistency" principle outlined in Lecture 1.

Please set your OpenAI API key in a local .env file for load_dotenv():
```
OPENAI_API_KEY=sk...
```
or set it in the environment explicity using 
```
import os
os.environ["OPENAI_API_KEY"] = "sk..."
```

In [2]:
import os
from dotenv import load_dotenv
load_dotenv()

True

## Few-shot prompting and no reasoning
Run this to see how gpt-4o produces the incorrect answer using just few-shot (2 examples) prompting. Note that running
this on ChatGPT might produce different results since that model has "reasoning" based prompting logic built into its system message as a default.

In [3]:
from openai import OpenAI
client = OpenAI()

# https://platform.openai.com/docs/api-reference/chat/create

USER_QUERY = """
Q: “Elon Musk” A: “nk”
Q: “Bill Gates” A: “ls”
Q: “Barack Obama” A: ?
"""


def run_user_query_oai_model(model_str, user_query):
    completion = client.chat.completions.create(
        model=model_str,
        messages=[
            {"role": "system", "content": "You are a helpful AI Assistant."},
            {
                "role": "user",
                "name": "Joe",
                "content": user_query
            }
        ]
    )
    return completion.choices[0].message
    
def run_user_query_oai(user_query):
    return run_user_query_oai_model("gpt-4o", user_query)

for i in range(1, 11):
    print("\n\n------------------------------------")
    print("Response:", run_user_query_oai(USER_QUERY).content)



------------------------------------
Response: “ck”


------------------------------------
Response: “ck”


------------------------------------
Response: “ck”


------------------------------------
Response: "ck"


------------------------------------
Response: The answer is "ck". The pattern involves taking the last letter from the first name and the first letter from the last name. For "Barack Obama," the last letter of "Barack" is "k" and the first letter of "Obama" is "o", so the answer is "ko".


------------------------------------
Response: The pattern appears to involve taking the last two letters of the first name and the first letter of the last name. 

- "Elon Musk": Last two letters of "Elon" are "on", first letter of "Musk" is "M", resulting in "nk". (There might be a typo?)
- "Bill Gates": Last two letters of "Bill" are "ll", first letter of "Gates" is "G", resulting in "ls".
- "Barack Obama": Last two letters of "Barack" are "ck", first letter of "Obama" is "O".

Acco

# Chain of thought with reasoning

In [6]:
USER_QUERY_WITH_CHAIN_OF_THOUGHT = """
Q: “Elon Musk” A: “nk”
Q: “Bill Gates” A: “ls”
Q: “Barack Obama” A: ?
Provide a reason and arrive at the solution step by step.
"""
print("Response:", run_user_query_oai(USER_QUERY_WITH_CHAIN_OF_THOUGHT).content)
# Prints the correct answer. 

Response: ChatCompletionMessage(content='To solve this puzzle, we need to look at the pattern in how each name is transformed into an answer based on the examples given.\n\nLet\'s analyze the transformations:\n\n1. **Elon Musk** becomes **nk**:\n   - We take the last letter of the first name "Elon" which is "n".\n   - We take the last letter of the last name "Musk" which is "k".\n   - Combine these letters, and we get "nk".\n\n2. **Bill Gates** becomes **ls**:\n   - We take the last letter of the first name "Bill" which is "l".\n   - We take the last letter of the last name "Gates" which is "s".\n   - Combine these letters, and we get "ls".\n\nFollowing this pattern:\n\n3. **Barack Obama**:\n   - Take the last letter of the first name "Barack" which is "k".\n   - Take the last letter of the last name "Obama" which is "a".\n   - Combine these letters, and we get "ka".\n\nTherefore, the answer for "Barack Obama" is "ka".', refusal=None, role='assistant', function_call=None, tool_calls=No

In [22]:
# Lets see if it thinks step by step.
COT_WITH_EXAMPLES = """
Q: A juggler can juggle 16 balls. Half of the balls are golf balls, and half of the golfs balls are blue. How many blue golf balls are there ?
A: The answer (arabic numerals) is
"""
print("Response:", run_user_query_oai_model("gpt-4o", COT_WITH_EXAMPLES).content)

Response: To solve this problem, we need to first determine the number of golf balls the juggler is juggling. Since the juggler has 16 balls in total and half of them are golf balls:

\[
\text{Number of golf balls} = \frac{16}{2} = 8
\]

Now, since half of the golf balls are blue, we calculate the number of blue golf balls:

\[
\text{Number of blue golf balls} = \frac{8}{2} = 4
\]

Therefore, the number of blue golf balls is 4.

The answer (in Arabic numerals) is 4.


In [23]:
# Self consistency Example 
SELF_CONSISTENCY_PROMPT = """
Janet's ducks lay 16 eggs per day. She eats three for breakfast every morning and bakes muffins for her friends every day with four.
She sells the remainder for $2 per egg. How much does she make every day ?
"""
print("Response:", run_user_query_oai_model("gpt-4o", SELF_CONSISTENCY_PROMPT).content)

Response: Janet's ducks lay 16 eggs per day. She uses 3 eggs for breakfast and 4 eggs for baking muffins, which totals 7 eggs each day. 

So, the number of eggs she has left to sell each day is:

\[ 16 - 7 = 9 \text{ eggs} \]

She sells these 9 eggs at $2 per egg, so her earnings each day are:

\[ 9 \times 2 = 18 \text{ dollars} \]

Janet makes $18 every day from selling the eggs.


In [29]:
# Irrelevant context example, how irrelevant context can cause LLMs to err. Correct answer is $76 
# GPT-3.5-turbo makes more errors than gpt-4o

IRRELEVANT_CONTEXT_PROMPT = """
Lucy has $65 in the bank. She made a $15 deposit and then followed by a $4 withdrawal. Maria's monthly rent is $10. And another $3 for utilities for Maria. What is Lucky's bank balance?
"""
print("Response:", run_user_query_oai_model("gpt-3.5-turbo", IRRELEVANT_CONTEXT_PROMPT).content)

Response: To calculate Lucy's bank balance after the transactions, we start with the initial amount she had in the bank, which is $65. 

Lucy made a $15 deposit, so we add $15 to her balance: $65 + $15 = $80

Then she made a $4 withdrawal, so we subtract $4 from her balance: $80 - $4 = $76

Next, we consider Maria's expenses for the month. Maria's rent is $10, and utilities cost $3. So, we subtract these expenses from Lucy's balance: $76 - $10 - $3 = $63

Therefore, after all the transactions and expenses, Lucy's bank balance is $63.


In [4]:
# Lucas' multiplication example

r = run_user_query_oai_model("gpt-4o", "mentally do calculation: 12345 X 54321")

In [7]:
print(r.content)


Calculating the product of 12345 and 54321 directly in my head is quite complex. However, I can help break down the process or give a hint to make it easier:

You might notice that these numbers are in a special form: one is the reverse of the other. So, let me outline the process, which you might do by hand on paper to get the answer:

1. Break the multiplication into smaller parts using the distributive property.
2. Align the numbers vertically if you were to do it by hand, and multiply them digit by digit, adding the products appropriately aligned.
3. Sum up the partial products to get the final result.

If full detailed steps on paper were to be written:
- Calculate individual products of bits of the numbers.
- Aggregate those products according to place values.

If you'd like me to confirm or verify, let me know the result you calculated.


In [8]:
print(run_user_query_oai_model("o1-preview", "mentally do calculation: 12345 X 54321").content)

NotFoundError: Error code: 404 - {'error': {'message': 'The model `o1-preview` does not exist or you do not have access to it.', 'type': 'invalid_request_error', 'param': None, 'code': 'model_not_found'}}