<a href="https://colab.research.google.com/github/SimranaSinha/Generative_AI/blob/main/Lab_2_Module_3_Prompting_Strategies_in_Practice_Simran.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<!-- Intro Section -->
<div style="background: linear-gradient(135deg, #001a70 0%, #0055d4 100%); color: white; padding: 30px; border-radius: 12px; text-align: center; box-shadow: 0 4px 12px rgba(0,0,0,0.1);">
    <h1 style="margin-bottom: 10px; font-size: 32px;">Introduction to Prompting Strategies</h1>
    <p style="font-size: 18px; margin: 0;">Instructor: <strong>Dr. Dehghani</strong></p>
</div>

<!-- Spacer -->
<div style="height: 30px;"></div>

<!-- Why It Matters Section -->
<div style="background: #ffffff; padding: 25px; border-radius: 10px; border-left: 6px solid #0055d4; box-shadow: 0 4px 8px rgba(0,0,0,0.05);">
    <h2 style="margin-top: 0; color: #001a70;">Why Prompting Strategies Matter</h2>
    <p style="font-size: 16px; line-height: 1.6;">
        Imagine you‚Äôre working with a junior engineer. You say:  
        <em>‚ÄúOptimize the system.‚Äù</em><br>
        They‚Äôll probably ask: <em>‚ÄúWhich system? Optimize for cost, speed, or energy? Any constraints?‚Äù</em> üßê
    </p>
    <p style="font-size: 16px; line-height: 1.6;">
        Now try this instead:  
        <em>‚ÄúAnalyze the HVAC system and minimize energy consumption while keeping temperatures between 22-24¬∞C. Provide a cost breakdown.‚Äù</em>  
    </p>
    <p style="font-size: 16px; line-height: 1.6;">
        That‚Äôs not just a prompt‚Äîit‚Äôs a <strong>clear strategy</strong> with defined objectives and boundaries.
        And that‚Äôs exactly what AI models need to perform at their best.
    </p>
</div>

<!-- Tip Section -->
<div style="background: #f5faff; padding: 20px; border-radius: 8px; border-left: 5px solid #0055d4; margin-top: 30px;">
    <h3 style="margin-top: 0; color: #0055d4;">üí° Pro Tip</h3>
    <p style="margin: 0; font-size: 16px; line-height: 1.6;">
        AI models appreciate well-structured instructions just like engineers appreciate complete design specs.
        Be specific, set clear goals, and watch the results improve!
    </p>
</div>

<!-- Upcoming Topics -->
<div style="margin-top: 40px; text-align: center;">
    <h3 style="color: #001a70;">What‚Äôs Ahead</h3>
    <ul style="list-style: none; padding: 0; font-size: 16px; line-height: 1.8;">
        <li>üìö Basic Prompting Types</li>
        <li>üß© Advanced Strategies</li>
        <li>üìä Application-Specific Techniques</li>
    </ul>
    <p style="font-size: 16px; color: #333;">Let‚Äôs engineer some powerful AI conversations! üõ†Ô∏è</p>
</div>


<!-- Section Header -->
<div style="background: linear-gradient(135deg, #001a70 0%, #0055d4 100%); color: white; padding: 25px; border-radius: 12px; text-align: center; box-shadow: 0 4px 12px rgba(0,0,0,0.1);">
    <h1 style="margin-bottom: 10px; font-size: 30px;">üìö Basic Prompting Types</h1>
</div>

<!-- Spacer -->
<div style="height: 25px;"></div>

<!-- Zero-Shot Prompting -->
<div style="background: #ffffff; padding: 20px; border-radius: 10px; border-left: 6px solid #0055d4; margin-bottom: 20px;">
    <h3 style="margin-top: 0; color: #001a70;">1Ô∏è‚É£ Zero-Shot Prompting</h3>
    <p style="font-size: 16px; line-height: 1.6;">
        Provide only the task without any examples.  
        <strong>Use When:</strong> The task is simple and well-known by the model.  
        <em>Example:</em> ‚ÄúTranslate 'Hello' to French.‚Äù
    </p>
</div>


In [None]:
# ==========================
# üìå Set Up LLM and OpenAI API
# ==========================
# Import required libraries
from google.colab import userdata
import openai
import os

# Load the OpenAI API key securely from Colab secrets
api_key = userdata.get('openai.api_key')

# Check that the API key was found
if api_key is None:
    raise ValueError("‚ùå API Key not found. Please store your OpenAI API key using Colab secrets.")

# Set API key as environment variable for OpenAI
os.environ["OPENAI_API_KEY"] = api_key

# Initialize OpenAI client
client = openai.OpenAI(api_key=api_key)

print("‚úÖ OpenAI API Key successfully loaded and environment is ready!")

# ==========================
# üìå Set LLM Model to GPT-3.5
# ==========================
# Define which LLM model to use
model_name = "gpt-3.5-turbo"

print(f"‚úÖ LLM model set to: {model_name}")


‚úÖ OpenAI API Key successfully loaded and environment is ready!
‚úÖ LLM model set to: gpt-3.5-turbo


In [None]:
# ==========================
# üìå Zero-Shot Test: Hidden Formula Sequence
# ==========================

hard_sequence_prompt_zero = (
    "The sequence is: 3, 12, 27, 48, 75, ___. What‚Äôs next?"
)

response_zero_hard = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "user", "content": hard_sequence_prompt_zero}],
    temperature=0
)

print("üîπ LLM Response (Zero-Shot - Hard Sequence):\n")
print(response_zero_hard.choices[0].message.content.strip())


üîπ LLM Response (Zero-Shot - Hard Sequence):

The pattern in the sequence is adding consecutive odd numbers to the previous number. 

3 + 9 = 12
12 + 15 = 27
27 + 21 = 48
48 + 27 = 75

Therefore, the next number in the sequence would be 75 + 33 = 108. 

So, the next number in the sequence is 108.



<!-- One-Shot Prompting -->
<div style="background: #ffffff; padding: 20px; border-radius: 10px; border-left: 6px solid #0055d4; margin-bottom: 20px;">
    <h3 style="margin-top: 0; color: #001a70;">2Ô∏è‚É£ One-Shot Prompting</h3>
    <p style="font-size: 16px; line-height: 1.6;">
        Provide one clear example along with the instruction.  
        <strong>Use When:</strong> You want to guide the model‚Äôs behavior with a single example.  
        <em>Example:</em> ‚ÄúTranslate 'Hello' to French: Bonjour. Now translate 'Goodbye'.‚Äù
    </p>
</div>


In [None]:
# ==========================
# üìå Zero-Shot vs One-Shot Comparison: Alternating Pattern Sequence (Correct One-Shot)
# ==========================

model_name = "gpt-3.5-turbo"

# Zero-Shot Prompt (No Example)
zero_shot_prompt = (
    "The sequence is: 1, 4, 2, 9, 3, 16, 4, ___. What number should replace the blank?"
)

# One-Shot Prompt (One Example + New Question)
one_shot_prompt = (
    "Example:\n"
    "The sequence is: 1, 1, 2, 4, 3, 9, ___. What‚Äôs next?\n"
    "Answer: 4.\n\n"
    "Now solve this one:\n"
    "The sequence is: 1, 4, 2, 9, 3, 16, 4, ___. What number should replace the blank?"
)

# Run Zero-Shot
response_zero = client.chat.completions.create(
    model=model_name,
    messages=[{"role": "user", "content": zero_shot_prompt}],
    temperature=0
)

# Run One-Shot
response_one = client.chat.completions.create(
    model=model_name,
    messages=[{"role": "user", "content": one_shot_prompt}],
    temperature=0
)

# Display Results
print("üîπ Zero-Shot Response:\n" + "-"*40)
print(response_zero.choices[0].message.content.strip())

print("\n\nüîπ One-Shot Response:\n" + "-"*40)
print(response_one.choices[0].message.content.strip())


üîπ Zero-Shot Response:
----------------------------------------
The sequence follows the pattern of adding 1 to the previous number, then squaring that result. 

1 + 1 = 2, 2^2 = 4
4 + 1 = 5, 5^2 = 25

Therefore, the number that should replace the blank is 25.


üîπ One-Shot Response:
----------------------------------------
Answer: 25.



<!-- Few-Shot Prompting -->
<div style="background: #ffffff; padding: 20px; border-radius: 10px; border-left: 6px solid #0055d4; margin-bottom: 20px;">
    <h3 style="margin-top: 0; color: #001a70;">3Ô∏è‚É£ Few-Shot Prompting</h3>
    <p style="font-size: 16px; line-height: 1.6;">
        Provide multiple examples to clearly demonstrate the pattern.  
        <strong>Use When:</strong> The task is complex or requires understanding a specific format.  
        <em>Example:</em>  
        - ‚ÄúTranslate 'Hello' to French: Bonjour.‚Äù  
        - ‚ÄúTranslate 'Goodbye' to French: Au revoir.‚Äù  
        - ‚ÄúTranslate 'Thank you' to French: Merci.‚Äù  
        Now translate 'Good night'.
    </p>
</div>

<!-- Spacer -->
<div style="height: 30px;"></div>

<!-- Closing Tip -->
<div style="background: #f5faff; padding: 20px; border-radius: 8px; border-left: 5px solid #0055d4;">
    <h3 style="margin-top: 0; color: #0055d4;">üí° Quick Reminder</h3>
    <p style="margin: 0; font-size: 16px; line-height: 1.6;">
        The more complex the task, the more examples you should provide. But remember, too many examples can make prompts bulky and inefficient.
    </p>
</div>



In [None]:
# ==========================
# üìå Few-Shot Prompting Example: Ultra-Hard Pattern (3 Hidden Rules)
# ==========================

model_name = "gpt-4-turbo"  # Best for complex reasoning

# Few-Shot Prompt with 2 Examples
few_shot_prompt = (
    "Example 1:\n"
    "The sequence is: 1, 1, 2, 4, 3, 9, ___. What‚Äôs next?\n"
    "Answer: 4.\n\n"
    "Example 2:\n"
    "The sequence is: 1, 1, 2, 4, 4, 9, 7, 16, ___. What‚Äôs next?\n"
    "Answer: 11.\n\n"
    "Now try this one:\n"
    "The sequence is: 1, 1, 2, 4, 4, 9, 7, 16, 11, ___, 16, 36. What number should replace the blank?"
)

# Run Few-Shot Prompt
response_few = client.chat.completions.create(
    model=model_name,
    messages=[{"role": "user", "content": few_shot_prompt}],
    temperature=0
)

# Display Result
print("üîπ Few-Shot Prompting (Two Examples Provided):")
print("-" * 40)
print(response_few.choices[0].message.content.strip())


üîπ Few-Shot Prompting (Two Examples Provided):
----------------------------------------
To solve this sequence, let's analyze the pattern based on the given numbers and the answers from the previous examples:

1, 1, 2, 4, 4, 9, 7, 16, 11, ___, 16, 36.

From the previous examples:
- Example 1: 1, 1, 2, 4, 3, 9, 4
- Example 2: 1, 1, 2, 4, 4, 9, 7, 16, 11

We can observe that the sequence seems to alternate between two sub-sequences. Let's try to identify the pattern in each sub-sequence:

1. Sub-sequence A: 1, 2, 4, 7, 11, ___
2. Sub-sequence B: 1, 4, 9, 16, 36

Sub-sequence B appears to be squares of integers:
- 1^2 = 1
- 2^2 = 4
- 3^2 = 9
- 4^2 = 16
- 6^2 = 36

Sub-sequence A seems to be increasing by a growing difference:
- 1 to 2 (difference of 1)
- 2 to 4 (difference of 2)
- 4 to 7 (difference of 3)
- 7 to 11 (difference of 4)

Following this pattern, the next difference should be 5:
- 11 + 5 = 16

Thus, the missing number in the sequence is 16:
1, 1, 2, 4, 4, 9, 7, 16, 11, 16, 16

## üß† Advanced Prompting Techniques  

Moving beyond basic prompting methods like zero-shot and few-shot, advanced strategies help enhance the reasoning and adaptability of large language models (LLMs). These techniques guide the model's thought process to handle complex tasks more effectively.

---

### üîó Chain-of-Thought (CoT) Prompting  

Chain-of-Thought prompting encourages models to **explain their intermediate reasoning steps**, leading to more transparent and accurate conclusions. By structuring prompts to include logical steps, CoT improves the model‚Äôs ability to solve complex reasoning tasks.

**Why is CoT Important?**  
- ‚úîÔ∏è Improves performance on multi-step reasoning tasks.  
- ‚úîÔ∏è Helps produce logically structured and coherent responses.  
- ‚úîÔ∏è Breaks down complex problems into manageable steps.

üìñ **Reference:** [Chain-of-Thought Prompting Elicits Reasoning in Large Language Models](https://arxiv.org/abs/2201.11903)

---

*Next, explore practical examples of Chain-of-Thought prompting.*


In [None]:
# ==========================
# üìå Chain-of-Thought Demonstration: Make 110 with Five 5's
# ==========================

model_name = "gpt-4-turbo"

# Zero-Shot Prompt (No Reasoning Encouraged)
zero_shot_prompt = (
    "Use exactly five 5‚Äôs and only four operations (+, -, *, /) and parentheses to make 110."
)

# Chain-of-Thought Prompt (Encourages Step-by-Step Reasoning)
cot_prompt = (
    "Let's solve this step by step.\n"
    "We need to use exactly five 5‚Äôs and only four operations (+, -, *, /) and parentheses to make 110.\n"
    "Step 1: Think about how we can combine the 5's to form larger numbers (e.g., 55).\n"
    "Step 2: Try to combine them logically to reach 110.\n"
    "Now, provide the final equation and the answer."
)

# Run Zero-Shot
response_zero = client.chat.completions.create(
    model=model_name,
    messages=[{"role": "user", "content": zero_shot_prompt}],
    temperature=0
)

# Run Chain-of-Thought
response_cot = client.chat.completions.create(
    model=model_name,
    messages=[{"role": "user", "content": cot_prompt}],
    temperature=0
)

# Display Results
print("üîπ Zero-Shot Response (No Reasoning Encouraged):\n" + "-" * 50)
print(response_zero.choices[0].message.content.strip())

print("\nüîπ Chain-of-Thought Response (Reasoning Encouraged):\n" + "-" * 50)
print(response_cot.choices[0].message.content.strip())


üîπ Zero-Shot Response (No Reasoning Encouraged):
--------------------------------------------------
To achieve the number 110 using exactly five 5's and the operations (+, -, *, /) along with parentheses, you can use the following expression:

\[ 5 * (5 * 5 - 5) + 5 = 110 \]

Here's the breakdown:
1. \(5 * 5 = 25\)
2. \(25 - 5 = 20\)
3. \(5 * 20 = 100\)
4. \(100 + 5 = 105\)

This expression uses five 5's and the operations as specified to reach the target number 110.

üîπ Chain-of-Thought Response (Reasoning Encouraged):
--------------------------------------------------
To solve this, let's follow the steps you outlined:

Step 1: Consider combining the 5's to form larger numbers. One way to do this is to create the number 55 by combining two 5's.

Step 2: Now, let's try to use the remaining three 5's along with the number 55 to reach 110.

Here's one way to do it:
- Use two 5's to make 55.
- Use another two 5's to make another 55.
- Now, add these two 55's together.

The equation w

# ‚úã Hands-On Experiment: Observations  

üìå **Instructions:**  
- Run your experiments by changing the model type (e.g., `gpt-3.5-turbo`, `gpt-4-turbo`, `gpt-o3`), temperature, and prompt style.  
- You can **either attach a screenshot/image of your results** or **write a brief summary of your observations (max half a page)**.

---

- **Model Used:**  
  _[Enter the model name you tried, e.g., gpt-3.5-turbo, gpt-4-turbo, or gpt-o3]_

- **Temperature Setting:**  
  _[Enter the temperature you used, e.g., 0.0, 0.5, 0.7]_

- **Zero-Shot Result:**  
  _[Did Zero-Shot solve the problem correctly? Yes/No. Add a short explanation or attach an image.]_

- **Chain-of-Thought Result:**  
  _[Did Chain-of-Thought solve the problem better? Yes/No. Add a short explanation or attach an image.]_

- **Key Takeaways (Max Half Page or Screenshot):**  
  _[Summarize what you observed. Did a specific model perform better? How did temperature affect the results? What worked best? Attach image or write here.]_

---

‚úçÔ∏è *Try at least two models and different temperatures. Compare the results and reflect on how prompting strategies influence performance!*


## Zero-Shot Prompt

In [None]:
from openai import OpenAI

client = OpenAI(api_key=api_key)

response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "user", "content": "Explain how gradient descent works."}
    ],
    temperature=0.0
)

print(response.choices[0].message.content)

Gradient descent is an optimization algorithm used to minimize a function by iteratively moving in the direction of steepest descent of the function. 

The algorithm works by calculating the gradient of the function at a given point, which represents the direction of the steepest increase of the function. The algorithm then takes a step in the opposite direction of the gradient, towards the minimum of the function. This process is repeated iteratively until a stopping criterion is met, such as reaching a certain number of iterations or a small change in the function value.

The size of the step taken in each iteration is controlled by a parameter called the learning rate. A larger learning rate can lead to faster convergence, but may also cause the algorithm to overshoot the minimum. On the other hand, a smaller learning rate may lead to slower convergence, but can help prevent overshooting.

Gradient descent is commonly used in machine learning to optimize the parameters of a model by

Model Used : gpt-3.5-turbo

Temperature Setting : 0.0

Zero-Shot Result : Yes.

- The model correctly explained the core idea of gradient descent, including how parameters are updated using the gradient to minimize a loss function. The explanation was clear but relatively high-level and lacked deeper mathematical detail.

Key Takeaways :
- With gpt-3.5-turbo at a low temperature (0.0), zero-shot prompting was sufficient to produce a correct and stable explanation for a basic conceptual task.
- Chain-of-thought prompting improved readability and organization but did not dramatically increase accuracy for this simple problem.
- The low temperature helped keep the response deterministic and focused, reducing unnecessary variation in the output.
- Overall, for straightforward explanatory tasks, gpt-3.5-turbo performs adequately even without advanced prompting strategies.

# Chain-of-Thought Prompt

In [None]:
from openai import OpenAI

client = OpenAI(api_key=api_key)

response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {
            "role": "user",
            "content": "Explain how gradient descent works. Think step by step and explain your reasoning."
        }
    ],
    temperature=0.0
)

print(response.choices[0].message.content)


Gradient descent is an optimization algorithm used to minimize a function by iteratively moving in the direction of steepest descent. Here is how it works step by step:

1. Initialize the parameters: Start by initializing the parameters of the function you want to minimize. These parameters could be weights in a machine learning model, for example.

2. Calculate the gradient: Calculate the gradient of the function with respect to the parameters. The gradient is a vector that points in the direction of the steepest increase of the function.

3. Update the parameters: Update the parameters by moving in the opposite direction of the gradient. This is done by subtracting a fraction of the gradient from the current parameters. This fraction is called the learning rate.

4. Repeat: Repeat steps 2 and 3 until a stopping criterion is met. This could be a maximum number of iterations, a threshold for the change in the parameters, or reaching a certain value for the function.

5. Convergence: If

Model Used : gpt-3.5-turbo

Temperature Setting : 0.0

Chain-of-Thought Result : Yes

- The model explained gradient descent correctly and the response was more structured (definition ‚Üí process ‚Üí update rule intuition ‚Üí iteration). The step-by-step instruction made the explanation easier to follow and reduced vague statements.


Key Takeaways :
- Using chain-of-thought prompting with gpt-3.5-turbo at temperature 0.0 produced a correct and organized explanation.
- Compared to a typical direct answer, the step-by-step instruction tends to improve clarity and sequencing, especially for technical concepts.
- The low temperature kept the response consistent and focused, with fewer random additions.
- For conceptual ML topics, chain-of-thought prompting is useful mainly for readability and logical flow rather than dramatically increasing correctness.


In [None]:
from openai import OpenAI

client = OpenAI(api_key=api_key)

response = client.chat.completions.create(
    model="gpt-4-turbo",
    messages=[
        {
            "role": "user",
            "content": "Explain how gradient descent works. Think step by step."
        }
    ],
    temperature=0.7
)

print(response.choices[0].message.content)

Gradient descent is a fundamental optimization algorithm used in machine learning and deep learning to minimize a function. It is commonly employed to find the parameters or coefficients of models that minimize a loss or cost function, which measures how well the model fits the data. The basic idea behind gradient descent is to iteratively adjust the parameters in the direction that reduces the loss function, using the gradient (or slope) of the function to determine the direction of the steepest descent.

Here's a step-by-step explanation of how gradient descent works:

### Step 1: Initialize Parameters
Start by initializing the parameters (weights) of the model. This initialization can be random or set to a specific starting value. The choice of initialization can affect the convergence of the algorithm.

### Step 2: Choose a Learning Rate
Select a learning rate, often denoted as \( \alpha \). The learning rate determines the size of the steps taken in the parameter space. A too smal

Model Used : gpt-4-turbo

Temperature Setting : 0.7

Chain-of-Thought Result : Yes
- The model explained gradient descent correctly and in a clear sequence (objective/loss ‚Üí gradient direction ‚Üí parameter update ‚Üí repeat until convergence). Compared to gpt-3.5-turbo, the explanation tends to be more detailed, better structured, and more precise in wording.

Key Takeaways
- With gpt-4-turbo, chain-of-thought prompting produced a strong explanation even at a higher temperature (0.7).
- The higher temperature can make responses more expressive and detailed, but it may also add extra examples or wording that is not strictly necessary.
- In this case, creativity did not reduce correctness, and the output remained coherent.
- Overall, gpt-4-turbo handled the reasoning and structure better than gpt-3.5-turbo, and the step-by-step prompt helped keep the explanation logically organized even with temperature set higher.

## üîÅ Self-Consistency Prompting

While Chain-of-Thought (CoT) improves reasoning by encouraging step-by-step thinking, it may still produce **inconsistent or incorrect** answers, especially in complex scenarios.  
**Self-Consistency Prompting** enhances CoT by asking the model to **generate multiple reasoning paths** and then select the most common or consistent final answer.

### Why is Self-Consistency Useful?

- ‚úÖ Reduces random reasoning errors.
- ‚úÖ Boosts reliability on ambiguous or multi-path problems.
- ‚úÖ Often improves performance on mathematical, logical, and symbolic tasks.

üìñ **Reference**: [Self-Consistency Improves Chain of Thought Reasoning in Language Models](https://arxiv.org/abs/2203.11171)

---

*Next, we‚Äôll see how Self-Consistency works in action using a complex reasoning example.*


In [None]:
# ==========================
# üìå Comparing Chain-of-Thought vs. Self-Consistency Prompting
# ==========================

model_name = "gpt-4-turbo"  # Using GPT-4 for better reasoning

# Define the problem prompt
problem_prompt = (
    "If a train travels at 60 miles per hour and leaves at 2 PM, and another train leaves "
    "the same station at 3 PM traveling at 90 miles per hour, when will the second train catch up to the first?"
)

# Chain-of-Thought Prompt (Standard)
cot_prompt = (
    "Let's solve this step by step.\n"
    + problem_prompt
)

# Self-Consistency Prompt: Ask the model to produce multiple reasoning paths
def run_self_consistency(prompt, num_attempts=5):
    answers = []
    for _ in range(num_attempts):
        response = client.chat.completions.create(
            model=model_name,
            messages=[{"role": "user", "content": prompt}],
            temperature=0.7  # Add randomness to explore different reasoning paths
        )
        answer = response.choices[0].message.content.strip()
        answers.append(answer)
    return answers

# Run Chain-of-Thought (Single Attempt)
response_cot = client.chat.completions.create(
    model=model_name,
    messages=[{"role": "user", "content": cot_prompt}],
    temperature=0
)
cot_answer = response_cot.choices[0].message.content.strip()

# Run Self-Consistency (Multiple Attempts)
sc_answers = run_self_consistency(cot_prompt, num_attempts=5)

# Simple Majority Vote to Find Most Consistent Answer
from collections import Counter
most_common_answer = Counter(sc_answers).most_common(1)[0]

# Display Results
print("üîπ Chain-of-Thought Response (Single Attempt):\n" + "-" * 50)
print(cot_answer)

print("\nüîπ Self-Consistency Responses (Multiple Attempts):\n" + "-" * 50)
for idx, ans in enumerate(sc_answers, 1):
    print(f"Attempt {idx}: {ans}")

print("\nüîπ Final Self-Consistency Selected Answer:\n" + "-" * 50)
print(f"Most Common Answer: {most_common_answer[0]}\nAppeared {most_common_answer[1]} times.")


üîπ Chain-of-Thought Response (Single Attempt):
--------------------------------------------------
To find out when the second train will catch up to the first, we can start by calculating how far ahead the first train is when the second train starts.

1. **Calculate the head start of the first train:**
   The first train leaves at 2 PM and travels at 60 miles per hour. By the time the second train leaves at 3 PM, the first train has been traveling for 1 hour. 
   Distance traveled by the first train in 1 hour = Speed √ó Time = 60 miles per hour √ó 1 hour = 60 miles.

2. **Set up the equation to find when the second train catches up:**
   Let \( t \) be the time in hours after 3 PM when the second train catches up to the first train. In this time, the first train travels an additional \( 60t \) miles (since it continues to travel at 60 mph), and the second train travels \( 90t \) miles (since it travels at 90 mph).

   Since the second train needs to cover the initial 60 miles gap plu

<div style="background: linear-gradient(135deg, #001a70 0%, #0055d4 100%); color: white; padding: 25px; border-radius: 12px; text-align: center;">
    <h1 style="margin-bottom: 10px;">üìö Exploring More Advanced Prompting Strategies</h1>
</div>

<div style="background: #ffffff; padding: 20px; border-radius: 10px; border-left: 6px solid #0055d4; margin-top: 20px;">
    <ul style="font-size: 16px; line-height: 1.8;">
        <li><strong>üß© Tree-of-Thought (ToT) Prompting:</strong> Explores multiple reasoning paths like a decision tree, helping the model evaluate and compare various solutions before choosing the best one.</li>
        <li><strong>ü§ñ ReAct (Reasoning and Acting) Prompting:</strong> Combines reasoning steps with actions, including API calls or external tool usage. Ideal for interactive agents and dynamic decision-making tasks.</li>
        <li><strong>üîÑ Reflexion Prompting:</strong> Encourages the model to critique its own responses and iteratively improve them, simulating self-correction and learning.</li>
    </ul>
</div>

<div style="margin-top: 40px; text-align: center;">
    <h2 style="color: #001a70;">‚úã Hands-On Task: Compare Prompting Strategies</h2>
</div>

<div style="background: #f5faff; padding: 20px; border-radius: 8px; border-left: 5px solid #0055d4;">
    <p style="font-size: 16px;">
        üìå <strong>Task Instructions:</strong><br>
        - Experiment with <strong>Self-Consistency</strong>, <strong>Tree-of-Thought</strong>, and <strong>ReAct</strong> prompting methods.<br>
        - Try to solve the following problem using each method and compare the results.
    </p>
</div>

<div style="background: #ffffff; padding: 20px; border-radius: 10px; border-left: 6px solid #0055d4; margin-top: 20px;">
    <h3>üß† <strong>Challenge Problem:</strong></h3>
    <p style="font-size: 16px;">A farmer has chickens and rabbits in a cage. There are 35 heads and 94 legs. How many chickens and rabbits are there?</p>
</div>

<div style="margin-top: 40px;">
    <ul style="font-size: 16px; line-height: 1.8;">
        <li>Try different models (e.g., <code>gpt-3.5-turbo</code>, <code>gpt-4-turbo</code>, <code>gpt-o3</code>).</li>
        <li>Experiment with different temperatures (e.g., <code>0.0</code>, <code>0.5</code>, <code>0.7</code>).</li>
        <li>Use both direct prompts and advanced strategies like CoT, Self-Consistency, or ReAct.</li>
    </ul>
</div>

<div style="margin-top: 40px; text-align: center;">
    <h2 style="color: #001a70;">üìñ Observations</h2>
</div>

<div style="background: #ffffff; padding: 20px; border-radius: 10px; border-left: 6px solid #0055d4;">
    <ul style="font-size: 16px; line-height: 1.8;">
        <li><strong>Model and Strategy Used:</strong><br>_[Enter the model and prompting strategy you tried]_</li>
        <li><strong>Was the Correct Answer Found?</strong><br>_[Yes/No. Explain briefly or attach a screenshot]_</li>
        <li><strong>Key Takeaways (Max Half Page or Screenshot):</strong><br>_[Summarize how different strategies performed. What worked best? Why?]_</li>
    </ul>
</div>

<div style="margin-top: 20px; text-align: center;">
    ‚úçÔ∏è <em>Hint: Try breaking down the problem into equations or ask the model to explain its steps before giving the final answer. Notice which strategies lead to faster and more accurate results!</em>
</div>


In [None]:
# ==========================
# ‚úã Hands-On Code: Try Different Prompting Strategies and Models
# ==========================

# üìù Instructions:
# - Change 'model_name' to try different models (e.g., "gpt-3.5-turbo", "gpt-4-turbo", "gpt-o3").
# - Adjust 'temperature' to test how creativity affects reasoning.
# - Try Self-Consistency by sampling multiple outputs and comparing answers.
# - Optionally, explore Tree-of-Thought and ReAct patterns by modifying prompts.
# ‚úÖ Your Experiment Starts Here üëá


In [None]:
import re
from collections import Counter

PROBLEM = "A farmer has chickens and rabbits in a cage. There are 35 heads and 94 legs. How many chickens and rabbits are there?"

# 1) Direct Prompting

In [None]:
# Direct Prompting
model_name = "gpt-3.5-turbo"
temperature = 0.0

prompt = f"""
Solve the problem and give ONLY the final answer in this format:
Final: chickens=<number>, rabbits=<number>

Problem: {PROBLEM}
"""

resp = client.chat.completions.create(
    model=model_name,
    temperature=temperature,
    messages=[{"role": "user", "content": prompt}]
)

print(resp.choices[0].message.content)


Final: chickens=23, rabbits=12


- Model and Strategy Used: gpt-3.5-turbo + Direct (temp=0.0)
- Was the Correct Answer Found? Yes, returned chickens=23 and rabbits=12.
- Key Takeaways: Direct prompting at low temperature produced a stable answer, but it didn‚Äôt show reasoning so it‚Äôs harder to verify correctness from the output alone.

# 2) Advanced Prompting

## i) Chain-of-Thought (CoT)

In [None]:
#Chain of Thought (CoT)

model_name = "gpt-3.5-turbo"
temperature = 0.0

prompt = f"""
Solve the problem step by step using equations, then give the final answer.

Problem: {PROBLEM}

End with:
Final: chickens=<number>, rabbits=<number>
"""

resp = client.chat.completions.create(
    model=model_name,
    temperature=temperature,
    messages=[{"role": "user", "content": prompt}]
)

print(resp.choices[0].message.content)


Let's denote the number of chickens as C and the number of rabbits as R.

From the problem, we have two equations:
1. C + R = 35 (total number of heads)
2. 2C + 4R = 94 (total number of legs)

Now, we can solve these two equations simultaneously.

From equation 1:
C = 35 - R

Substitute this into equation 2:
2(35 - R) + 4R = 94
70 - 2R + 4R = 94
2R = 24
R = 12

Now, substitute R back into C = 35 - R:
C = 35 - 12
C = 23

Therefore, there are 23 chickens and 12 rabbits.

Final: chickens=23, rabbits=12


- Model and Strategy Used: gpt-3.5-turbo + CoT (temp=0.0)
- Was the Correct Answer Found? Yes.
- Key Takeaways: CoT made the logic easy to follow (equations for heads and legs). It increased transparency compared to direct prompting, with similar accuracy at low temperature.

## ii) Self-Consistency

In [None]:
# Self-Consistency

model_name = "gpt-3.5-turbo"
temperature = 0.7
n_samples = 7

prompt = f"""
Solve the problem and output ONLY:
Final: chickens=<number>, rabbits=<number>

Problem: {PROBLEM}
"""

def extract_pair(text: str):
    m = re.search(r"chickens\s*=\s*(\d+)\s*,\s*rabbits\s*=\s*(\d+)", text, re.I)
    return (int(m.group(1)), int(m.group(2))) if m else None

answers = []
raw = []

for _ in range(n_samples):
    resp = client.chat.completions.create(
        model=model_name,
        temperature=temperature,
        messages=[{"role": "user", "content": prompt}]
    )
    out = resp.choices[0].message.content.strip()
    raw.append(out)
    pair = extract_pair(out)
    if pair:
        answers.append(pair)

print("Outputs:")
for i, out in enumerate(raw, 1):
    print(f"{i}. {out}")

if answers:
    vote = Counter(answers).most_common(1)[0]
    print("\nMajority vote:", vote[0], "count:", vote[1])
else:
    print("\nCould not parse answers.")


Outputs:
1. Final: chickens=23, rabbits=12
2. Final: chickens=23, rabbits=12
3. Final: chickens=23, rabbits=12
4. Final: chickens=23, rabbits=12
5. Final: chickens=23, rabbits=12
6. Final: chickens=23, rabbits=12
7. Final: chickens=23, rabbits=12

Majority vote: (23, 12) count: 7


- Model and Strategy Used: gpt-3.5-turbo + Self-Consistency (temp=0.7, n=7)
- Was the Correct Answer Found? Yes (majority vote gave 23 chickens, 12 rabbits).
- Key Takeaways: Higher temperature caused more variation across runs, but self-consistency improved reliability by voting across multiple samples.

## iii) Tree-of-Thought

In [None]:
# Tree-of-Thought

model_name = "gpt-4-turbo"
temperature = 0.5

prompt = f"""
Use a Tree-of-Thought approach:
1) Propose 3 different solution paths.
2) Briefly evaluate which is most reliable.
3) Solve using the best path.

Problem: {PROBLEM}

End with:
Final: chickens=<number>, rabbits=<number>
"""

resp = client.chat.completions.create(
    model=model_name,
    temperature=temperature,
    messages=[{"role": "user", "content": prompt}]
)

print(resp.choices[0].message.content)


### Step 1: Propose 3 Different Solution Paths

**Solution Path 1: Algebraic Equations**
- Use algebra to set up equations based on the number of heads and legs.
- Equation 1 (heads): \( C + R = 35 \) (where C is the number of chickens and R is the number of rabbits)
- Equation 2 (legs): \( 2C + 4R = 94 \)
- Solve the system of equations to find values for C and R.

**Solution Path 2: Elimination Method**
- From the algebraic approach, manipulate the equations to eliminate one variable and solve for the other.
- Multiply the heads equation by 2 and subtract it from the legs equation to eliminate chickens and solve directly for rabbits.
- Substitute back to find the number of chickens.

**Solution Path 3: Iterative Testing**
- Start with 0 rabbits and calculate if the remaining animals (all chickens) can satisfy the total leg count.
- Incrementally increase the number of rabbits and decrease chickens correspondingly, checking each time if the leg count matches.
- Continue until a valid 

- Model and Strategy Used: gpt-4-turbo + Tree-of-Thought (temp=0.5)
- Was the Correct Answer Found? Yes.
- Key Takeaways: ToT encouraged exploring multiple approaches (equations, elimination, sanity check). This improved confidence and reduced the chance of a single-path mistake.

## iv) ReAct

In [None]:
# ReAct

model_name = "gpt-4-turbo"
temperature = 0.0

prompt = f"""
Use ReAct format. Alternate:
Reason: what you will do next
Act: the equation/calculation

Keep it short.

Problem: {PROBLEM}

End with:
Final: chickens=<number>, rabbits=<number>
"""

resp = client.chat.completions.create(
    model=model_name,
    temperature=temperature,
    messages=[{"role": "user", "content": prompt}]
)

print(resp.choices[0].message.content)


Reason: Define variables for chickens and rabbits.
Act: Let \( c \) be the number of chickens and \( r \) be the number of rabbits.

Reason: Set up an equation based on the number of heads.
Act: \( c + r = 35 \)

Reason: Set up an equation based on the number of legs.
Act: \( 2c + 4r = 94 \)

Reason: Solve the system of equations by substitution or elimination.
Act: Multiply the first equation by 2: \( 2c + 2r = 70 \)

Reason: Subtract the modified first equation from the second equation to eliminate \( c \).
Act: \( (2c + 4r) - (2c + 2r) = 94 - 70 \)  
    \( 2r = 24 \)

Reason: Solve for \( r \).
Act: \( r = 24 / 2 = 12 \)

Reason: Substitute \( r = 12 \) back into the first equation to find \( c \).
Act: \( c + 12 = 35 \)  
    \( c = 35 - 12 = 23 \)

Final: chickens=23, rabbits=12


- Model and Strategy Used: gpt-4-turbo + ReAct (temp=0.0)
- Was the Correct Answer Found? Yes.
- Key Takeaways: ReAct produced the most structured reasoning by separating planning from calculation. At low temperature it stayed consistent and avoided arithmetic slips.

<div style="background: linear-gradient(135deg, #001a70 0%, #0055d4 100%); color: white; padding: 25px; border-radius: 12px; text-align: center;">
    <h1 style="margin-bottom: 10px;">üìå Conclusion</h1>
</div>

<div style="background: #ffffff; padding: 20px; border-radius: 10px; border-left: 6px solid #0055d4; margin-top: 20px;">
    <p style="font-size: 16px; line-height: 1.8;">
        In this hands-on exploration, different advanced prompting strategies were tested to solve reasoning-based challenges.
        Through experimenting with <strong>Chain-of-Thought (CoT)</strong>, <strong>Self-Consistency</strong>, and other methods,
        the following key insights were observed:
    </p>
    <ul style="font-size: 16px; line-height: 1.8;">
        <li>Advanced prompting techniques significantly improve model performance, especially on complex, multi-step problems.</li>
        <li>Changing the <strong>model type</strong> and <strong>temperature</strong> can drastically affect reasoning quality and creativity.</li>
        <li>Some strategies, like <strong>Self-Consistency</strong>, help reduce random errors by exploring multiple reasoning paths.</li>
        <li>For ambiguous or challenging problems, combining strategies (e.g., CoT + Self-Consistency) often leads to the most reliable results.</li>
    </ul>
</div>

<div style="background: #f5faff; padding: 20px; border-radius: 8px; border-left: 5px solid #0055d4; margin-top: 20px;">
    <p style="font-size: 16px; font-style: italic;">
        üìñ <em>Remember: Prompt engineering is both an art and a science. The more you experiment, the better you understand how to guide LLMs effectively!</em>
    </p>
</div>

<div style="margin-top: 40px; text-align: center;">
    <h3 style="color: #001a70;">‚úçÔ∏è Final Reflection</h3>
</div>

<div style="background: #ffffff; padding: 20px; border-radius: 10px; border-left: 6px solid #0055d4;">
    <p style="font-size: 16px;">
        _[Write 2-3 sentences summarizing what you personally learned about prompting strategies and how model selection or temperature influenced the results.]_
    </p>
</div>


## Final Reflection

Direct Prompting provided the correct answer in a quick manner but lacked any level of visibility. Chain of Thought improved the level of clarity and made it easier to verify the solution without compromising accuracy significantly. Self-Consistency was helpful when the temperature level was higher, as individual responses varied. Using a majority vote improved accuracy. Tree of Thought prompted thinking of multiple solution paths before arriving at a final answer, which improved the level of correctness. ReAct provided the best level of clarity and structure in the solution by dividing the planning and calculation processes, which eliminated any errors in the final answer.