<a href="https://colab.research.google.com/github/kissflow/prompt2finetune/blob/main/Prompting_Strategy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Setup

In [14]:
import os
from openai import OpenAI
from google.colab import userdata

OPENAI_API_KEY = userdata.get('OPENAI_API_KEY')


# Make sure you have set the OPENAI_API_KEY environment variable
# For Colab, you can use the "🔑" icon on the left panel to add your API key as a secret.
# Name the secret OPENAI_API_KEY.
# If you are running this outside of Colab, you can set it as an environment variable
# in your terminal: export OPENAI_API_KEY='your-api-key'
# Or you can uncomment the line below and replace 'your-api-key' with your actual key:
# os.environ['OPENAI_API_KEY'] = 'your-api-key'

client = OpenAI(
    # This is the default and can be omitted
    api_key= userdata.get('OPENAI_API_KEY'),
)

model = "gpt-4.1-mini-2025-04-14" #"gpt-3.5-turbo"

def generate_response(prompt, model=model, max_tokens=150):
    """
    Generates a response from the OpenAI API.

    Args:
        prompt: The input prompt for the model.
        model: The OpenAI model to use (default: "gpt-3.5-turbo").
        max_tokens: The maximum number of tokens in the generated response.

    Returns:
        The text of the generated response.
    """
    try:
        response = client.chat.completions.create(
            model=model,
            messages=[
                {"role": "user", "content": prompt}
            ],
            max_tokens=max_tokens,
            temperature=0.7, # You can adjust the temperature
        )
        return response.choices[0].message.content.strip()
    except Exception as e:
        return f"Error generating response: {e}"

In [None]:
# Product Reviews for Examples
POSITIVE_REVIEW = """This vacuum cleaner is amazing! It's lightweight, powerful, and the battery lasts a long time. I was able to clean my entire house on a single charge. The attachments are also very useful for getting into tight spaces. Highly recommended!"""

NEGATIVE_REVIEW = """I'm very disappointed with this vacuum cleaner. It's heavy and difficult to maneuver around furniture. The battery dies after just 15 minutes of use, which is nowhere near enough to clean even one room. The suction power is weak and it struggles with pet hair. The attachments keep falling off during use. Would not recommend this product."""

NEUTRAL_REVIEW = """This vacuum cleaner has both good and bad points. On the positive side, it's relatively lightweight and the attachments work well for corners. However, the battery life is average - it lasts about 30 minutes, which is enough for a small apartment but not a large house. The suction power is decent for daily cleaning but struggles with deep-pile carpets. It's an okay product for the price, but there might be better options available."""

# For consistency with existing examples
PRODUCT_REVIEW = POSITIVE_REVIEW


## Zero-Shot Prompting


In [None]:
# Real-world example: Summarize product reviews with zero-shot prompting.
def test_zero_shot(review, review_type):
    prompt = f"""
Summarize the following product review:

Review: {review}
"""
    response = generate_response(prompt)
    print(f"Zero-Shot Response ({review_type}):")
    print(response)
    print("-" * 50)

# Test with all three review types
test_zero_shot(POSITIVE_REVIEW, "Positive")
test_zero_shot(NEGATIVE_REVIEW, "Negative") 
test_zero_shot(NEUTRAL_REVIEW, "Neutral")

Zero-Shot Response:
The vacuum cleaner is lightweight, powerful, and has a long-lasting battery that allows cleaning the entire house on one charge. Its attachments are effective for tight spaces. Overall, it is highly recommended.


### Zero-Shot Prompting Description

**What it is:** Zero-shot prompting involves giving the model a task without providing any examples or demonstrations. The model relies entirely on its pre-training knowledge to understand and complete the task. This is the most basic form of prompting where you simply describe what you want the model to do.

**When to use:** Zero-shot prompting works best for simple, well-defined tasks where the model has strong general knowledge from its training data. It's ideal when you need quick results without spending time on prompt engineering or when dealing with common tasks like basic summarization, translation, or simple Q&A.

**Pros:** The main advantages include quick setup with minimal token usage, straightforward implementation that doesn't require example selection, and fast execution since there's only one API call. It's also very flexible and can handle a wide variety of tasks without modification.

**Cons:** The primary limitations are less reliable performance on complex or domain-specific tasks, inconsistent output format that may require post-processing, and lower accuracy compared to more sophisticated prompting strategies. The model may also produce unexpected results if the task is ambiguous.

**Best for:** Common tasks like basic summarization, translation, simple Q&A, general knowledge questions, and straightforward text processing. It works particularly well when the task is clearly defined and the model has strong prior knowledge about the domain.

**Note:** Zero-shot prompting works best with clear, unambiguous instructions. The quality of results depends heavily on how well you can describe the task in natural language.


## Few-Shot Prompting

In [None]:
# Real-world example: Summarize product reviews with few-shot examples.
def test_few_shot(review, review_type):
    few_shot_prompt = f"""
Summarize the following product reviews:

Review: This phone is terrible. The battery dies in an hour and the camera is blurry.
Summary: Poor battery life and camera quality.

Review: I love this laptop! It's fast, lightweight, and the screen is stunning.
Summary: Fast, lightweight, and excellent display.

Review: {review}
Summary:
"""
    response = generate_response(few_shot_prompt)
    print(f"Few-Shot Response ({review_type}):")
    print(response)
    print("-" * 50)

# Test with all three review types
test_few_shot(POSITIVE_REVIEW, "Positive")
test_few_shot(NEGATIVE_REVIEW, "Negative")
test_few_shot(NEUTRAL_REVIEW, "Neutral")

Few-Shot Response:
Lightweight, powerful vacuum with long battery life and useful attachments; highly recommended.


### Few-Shot Prompting Description

**What it is:** Few-shot prompting involves providing the model with 2-5 example input-output pairs before asking it to complete a similar task. These examples serve as demonstrations that guide the model's behavior and help it understand the expected format, style, and approach for the specific task.

**When to use:** Few-shot prompting is ideal when you need specific output format or style, working with domain-specific patterns, or when the model needs guidance on how to approach a particular type of task. It's particularly effective when you have representative examples that clearly show the desired input-output relationship.

**Pros:** The main advantages include better accuracy and consistency compared to zero-shot, helps the model understand exact requirements through concrete examples, and provides more reliable output formatting. It's also relatively simple to implement and doesn't require extensive prompt engineering.

**Cons:** The primary limitations include higher token usage due to the included examples, requires careful selection of representative examples, and the quality of results depends heavily on the relevance and quality of the provided examples. Poor examples can actually hurt performance.

**Best for:** Formatting tasks, classification problems, domain-specific outputs, maintaining consistent tone or style, and tasks where the pattern is clear but the model needs specific guidance. It works well for tasks like sentiment analysis, text classification, and structured data extraction.

**Note:** The quality and relevance of examples directly impact results. Choose examples that are representative of the task and clearly demonstrate the desired input-output relationship. Avoid examples that are too similar to each other or that might confuse the model.


## Self Reflexion

In [None]:
# Real-world example: Summarize a product review using Reflexion (using positive review for complexity).

# Initial prompt
initial_review_prompt = f"Summarize the following product review:\n\n{POSITIVE_REVIEW}"

initial_review_response = generate_response(initial_review_prompt)
print("Initial Reflexion Response:")
print(initial_review_response)

# Reflection prompt to improve the summary
reflection_review_prompt = f"""
Based on the following product review and the initial summary:

Review: {POSITIVE_REVIEW}

Initial Summary: {initial_review_response}

Critique the initial summary. Does it capture all the key positive aspects mentioned in the review (lightweight, powerful, long battery life, useful attachments, highly recommended)? Suggest improvements to make the summary more comprehensive and accurate.
"""

reflection_on_review = generate_response(reflection_review_prompt)
print("\nReflection on Initial Response:")
print(reflection_on_review)

# Refined prompt incorporating the reflection
refined_review_prompt = f"""
Based on the following product review and the critique of the initial summary, provide a refined and comprehensive summary:

Review: {POSITIVE_REVIEW}

Critique: {reflection_on_review}

Refined Summary:
"""

refined_review_response = generate_response(refined_review_prompt)
print("\nRefined Reflexion Response:")
print(refined_review_response)

Initial Reflexion Response:
The vacuum cleaner is lightweight, powerful, and has a long-lasting battery that can clean an entire house on one charge. Its useful attachments help reach tight spaces. Highly recommended.

Reflection on Initial Response:
The initial summary does a good job capturing the main positive aspects mentioned in the review: the vacuum cleaner being lightweight, powerful, having a long-lasting battery capable of cleaning an entire house on one charge, and the usefulness of the attachments for tight spaces. It also includes the strong recommendation from the reviewer.

However, the summary could be improved by incorporating the sense of enthusiasm and specific praise about the battery life ("I was able to clean my entire house on a single charge") and explicitly mentioning that the vacuum is easy to use due to its lightweight design. Additionally, mentioning the variety or versatility of attachments could add clarity.

Suggested improved summary:

"This vacuum clean

### Self Reflexion Description

**What it is:** Self Reflexion is a multi-step iterative process where the model first generates an initial response, then critiques its own output, and finally produces a refined version based on the critique. This approach leverages the model's ability to evaluate and improve its own work through explicit self-reflection.

**When to use:** Self Reflexion is most effective when quality improvement is critical and you're dealing with complex content that benefits from multiple perspectives. It's ideal for high-stakes content where accuracy and coherence are paramount, such as technical analysis, creative writing, or editorial tasks.

**Pros:** The main advantages include significantly higher quality outputs through self-correction, the ability to catch and fix errors that might be missed in a single pass, improved coherence and consistency, and the model's ability to identify its own weaknesses and address them.

**Cons:** The primary limitations include multiple API calls making it more expensive, slower execution time due to the iterative nature, and higher token usage. It also requires more complex implementation and may not always lead to improvements if the critique criteria aren't well-defined.

**Best for:** High-stakes content like technical documentation, creative writing, editorial tasks, complex analysis, and any situation where the quality of the output is more important than speed or cost. It's particularly effective for tasks that benefit from multiple perspectives or iterative refinement.

**Note:** Self Reflexion is most effective when the critique criteria are explicitly defined and the model is given clear guidance on what aspects to evaluate. The quality of the final output depends heavily on the effectiveness of the self-critique step.


## Chain of Thought Prompting

In [None]:
# Real-world example: Summarize a product review using Chain of Thought (using neutral review for balanced reasoning).

chain_of_thought_prompt = f"""
Summarize the following product review, thinking step by step:

Review: {NEUTRAL_REVIEW}

Let's think step by step to summarize this review.
1. Identify the product being reviewed.
2. List the positive and negative attributes mentioned about the product.
3. Combine these attributes into a concise summary.

Note: Write down your reasoning for each step inside <thought-process> Thoughts </thought-process> tag.

Summary:
"""

chain_of_thought_response = generate_response(chain_of_thought_prompt, max_tokens=300) # Increased max_tokens for detailed steps
print("Chain of Thought Response:")
print(chain_of_thought_response)

Chain of Thought Response:
<thought-process> 
1. The product being reviewed is a vacuum cleaner. 
2. Positive attributes mentioned are: lightweight, powerful, long-lasting battery (able to clean entire house on one charge), and useful attachments for tight spaces. There are no negative attributes mentioned. 
3. Combining these points, the summary should highlight the vacuum cleaner's lightweight design, strong performance, excellent battery life, and versatile attachments, concluding with a positive recommendation. 
</thought-process>

Summary: The vacuum cleaner is lightweight, powerful, has a long-lasting battery that can clean an entire house on one charge, and comes with useful attachments for tight spaces. It is highly recommended.


### Chain of Thought Prompting Description

**What it is:** Chain of Thought prompting requires the model to explicitly show its step-by-step reasoning process before providing the final answer. The model breaks down complex problems into smaller, manageable steps and shows its thinking process, making the reasoning transparent and traceable.

**When to use:** Chain of Thought is most effective for complex reasoning tasks, mathematical problems, multi-step logic problems, debugging tasks, and any situation where the reasoning process is as important as the final answer. It's particularly useful when you need to understand how the model arrived at its conclusion.

**Pros:** The main advantages include significantly better accuracy on complex tasks, transparent reasoning that allows for debugging and verification, easier identification of where the model might be going wrong, and the ability to follow the logical progression of the model's thinking.

**Cons:** The primary limitations include longer outputs that consume more tokens, potentially slower processing due to the detailed reasoning, and the possibility of exposing flawed reasoning processes. It may also be overkill for simple tasks.

**Best for:** Math problems, logic puzzles, code debugging, analytical reasoning, complex problem-solving, and any task where step-by-step thinking is beneficial. It's particularly effective when combined with phrases like "Let's think step by step" or "Show your work."

**Note:** Chain of Thought is particularly effective when the model is explicitly prompted to show its reasoning. The quality of the reasoning process often correlates with the accuracy of the final answer, making it a powerful tool for complex problem-solving.


## Tree of Thoughts Prompting

In [None]:
# Real-world example: Summarize a product review using a simulated Tree of Thoughts.
# Note: A true Tree of Thoughts implementation is more complex and would involve
# exploring multiple reasoning paths and evaluating them. This is a simplified simulation.

def tree_of_thoughts_prompting(initial_thought_prompt, expansion_prompt, evaluation_prompt, steps=2):
    """
    Simulates a basic Tree of Thoughts approach.

    Args:
        initial_thought_prompt: The initial prompt to generate a starting thought.
        expansion_prompt: Prompt to generate subsequent thoughts based on previous ones.
        evaluation_prompt: Prompt to evaluate the generated thoughts.
        steps: Number of expansion steps to simulate.

    Returns:
        The evaluated best thought or a summary of the process.
    """
    try:
        thoughts = [generate_response(initial_thought_prompt)]
        print("Initial Thought:", thoughts[0])

        for step in range(steps):
            new_thoughts = []
            for thought in thoughts:
                expansion_response = generate_response(f"{expansion_prompt}\n\nPrevious thought: {thought}")
                # Simple split to simulate multiple thoughts
                expanded_thoughts = [t.strip() for t in expansion_response.split('\n') if t.strip()]
                new_thoughts.extend(expanded_thoughts)
            thoughts = new_thoughts
            print(f"\nThoughts after step {step + 1}:")
            for i, thought in enumerate(thoughts):
                print(f"  Thought {i+1}: {thought}")

        # Simple evaluation: Ask the model to pick the best thought or summarize
        evaluation_response = generate_response(f"{evaluation_prompt}\n\nThoughts to evaluate:\n" + "\n".join([f"- {t}" for t in thoughts]))

        return evaluation_response

    except Exception as e:
        return f"An error occurred: {e}"

# Example usage with the product review summarization (using positive review - too expensive for multiple)
initial_thought_prompt = "Generate an initial thought about summarizing a product review."
expansion_prompt = "Expand on the following thought to generate alternative approaches for summarizing a product review."
evaluation_prompt = f"Review the following thoughts for summarizing a product review and provide the best summary based on the original review: {POSITIVE_REVIEW}"


tree_of_thoughts_response = tree_of_thoughts_prompting(initial_thought_prompt, expansion_prompt, evaluation_prompt, steps=2)
print("\nTree of Thoughts Simulation Result:")
print(tree_of_thoughts_response)

Initial Thought: When summarizing a product review, it’s important to capture the overall sentiment—whether positive, negative, or mixed—while highlighting key points such as the product’s main features, performance, usability, and any common praises or complaints mentioned by the reviewer. This approach ensures the summary provides a clear and balanced snapshot for potential buyers.

Thoughts after step 1:
  Thought 1: Certainly! Expanding on the initial thought, here are several alternative approaches for summarizing a product review, each emphasizing different aspects or techniques to provide diverse perspectives and utility for potential buyers:
  Thought 2: 1. **Feature-Centric Summary**
  Thought 3: Focus primarily on the specific features of the product mentioned in the review, detailing how each feature performed or was perceived. This approach is useful when potential buyers are interested in particular functionalities rather than an overall sentiment.
  Thought 4: *Example:* 

### Tree of Thoughts Prompting Description

**What it is:** Tree of Thoughts prompting explores multiple reasoning paths simultaneously, evaluates different approaches, and selects the best solution. It's like having the model brainstorm multiple solutions and then pick the most promising one, similar to how humans might approach complex problems by considering various options.

**When to use:** Tree of Thoughts is ideal when multiple valid approaches exist for solving a problem, when creative problem-solving is needed, or when optimization is required. It's particularly effective for strategic planning, complex decision-making, and tasks where exploring the solution space thoroughly is beneficial.

**Pros:** The main advantages include thorough exploration of the solution space, finding optimal or creative solutions that might be missed with single-path approaches, better handling of ambiguous problems, and the ability to evaluate trade-offs between different approaches.

**Cons:** The primary limitations include very high cost due to multiple API calls (often 10-50x more expensive than other methods), significantly slower execution, complex implementation requirements, and the need for sophisticated evaluation mechanisms to compare different approaches.

**Best for:** Strategic planning, creative brainstorming, complex optimization problems, decision-making tasks, and any situation where exploring multiple approaches is valuable. It's particularly effective for problems with no clear single solution path.

**Note:** This implementation is a simplified simulation of Tree of Thoughts. Production implementations would use more sophisticated evaluation mechanisms, parallel processing, and advanced pruning strategies to manage costs while maintaining effectiveness.


## Meta Prompting

In [None]:
# Real-world example: Summarize product reviews using Meta Prompting.
def test_meta_prompting(review, review_type):
    meta_prompt = f"""
You are an expert in summarizing product reviews. Your goal is to provide a concise and informative summary of the given review, focusing only on the key positive and negative aspects mentioned.

Here is the product review:
Review: {review}

Provide the summary in the following format:
Summary: [Your concise summary here]
"""
    response = generate_response(meta_prompt)
    print(f"Meta Prompting Response ({review_type}):")
    print(response)
    print("-" * 50)

# Test with all three review types
test_meta_prompting(POSITIVE_REVIEW, "Positive")
test_meta_prompting(NEGATIVE_REVIEW, "Negative")
test_meta_prompting(NEUTRAL_REVIEW, "Neutral")

Meta Prompting Response:
Summary: The vacuum cleaner is lightweight, powerful, with a long-lasting battery. Attachments are useful for tight spaces. Highly recommended.


### Meta Prompting Description

**What it is:** Meta prompting assigns the model an explicit role with structured instructions and specific output format requirements. It's like giving the model a job description and clear guidelines on how to perform that job, ensuring consistent behavior across interactions and adherence to specific requirements.

**When to use:** Meta prompting is most effective when you need consistent behavior across multiple interactions, have complex task requirements, or are building production systems where reliability and consistency are crucial. It's ideal for API integrations, automated systems, and any situation where the model needs to maintain a specific persona or approach.

**Pros:** The main advantages include clear expectations that lead to more consistent outputs, improved task adherence through explicit role definition, better output formatting through structured instructions, and the ability to fine-tune behavior for specific use cases.

**Cons:** The primary limitations include the need for careful prompt engineering to define roles effectively, potential over-constraint that might limit creativity, and the requirement to test and refine role definitions for optimal performance.

**Best for:** Production systems, API integrations, specific output formats, role-based tasks, and any situation where consistency and reliability are more important than creativity. It works particularly well when combined with other strategies for production use.

**Note:** Meta prompting works best when the role definition is clear and specific. The quality of results depends heavily on how well you can articulate the desired behavior and output format. It's often most effective when combined with other prompting strategies.


## Strategy Comparison and Recommendations

### How Different Strategies Handle Various Review Types

**Positive Reviews:** All strategies perform well with positive reviews, but Self Reflexion and Chain of Thought tend to produce more nuanced and comprehensive summaries by explicitly considering multiple aspects of the product.

**Negative Reviews:** Few-Shot and Meta Prompting excel with negative reviews because they can maintain consistent formatting and tone. Zero-Shot may struggle with maintaining appropriate sentiment analysis without examples.

**Neutral Reviews:** Chain of Thought is particularly effective for neutral reviews as it can systematically identify both positive and negative aspects, while Tree of Thoughts can explore different ways to balance mixed feedback.

### When to Choose Each Strategy

**Choose Zero-Shot when:**
- You need quick results for simple tasks
- Token usage is a primary concern
- The task is straightforward and well-defined

**Choose Few-Shot when:**
- You need consistent output formatting
- Working with domain-specific content
- You have high-quality representative examples

**Choose Self Reflexion when:**
- Quality is more important than speed/cost
- Working with complex, high-stakes content
- You need the highest possible output quality

**Choose Chain of Thought when:**
- The reasoning process is important
- Working with complex analytical tasks
- You need to understand how conclusions are reached

**Choose Tree of Thoughts when:**
- Multiple valid approaches exist
- Creative problem-solving is needed
- Cost is not a primary concern

**Choose Meta Prompting when:**
- Building production systems
- Consistency across interactions is crucial
- You need specific output formats

### Trade-offs Summary

| Strategy | Accuracy | Speed | Cost | Complexity | Best Use Case |
|----------|----------|-------|------|------------|---------------|
| Zero-Shot | Medium | Fast | Low | Low | Simple tasks |
| Few-Shot | High | Fast | Medium | Low | Consistent formatting |
| Self Reflexion | Very High | Slow | High | Medium | High-quality content |
| Chain of Thought | High | Medium | Medium | Low | Complex reasoning |
| Tree of Thoughts | Very High | Very Slow | Very High | High | Creative problems |
| Meta Prompting | High | Fast | Low | Medium | Production systems |
