# Task 3: Self-Reflection Prompt for Improving Output

## Overview
This notebook demonstrates **self-reflection (Reflexion)** prompting, where the AI critiques and improves its own outputs. The model acts as both creator and critic, iteratively refining its responses.

## Self-Reflection Pattern
1. **Generate**: Create an initial output
2. **Critique**: Analyze strengths and weaknesses
3. **Improve**: Generate an enhanced version based on the critique

## Use Case
Generate a summary of a technical concept, then have the AI critique and improve it.

In [1]:
# Import required libraries
import os
from google import genai
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Initialize the Gemini client
client = genai.Client(api_key=os.getenv('GEMINI_API_KEY'))

print("✓ Gemini client initialized successfully")

✓ Gemini client initialized successfully


## Step 1: Generate Initial Summary
First, we'll ask the AI to create a summary of a technical concept.

In [2]:
initial_prompt = """
Summarize the concept of "API Rate Limiting" in 3-4 sentences. Explain what it is, why it's important, and how it works.
"""

response_1 = client.models.generate_content(
    model='gemini-2.0-flash-exp',
    contents=initial_prompt
)

original_summary = response_1.text

print("ORIGINAL SUMMARY:")
print("=" * 80)
print(original_summary)
print("=" * 80)

ORIGINAL SUMMARY:
API rate limiting restricts the number of requests a user or application can make to an API within a specific timeframe. This is crucial for protecting the API from abuse, preventing overload, and ensuring fair usage for all users.  Rate limiting is typically implemented by tracking API requests and rejecting requests that exceed the defined limit, often returning an error code indicating the limit has been reached. This helps maintain API stability and availability.



## Step 2: Self-Critique and Improvement
Now we'll ask the AI to critique its own summary and generate an improved version.

In [3]:
reflection_prompt = f"""
Review the following summary and critique it based on these criteria:
1. Clarity - Is it easy to understand?
2. Completeness - Does it cover all key aspects?
3. Accuracy - Is the information correct?
4. Conciseness - Is it appropriately brief without losing important details?

Original Summary:
{original_summary}

Provide:
1. A critique identifying specific strengths and weaknesses
2. An improved version of the summary that addresses the weaknesses

Format your response as:
CRITIQUE:
[Your detailed critique]

IMPROVED SUMMARY:
[Your enhanced version]
"""

response_2 = client.models.generate_content(
    model='gemini-2.0-flash-exp',
    contents=reflection_prompt
)

reflection_output = response_2.text

print("\nSELF-REFLECTION OUTPUT:")
print("=" * 80)
print(reflection_output)
print("=" * 80)


SELF-REFLECTION OUTPUT:
CRITIQUE:

1.  **Clarity:** The summary is generally clear and easy to understand for someone familiar with APIs. However, it could benefit from slightly more explanation for a complete novice.

2.  **Completeness:** The summary covers the core purpose and function of rate limiting but misses some important aspects, such as the different types of rate limiting (e.g., user-based, IP-based), how the timeframe is defined, and the variety of responses an API might return upon hitting a limit. It also doesn't mention the different algorithms used to implement rate limiting (e.g., token bucket, leaky bucket).

3.  **Accuracy:** The information presented is accurate.

4.  **Conciseness:** The summary is reasonably concise and avoids unnecessary jargon. It's a good length for a brief overview.

**Strengths:**

*   Clearly explains the purpose of rate limiting.
*   Concise and easy to read.
*   Accurately describes the general mechanism.

**Weaknesses:**

*   Lacks deta

## Comparison: Before and After
Let's extract and compare the original and improved summaries side-by-side.

In [4]:
# Extract the improved summary from the reflection output
if "IMPROVED SUMMARY:" in reflection_output:
    improved_summary = reflection_output.split("IMPROVED SUMMARY:")[1].strip()
else:
    improved_summary = "Could not extract improved summary"

print("\n" + "=" * 80)
print("COMPARISON: ORIGINAL vs IMPROVED")
print("=" * 80)
print("\n📝 ORIGINAL SUMMARY:")
print("-" * 80)
print(original_summary)
print("\n✨ IMPROVED SUMMARY:")
print("-" * 80)
print(improved_summary)
print("=" * 80)


COMPARISON: ORIGINAL vs IMPROVED

📝 ORIGINAL SUMMARY:
--------------------------------------------------------------------------------
API rate limiting restricts the number of requests a user or application can make to an API within a specific timeframe. This is crucial for protecting the API from abuse, preventing overload, and ensuring fair usage for all users.  Rate limiting is typically implemented by tracking API requests and rejecting requests that exceed the defined limit, often returning an error code indicating the limit has been reached. This helps maintain API stability and availability.


✨ IMPROVED SUMMARY:
--------------------------------------------------------------------------------
API rate limiting controls the number of requests a user or application can make to an API within a specific timeframe, such as per minute, hour, or day. This is essential for protecting the API from abuse (e.g., denial-of-service attacks), preventing overload, ensuring fair usage among a

## Additional Example: Multi-Round Reflection
Let's demonstrate multiple rounds of self-reflection to further improve the output.

In [5]:
# Second round of reflection
second_reflection_prompt = f"""
Review this improved summary one more time and make it even better by:
1. Adding a concrete example if missing
2. Ensuring technical accuracy
3. Making it more engaging while staying concise

Current Summary:
{improved_summary}

Provide only the final improved version without critique.
"""

response_3 = client.models.generate_content(
    model='gemini-2.0-flash-exp',
    contents=second_reflection_prompt
)

final_summary = response_3.text

print("\n🎯 FINAL SUMMARY (After 2 Rounds of Reflection):")
print("=" * 80)
print(final_summary)
print("=" * 80)


🎯 FINAL SUMMARY (After 2 Rounds of Reflection):
API rate limiting is a critical mechanism for safeguarding APIs by controlling the volume of requests a user or application can make within a given timeframe (e.g., per minute or hour). Imagine a popular e-commerce API: Without rate limiting, a malicious bot could flood the API with requests, potentially crashing the service and preventing legitimate users from accessing it. Rate limiting prevents this abuse, protects against denial-of-service (DoS) attacks, and ensures fair resource allocation. These limits are typically enforced based on user ID, API key, or IP address. When a limit is exceeded, the API responds with a "429 Too Many Requests" error, informing the user to retry after a specific period. Algorithms like the token bucket or leaky bucket are commonly used to implement these controls, providing nuanced approaches to traffic shaping and API protection.



## Demonstrating with a Different Topic
Let's apply self-reflection to another technical concept to show the technique's versatility.

In [6]:
# Generate initial summary for a different topic
topic_prompt = """
Explain "Docker containers" in 3-4 sentences for someone new to software development.
"""

response_docker_1 = client.models.generate_content(
    model='gemini-2.0-flash-exp',
    contents=topic_prompt
)

docker_summary_v1 = response_docker_1.text

print("Example 2: Docker Containers")
print("\n📝 Initial Summary:")
print("-" * 80)
print(docker_summary_v1)
print("-" * 80)

Example 2: Docker Containers

📝 Initial Summary:
--------------------------------------------------------------------------------
Docker containers are like lightweight, self-contained packages that hold everything an application needs to run, including code, libraries, and settings. They isolate applications from each other and the underlying operating system, ensuring consistency across different environments. This means your application will run the same way on your computer as it does on a server, regardless of the underlying infrastructure. Think of them like shipping containers for software, making deployment and scaling much easier.

--------------------------------------------------------------------------------


In [7]:
# Apply self-reflection
docker_reflection_prompt = f"""
Critique this explanation of Docker containers for a beginner:

{docker_summary_v1}

Consider:
- Is it beginner-friendly?
- Are there any technical terms that need simplification?
- Would an analogy help?
- Is anything important missing?

Provide:
CRITIQUE: [Your analysis]
IMPROVED VERSION: [Better explanation]
"""

response_docker_2 = client.models.generate_content(
    model='gemini-2.0-flash-exp',
    contents=docker_reflection_prompt
)

docker_reflection = response_docker_2.text

print("\n🔍 Self-Reflection Process:")
print("=" * 80)
print(docker_reflection)
print("=" * 80)


🔍 Self-Reflection Process:
**CRITIQUE:**

The explanation is generally good for a beginner, but it could be improved by simplifying some of the language and adding a stronger analogy.

*   **Beginner-Friendly:** Mostly yes. However, "underlying operating system" might be slightly confusing.
*   **Technical Terms:** "Underlying infrastructure" could be simplified. "Libraries" could be explained very briefly, especially for someone completely new to development.
*   **Analogy:** The "shipping containers" analogy is good, but it could be pushed further to be more impactful. It needs to highlight the advantage of *standardization* that both shipping containers and Docker containers provide.
*   **Missing:** While the explanation touches on isolation and consistency, it doesn't explicitly mention *portability* as a key benefit, which is closely tied to the container's self-contained nature. Also, it doesn't touch upon the *speed* and *efficiency* benefits compared to traditional Virtual Ma

## Summary: Self-Reflection Effectiveness

In [8]:
print("\n" + "=" * 80)
print("SELF-REFLECTION PROMPTING SUMMARY")
print("=" * 80)
print("\n✓ Demonstrated iterative improvement through self-critique")
print("✓ Applied to multiple topics (API Rate Limiting, Docker Containers)")
print("✓ Showed multi-round reflection for continuous refinement")
print("\nKey Insight: Each reflection cycle identified specific weaknesses and")
print("generated measurably better outputs.")
print("=" * 80)


SELF-REFLECTION PROMPTING SUMMARY

✓ Demonstrated iterative improvement through self-critique
✓ Applied to multiple topics (API Rate Limiting, Docker Containers)
✓ Showed multi-round reflection for continuous refinement

Key Insight: Each reflection cycle identified specific weaknesses and
generated measurably better outputs.


## Key Takeaways

**Self-Reflection Prompting Benefits:**
1. **Quality Improvement** - Outputs get progressively better with each iteration
2. **Self-Correction** - AI identifies and fixes its own mistakes
3. **No Additional Training** - Improves outputs without fine-tuning
4. **Transparency** - Shows what aspects were improved and why
5. **Versatility** - Works across different types of content and domains

**When to Use Self-Reflection:**
- Content creation that needs high quality (articles, summaries, documentation)
- Code review and optimization
- Problem-solving where initial solutions may be suboptimal
- Educational contexts to show improvement process
- Any task where quality matters more than speed

**Best Practices:**
1. **Specific Criteria** - Define clear evaluation dimensions (clarity, accuracy, etc.)
2. **Structured Format** - Use consistent format (CRITIQUE → IMPROVED VERSION)
3. **Multiple Rounds** - Consider 2-3 rounds for complex tasks
4. **Cost vs Quality** - Balance API calls with quality needs

**Research Findings:**
- Even GPT-4 shows significant improvement with self-reflection
- Reflexion agents improve performance on decision-making, reasoning, and coding tasks
- Self-refine leads to measurable gains in sentiment analysis and code optimization

**Comparison to Regular Prompting:**
- **Regular**: Single generation → Done
- **Self-Reflection**: Generate → Critique → Improve → (Repeat) → Higher quality output