# TextGrad Tutorials: Self-Verification

![TextGrad](https://github.com/vinid/data/blob/master/logo_full.png?raw=true)

An autograd engine -- for textual gradients!

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/zou-group/TextGrad/blob/main/examples/notebooks/Prompt-Optimization.ipynb)
[![GitHub license](https://img.shields.io/badge/License-MIT-blue.svg)](https://lbesson.mit-license.org/)
[![Arxiv](https://img.shields.io/badge/arXiv-2406.07496-B31B1B.svg)](https://arxiv.org/abs/2406.07496)
[![Documentation Status](https://readthedocs.org/projects/textgrad/badge/?version=latest)](https://textgrad.readthedocs.io/en/latest/?badge=latest)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/textgrad)](https://pypi.org/project/textgrad/)
[![PyPI](https://img.shields.io/pypi/v/textgrad)](https://pypi.org/project/textgrad/)

**Objectives for this tutorial:**

* This tutorial demonstrates how to implement and use self-verification in TextGrad to improve the reliability of optimization processes. Self-verification helps detect and correct hallucinations or errors in the optimization suggestions provided by language models.

**Requirements:**

* You need to have an OpenAI API key to run this tutorial. This should be set as an environment variable as OPENAI_API_KEY.

## 1. Setup and Configuration

First, let's set up our environment and import the necessary modules. We'll use Gemini 1.5 Pro as our language model.

In [25]:
import os
import sys
from dotenv import load_dotenv

# Load environment variables (API keys)
load_dotenv()

True

In [26]:
# If you're using a local development version of TextGrad with your custom classes,
# you might need to add your development directory to the path
dev_path = "/Users/eugeniusms/Development/SKRIPSI"
if dev_path not in sys.path:
    sys.path.insert(0, dev_path)

In [27]:
import textgrad as tg
from textgrad.optimizer import VerifiedTextualGradientDescent

In [None]:
# ONLY FOR TESTING: NEED MOVE TO LIBRARY-BASED
import sys
import os

# Remove the installed version from sys.modules
if 'textgrad' in sys.modules:
    del sys.modules['textgrad']

# Remove any textgrad submodules
for key in list(sys.modules.keys()):
    if key.startswith('textgrad.'):
        del sys.modules[key]

# Add your local development directory to the Python path
# This needs to be the parent directory of textgrad
dev_path = "/Users/eugeniusms/Development/SKRIPSI"
if dev_path not in sys.path:
    sys.path.insert(0, dev_path)

# Now import your local version
import textgrad as tg
print(f"Imported TextGrad from: {tg.__file__}")

# Try importing your custom class
from textgrad.optimizer import VerifiedTextualGradientDescent

## 2. Set Up the Optimization Problem

We'll create a simple optimization problem to demonstrate self-verification. For this example, we'll use a simple initial solution and optimize it.

In [29]:
# Set up engine
tg.set_backward_engine("gemini-1.5-pro", override=True)

In [32]:
# Initialize variable with a deliberately flawed solution
problem = "If a circle has a radius of 5 cm, what is its area? Reason step by step."

initial_solution = """To find the area of a circle with radius 5 cm, I'll use the formula A = 2πr.

Given:
- Radius (r) = 5 cm

Step 1: Substitute the radius into the formula.
A = 2π x 5

Step 2: Calculate the result.
A = 2 x 3.14 x 5
A = 31.4

Therefore, the area of the circle is 31.4 square centimeters."""

solution = tg.Variable(initial_solution, 
                      requires_grad=True,
                      role_description="solution to the problem with step-by-step reasoning")

In [33]:
# Define loss function
loss_fn = tg.TextLoss("""You are evaluating a solution to a geometry problem. 
Be logical, precise, and critical. Identify any issues with formulas, calculations, or reasoning.
Provide specific feedback on what needs to be corrected.""")

In [34]:
# Create optimizer with verification
optimizer = VerifiedTextualGradientDescent(
    parameters=[solution],
    verification_strategy="process",  # Use process-supervised verification
    verification_threshold=0.7,  # Only apply corrections with confidence >= 0.7
    verbose=1  # Show more details during optimization
)

## 3. Run the Optimization Loop

Now let's run several iterations of optimization with verification. We'll see how the self-verification mechanism detects and corrects errors in the optimization process.

In [35]:
# Print the initial solution
print("Initial solution:")
print(solution.value)
print()

# Run first iteration
print("Iteration 1...")
loss = loss_fn(solution)
loss.backward()
optimizer.step()

Initial solution:
To find the area of a circle with radius 5 cm, I'll use the formula A = 2πr.

Given:
- Radius (r) = 5 cm

Step 1: Substitute the radius into the formula.
A = 2π x 5

Step 2: Calculate the result.
A = 2 x 3.14 x 5
A = 31.4

Therefore, the area of the circle is 31.4 square centimeters.

Iteration 1...
-----------------------VerifiedTextualGradientDescent------------------------
To find the area of a circle with radius 5 cm, I'll use the formula A = πr².

Given:
- Radius (r) = 5 cm

Step 1: Substitute the radius into the formula.
A = π * (5 cm)²

Step 2: Calculate the result.
A = π * 25 cm²
A ≈ 3.14 * 25 cm² (Using 3.14 as an approximation for π)
A ≈ 78.5 cm²

Therefore, the area of the circle is approximately 78.5 square centimeters. For a more precise answer, use a more accurate value of π (e.g., 3.14159).


In [36]:
# Print solution after iteration 1
print("Solution after iteration 1:\n")
print(solution.value)
print()

# Run second iteration
print("Iteration 2...")
loss = loss_fn(solution)
loss.backward()
optimizer.step()

Solution after iteration 1:

To find the area of a circle with radius 5 cm, I'll use the formula A = πr².

Given:
- Radius (r) = 5 cm

Step 1: Substitute the radius into the formula.
A = π * (5 cm)²

Step 2: Calculate the result.
A = π * 25 cm²
A ≈ 3.14 * 25 cm² (Using 3.14 as an approximation for π)
A ≈ 78.5 cm²

Therefore, the area of the circle is approximately 78.5 square centimeters. For a more precise answer, use a more accurate value of π (e.g., 3.14159).

Iteration 2...
-----------------------VerifiedTextualGradientDescent------------------------
To find the area of a circle with radius 5 cm, I'll use the formula A = πr².

Given:
- Radius (r) = 5 cm

Step 1: Substitute the radius into the formula.
A = π * (5 cm)²

Step 2: Simplify the expression.
A = π * 25 cm²

Step 3: Calculate the result using π ≈ 3.14159 (a more precise approximation for π).
A ≈ 3.14159 * 25 cm²
A ≈ 78.53975 cm²

Step 4: Round the result to two decimal places.
A ≈ 78.54 cm²

Therefore, the area of the circl

In [10]:
# Print solution after iteration 2
print("Solution after iteration 2:\n")
print(solution.value)
print()

# Run third iteration
print("Iteration 3...")
loss = loss_fn(solution)
loss.backward()
optimizer.step()

Solution after iteration 2:

To find the area of a circle with radius 5 cm, I'll use the formula A = πr².

Given:
- Radius (r) = 5 cm

Step 1: Identify the correct formula for the area of a circle.
The area of a circle is given by A = πr², where r is the radius.

Step 2: Substitute the radius into the formula.
A = π × (5 cm)²

Step 3: Calculate the squared radius.
r² = 5² = 25 cm²

Step 4: Multiply by π.
A = π × 25 cm²
A = 78.54 cm² (using π ≈ 3.14159)

Therefore, the area of the circle with radius 5 cm is 78.54 square centimeters.

Iteration 3...


In [37]:
# Print final solution
print("Final solution after iteration 3:\n")
print(solution.value)

Final solution after iteration 3:

To find the area of a circle with radius 5 cm, I'll use the formula A = πr².

Given:
- Radius (r) = 5 cm

Step 1: Substitute the radius into the formula.
A = π * (5 cm)²

Step 2: Simplify the expression.
A = π * 25 cm²

Step 3: Calculate the result using π ≈ 3.14159 (a more precise approximation for π).
A ≈ 3.14159 * 25 cm²
A ≈ 78.53975 cm²

Step 4: Round the result to two decimal places.
A ≈ 78.54 cm²

Therefore, the area of the circle is approximately 78.54 square centimeters. Using a more precise value of π results in a more accurate calculation.  While 3.14 is a common approximation, using more decimal places of π minimizes the rounding error and provides a closer approximation to the true area.


## 4. Demonstrating the Benefits of Verification

Now let's try another example to show how verification helps catch and correct errors during optimization.

In [38]:
# Let's try another example with a different problem
problem2 = "If it takes 1 hour to dry 25 shirts under the sun, how long will it take to dry 30 shirts under the sun? Reason step by step."

initial_solution2 = """To determine how long it will take to dry 30 shirts under the sun, I'll set up a proportion.

Let's call the time to dry 30 shirts 't'.

We know that:
- 25 shirts take 1 hour
- 30 shirts take t hours

Using a direct proportion:
25/1 = 30/t

Solving for t:
25t = 30
t = 30/25
t = 1.2

Therefore, it will take 1.2 hours (or 1 hour and 12 minutes) to dry 30 shirts under the sun."""

solution2 = tg.Variable(initial_solution2, 
                       requires_grad=True,
                       role_description="solution to the problem with step-by-step reasoning")

In [39]:
# Define loss function for the second problem
loss_fn2 = tg.TextLoss("""You are evaluating a solution to a problem. Be logical and critical. 
Identify any issues with the solution's assumptions or reasoning.
Is the solution correct and is the reasoning valid? Provide specific feedback.""")

In [40]:
# Create optimizers - one with verification and one without
verified_optimizer = VerifiedTextualGradientDescent(
    parameters=[solution2],
    verification_strategy="process",
    verification_threshold=0.7,
    verbose=1
)

In [41]:
# Print the initial solution
print("Initial solution for second problem:")
print(solution2.value)
print()

print("Running optimization with verification...")
loss = loss_fn2(solution2)
loss.backward()
verified_optimizer.step()

Initial solution for second problem:
To determine how long it will take to dry 30 shirts under the sun, I'll set up a proportion.

Let's call the time to dry 30 shirts 't'.

We know that:
- 25 shirts take 1 hour
- 30 shirts take t hours

Using a direct proportion:
25/1 = 30/t

Solving for t:
25t = 30
t = 30/25
t = 1.2

Therefore, it will take 1.2 hours (or 1 hour and 12 minutes) to dry 30 shirts under the sun.

Running optimization with verification...
-----------------------VerifiedTextualGradientDescent------------------------
Estimating the drying time for 30 shirts based solely on the time it takes to dry 25 shirts is an oversimplification.  Drying time is heavily influenced by environmental factors like sunlight intensity, temperature, humidity, and airflow, not just the number of shirts.  A simple proportion doesn't capture this complexity.

While we can use a proportion as a starting point, it only provides a rough estimate *if* the additional shirts don't significantly affect t

In [42]:
# Print solution after verification
print("Solution after verification:\n")
print(solution2.value)

Solution after verification:

Estimating the drying time for 30 shirts based solely on the time it takes to dry 25 shirts is an oversimplification.  Drying time is heavily influenced by environmental factors like sunlight intensity, temperature, humidity, and airflow, not just the number of shirts.  A simple proportion doesn't capture this complexity.

While we can use a proportion as a starting point, it only provides a rough estimate *if* the additional shirts don't significantly affect the sunlight and airflow each shirt receives.  Let's calculate that proportional estimate:

25 shirts / 1 hour = 30 shirts / t hours

Solving for t:  t = 30/25 = 1.2 hours (72 minutes)

However, this assumes the drying rate remains constant. In reality, if there's enough sunlight and space for all 30 shirts, adding 5 more might not change the drying time substantially.  The environmental conditions are often the limiting factor, not the number of shirts (within reasonable limits).

Therefore, a more r

## 5. Understanding the Verification Process

Let's take a closer look at how the verification works under the hood. The process verification examines each step of reasoning in the proposed update.

In [43]:
# Let's look at a verification example
print("Verification example:\n")
print("Original solution:")
print(initial_solution)
print()
print("Proposed update:")
print("To find the area of a circle with radius 5 cm, I'll use the formula A = πr².\n\nGiven:\n- Radius (r) = 5 cm\n\nStep 1: Substitute the radius into the formula.\nA = π × 5²\n\nStep 2: Calculate the result.\nA = π × 25\nA = 3.14159 × 25\nA = 78.54 (rounded to 2 decimal places)\n\nTherefore, the area of the circle is 78.54 square centimeters.")

Verification example:

Original solution:
To find the area of a circle with radius 5 cm, I'll use the formula A = 2πr.

Given:
- Radius (r) = 5 cm

Step 1: Substitute the radius into the formula.
A = 2π x 5

Step 2: Calculate the result.
A = 2 x 3.14 x 5
A = 31.4

Therefore, the area of the circle is 31.4 square centimeters.

Proposed update:
To find the area of a circle with radius 5 cm, I'll use the formula A = πr².

Given:
- Radius (r) = 5 cm

Step 1: Substitute the radius into the formula.
A = π × 5²

Step 2: Calculate the result.
A = π × 25
A = 3.14159 × 25
A = 78.54 (rounded to 2 decimal places)

Therefore, the area of the circle is 78.54 square centimeters.


In [44]:
# Create a testing function to simulate the verification
def test_verification(original, updated, objective, strategy="process"):
    """Simulate verification process"""
    from textgrad.verification import get_verifier
    
    # Get verification engine
    engine = tg.SingletonBackwardEngine().get_engine()
    
    # Create verifier
    verifier = get_verifier(strategy, engine)
    
    # Create original variable
    original_var = tg.Variable(original, 
                             requires_grad=True,
                             role_description="solution to the problem with step-by-step reasoning")
    
    # Run verification
    is_valid, confidence, corrections = verifier.verify_update(
        original_var, updated, objective)
    
    return {
        "is_valid": is_valid,
        "confidence": confidence,
        "corrections": corrections
    }

In [45]:
# Test verification with different updates
correct_update = "To find the area of a circle with radius 5 cm, I'll use the formula A = πr².\n\nGiven:\n- Radius (r) = 5 cm\n\nStep 1: Substitute the radius into the formula.\nA = π × 5²\n\nStep 2: Calculate the result.\nA = π × 25\nA = 3.14159 × 25\nA = 78.54 (rounded to 2 decimal places)\n\nTherefore, the area of the circle is 78.54 square centimeters."

flawed_update = "To find the area of a circle with radius 5 cm, I'll use the formula A = 2πr.\n\nGiven:\n- Radius (r) = 5 cm\n\nStep 1: Substitute the radius into the formula.\nA = 2π × 5\n\nStep 2: Calculate the result.\nA = 2 × 3.14 × 5\nA = 31.4\n\nTherefore, the area of the circle is 31.4 square centimeters."

objective = "The solution uses the incorrect formula for the area of a circle. The correct formula is A = πr² (pi times radius squared), not A = 2πr (which is the formula for the circumference of a circle). This leads to an incorrect final answer. The calculation steps also need to be updated to reflect the correct formula."

# Test both updates
result1 = test_verification(initial_solution, correct_update, objective)
result2 = test_verification(initial_solution, flawed_update, objective)

print("Verification result for the correct update:")
print(result1)
print()
print("Verification result for a flawed update (with wrong formula):")
print(result2)

Verification result for the correct update:
{'is_valid': True, 'confidence': 1.0, 'corrections': None}

Verification result for a flawed update (with wrong formula):
{'is_valid': False, 'confidence': 1.0, 'corrections': "To find the area of a circle with radius 5 cm, we'll use the formula A = πr².\n\nGiven:\n- Radius (r) = 5 cm\n\nStep 1: Substitute the radius into the formula.\nA = π × 5²\n\nStep 2: Calculate the square of the radius.\nA = π × 25\n\nStep 3: Calculate the result using π ≈ 3.14\nA = 3.14 × 25\nA = 78.5\n\nTherefore, the area of the circle is 78.5 square centimeters."}


## 6. Conclusion

Self-verification adds an important layer of reliability to TextGrad's optimization process. By detecting and correcting errors in the optimization suggestions, it helps ensure that the final solution is accurate and well-reasoned.

Key benefits of self-verification include:

1. **Error Detection**: Identifies issues in proposed updates before they're applied
2. **Correction**: Provides corrections for invalid updates
3. **Confidence Scoring**: Quantifies confidence in verification results
4. **Quality Improvement**: Improves the overall quality of the optimization process

By using the `VerifiedTextualGradientDescent` optimizer with an appropriate verification strategy, you can enhance TextGrad's optimization capabilities across a wide range of applications.