# Prompt Optimization and Evaluation with OpenAI API

<div style="display:flex; align-items:center; padding: 50px;">
<p style="margin-right:10px;">
    <img height="200px" style="width:auto;" width="200px" src="https://avatars.githubusercontent.com/u/192148546?s=400&u=95d76fbb02e6c09671d87c9155f17ca1e4ef8f21&v=4"> 
</p>
</div>


## Description:

This app utilizes OpenAI's GPT-4 model to optimize prompts and evaluate their effectiveness. The PromptOptimizer class refines a given prompt using instructions, iterating several times to improve clarity, conciseness, and effectiveness. After optimization, the app evaluates the original and optimized prompts to generate a comparison report with reasons and scores. The app can be used to optimize prompts for production environments, ensuring they are effective and clear for LLM usage.

## Step 1: Installing Requirements  

---  

### Purpose:  

- The function **`install_requirements`** ensures that the necessary libraries, which are mentioned in the **`requirements.txt`** file, are installed in the environment.  

---  

### Steps:  

- **Global Variable Setup:**  

  - **`requirements_installed`** keeps track of whether the requirements have already been installed.  

  - **`max_retries`** and **`retries`** handle the retry mechanism.  

- **Installation Logic:**  

  - If the requirements aren't installed (**`requirements_installed`** is **False**), it uses the **`os.system()`** method to execute the **`pip install -r requirements.txt`** command to install the packages.  

- **Retry Mechanism:**  

  - If installation fails, the function retries up to 3 times, before exiting the program.  


In [40]:
import os

requirements_installed = False
max_retries = 3
retries = 0


def install_requirements():
    """Installs the requirements from requirements.txt file"""
    global requirements_installed
    if requirements_installed:
        print("Requirements already installed.")
        return

    print("Installing requirements...")
    install_status = os.system("pip install -r requirements.txt")
    if install_status == 0:
        print("Requirements installed successfully.")
        requirements_installed = True
    else:
        print("Failed to install requirements.")
        if retries < max_retries:
            print("Retrying...")
            retries += 1
            return install_requirements()
        exit(1)
    return

install_requirements()

## Step 2: Setting Up Environment Variables  

---  

### Purpose:  

- This code loads environment variables, ensuring that sensitive information (like API keys) are available for the app to function.  

---  

### Steps:  

- **load_dotenv:**  

  - The **`load_dotenv()`** method from **`python-dotenv`** loads environment variables from a **`.env`** file into the environment.  

- **Checking Required Environment Variables:**  

  - The **`check_env`** function checks whether the necessary environment variables (in this case, **`OPENAI_API_KEY`**) are set, and if not, the program prompts the user to set them and exits.  


In [42]:
from dotenv import load_dotenv
import os


def setup_env():
    """Sets up the environment variables"""
    def check_env(env_var):
        value = os.getenv(env_var)
        if value is None:
            print(f"Please set the {env_var} environment variable.")
            exit(1)
        else:
            print(f"{env_var} is set.")
    load_dotenv()

    variables_to_check = ["OPENAI_API_KEY"]

    for var in variables_to_check:
        check_env(var)

setup_env()

## Step 3: Prompt Optimization and Evaluation with OpenAI

### Purpose:

The `PromptOptimizer` class is designed to optimize prompts for use with OpenAI's `GPT-4` model.

It takes a user-defined prompt and refines it through multiple iterations, optimizing it for clarity, conciseness, and effectiveness.

After the optimization, it evaluates and compares the original and optimized prompts.

### 1. Initial Setup

#### Imports:

- `OpenAI`: This is used to interact with OpenAI's API, specifically for creating chat completions using `GPT` models.

- `os`: This module is used to interact with the operating system, particularly for accessing environment variables like the `OpenAI API` key.

### 2. Class Definition: `PromptOptimizer`

#### Purpose:

This class is the core of the application.

It manages the prompt optimization and evaluation process.

It interacts with the `OpenAI API` to refine and evaluate prompts.

### 3. Constructor: `__init__` Method

#### Purpose:

The constructor initializes the class with a user-provided prompt and sets up the `OpenAI API` client using the API key from the environment variable.

#### Parameters:

- `prompt`: The original prompt to be optimized.

#### Attributes:

- `self.ai`: The `OpenAI API` client, initialized with the API key fetched from the environment (`OPENAI_API_KEY`).

- `self.prompt`: The original user-provided prompt.

- `self.optimized_prompt`: Placeholder for storing the optimized version of the prompt after optimization.

### 4. `optimize` Method

#### Purpose:

This method optimizes the given prompt using a set of instructions and iteratively refines it for a given number of turns.

#### Parameters:

- `optimize_instructions`: A list of instructions that guide the optimization process.
  - Example instructions: "Refine the prompt to be concise and clear," and "Make the prompt suitable for production environments."
  
- `turns`: The number of iterations to run the optimization. Each turn involves refining the prompt further.

#### 4.1. Optimization Loop

- `Optimization Instructions`: The `optimize_instructions` provided are passed along with the prompt to instruct OpenAI's model to refine the prompt.

- Loop for Multiple Turns: The loop runs for a number of turns (default 3), allowing the prompt to be optimized iteratively. In each iteration:
  - A new message is created to instruct the model to refine the prompt according to the provided instructions.
  
  - The prompt is progressively refined through the iterations.

#### 4.2. `OpenAI API` Request

- `API Request`: The `optimize_prompt` (which includes both the instructions and the current prompt) is sent to OpenAI's `GPT-4` model.

- `role: "system"` provides context to the model about its task.

- `role: "user"` provides the actual prompt that needs refinement.

- Response Handling: The optimized prompt returned by OpenAI is stored in `previous_optimized_prompt` for the next iteration.

#### 4.3. Error Handling and Final Optimization

- Error Handling: If any error occurs during the optimization (e.g., network issues or `API` failure), it catches the exception and continues with the next iteration.

- Final Optimized Prompt: After completing all iterations, the final optimized prompt is stored in `self.optimized_prompt` and returned.

### 5. `get_optimized_prompt` Method

#### Purpose:

This method retrieves the optimized prompt.

If it hasn't been optimized yet, it calls the `optimize()` method to perform the optimization.

#### Logic:

- If the prompt has already been optimized, it returns the optimized version.

- If not, it triggers the optimization process and returns the result.

### 6. `evaluate` Method

#### Purpose:

This method evaluates the original prompt against the optimized prompt by providing a detailed comparison.

#### Evaluation Process:

- The evaluation checks how well the optimization was done and returns a markdown report that compares the two prompts.

#### 6.1. Evaluation Prompt Creation

- Purpose: This string constructs the evaluation prompt, instructing OpenAI's model to compare both prompts (original and optimized) and generate a markdown table with the comparison.

#### 6.2. `API` Call for Evaluation

- Purpose: The `evaluation_prompt` is sent to OpenAI's API to compare the two prompts and generate the evaluation.

#### 6.3. Report Generation

- Purpose: A markdown header with the original and optimized prompts is created, and the evaluation report generated by the API is appended to it.

- Return: The full report, including the comparison and evaluation, is returned.

### 7. Error Handling in `evaluate` Method

#### Purpose:

If any error occurs during the evaluation, it is caught and an error message is printed.

The method returns `None` if evaluation fails.

---

### Summary:

The `PromptOptimizer` class is a robust tool for refining prompts and comparing their effectiveness.

It:

- Optimizes a given prompt over multiple iterations using OpenAI’s `GPT-4` model.

- Evaluates the original and optimized prompts by generating a detailed comparison report.

- Handles errors and ensures that the optimized prompt meets production-ready standards.


In [44]:
from openai import OpenAI
import os


class PromptOptimizer:
    """A simple prompt optimization class that uses OpenAI's Chat API to optimize prompts."""

    def __init__(self, prompt: str):
        """Initializes the PromptOptimizer with the given prompt. """
        self.ai = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
        self.prompt = prompt
        self.optimized_prompt = None

    def optimize(self, optimize_instructions = [
        "Refine the given prompt so that it is fit for use in production environments.",
        "Make the prompt concise, detailed and clear with a good balance of effectivenss and usefulness.",
    ], turns = 3) -> str:
        """Optimizes the prompt based on optimize instructions. """

        prompt_to_optimize = self.prompt
        previous_optimized_prompt = prompt_to_optimize
        for i in range(turns):
            try:
                print(f"Optimizing prompt - Turn {i+1}...")
                optimize_prompt = f"""
                    You will be given a prompt, refine it as per the given instructions. 
                    In addition to the instructions, come up with improvements and apply the refinements to the prompt.
                    Respond only with the final refined prompt after all improvements are made.
                    Instructions: ```{"\n".join(optimize_instructions)}```
                    Prompt: {previous_optimized_prompt}
                """
                system = "You are an AI prompt optimizer with deep understanding of LLM technology."
                ai_response = self.ai.chat.completions.create(
                    messages=[
                        {"role": "system", "content": system},
                        { "role": "user", "content": optimize_prompt }
                    ],
                    model="gpt-4o-mini"
                )
                previous_optimized_prompt = ai_response.choices[0].message.content
            except Exception as e:
                print(f"Failed to optimize prompt for turn {i+1}: {e}")
                continue
        self.optimized_prompt = previous_optimized_prompt
        return self.optimized_prompt
    
    def get_optimized_prompt(self) -> str:
        """Returns the optimized prompt."""
        if not self.optimized_prompt:
            optimized_prompt = self.optimize()
            if not optimized_prompt:
                print("Failed to optimize the prompt.")
        return self.optimized_prompt
    

    def evaluate(self):
        try:
            print("Evaluating the optimized prompt...")
            optimized_prompt = self.get_optimized_prompt()

            if not optimized_prompt:
                print("Failed to get the optimized prompt, can't evaluate.")
                return None
            
            evaluation_prompt = f"""
                    Given two prompts, compare the original prompt and the optimized prompt. 
                    Provide a table with the comparison of the two prompts along with reasons and scores.
                    In the markdown response provide a title, summary, and conclusion and the table. 
                    Respond in markdown format strictly. 
                    Make sure the markdown tables are compatible with Jupyter notebooks.
                    Original Prompt: {self.prompt}
                    Optimized Prompt: {optimized_prompt}
                """
            
            response = self.ai.chat.completions.create(
                model="gpt-4o-mini",
                messages=[
                    {"role": "system", "content": "You are expert in prompt evaluation."},
                    {"role": "user", "content": evaluation_prompt}
                ])
            
            header = f"""# Prompt Optimization Report\n## Original Prompt\n{self.prompt}\n## Optimized Prompt\n{self.optimized_prompt}"""
            report = response.choices[0].message.content
            full_report = header + report
            return full_report
        except Exception as e:
            print(f"Failed to evaluate the optimized prompt: {e}")
            return None

## Step 4: Run Prompt Optimization and Generate Evaluation Report

This block of code runs the prompt optimization and generates an evaluation report. Here's what each part does:

### Importing necessary modules:

- `clear_output`: Clears the current output in the Jupyter notebook to keep the display clean.

- `Markdown`: Allows the return of the evaluation report in `Markdown` format to be rendered properly in Jupyter notebooks.

### Defining the function `run_prompt_optimizer`:

This function takes a `prompt` as input, which will be optimized using the `PromptOptimizer` class.

### Creating an instance of `PromptOptimizer`:

An instance of the `PromptOptimizer` class is created with the given prompt. This initializes the class and prepares it to optimize the provided prompt.

### Optimizing the prompt:

The `get_optimized_prompt()` method of the `PromptOptimizer` class is called to retrieve the optimized version of the prompt.

### Clearing the output:

Clears any previous outputs from the notebook to ensure a clean display.

### Evaluating the optimized prompt:

The `evaluate()` method of the `PromptOptimizer` class is called, which compares the original and optimized prompts and generates a detailed evaluation report.

### Returning the evaluation report in `Markdown` format:

The generated evaluation report is returned in `Markdown` format, which is rendered in Jupyter notebooks as properly formatted output.


In [45]:
from IPython.display import clear_output, Markdown

def run_prompt_optimizer(prompt: str) -> None:
    prompt_optimizer = PromptOptimizer(prompt)
    optimized_prompt = prompt_optimizer.get_optimized_prompt()
    clear_output()
    report = prompt_optimizer.evaluate()
    return Markdown(report)

## Step 5: Run Prompt Optimization and Evaluation

### Example 1:

The prompt asks for a detailed plan for learning Python programming in 2025.

`run_prompt_optimizer(prompt)` will optimize this prompt and generate an evaluation report comparing the original and optimized versions.

### Example 2:

The prompt asks for the Standard Operating Procedure (SOP) for a conflict resolution strategy for a team of 5 members in a Big 4 firm.

`run_prompt_optimizer(prompt)` will optimize this prompt and provide an evaluation report as well.

In both examples, the function optimizes and evaluates the prompts using the `PromptOptimizer` class.


In [None]:
# Example 1

prompt = "Give me a detailed plan for learning Python Programming in 2025."

run_prompt_optimizer(prompt)

# Example 2

prompt = "Give me the SOP of a conflict resolution strategy for a team of 5 members in a Big 4 Firm."

run_prompt_optimizer(prompt)


## Conclusion:

This app provides an easy-to-use interface for refining prompts and evaluating their effectiveness in AI-driven environments. By using OpenAI's GPT-4 model, the app ensures that prompts are optimized to be concise, clear, and useful, followed by a detailed evaluation report comparing the original and optimized prompts.

---

# Thank You for visiting The Hackers Playbook! 🌐

If you liked this research material;

- [Subscribe to our newsletter.](https://thehackersplaybook.substack.com)

- [Follow us on LinkedIn.](https://www.linkedin.com/company/the-hackers-playbook/)

- [Leave a star on our GitHub.](https://www.github.com/thehackersplaybook)

<div style="display:flex; align-items:center; padding: 50px;">
<p style="margin-right:10px;">
    <img height="200px" style="width:auto;" width="200px" src="https://avatars.githubusercontent.com/u/192148546?s=400&u=95d76fbb02e6c09671d87c9155f17ca1e4ef8f21&v=4"> 
</p>
</div>