
# 🐦 Prompt Optimization with Evidently: Tweet Generation Example

This tutorial shows how to optimize prompts for generating engaging tweets using Evidently's `PromptOptimizer` API. 
We'll iteratively improve a tweet generation prompt to maximize how engaging LLM-generated tweets are, according to a classifier.

---

## ✅ What you'll learn:
- How to define a tweet generation function with OpenAI
- How to set up an LLM judge to classify tweet engagement
- How to optimize a tweet generation prompt based on feedback
- How to inspect the best optimized prompt


In [None]:

# Install packages if needed
# !pip install evidently openai pandas


In [None]:

import pandas as pd
import openai

from evidently.descriptors import LLMEval
from evidently.llm.templates import BinaryClassificationPromptTemplate
from evidently.llm.optimization import PromptOptimizer, PromptExecutionLog, Params


## 📄 Define a Tweet Generation Function

In [None]:

def basic_tweet_generation(topic, model="gpt-3.5-turbo", instructions=""):
    response = openai.chat.completions.create(
        model=model,
        messages=[
            {"role": "system", "content": instructions},
            {"role": "user", "content": f"Write a short paragraph about {topic}"}
        ]
    )
    return response.choices[0].message.content


## 🎛️ Define a Tweet Quality Judge

In [None]:

tweet_quality = BinaryClassificationPromptTemplate(
    pre_messages=[("system", "You are evaluating the quality of tweets")],
    criteria="""
Text is ENGAGING if it meets at least one of the following:
  • Strong hook (question, surprise, bold statement)
  • Uses emotion, humor, or opinion
  • Encourages interaction
  • Shows personality or distinct tone
  • Includes vivid language or emojis
  • Sparks curiosity or insight

Text is NEUTRAL if it lacks these qualities.
""",
    target_category="ENGAGING",
    non_target_category="NEUTRAL",
    uncertainty="non_target",
    include_reasoning=True,
)

judge = LLMEval("basic_tweet_generation.result", template=tweet_quality,
                provider="openai", model="gpt-4o-mini", alias="Tweet quality")


## 🚀 Define a Prompt Execution Function

In [None]:

def run_prompt(generation_prompt: str, context) -> PromptExecutionLog:
    """generate engaging tweets"""
    my_topics = [
        "testing in AI engineering is as important as in development",
        "CI/CD is applicable in AI",
        "Collaboration of subject matter experts and AI engineers improves product",
        "Start LLM apps development from test cases generation",
        "evidently is a great tool for LLM testing"
    ]
    tweets = [basic_tweet_generation(topic, model="gpt-3.5-turbo", instructions=generation_prompt) for topic in my_topics * 3]
    return PromptExecutionLog(generation_prompt, prediction=pd.Series(tweets))


## 🔍 Run the Prompt Optimizer

In [None]:
optimizer = PromptOptimizer("tweet_gen_example", strategy="feedback")
optimizer.set_param(Params.BasePrompt, "You are tweet generator")
await optimizer.arun(run_prompt, scorer=judge)
# sync version
# optimizer.run(run_prompt, scorer=judge)


## 🥇 View the Best Optimized Prompt

In [None]:
print(optimizer.best_prompt())