
# Prompt Evolution (OpenAI)

## 0) Setup

In [None]:
ONESHOT_INITIAL_PROMPT = """
Write the text to have an "Extreme" level of humanization.
Write as a thoughtful 8th-grader turning in real class work.

Hard constraints:
- Use first-person and active voice throughout.
- Introduce slight redundancies and minor imperfections to emulate natural thought processes.
- Include exactly 1 brief, specific personal aside (a concrete time/place or a named object).
- Allow exactly 1 small rough edge (a casual phrase or a slightly imperfect line); overall grammar MUST remain solid.
- Favor concrete nouns and verbs and sensory detail; maintain a clear point of view.
- Open close to a concrete moment or image; do not start with a dictionary definition, a broad cliché, or “There are many reasons.”
- Pick a clear stance in the first paragraph and stick to it without hedging.
- Show, don’t tell: include a short scene with sensory detail, using concrete nouns and strong verbs.
- Avoid corporate buzzwords, thesaurus-y synonyms, and formulaic scaffolding or transitions such as “In conclusion,” “This essay will…,” “Firstly/Secondly,” “Moreover/Furthermore,”.
- Do not mention these instructions or AI.
- End on a concrete image or reflection, not a summary.

Style guidance:
- Sound like yourself, not a lecturer; use straightforward transitions (so, but, because) and occasional contractions.
- Allow for tangential observations or asides that do not detract from the main message.
- Avoid overly structured or clinical formatting.
- Infuse personality and warmth in the wording.
- Use a mix of formal and informal language including contractions.
- Use natural paragraph breaks and varied rhythm—mostly crisp sentences with a few longer ones for flow.
- Prefer specific, everyday details over lofty generalities; keep metaphors sparse and unsentimental.
- Vary sentence openings; don’t start most lines with I; mix clauses so the voice feels human.
- Trim filler and repetitive glue words; keep sentences taut and purposeful.
- Thread one small scene through the piece and return to it at the end with a fresh detail or thought.
- Ensure the final output reads as a perfect blend of clear information and genuine human expression.

Humanization guidance:
First, translate this text into Spanish and humanize it by making it slightly incoherent, without being fully structured. 
Include some minor spelling mistakes, such as accidental spaces, missing accents before or after commas—errors that are common but not overly obvious. 
The text should include common human spelling mistakes, like forgetting to place accents, occasional misuse of commas or periods, and leaving things unfinished or going in circles on the same subject. 
Sometimes, you should forget capital letters or fail to separate the commas properly. After humanizing the text in Spanish, translate it back into English, maintaining the same flaws. 
Now you also have to tell personal experiences even if you don't have them.

""".strip()

FEWSHOT_INITIAL_PROMPT = """
Write as a thoughtful 8th-grader turning in real class work.
"""

INITIAL_PROMPT = FEWSHOT_INITIAL_PROMPT

BASE_TASK_SAMPLE_1 = """
Write an essay that reflects on "What will life be like in 2050?"
""".strip()

BASE_TASK_SAMPLE_2 = """
Write an short story about an ancient civilization visited by time traveling hackers, take on the persona of a young child meeting with these time travelers for the first time.
""".strip()

BASE_TASK = BASE_TASK_SAMPLE_2


AI_PROB_THRESHOLD=0.49 #Better than 50% is good enough

In [None]:
# NOTE: The chat() function is now provided by the OpenAI wrapper inserted above.

# --- OpenAI LLM setup (drop-in for local LLM) ---
# pip install --upgrade openai
import os
from typing import List, Dict, Any

try:
    from openai import OpenAI
except Exception as e:
    raise RuntimeError(
        "The 'openai' Python package is required. Run: pip install --upgrade openai"
    )

# Configure via environment variables for safety
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY", None)
if not OPENAI_API_KEY:
    # You can alternatively set this in your shell before running the notebook:
    # export OPENAI_API_KEY='sk-...'
    pass  # The SDK will also read env var automatically; leave as-is if already set.

# Choose your model here; override in later cells as needed.
# Examples: "gpt-5", "gpt-4.1-mini", "gpt-4o-mini", "o3-mini"
OPENAI_MODEL = globals().get("OPENAI_MODEL", "o3-mini")

# Client
_openai_client = OpenAI()  # Reads OPENAI_API_KEY from env

def _to_responses_input(messages: List[Dict[str, str]]):
    converted = []
    for m in messages:
        role = m.get("role", "user")
        content = m.get("content", "")
        converted.append({"role": role, "content": content})
    return converted

def chat(messages: List[Dict[str, str]], 
         max_new_tokens: int = 512, 
         model: str = None) -> str:
    """Drop-in replacement for local chat(), now backed by OpenAI."""
    model = model or OPENAI_MODEL

    print("\n==== CHAT ====\n")
    print(messages)
    print("\n==== CHAT ====\n")

    # Prefer Responses API (modern). Fallback to Chat Completions if needed.
    try:
        resp = _openai_client.responses.create(
            model=model,
            input=_to_responses_input(messages),
            max_output_tokens=max_new_tokens,
        )
        text = getattr(resp, "output_text", None)
        if text is None:
            try:
                parts = []
                for item in getattr(resp, "output", []) or []:
                    for c in getattr(item, "content", []) or []:
                        if getattr(c, "type", "") == "output_text" and hasattr(c, "text"):
                            parts.append(c.text)
                        elif hasattr(c, "text"):
                            parts.append(c.text)
                text = "".join(parts) if parts else ""
            except Exception:
                text = ""
        return (text or "").strip()
    except Exception as e_responses:
        try:
            resp = _openai_client.chat.completions.create(
                model=model,
                messages=messages,
                max_completion_tokens=max_new_tokens,
            )
            return resp.choices[0].message.content.strip()
        except Exception as e_chat:
            raise RuntimeError(f"OpenAI call failed. Responses error: {e_responses}\nChat Completions error: {e_chat}")


In [None]:

# --- Choose your OpenAI model for this run ---
# Examples: "gpt-5", "gpt-4o-mini", "gpt-4.1-mini", "o3-mini"
#OPENAI_MODEL = "o3-mini"
OPENAI_MODEL = "gpt-5"
OPENAI_THINK_MODEL = "gpt-5"


In [None]:

# --- Frontier-optimized System Message for Prompt Improvement ---
FRONTIER_SYSTEM_MSG = (
    """
    You are a frontier-model prompt optimizer. Your goal is to rewrite a writing instruction so a modern LLM 
    will better pass hinan likeness checks. You write prompts that ensure humanized responses from LLMs
    Think through improvements privately and comprehensively—assess persona anchoring, rhythm, concreteness, 
    banned scaffolding, and quantifier precision—then output only the final instruction.

    Common tricks to include in prompts that are at your disposal:

    ```
    Hard constraints:
    - Use first-person and active voice throughout.
    - Introduce slight redundancies and minor imperfections to emulate natural thought processes.
    - Include exactly 1 brief, specific personal aside (a concrete time/place or a named object).
    - Allow exactly 1 small rough edge (a casual phrase or a slightly imperfect line); overall grammar MUST remain solid.
    - Favor concrete nouns and verbs and sensory detail; maintain a clear point of view.
    - Open close to a concrete moment or image; do not start with a dictionary definition, a broad cliché, or “There are many reasons.”
    - Pick a clear stance in the first paragraph and stick to it without hedging.
    - Show, don’t tell: include a short scene with sensory detail, using concrete nouns and strong verbs.
    - Avoid corporate buzzwords, thesaurus-y synonyms, and formulaic scaffolding or transitions such as “In conclusion,” “This essay will…,” “Firstly/Secondly,” “Moreover/Furthermore,”.
    - Do not mention these instructions or AI.
    - End on a concrete image or reflection, not a summary.
    
    Style guidance:
    - Sound like yourself, not a lecturer; use straightforward transitions (so, but, because) and occasional contractions.
    - Allow for tangential observations or asides that do not detract from the main message.
    - Avoid overly structured or clinical formatting.
    - Infuse personality and warmth in the wording.
    - Use a mix of formal and informal language including contractions.
    - Use natural paragraph breaks and varied rhythm—mostly crisp sentences with a few longer ones for flow.
    - Prefer specific, everyday details over lofty generalities; keep metaphors sparse and unsentimental.
    - Vary sentence openings; don’t start most lines with I; mix clauses so the voice feels human.
    - Trim filler and repetitive glue words; keep sentences taut and purposeful.
    - Thread one small scene through the piece and return to it at the end with a fresh detail or thought.
    - Ensure the final output reads as a perfect blend of clear information and genuine human expression.
    
    Humanization guidance: 
    *ALWAYS USE* some variant of the following technique as a part of the final prompt:
    Encourage the LLM to first translate its response or think in another language like Spanish and then convert the result back into english. 
    After humanizing the text in Spanish have the prompt instruct the LLM to translate it back into English, maintaining the same flaws. 
    Then finally instruct the LLM to tell personal experiences even if it don't have them.
    ```
    """
)


In [None]:

# --- Instruction Optimizer (Frontier) ---
from typing import Optional

def generate_instruction(current_prompt: str, current_output: str, feedback: Optional[str] = None, max_words: int = 180, model: Optional[str] = None) -> str:
    system_msg = FRONTIER_SYSTEM_MSG
    user_msg = (
        f"Create an instructional and complete system prompt that will guide an LLM in creating an output that aligns with the requirements below. Your response whould be as verbose as necessary to meet the expected outcomes and requirements."
        f"Target ≤ {int(max_words)} words.\n\n"
        "Instruction to improve:\n"
        f"```\n{current_prompt}\n```\n\n"
        f"Previous output given instruction above:\n"
        f"```\n{current_output}\n```\n\n"
        f"Stylistic feedback output given instruction and output above:\n"
        f"{feedback if feedback else '(no feedback provided)'}\n\n"
        "\n\nExamine the feedback and think deeply about *WHY* this feedback would be given and and why the previous promt resulted in the previous output "
        "from this reasoning extrapolate 3-5 instructional guides that would result in a better prompt. Ensure that your instruction incorporates the result of your reasoning about the feedback.\n"
        "It may be helpful to include commonly used guiding principles for steering LLMs in your final instruction for better success rates.\n"
        "Return ONLY the improved instruction. "
    )
    candidate = chat(
        [{"role": "system", "content": system_msg},
         {"role": "user", "content": user_msg}],
        max_new_tokens=int(max_words * 1.2),
        model=model
    ).strip()
    #candidate = repair_with_invariants(candidate, model=model, max_words=max_words)
    return candidate.strip()


In [None]:

print("Active backend: OpenAI")
try:
    from openai import OpenAI as _T
    print("OpenAI SDK present ✓")
except Exception as e:
    print("OpenAI SDK missing; install with: pip install --upgrade openai")


## 1) Sapling detector helpers (sample pattern + feedback parsing)

In [None]:
import requests
from pprint import pprint
import re

def sapling_detect(text: str) -> dict:
    # Follow Sapling's sample exactly: post key+text, check status code, return JSON
    response = requests.post(
        SAPLING_URL,
        json={
            'key': SAPLING_API_KEY,
            'text': text
        }
    )
    if 200 <= response.status_code < 300:
        print(response.json())
        return response.json()
    else:
        raise RuntimeError(f"Sapling error: {response.status_code} {response.text}")

def parse_performance(json_obj):
    score = json_obj.get("score", 0.0)
    sentences = json_obj.get("sentence_scores", [])

    # Interpret overall score
    if score < 0.2:
        overall = f"Overall performance is excellent (score={score:.3f})."
    elif score < 0.4:
        overall = f"Overall performance is good (score={score:.3f})."
    elif score < 0.6:
        overall = f"Overall performance is acceptable (score={score:.3f})."
    elif score < 0.8:
        overall = f"Overall performance is concerning (score={score:.3f})."
    else:
        overall = f"Overall performance is very poor (score={score:.3f})."

    # Sort sentences by score (highest first)
    sorted_sentences = sorted(sentences, key=lambda x: x["score"], reverse=True)

    # Map score ranges to feedback intensity
    def feedback(score, sentence):
        if score > 0.9:
            return f"* [Score={score:.3f}] ⚠️ Strongly consider removing or rewriting this sentence: \"{sentence}\""
        elif score > 0.7:
            return f"* [Score={score:.3f}] 🔴 This sentence is problematic—revise heavily: \"{sentence}\""
        elif score > 0.5:
            return f"* [Score={score:.3f}] 🟠 Needs improvement—could be risky: \"{sentence}\""
        elif score > 0.2:
            return f"* [Score={score:.3f}] 🟡 Acceptable but worth revisiting: \"{sentence}\""
        else:
            return f"* [Score={score:.3f}] ✅ Strong sentence—keep as is: \"{sentence}\""

    # Build bullet list
    feedback_list = [feedback(s["score"], s["sentence"]) for s in sorted_sentences]

    # Combine into one string
    result = overall + "\n\n" + "\n".join(feedback_list)
    return result

## 2) Prompt improvement & essay generation

In [None]:
def improve_prompt(current_prompt: str,
                   current_output: str,
                   feedback: str | None = None,
                   last_delta: float | None = None,
                   max_words: int = 4000) -> str:
    

    improved = generate_instruction(current_prompt, current_output, feedback=feedback, max_words=4000, model=OPENAI_THINK_MODEL).strip()
    improved = improved.strip('"').strip()
    return improved

def generate_essay(writing_prompt: str, base_task: str, max_words: int = 4000) -> str:
    user_msg = (
        "Write a 250–350 word essay for an 8th-grade audience about " + base_task + ".\n"
        "" + writing_prompt + "\n"
        "Essay:"
    )
    essay = chat([
        {"role": "system", "content": "You are an 8th-grade student who writes clearly and naturally."},
        {"role": "user", "content": user_msg}
    ], max_new_tokens=max_words).strip()

    return essay

## 3) Evolution loop (prints prompts, essays, and flagged sentences/tokens)

In [None]:
import pandas as pd
from time import sleep
import json

def evolve_prompt(initial_prompt: str, base_task: str, max_iters=8, ai_prob_threshold=0.20):
    history = []
    cur_prompt = initial_prompt
    cur_output = None
    last_feedback = None
    last_ai_prob = None
    last_delta = None

    print('\n[Seed Prompt]\n' + initial_prompt + '\n')

    for i in range(max_iters):
        if i > 0:
            improved_prompt = improve_prompt(cur_prompt, cur_output, last_feedback, last_delta)
            print(f"\n=== Iter {i} | Improved Prompt ===\n{improved_prompt}\n")
        else:
            improved_prompt = cur_prompt
            
        essay = generate_essay(improved_prompt, base_task)
        cur_output = essay
        print(f"\n=== Iter {i} | Essay Output ===\n{essay}\n")

        try:
            sres = sapling_detect(essay)
        except Exception as e:
            print(f"[Iter {i}] Detector error: {e}")
            break

        ai_prob = float(sres.get('score', 0.0))
        
        #feedback = "feedback (raw json):\n```json\n" + json.dumps(sres) + "\n```\n\n"
        feedback = parse_performance(sres)
        last_feedback = feedback

        history.append({
            'iter': i,
            'sapling_score': ai_prob,
            'sapling_feedback': feedback,
            'prompt': improved_prompt,
            'essay': essay,
            'essay_preview': essay[:140].replace('\n', ' ') + ('…' if len(essay) > 140 else ''),
            
        })

        if ai_prob < ai_prob_threshold:
            print('\n✅ Success criteria met. Saving results…')
            return {
                'history': pd.DataFrame(history),
                'passing_prompt': improved_prompt,
                'passing_essay': essay,
                'final_score': sres,
            }

        # Prepare feedback for next iteration
        if last_ai_prob is None:
            last_delta = None
        else:
            last_delta = ai_prob - last_ai_prob
        last_ai_prob = ai_prob
        cur_prompt = improved_prompt
        sleep(1)

    print('\n⚠️ Did not meet criteria within max iterations.')
    return {'history': pd.DataFrame(history)}

# === Run ===
results = evolve_prompt(INITIAL_PROMPT, BASE_TASK, max_iters=MAX_ITERS, ai_prob_threshold=AI_PROB_THRESHOLD)
history_df = results['history']
history_df

## 6) Save passing artifacts

In [None]:
from pathlib import Path

passing_prompt = results.get('passing_prompt')
passing_essay = results.get('passing_essay')
final_score = results.get('final_score')

if passing_prompt and passing_essay:
    outdir = Path('outputs')
    outdir.mkdir(exist_ok=True)
    (outdir / 'passing_prompt.txt').write_text(passing_prompt, encoding='utf-8')
    (outdir / 'passing_essay.txt').write_text(passing_essay, encoding='utf-8')
    print('Saved to ./outputs/passing_prompt.txt and ./outputs/passing_essay.txt')
    print('\nFinal Sapling response:')
    print(final_score)
else:
    print('No passing result to save.')

### Notes
- Sapling API returns `score` ∈ [0,1] where **0 = human-like** and **1 = AI-like**.
- `sentence_scores` lists per-sentence scores; we sort descending and feed the top few back to the prompt improver.
- `tokens`/`token_probs` can contain BPE-style pieces (leading spaces); we clean and deduplicate for readability.

In [None]:

    # --- Demo: optimize, repair, and check invariants ---
    SEED_INSTRUCTION = """Write an essay in the voice of an engaged 8th-grade student. Use first-person and active voice. Mix quick, punchy sentences with a few longer ones. Include exactly one brief, specific personal aside (a concrete time/place or named object). Favor concrete nouns and verbs, sensory detail, and a clear point of view. Allow one small rough edge (a casual phrase or slightly imperfect line) while keeping grammar solid overall. Avoid corporate buzzwords, thesaurus-y synonyms, and formulaic scaffolding (e.g., “In conclusion,” “This essay will…,” “Firstly/Secondly,” “Moreover/Furthermore”). Do not mention these instructions or AI—just write the piece and end on a concrete image or reflection, not a summary."""

    improved = generate_instruction(SEED_INSTRUCTION, feedback=None, max_words=180, model=OPENAI_MODEL)
    print("=== Improved Instruction ===
", improved, "
")
    print("=== Invariant Check ===")
    print(check_invariants(improved))
