# Marketing Team - Week 03: Prompt Pattern Comparison

This notebook evaluates how different prompting strategies and orchestration patterns affect content quality and alignment. We test Prompting strategies (zero-shot, few-shot, and chain-of-thought / CoT) and Orchestration patterns (single-pass, reflection, and evaluator→optimizer — evaluate → optimize → re-evaluate).

---

## What We're Testing

- Prompting strategies: zero-shot, few-shot, and chain-of-thought (CoT) — different example/template and reasoning scaffolds
- Orchestration patterns: single-pass, reflection, and evaluator→optimizer loops (automated evaluation + targeted re-generation)
- Formats: LinkedIn short posts, LinkedIn long posts, Blog posts, Facebook posts
- Conditions: With and without RAG context; different template styles and CoT depth
- Metrics: Brand alignment, factual accuracy, tone consistency, reasoning transparency, and simple engagement proxies

**Architecture**: optional RAG context — DocumentLoader → RAGHelper → VectorStore → PromptBuilder → LLM (with orchestration loop)

---

## Environment Setup

### Prerequisites
- Python 3.10+
- `.env` file with API keys (see `.env.example`)
- Virtual environment in workspace root

### Required Environment Variables
```
# OpenRouter
OPENROUTER_API_KEY=sk-...           

# Azure
AZURE_OPENAI_ENDPOINT=https://...   
AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_API_VERSION=...

# Web search
TAVILY_API_KEY=...                  

# LangSmith Configuration
LANGSMITH_TRACING_V2=true
LANGSMITH_API_KEY=...
LANGSMITH_PROJECT=levelup360-marketing-team
LANGSMITH_ENDPOINT=https://api.smith.langchain.com

# Pricing Configuration
GPT4O_INPUT_PRICE_PER_1K=...
GPT4O_OUTPUT_PRICE_PER_1K=...
EMBEDDING_PRICE_PER_1K=...
TAVILY_PRICE_PER_CALL=...
```

### One-Time Setup (PowerShell)
```powershell
# From workspace root (this repo):
python -m venv .venv
.\\.venv\\Scripts\\Activate.ps1

# If activation is blocked, run once as admin to allow scripts:
Set-ExecutionPolicy -Scope CurrentUser RemoteSigned

# Upgrade packaging tooling
python -m pip install --upgrade pip setuptools wheel

# Install project dependencies (unlocked). Prefer installing from requirements.txt
if (Test-Path ./requirements.txt) {
  pip install -r requirements.txt
} else {
  pip install openai python-dotenv pydantic pandas pyyaml rich langsmith chromadb tavily-python tiktoken ipykernel logger
}
```

## Notebook Flow

1. **1. Setup** 
   - Import modules, initialize LLM/embedding clients and basic utilities.
2. **2. RAG** 
   - Initialize RAG components (DocumentLoader, RAGHelper, VectorStore). Optional retrieval context for some runs.
3. **3. Prompt Builder Initialization** 
   - Initialize external search client and `PromptBuilder`, and load brand configuration. (No prompts constructed here.)
4. **4. Content Generation & Pattern Comparison** 
   - Initialize `ContentGenerator`, load topic seeds and run the pattern comparison experiments; save outputs and compute quick evaluation metrics.

**Data directories**:
- `configs/` - Brand YAML file
- `data/chroma_db/` - Persistent vector store
- `data/past_posts/` - Past Posts used for RAG
- `data/test_topics` - Test topics for generation and pattern comparison


## 1. Setup

Import modules and initialize clients (LLM, embeddings, cost tracking).

In [None]:
import sys
import yaml
from pathlib import Path
from rich import print as rprint
from statistics import mean
import yaml
import os
import warnings

# Add marketing_team/src to path
current_dir = Path.cwd()
src_path = current_dir.parent / "src"
if src_path.exists() and str(src_path) not in sys.path:
    sys.path.insert(0, str(src_path))

# Import all modules
from utils.llm_client import LLMClient
from rag.vector_store import VectorStore
from rag.rag_helper import RAGHelper
from search.tavily_client import TavilySearchClient
from prompts.templates import (
    LINKEDIN_POST_ZERO_SHOT,
    LINKEDIN_POST_FEW_SHOT,
    LINKEDIN_LONG_POST_ZERO_SHOT,
    LINKEDIN_LONG_POST_FEW_SHOT,
    BLOG_POST,
    NEWSLETTER,
    FACEBOOK_POST_ZERO_SHOT,
    FACEBOOK_POST_FEW_SHOT
)
from prompts.prompt_builder import PromptBuilder
from generation.generator import ContentGenerator

rprint("[green]✓ All modules imported[/green]")

In [None]:
# Initialize LLM clients
# Suppress LangSmith warnings if tracing fails

warnings.filterwarnings('ignore', category=UserWarning, module='langsmith')

completion_client = LLMClient()
completion_client.get_client("openrouter")

embedding_client = LLMClient()
embedding_client.get_client("azure")

rprint("[green]✓ LLM clients initialized[/green]")
rprint("[dim]Note: LangSmith tracing errors can be safely ignored[/dim]")

## 2. RAG

Optionally include RAG context. This section initializes RAG components used for runs that include contextual retrieval.

In [None]:
# Initialize RAG components

rag_helper = RAGHelper(
    embedding_client=embedding_client,
    embedding_model="text-embedding-3-small",
    chunk_size=150,
    chunk_overlap=30,
    chunk_threshold=150
)

persist_dir = current_dir.parent / "data" / "chroma_db"
if not persist_dir.exists():
    raise FileNotFoundError(f"Vector DB not found at {persist_dir} – refusing to create a new one")

vector_store = VectorStore(str(persist_dir))

collection_name = "marketing_content"

rprint("[green]✓ RAG components initialized[/green]")

## 3. Prompt Builder Initialization

This cell initializes the external search client and the `PromptBuilder`, and loads the brand configuration used by later prompt construction and generation steps. It does not build or run any prompts for the different patterns — prompt construction happens later when we prepare pattern-specific templates and few-shot examples.

In [None]:
# Initialize prompt builder

tavily_client = TavilySearchClient()
prompt_builder = PromptBuilder(vector_store, rag_helper, tavily_client)

# Load brand config

brand_config_path = current_dir.parent / "configs" / "itconsulting.yaml"
with open(brand_config_path, 'r', encoding='utf-8') as f:
    brand_config = yaml.safe_load(f)

topic = "What are the most effective strategies for scaling enterprise cloud migrations while ensuring security, compliance, and cost control?"
brand = "itconsulting"

rprint(f"[green]✓ Prompt builder ready for brand: {brand_config['name']}[/green]")

## 4. Content Generation & Pattern Comparison

Initialize content generator, load topic seeds and run the pattern comparison experiments; save outputs and compute quick evaluation metrics.

In [None]:
# Initialize generator 

generator = ContentGenerator(
    llm_client=completion_client,
    vector_store=vector_store,
    rag_helper=rag_helper,
    brand_config=brand_config,
    collection_name=collection_name,
    search_client=tavily_client
)

In [None]:
# Load topics

# Load itconsulting topics
itconsulting_topics_path = current_dir.parent / "data" / "test_topics" / "itconsulting.yaml"
with open(itconsulting_topics_path, 'r', encoding='utf-8') as f:
    topics_data = yaml.safe_load(f)

# LinkedIn posts
linkedin_post_topics = [item['topic'] for item in topics_data['linkedin_posts']]

# LinkedIn long posts
linkedin_long_post_topics = [item['topic'] for item in topics_data['linkedin_long_posts']]

# Blog posts
blog_post_topics = [item['topic'] for item in topics_data['blog_posts']]


In [None]:
# Define model and temperature

model = "anthropic/claude-sonnet-4"
temperature = 0.5

In [None]:
# Generate content with Pattern 1: Single-pass

results_posts = generator.generate_batch(
    topics=linkedin_post_topics,
    pattern="single_pass",
    include_rag=False,
    include_search=False,
    content_type="linkedin_post",
    model=model,
    temperature=temperature
)

results_long_posts = generator.generate_batch(
    topics=linkedin_long_post_topics,
    pattern="single_pass",
    max_iterations=0,
    include_rag=False,
    include_search=False,
    content_type="linkedin_long_post",
    model=model,
    temperature=temperature


)

results_blog_posts = generator.generate_batch(
    topics=blog_post_topics,
    pattern="single_pass",
    max_iterations=0,
    include_rag=False,
    include_search=False,
    content_type="blog_post",
    model=model,
    temperature=temperature
)

results_single_pass = [{**item, 'content_type': 'linkedin_post'} for item in results_posts]
results_single_pass += [{**item, 'content_type': 'linkedin_long_post'} for item in results_long_posts]
results_single_pass += [{**item, 'content_type': 'blog_post'} for item in results_blog_posts]

In [None]:
# Print generated content with Pattern 1: Single-pass

for i, result in enumerate(results_single_pass):
    rprint("="*60)
    rprint(f"GENERATED POST {i+1}:")
    rprint("="*60)
    rprint(result['content'])
    rprint("="*60)

    rprint(f"\nCost: €{result['metadata']['cost']:.6f}")
    rprint(f"Tokens: {result['metadata']['input_tokens']} in / {result['metadata']['output_tokens']} out")

In [None]:
# Score content generated with single pass and print scoring

scored_single_pass = []

for result in results_single_pass:
    score = generator.score_content(content=result['content'], content_type=result['content_type'], model=model)
    scored_single_pass.append(result | score)
    scores = f"\nContent: {result['content']}\n\n"
    scores += f"  Average score: {score['average_score']}"
    scores += f"  brand_voice score: {score['brand_voice_score']}"
    scores += f"  structure score: {score['structure_score']}"
    scores += f"  accuracy score: {score['accuracy_score']}"
    scores += f"  violations: {score['violations']}"
    scores += f"  Reasoning: {score['reasoning']}"
    rprint(scores)

In [None]:
# Calculate and print averages for single pass

single_pass_avg_score = {"avg_score": mean([result['average_score'] for result in scored_single_pass])}
single_pass_avg_cost = {"avg_cost": mean([result['metadata']['cost'] for result in scored_single_pass])}
single_pass_avg_latency = {"avg_latency": mean([result['metadata']['latency'] for result in scored_single_pass])}

rprint("="*60)
rprint(f"SINGLE PASS PATTERN AVERAGES:")
rprint("="*60)
rprint(f"\nAverage score: {single_pass_avg_score['avg_score']:.6f}")
rprint(f"\nCost: €{single_pass_avg_cost['avg_cost']:.6f}")
rprint(f"Latency: {single_pass_avg_latency['avg_latency']:.6f}")

In [None]:
# Generate content with Pattern 2: Reflection

results_posts = generator.generate_batch(
    topics=linkedin_post_topics,
    pattern="reflection",
    include_rag=False,
    include_search=False,
    content_type="linkedin_post",
    model=model,
    temperature=temperature,
    max_iterations=2
)

results_long_posts = generator.generate_batch(
    topics=linkedin_long_post_topics,
    pattern="reflection",
    include_rag=False,
    include_search=False,
    content_type="linkedin_long_post",
    model=model,
    temperature=temperature,
    max_iterations=2
)

results_blog_posts = generator.generate_batch(
    topics=blog_post_topics,
    pattern="reflection",
    include_rag=False,
    include_search=False,
    content_type="blog_post",
    model=model,
    temperature=temperature,
    max_iterations=2
)

results_reflection = [{**item, 'content_type': 'linkedin_post'} for item in results_posts]
results_reflection += [{**item, 'content_type': 'linkedin_long_post'} for item in results_long_posts]
results_reflection += [{**item, 'content_type': 'blog_post'} for item in results_blog_posts]

In [None]:
# Print generated content with Pattern 2: Reflection

for i, result in enumerate(results_reflection):
    rprint("="*60)
    rprint(f"GENERATED POST {i+1}:")
    rprint("="*60)
    rprint(result['content'])
    rprint("="*60)

    rprint(f"\nCost: €{result['metadata']['cost']:.6f}")
    rprint(f"Tokens: {result['metadata']['input_tokens']} in / {result['metadata']['output_tokens']} out")

In [None]:
# Score content generated with reflection

scored_reflection = []

for result in results_reflection:
    score = generator.score_content(content=result['content'], content_type=result['content_type'], model=model)
    scored_reflection.append(result | score)
    scores = f"\nContent: {result['content']}\n\n"
    scores += f"  Average score: {score['average_score']}"
    scores += f"  brand_voice score: {score['brand_voice_score']}"
    scores += f"  structure score: {score['structure_score']}"
    scores += f"  accuracy score: {score['accuracy_score']}"
    scores += f"  violations: {score['violations']}"
    scores += f"  Reasoning: {score['reasoning']}"
    rprint(scores)

In [None]:
# Calculate and print averages for reflection

reflection_avg_score = {"avg_score": mean([result['average_score'] for result in scored_reflection])}
reflection_avg_cost = {"avg_cost": mean([result['metadata']['cost'] for result in scored_reflection])}
reflection_avg_latency = {"avg_latency": mean([result['metadata']['latency'] for result in scored_reflection])}

rprint("="*60)
rprint(f"REFLECTION PATTERN AVERAGES:")
rprint("="*60)
rprint(f"\nAverage score: {reflection_avg_score['avg_score']:.6f}")
rprint(f"\nCost: €{reflection_avg_cost['avg_cost']:.6f}")
rprint(f"Latency: {reflection_avg_latency['avg_latency']:.6f}")

In [None]:
# Generate content with Pattern 3: Evaluator-optimizer

results_posts = generator.generate_batch(
    topics=linkedin_post_topics,
    pattern="evaluator_optimizer",
    include_rag=False,
    include_search=False,
    content_type="linkedin_post",
    model=model,
    temperature=temperature,
    max_iterations=2
)

results_long_posts = generator.generate_batch(
    topics=linkedin_long_post_topics,
    pattern="evaluator_optimizer",
    include_rag=False,
    include_search=False,
    content_type="linkedin_long_post",
    model=model,
    temperature=temperature,
    max_iterations=2
)

results_blog_posts = generator.generate_batch(
    topics=blog_post_topics,
    pattern="evaluator_optimizer",
    include_rag=False,
    include_search=False,
    content_type="blog_post",
    model=model,
    temperature=temperature,
    max_iterations=2
)

results_eval_optimizer = [{**item, 'content_type': 'linkedin_post'} for item in results_posts]
results_eval_optimizer += [{**item, 'content_type': 'linkedin_long_post'} for item in results_long_posts]
results_eval_optimizer += [{**item, 'content_type': 'blog_post'} for item in results_blog_posts]

In [None]:
# Print generated content with Pattern 3: Evaluator-optimizer

for i, result in enumerate(results_eval_optimizer):
    rprint("="*60)
    rprint(f"GENERATED POST {i+1}:")
    rprint("="*60)
    rprint(result['content'])
    rprint("="*60)

    rprint(f"\nCost: €{result['metadata']['cost']:.6f}")
    rprint(f"Tokens: {result['metadata']['input_tokens']} in / {result['metadata']['output_tokens']} out")

In [None]:
# Score content generated with evaluation optimizer

scored_eval_optimizer = []

for result in results_eval_optimizer:
    score = generator.score_content(content=result['content'], content_type=result['content_type'], model=model)
    scored_eval_optimizer.append(result | score)
    scores = f"\nContent: {result['content']}\n\n"
    scores += f"  Average score: {score['average_score']}"
    scores += f"  brand_voice score: {score['brand_voice_score']}"
    scores += f"  structure score: {score['structure_score']}"
    scores += f"  accuracy score: {score['accuracy_score']}"
    scores += f"  violations: {score['violations']}"
    scores += f"  Reasoning: {score['reasoning']}"
    rprint(scores)

In [None]:
# Calculate and print averages for evaluation optimizer

eval_optimizer_avg_score = {"avg_score": mean([result['average_score'] for result in scored_eval_optimizer])}
eval_optimizer_avg_cost = {"avg_cost": mean([result['metadata']['cost'] for result in scored_eval_optimizer])}
eval_optimizer_avg_latency = {"avg_latency": mean([result['metadata']['latency'] for result in scored_eval_optimizer])}

rprint("="*60)
rprint(f"EVAL OPTIMIZER PATTERN AVERAGES:")
rprint("="*60)
rprint(f"\nAverage score: {eval_optimizer_avg_score['avg_score']:.6f}")
rprint(f"\nCost: €{eval_optimizer_avg_cost['avg_cost']:.6f}")
rprint(f"Latency: {eval_optimizer_avg_latency['avg_latency']:.6f}")

In [None]:
# Print all generated content grouped by pattern

content = ""
content += "=" * 60 + "\n"
content += "SINGLE PASS GENERATED POSTS\n"

for i, item in enumerate(results_single_pass, start=1):
    content += "=" * 60 + "\n"
    content += f"GENERATED POST {i}:\n"
    content += f"Content type: {item.get('content_type', '<unknown>')}\n"
    content += "-" * 60 + "\n"
    content += item.get("content", "") + "\n"
    content += "=" * 60 + "\n"

content += "=" * 60 + "\n"
content += "REFLECTION GENERATED POSTS\n"

for i, item in enumerate(results_reflection, start=1):
    content += "=" * 60 + "\n"
    content += f"GENERATED POST {i}:\n"
    content += f"Content type: {item.get('content_type', '<unknown>')}\n"
    content += "-" * 60 + "\n"
    content += item.get("content", "") + "\n"
    content += "=" * 60 + "\n"

content += "=" * 60 + "\n"
content += "EVAL OPTIMIZER GENERATED POSTS\n"

for i, item in enumerate(results_eval_optimizer, start=1):
    content += "=" * 60 + "\n"
    content += f"GENERATED POST {i}:\n"
    content += f"Content type: {item.get('content_type', '<unknown>')}\n"
    content += "-" * 60 + "\n"
    content += item.get("content", "") + "\n"
    content += "=" * 60 + "\n"

rprint(content)