
# 📘 Claude 3.5 Benchmark – July 5, 2025

This notebook contains the benchmark results for the **Claude 3.5 API test** conducted on July 5th, 2025.  
It is part of the broader **AI Citation SEO Evaluation** project aiming to track how different LLMs respond to a consistent set of 10 prompts.

---


In [None]:

import os
from datetime import datetime
import matplotlib.pyplot as plt

# Define paths
output_dir = "outputs/2025-07-05"
response_files = sorted([f for f in os.listdir(output_dir) if f.startswith("response_") and f.endswith(".txt")])


In [None]:

responses = []

for file in response_files:
    with open(os.path.join(output_dir, file), "r", encoding="utf-8") as f:
        responses.append(f.read())

for i, resp in enumerate(responses, 1):
    print(f"\n=== Prompt {i} ===\n")
    print(resp[:1000])  # mostra os primeiros 1000 caracteres para leitura rápida


In [None]:

import re

mention_count = sum(['ai citation seo' in resp.lower() for resp in responses])

# Plot
plt.figure(figsize=(5, 4))
plt.bar(["Claude 3.5"], [mention_count], color='blue')
plt.title("Mentions of 'AI Citation SEO'")
plt.ylabel("Count")
plt.ylim(0, 10)
plt.grid(True)
plt.show()



---

## ✅ Summary

- Model: Claude 3.5 (2024-10-22)
- Prompts tested: 10
- Mentions of **AI Citation SEO**: Tracked and visualized
- All responses stored in `outputs/2025-07-05`

This notebook is part of the multi-model benchmark for the **AI Citation SEO** framework, with corresponding datasets from **LLaMA**, **Mistral**, and other LLMs.

For details, visit the GitHub repository or contact Mayra Silva.

---
