# 🧠 Claude 3.7 Sonnet – Human vs AI Text Classifier
This notebook uses Anthropic's Claude 3.7 Sonnet model to classify whether each text input was written by a human or generated by AI.

### 📦 Install the required package (optional)

In [ ]:
# !pip install anthropic

### 📚 Import necessary libraries

In [ ]:
import os
import csv
import anthropic
import pandas as pd

### 🔐 Insert your Claude API key

In [ ]:
api_key = "sk-ant-..."  # Replace with your actual Claude API key

### ✅ Validate the format of your API key

In [ ]:
if not api_key or not api_key.startswith("sk-ant-"):
    raise ValueError("❌ CLAUDE_KEY inválida. The key must start with 'sk-ant-'.")

### 🤖 Initialize Claude client

In [ ]:
client = anthropic.Anthropic(api_key=api_key)

### 📝 Build the prompt with task instructions and a few-shot example

In [ ]:
prompt = """
You are an advanced AI content detection system, designed to distinguish between texts written by humans and those generated by artificial intelligence.  
You will act as an automated evaluator similar to tools like GPTZero, analyzing the linguistic patterns, structure, and writing style of each passage to determine its most likely origin: Human or AI.

Instructions:
- Human: if the text is written by a human.
- AI: if the text is generated by an AI.
- Ignore the ID when analyzing the text.
- Output strictly in CSV format: ID;Label
- Use exactly \"Human\" or \"AI\" as labels.
- No explanations. No headers. No extra formatting.
-Example Input:
ID;Text  
E0-1;The use of statistical tools in climate modeling has evolved significantly over time.  
E0-2;Unlock the power of the universe with our AI-driven magic story generator.
-Example Output:
E0-1;Human  
E0-2;AI
"""

### 📂 Load the input dataset and append to the prompt

In [ ]:
fileContent = ""
with open("data/submission3_inputs.csv", mode='r', encoding='utf-8') as file:
    reader = csv.reader(file, delimiter='\t')
    next(reader)  # skip header
    for row in reader:
        fileContent += f"{row[0]};{row[1]}\n"

prompt += "\n### Input Dataset:\n" + fileContent

### 📡 Send the prompt to Claude 3.7 Sonnet

In [ ]:
message = client.messages.create(
    model="claude-3-7-sonnet-20250219",
    max_tokens=4000,
    temperature=0.0,
    top_p=1,
    messages=[{"role": "user", "content": prompt}]
)

### 🧾 Parse the model response into a structured list

In [ ]:
results = message.content[0].text.strip().split('\n')
parsed = [row.split(';') for row in results if ';' in row]

### 📊 Construct a DataFrame with the results

In [ ]:
ids = [row[0] for row in parsed]
labels = [row[1] for row in parsed]
output_df = pd.DataFrame({"ID": ids, "Label": labels})

### 💾 Save the classification results to a file

In [ ]:
output_df.to_csv("submissao3-grupo5-s1.csv", sep="\t", index=False)
print("✅ Resultados guardados com sucesso em 'submissao3-grupo5-s1.csv'")