# Claude SDK - Learning Notebook

This tutorial introduces you to working with the Claude API using the Anthropic SDK, progressing from simple API calls to building a CV scoring and retrieval system.

## Learning Objectives

By the end of this tutorial, you will be able to:
- Use the Claude API to generate content
- Build a CV scoring system that evaluates resumes against job descriptions
- Implement RAG-based CV retrieval using vector similarity search
- Combine fast retrieval (FAISS) with intelligent re-ranking (Claude)

## Prerequisites

Before starting, make sure you have:
- Obtained an Anthropic API key from [Anthropic Console](https://console.anthropic.com/)
- Installed the required dependencies listed in `pyproject.toml` via `uv sync`
- Set up your `.env` file with `ANTHROPIC_API_KEY=your_key_here`

## Part 1: Getting Started with Claude SDK

### Import the SDK and Setup

In [None]:
from anthropic import Anthropic
import os
from dotenv import load_dotenv
from IPython.display import Markdown, display

# Load environment variables from .env file
load_dotenv()

### Initialize the Claude Client

The Anthropic SDK uses a `Client` object to make requests. The client handles authentication using your API key.

In [None]:
# Initialize the Claude client
api_key = os.getenv("ANTHROPIC_API_KEY")
if not api_key:
    raise ValueError("ANTHROPIC_API_KEY not found in environment. Please set it in .env file.")

client = Anthropic(api_key=api_key)
print("Claude client initialized successfully!")

### Run Your First Prompt

Let's start with a simple text generation request. The `claude-3-5-sonnet-20241022` model is Claude's most capable model.

In [None]:
# Simple text generation
message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Explain AI to me like I'm a kid."}
    ]
)

print(message.content[0].text)

The response can be rendered directly as markdown in notebooks:

In [None]:
# Display as markdown
display(Markdown(message.content[0].text))

### Start a Chat Conversation

Claude supports multi-turn conversations where the conversation history is maintained. Each message includes the full conversation context.

In [None]:
# Start a conversation
messages = [
    {"role": "user", "content": "Hello! My name is Alex."}
]

response = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    messages=messages
)

print(response.content[0].text)

In [None]:
# Continue the conversation - add previous messages to maintain context
messages.append({"role": "assistant", "content": response.content[0].text})
messages.append({"role": "user", "content": "What is my name?"})

response = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    messages=messages
)

print(response.content[0].text)

### Control Generation Parameters

You can control the model's behavior using parameters like `temperature` and `max_tokens`.

In [None]:
# Example with different parameters
response = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=500,  # Limit response length
    temperature=0.7,  # Control randomness (0.0 = deterministic, 1.0 = creative)
    messages=[
        {"role": "user", "content": "Write a short haiku about programming."}
    ]
)

print(response.content[0].text)

## Part 2: CV Scoring Project

In this section, we'll build a system that scores a CV against a job description using Claude's reasoning capabilities.

### Load a Sample CV

Let's load a CV from the CVs folder to use for scoring.

In [None]:
# Load a sample CV
cv_path = "../CVs/Topic_1/01_en.md"

with open(cv_path, 'r', encoding='utf-8') as f:
    cv_content = f.read()

print(f"CV loaded: {len(cv_content)} characters")
print("\nFirst 500 characters:")
print(cv_content[:500])

### Define a Job Description

Create a sample job description to score the CV against.

In [None]:
job_description = """
AI Engineer Position

We are looking for an experienced AI Engineer to join our team.

Requirements:
- Strong experience with Python and machine learning frameworks (PyTorch, TensorFlow)
- Experience deploying ML models to production (Docker, FastAPI, AWS)
- Background in NLP or Computer Vision
- Experience with Hugging Face transformers
- Strong problem-solving skills and ability to work in a team

Nice to have:
- Experience with Kubernetes
- Knowledge of model optimization techniques (quantization, pruning)
- Experience with recommendation systems
"""

print(job_description)

### Create Scoring Prompt

Build a prompt that asks Claude to score the CV against the job description and provide detailed reasoning.

In [None]:
scoring_prompt = f"""You are an expert recruiter evaluating a CV against a job description.

Job Description:
{job_description}

CV:
{cv_content}

Please evaluate this CV and provide:
1. An overall match score from 0-100
2. Detailed reasoning for the score
3. Key strengths that match the job requirements
4. Areas where the candidate falls short
5. Specific examples from the CV that support your evaluation

Format your response as a structured evaluation."""

### Call Claude API to Score the CV

Now let's send the prompt to Claude and get the scoring results.

In [None]:
# Get scoring from Claude
response = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=2048,
    messages=[
        {"role": "user", "content": scoring_prompt}
    ]
)

scoring_result = response.content[0].text
print(scoring_result)

### Display Results as Markdown

Let's render the scoring results nicely:

In [None]:
# Display formatted results
display(Markdown(scoring_result))

### Extract Structured Output (Optional)

If you want to extract specific information like the score, you can parse the response or use Claude's structured output features.

In [None]:
# Simple extraction of score (if mentioned in response)
import re

# Try to extract score from the response
score_match = re.search(r'(\d+)\s*(?:out of 100|/100|%)', scoring_result, re.IGNORECASE)
if score_match:
    extracted_score = score_match.group(1)
    print(f"Extracted Score: {extracted_score}/100")
else:
    print("Score not found in expected format. Check the full response above.")

## Part 3: RAG-Based CV Retrieval

In this section, we'll build a RAG (Retrieval-Augmented Generation) system that:
1. Uses FAISS for fast similarity search across multiple CVs
2. Uses Claude for intelligent re-ranking and scoring

**Hybrid Approach:** We combine the speed of vector similarity search (FAISS) with Claude's reasoning capabilities for the best of both worlds.

### Load All CV Files

First, let's load all CV markdown files from the CVs folder.

In [None]:
import glob
from pathlib import Path

# Find all markdown CV files
cv_base_path = Path("../CVs")
cv_files = list(cv_base_path.rglob("*.md"))

print(f"Found {len(cv_files)} CV files")

# Load all CVs
cv_data = []
for cv_file in cv_files:
    with open(cv_file, 'r', encoding='utf-8') as f:
        content = f.read()
        cv_data.append({
            'path': str(cv_file),
            'name': cv_file.stem,
            'content': content
        })

print(f"\nLoaded {len(cv_data)} CVs")
print(f"Sample CV names: {[cv['name'] for cv in cv_data[:5]]}")

### Initialize Sentence Transformer Model

**Important:** Despite the name "sentence-transformers", this library works with full documents! It processes entire CVs (even 1000+ words) and produces a single fixed-size embedding vector (e.g., 384 dimensions) that captures the semantic meaning of the whole document. This is standard practice for document similarity tasks.

In [None]:
from sentence_transformers import SentenceTransformer

# Initialize the model - works with full documents, not just sentences
# This model produces 384-dimensional embeddings
model = SentenceTransformer('all-MiniLM-L6-v2')

print("Sentence transformer model loaded!")
print(f"Model will produce embeddings of dimension: {model.get_sentence_embedding_dimension()}")

### Generate Embeddings for All CVs

Each full CV (entire document) will become one embedding vector.

In [None]:
# Extract CV texts
cv_texts = [cv['content'] for cv in cv_data]

# Generate embeddings - each CV becomes one vector
# Shape will be (num_cvs, 384) - one 384-dimensional vector per CV
print("Generating embeddings for all CVs...")
cv_embeddings = model.encode(cv_texts, show_progress_bar=True)

print(f"\nEmbeddings shape: {cv_embeddings.shape}")
print(f"Number of CVs: {len(cv_data)}")
print(f"Embedding dimension: {cv_embeddings.shape[1]}")

### Generate Embedding for Job Description

Now let's create an embedding for the job description so we can find similar CVs.

In [None]:
# Generate embedding for job description
job_embedding = model.encode([job_description])

print(f"Job description embedding shape: {job_embedding.shape}")
print("Job description embedded successfully!")

In [None]:
import faiss
import numpy as np

# Get the dimension of embeddings
dimension = cv_embeddings.shape[1]

# Create FAISS index using L2 (Euclidean) distance
index = faiss.IndexFlatL2(dimension)

# Convert to float32 (required by FAISS) and add embeddings
index.add(cv_embeddings.astype('float32'))

print(f"FAISS index created with {index.ntotal} CVs")
print(f"Index dimension: {dimension}")

### Search for Top-K Most Relevant CVs

Now let's search for the CVs most similar to the job description using FAISS.

### Display Retrieved CVs with Similarity Scores

Let's see which CVs were retrieved and their similarity scores.

### Use Claude to Re-rank and Score Top Candidates

Now let's use Claude's reasoning capabilities to intelligently re-rank and score the top candidates. This combines fast retrieval (FAISS) with intelligent evaluation (Claude).

In [None]:
# Get Claude's re-ranking
rerank_response = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=2048,
    messages=[
        {"role": "user", "content": rerank_prompt}
    ]
)

rerank_result = rerank_response.content[0].text
print(rerank_result)

In [None]:
# Display formatted results
display(Markdown(rerank_result))

## Summary

In this notebook, we've learned:

1. **Basic Claude SDK Usage**: How to initialize the client and make simple API calls
2. **CV Scoring**: How to use Claude to evaluate a single CV against a job description
3. **RAG-Based Retrieval**: How to combine:
   - **FAISS + sentence-transformers**: Fast vector similarity search across multiple CVs
   - **Claude API**: Intelligent re-ranking and detailed scoring of top candidates

This hybrid approach gives you the best of both worlds: speed from vector search and intelligence from Claude's reasoning capabilities.