# Multi-Turn PDF Understanding with Claude

This notebook demonstrates how to have extended conversations with Claude about PDF documents. Unlike single-shot interactions, this approach maintains context across multiple exchanges, allowing you to ask follow-up questions, dive deeper into specific sections, and build upon previous answers.


## Setup

We'll start by installing the Anthropic client and setting up the necessary configuration for PDF support.


In [1]:

%pip install anthropic
%pip install python-dotenv


Collecting anthropic
  Downloading anthropic-0.52.2-py3-none-any.whl.metadata (25 kB)
Collecting distro<2,>=1.7.0 (from anthropic)
  Downloading distro-1.9.0-py3-none-any.whl.metadata (6.8 kB)
Collecting jiter<1,>=0.4.0 (from anthropic)
  Downloading jiter-0.10.0-cp311-cp311-macosx_11_0_arm64.whl.metadata (5.2 kB)
Downloading anthropic-0.52.2-py3-none-any.whl (286 kB)
Downloading distro-1.9.0-py3-none-any.whl (20 kB)
Downloading jiter-0.10.0-cp311-cp311-macosx_11_0_arm64.whl (321 kB)
Installing collected packages: jiter, distro, anthropic
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3/3[0m [anthropic]/3[0m [anthropic]
[1A[2KSuccessfully installed anthropic-0.52.2 distro-1.9.0 jiter-0.10.0
Note: you may need to restart the kernel to use updated packages.


In [9]:

from anthropic import Anthropic
import base64
import os
from dotenv import load_dotenv

# Load the environment variables
load_dotenv()

# Get the API key from the environment variables
api_key = os.getenv("ANTHROPIC_API_KEY")

# While PDF support is in beta, you must pass in the correct beta header
client = Anthropic(
  api_key=api_key,
  default_headers={
    "anthropic-beta": "pdfs-2024-09-25"
  }
)
# For now, only claude-3-5-sonnet-20241022 supports PDFs
MODEL_NAME = "claude-3-5-sonnet-20241022"


## Loading and Encoding the PDF

Next, we'll load a PDF document and convert it to the base64 format required by the Anthropic API. The PDF will be loaded once and reused throughout our multi-turn conversation.


In [10]:
# Load and encode the PDF document
file_name = "../multimodal/documents/constitutional-ai-paper.pdf"

def load_pdf_as_base64(file_path):
    """Load a PDF file and return its base64-encoded string representation."""
    try:
        with open(file_path, "rb") as pdf_file:
            binary_data = pdf_file.read()
            base64_encoded_data = base64.standard_b64encode(binary_data)
            return base64_encoded_data.decode("utf-8")
    except FileNotFoundError:
        print(f"Error: Could not find PDF file at {file_path}")
        return None
    except Exception as e:
        print(f"Error loading PDF: {e}")
        return None

# Load the PDF once for the entire conversation
pdf_base64 = load_pdf_as_base64(file_name)
if pdf_base64:
    print(f"✓ Successfully loaded PDF: {file_name}")
    print(f"✓ PDF size: {len(pdf_base64)} characters (base64 encoded)")
else:
    print("✗ Failed to load PDF. Please check the file path.")


✓ Successfully loaded PDF: ../multimodal/documents/constitutional-ai-paper.pdf
✓ PDF size: 2784148 characters (base64 encoded)


## Multi-Turn Conversation Functions

Now we'll create the core functions that enable multi-turn conversations with the PDF. The key insight is that we only need to include the PDF document in the first message of our conversation - subsequent messages can reference it without re-uploading.


In [11]:
def get_completion(client, messages, max_tokens=2048):
    """Get a completion from Claude using the provided messages."""
    try:
        response = client.messages.create(
            model=MODEL_NAME,
            max_tokens=max_tokens,
            messages=messages
        )
        return response.content[0].text
    except Exception as e:
        return f"Error: {e}"

def create_initial_message(pdf_base64, user_prompt):
    """Create the first message that includes the PDF document."""
    return {
        "role": "user",
        "content": [
            {
                "type": "document", 
                "source": {
                    "type": "base64", 
                    "media_type": "application/pdf", 
                    "data": pdf_base64
                }
            },
            {
                "type": "text", 
                "text": user_prompt
            }
        ]
    }

def create_followup_message(user_prompt):
    """Create a follow-up message (text only, no PDF re-upload needed)."""
    return {
        "role": "user",
        "content": user_prompt
    }

def print_conversation_separator():
    """Print a visual separator for the conversation."""
    print("\n" + "="*80 + "\n")


## Interactive Multi-Turn PDF Chat

This is the main conversation loop that allows you to have an extended dialogue with Claude about the PDF document. The conversation maintains context across all exchanges.


In [None]:
def start_pdf_conversation(pdf_base64):
    """Start an interactive multi-turn conversation about the PDF."""
    if not pdf_base64:
        print("No PDF loaded. Please load a PDF first.")
        return
    
    conversation_history = []
    turn_count = 0
    
    print("Multi-Turn PDF Chat Started!")
    print("PDF document loaded and ready for questions.")
    print("Type 'quit' to end the conversation")
    print("Type 'history' to see conversation summary")
    print("Type 'clear' to start fresh (keeping the PDF)")
    print_conversation_separator()
    
    while True:
        # Get user input
        try:
            user_input = input("User: ").strip()
        except KeyboardInterrupt:
            print("\n\nConversation ended by user.")
            break
        
        # Handle special commands
        if user_input.lower() == "quit":
            print("Conversation ended.")
            break
        elif user_input.lower() == "history":
            print(f"Conversation Summary: {len(conversation_history)} messages, {turn_count} turns")
            if conversation_history:
                print("Recent topics discussed:")
                for i, msg in enumerate(conversation_history[-6:], 1):  # Show last 3 exchanges
                    role = "User" if msg["role"] == "user" else "Assistant"
                    content = msg["content"]
                    if isinstance(content, list):
                        content = content[1]["text"]  # Extract text from complex content
                    preview = content[:100] + "..." if len(content) > 100 else content
                    print(f"  {role}: {preview}")
            continue
        elif user_input.lower() == "clear":
            conversation_history = []
            turn_count = 0
            print("Conversation history cleared. PDF still loaded.")
            continue
        elif not user_input:
            print("Please enter a question or command.")
            continue
        
        # Create the appropriate message
        if turn_count == 0:
            # First turn: include PDF document
            user_message = create_initial_message(pdf_base64, user_input)
        else:
            # Subsequent turns: text only
            user_message = create_followup_message(user_input)
        
        # Add user message to history
        conversation_history.append(user_message)
        
        # Get response from Claude
        print("Claude is thinking...")
        assistant_response = get_completion(client, conversation_history)
        
        # Display response
        print(f"Assistant: {assistant_response}")
        
        # Add assistant response to history
        conversation_history.append({
            "role": "assistant", 
            "content": assistant_response
        })
        
        turn_count += 1
        print_conversation_separator()

# Start the conversation if PDF is loaded
if pdf_base64:
    start_pdf_conversation(pdf_base64)
else:
    print("Cannot start conversation: PDF not loaded")


## Example: Programmatic Multi-Turn Conversation

Below is an example of how you might structure a programmatic multi-turn conversation without user input, useful for automated analysis or testing.


In [13]:
def demonstrate_programmatic_conversation(pdf_base64):
    """Demonstrate a programmatic multi-turn conversation about the PDF."""
    if not pdf_base64:
        print("No PDF loaded for demonstration.")
        return
    
    # Define a series of questions that build upon each other
    questions = [
        "What is the main topic of this paper? Please provide a brief summary.",
        "What are the key challenges or problems that this work addresses?",
        "Can you explain the main methodology or approach used in more detail?",
        "What were the most significant results or findings?",
        "Based on our discussion, what do you think are the most important implications of this work?"
    ]
    
    conversation_history = []
    
    print("Programmatic Multi-Turn PDF Analysis")
    print("=" * 50)
    
    for i, question in enumerate(questions, 1):
        print(f"\nQuestion {i}: {question}")
        print("-" * 40)
        
        # Create appropriate message (first includes PDF, others are text-only)
        if i == 1:
            user_message = create_initial_message(pdf_base64, question)
        else:
            user_message = create_followup_message(question)
        
        conversation_history.append(user_message)
        
        # Get response
        response = get_completion(client, conversation_history, max_tokens=1024)
        print(f"Claude's Response:\n{response}\n")
        
        # Add response to history
        conversation_history.append({
            "role": "assistant",
            "content": response
        })
    
    print("Programmatic conversation completed!")
    print(f"Total exchanges: {len(questions)}")
    return conversation_history

# Uncomment the line below to run the programmatic demonstration
# programmatic_history = demonstrate_programmatic_conversation(pdf_base64)


## Advanced Features and Tips

### Key Benefits of Multi-Turn PDF Conversations:

1. **Context Preservation**: Each question builds upon previous answers, allowing for deeper exploration
2. **Efficiency**: The PDF is uploaded only once, saving bandwidth and processing time
3. **Natural Flow**: Conversations feel more natural and allow for clarification and follow-up questions
4. **Memory**: Claude remembers what was discussed earlier in the conversation

### Best Practices:

- **Start Broad, Then Narrow**: Begin with general questions, then dive into specific details
- **Reference Previous Answers**: Use phrases like "Based on what you just explained..." or "Following up on your earlier point..."
- **Ask for Clarification**: Don't hesitate to ask Claude to explain concepts in different ways
- **Use the History**: The conversation history helps you track what's been covered

### Potential Use Cases:

- **Research Paper Analysis**: Deep dive into academic papers with follow-up questions
- **Document Review**: Systematic review of contracts, reports, or policy documents  
- **Educational Content**: Interactive learning sessions with educational materials
- **Technical Documentation**: Step-by-step exploration of complex technical documents
