# Anthropic Extended Thinking with Claude

This notebook demonstrates Claude's extended thinking capabilities - a powerful feature that gives Claude enhanced reasoning for complex tasks.

## What is Extended Thinking?

Extended thinking allows Claude to "think" through complex problems step-by-step before providing a final answer. This is particularly useful for:

- **Mathematical problems** requiring multi-step calculations
- **Complex analysis** of documents or data
- **Architecture decisions** in software development
- **Strategic planning** and decision-making

### How It Works

When extended thinking is enabled, Claude:
1. Creates internal "thinking" content blocks
2. Works through the problem systematically
3. Incorporates insights from this reasoning
4. Delivers a more thoughtful final response

## Table of Contents

1. [Setup](#setup)
2. [Supported Models](#models)
3. [Basic Usage](#basic)
4. [Budget Tokens Parameter](#budget)
5. [Streaming Responses](#streaming)
6. [Tool Use with Extended Thinking](#tools)
7. [Interleaved Thinking (Beta)](#interleaved)
8. [Summarized vs Full Thinking](#summarized)
9. [Best Practices](#best-practices)
10. [Practical Examples](#examples)
11. [Summary](#summary)

<a id='setup'></a>
## 1. Setup

First, let's install the necessary packages and set up our API key.

We'll load the Anthropic API key, create the client, and verify initialization.
Expected output: a single line confirming the client is ready.


In [1]:
import os
import getpass
import anthropic
from IPython.display import Markdown, display
import json
import time
from dotenv import load_dotenv

# Load environment variables from .env file (if it exists)
load_dotenv()

# Set up your API key
api_key = os.getenv('ANTHROPIC_API_KEY')
if not api_key:
    api_key = getpass.getpass("Enter your Anthropic API key: ")

client = anthropic.Anthropic(api_key=api_key)
print("Anthropic client initialized (ready for API calls)")

Anthropic client initialized (ready for API calls)


<a id='models'></a>
## 2. Supported Models

Extended thinking is supported in these Claude models:

| Model | Model ID | Notes |
|-------|----------|-------|
| **Claude Opus 4.5** | `claude-opus-4-5-20251101` | Latest flagship model |
| **Claude Sonnet 4.5** | `claude-sonnet-4-5-20250929` | Enhanced reasoning |
| **Claude Haiku 4.5** | `claude-haiku-4-5-20251001` | Fast model with thinking |
| **Claude Opus 4.1** | `claude-opus-4-1-20250805` | Advanced reasoning |
| **Claude Opus 4** | `claude-opus-4-20250514` | Powerful reasoning |
| **Claude Sonnet 4** | `claude-sonnet-4-20250514` | Balanced performance |

**Important**: Claude 4+ models return **summarized thinking** (you're charged for full internal tokens), while Claude 3.7 returns full thinking output (deprecated).

This table shows which Claude models support extended thinking and the IDs to use.
Pick one model ID and reuse it consistently while you learn.


<a id='basic'></a>
## 3. Basic Usage

Let's start with a simple example to see extended thinking in action.

This first example shows the structure of a response with thinking and final text.
You will see a short thinking summary and then the final answer.


In [2]:
def basic_thinking_example():
    """A simple example demonstrating extended thinking"""
    
    print("Basic extended thinking example")
    print("-" * 60)
    response = client.messages.create(
        model="claude-sonnet-4-5-20250929",
        max_tokens=10000,
        thinking={
            "type": "enabled",
            "budget_tokens": 5000  # How many tokens Claude can use for thinking
        },
        messages=[{
            "role": "user",
            "content": "What is 27 * 453? Show me how you calculate this step by step."
        }]
    )
    
    # Process the response
    for block in response.content:
        if block.type == "thinking":
            print("Claude's Thinking Process (Summary):")
            print("-" * 50)
            print(block.thinking)
            print("-" * 50)
            print()
        elif block.type == "text":
            print("Final Answer:")
            print(block.text)

# Run the example
basic_thinking_example()

Basic extended thinking example
------------------------------------------------------------
Claude's Thinking Process (Summary):
--------------------------------------------------
I need to calculate 27 * 453 step by step.

I'll use the standard multiplication method, breaking down 27 into 20 + 7.

Method 1: Breaking it down
27 * 453 = (20 + 7) * 453
        = 20 * 453 + 7 * 453
        = 9,060 + 3,171

Let me calculate each part:
- 20 * 453 = 9,060
- 7 * 453 = 3,171

Let me verify 7 * 453:
7 * 400 = 2,800
7 * 50 = 350
7 * 3 = 21
Total: 2,800 + 350 + 21 = 3,171 ✓

Now: 9,060 + 3,171 = 12,231

Let me also show the traditional column method to be thorough.
--------------------------------------------------

Final Answer:
# 27 × 453

I'll solve this using the standard multiplication method:

```
      453
    ×  27
    -----
     3171  (453 × 7)
    9060   (453 × 20)
    -----
   12231
```

## Step-by-step:

**Step 1: Multiply 453 × 7**
- 7 × 3 = 21 (write 1, carry 2)
- 7 × 5 = 35, plus 

<a id='budget'></a>
## 4. Budget Tokens Parameter

The `budget_tokens` parameter determines how many tokens Claude can use for reasoning.

### Key Rules

- **Minimum**: 1,024 tokens
- **`max_tokens` must exceed `budget_tokens`** to allow room for the final response
- Claude may not use the entire budget - it only uses what's needed

### Budget Guidelines

| Budget Range | Use Case |
|--------------|----------|
| 1,024 - 5,000 | Basic reasoning tasks, simple calculations |
| 5,000 - 15,000 | Standard complex problems, code generation |
| 15,000 - 32,000 | Deep analysis, research, architecture planning |
| 32,000+ | Extensive multi-faceted problems (use batch API) |

We'll run the same prompt with different `budget_tokens` values and compare:
- runtime
- output length
- token usage


In [3]:
def budget_comparison():
    """Compare different thinking budgets"""
    
    problem = """Analyze this business scenario:
    A coffee shop has 3 locations. Location A makes $2,500/day, 
    Location B makes $1,800/day, and Location C makes $3,200/day.
    Operating costs are 65% of revenue. They want to open a 4th location.
    What factors should they consider and what's the minimum daily revenue 
    the new location needs to be profitable?"""
    
    budgets = [1024, 5000, 15000]
    
    print("Thinking Budget Comparison (compare time, length, tokens)")
    print("=" * 60)
    
    for budget in budgets:
        print(f"\nBudget: {budget:,} tokens")
        print("-" * 40)
        
        start_time = time.time()
        
        # max_tokens must be greater than budget_tokens
        max_tokens = budget + 4000
        
        response = client.messages.create(
            model="claude-sonnet-4-5-20250929",
            max_tokens=max_tokens,
            thinking={"type": "enabled", "budget_tokens": budget},
            messages=[{"role": "user", "content": problem}]
        )
        
        elapsed_time = time.time() - start_time
        
        # Get response length and usage
        response_text = ""
        for block in response.content:
            if block.type == "text":
                response_text = block.text
        
        print(f"Time: {elapsed_time:.2f} seconds")
        print(f"Response length: {len(response_text)} characters")
        print(f"Tokens used - Input: {response.usage.input_tokens}, Output: {response.usage.output_tokens}")
        print(f"Response preview: {response_text[:200]}...")

budget_comparison()

Thinking Budget Comparison (compare time, length, tokens)

Budget: 1,024 tokens
----------------------------------------
Time: 18.99 seconds
Response length: 1732 characters
Tokens used - Input: 133, Output: 902
Response preview: # Coffee Shop Expansion Analysis

## Current Performance Snapshot

**Daily Revenue & Profit by Location:**
- Location A: $2,500/day → $875 profit (35% margin)
- Location B: $1,800/day → $630 profit (3...

Budget: 5,000 tokens
----------------------------------------
Time: 22.90 seconds
Response length: 1785 characters
Tokens used - Input: 133, Output: 1182
Response preview: # Business Analysis: 4th Coffee Shop Location

## Current Performance Analysis

**Daily Metrics by Location:**
| Location | Revenue | Operating Costs (65%) | Profit (35%) |
|----------|---------|-----...

Budget: 15,000 tokens
----------------------------------------
Time: 19.11 seconds
Response length: 1475 characters
Tokens used - Input: 133, Output: 915
Response preview: # Coffee Shop Ex

<a id='streaming'></a>
## 5. Streaming Responses

For better user experience, especially with longer thinking times, you can stream responses.

**Important**: Streaming is **required** when `max_tokens` exceeds 21,333.

Streaming shows progress as thinking happens. Watch for dots during thinking and
then a full final response when the model switches to text output.


In [4]:
def stream_thinking_example():
    """Demonstrate streaming with extended thinking"""
    
    print("Streaming Extended Thinking Example (dots = thinking)")
    print("=" * 60)
    
    with client.messages.stream(
        model="claude-sonnet-4-5-20250929",
        max_tokens=12000,
        thinking={"type": "enabled", "budget_tokens": 10000},
        messages=[{
            "role": "user",
            "content": """Design a simple REST API for a todo list application. 
            Include endpoints for CRUD operations and consider:
            - Authentication
            - Error handling
            - Data validation
            - Response formats"""
        }],
    ) as stream:
        current_block_type = None
        
        for event in stream:
            if event.type == "content_block_start":
                current_block_type = event.content_block.type
                if current_block_type == "thinking":
                    print("\nClaude is thinking...", end="", flush=True)
                elif current_block_type == "text":
                    print("\n\nFinal Response:\n", end="", flush=True)
            
            elif event.type == "content_block_delta":
                if event.delta.type == "thinking_delta":
                    # Show progress dots for thinking
                    print(".", end="", flush=True)
                elif event.delta.type == "text_delta":
                    print(event.delta.text, end="", flush=True)
            
            elif event.type == "content_block_stop":
                if current_block_type == "thinking":
                    print(" Done thinking!")

stream_thinking_example()

Streaming Extended Thinking Example (dots = thinking)

Claude is thinking.................................... Done thinking!


Final Response:
# Todo List REST API Design

## Base URL
```
https://api.todoapp.com/v1
```

## Authentication

### Register User
```http
POST /auth/register
Content-Type: application/json

{
  "email": "user@example.com",
  "password": "securePassword123",
  "name": "John Doe"
}

Response: 201 Created
{
  "data": {
    "id": "usr_123",
    "email": "user@example.com",
    "name": "John Doe",
    "createdAt": "2024-01-15T10:30:00Z"
  },
  "token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
}
```

### Login
```http
POST /auth/login
Content-Type: application/json

{
  "email": "user@example.com",
  "password": "securePassword123"
}

Response: 200 OK
{
  "data": {
    "id": "usr_123",
    "email": "user@example.com",
    "name": "John Doe"
  },
  "token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
}
```

### Refresh Token
```http
POST /auth/refresh
Authorization: Be

<a id='tools'></a>
## 6. Tool Use with Extended Thinking

Extended thinking can be combined with tool use for more powerful applications.

### Important Constraints

- Extended thinking only supports `tool_choice: "auto"` or `tool_choice: "none"`
- **Forced tool use is NOT supported** with extended thinking
- When continuing conversations, thinking blocks must be preserved unchanged

This example includes a tool call. Look for the tool input and then the final plan.


In [5]:
def thinking_with_tools_example():
    """Demonstrate extended thinking with tool use"""
    
    # Define a simple calculator tool
    tools = [{
        "name": "calculator",
        "description": "Perform mathematical calculations",
        "input_schema": {
            "type": "object",
            "properties": {
                "expression": {
                    "type": "string",
                    "description": "Mathematical expression to evaluate"
                }
            },
            "required": ["expression"]
        }
    }]
    
    response = client.messages.create(
        model="claude-sonnet-4-5-20250929",
        max_tokens=20000,
        thinking={
            "type": "enabled",
            "budget_tokens": 8000
        },
        tools=tools,
        tool_choice={"type": "auto"},  # Only auto or none supported
        messages=[{
            "role": "user",
            "content": """I'm planning a party for 25 people. Each person will eat:
            - 3 slices of pizza (8 slices per pizza)
            - 2 sodas ($1.50 each)
            - 1 dessert ($3.00 each)
            
            Pizzas cost $12 each. Calculate the total cost and quantities needed."""
        }]
    )
    
    print("Party Planning with Extended Thinking (tool calls + final plan)")
    print("=" * 60)
    
    for block in response.content:
        if block.type == "thinking":
            print("\nPlanning Process:")
            print(block.thinking[:500] + "...\n")
        elif block.type == "tool_use":
            print(f"\nUsing tool: {block.name}")
            print(f"   Input: {block.input}")
        elif block.type == "text":
            print("\nFinal Plan:")
            display(Markdown(block.text))

thinking_with_tools_example()

Party Planning with Extended Thinking (tool calls + final plan)

Planning Process:
Let me break down this party planning problem:

1. Pizza calculation:
   - 25 people × 3 slices per person = 75 slices needed
   - 8 slices per pizza
   - Number of pizzas needed = 75 ÷ 8 = 9.375, so we need 10 pizzas (round up)
   - Cost: 10 pizzas × $12 = $120

2. Sodas calculation:
   - 25 people × 2 sodas per person = 50 sodas
   - Cost: 50 sodas × $1.50 = $75

3. Desserts calculation:
   - 25 people × 1 dessert per person = 25 desserts
   - Cost: 25 desserts × $3.00 = $75

4. Total cost:
  ...


Final Plan:


I'll help you calculate the quantities and total cost for your party!


Using tool: calculator
   Input: {'expression': '25 * 3'}

Using tool: calculator
   Input: {'expression': '75 / 8'}

Using tool: calculator
   Input: {'expression': '10 * 12'}

Using tool: calculator
   Input: {'expression': '25 * 2'}

Using tool: calculator
   Input: {'expression': '50 * 1.50'}

Using tool: calculator
   Input: {'expression': '25 * 3.00'}

Using tool: calculator
   Input: {'expression': '120 + 75 + 75'}


<a id='interleaved'></a>
## 7. Interleaved Thinking (Beta)

A powerful new feature for **Claude 4 models only** that enables reasoning between tool calls.

### How to Enable

Use the beta header: `anthropic-beta: interleaved-thinking-2025-05-14`

### Benefits

- Claude can think about tool results before deciding next steps
- Enables more sophisticated multi-step workflows
- Better tool orchestration for complex tasks

Interleaved thinking lets Claude reason between multiple tool calls.
This cell is safe to read without running; uncomment only if you have access.


In [6]:
def interleaved_thinking_example():
    """Example of interleaved thinking with multiple tool calls"""
    
    # Client with beta header for interleaved thinking
    client_with_beta = anthropic.Anthropic(
        api_key=api_key,
        default_headers={
            "anthropic-beta": "interleaved-thinking-2025-05-14"
        }
    )
    
    tools = [
        {
            "name": "search_database",
            "description": "Search for information in a database",
            "input_schema": {
                "type": "object",
                "properties": {
                    "query": {"type": "string"},
                    "filters": {"type": "object"}
                },
                "required": ["query"]
            }
        },
        {
            "name": "analyze_data",
            "description": "Analyze data and return insights",
            "input_schema": {
                "type": "object",
                "properties": {
                    "data": {"type": "array"},
                    "analysis_type": {"type": "string"}
                },
                "required": ["data", "analysis_type"]
            }
        }
    ]
    
    # max_tokens must be greater than budget_tokens
    budget = 16384
    
    response = client_with_beta.messages.create(
        model="claude-sonnet-4-5-20250929",
        max_tokens=budget + 4000,
        tools=tools,
        messages=[{
            "role": "user",
            "content": "Find all customers who made purchases last month and analyze their buying patterns."
        }],
        thinking={
            "type": "enabled",
            "budget_tokens": budget  # Larger budget for complex multi-step reasoning
        }
    )
    
    print("Interleaved Thinking Example (tool reasoning between calls)")
    print("=" * 60)
    print("\nWith interleaved thinking, Claude can:")
    print("1. Think about what data to search for")
    print("2. Call search_database tool")
    print("3. Think about the results")
    print("4. Decide to call analyze_data tool")
    print("5. Think about the analysis")
    print("6. Provide final insights")
    
    for block in response.content:
        if block.type == "thinking":
            print(f"\n[THINKING]: {block.thinking[:200]}...")
        elif block.type == "tool_use":
            print(f"\n[TOOL CALL]: {block.name}")
        elif block.type == "text":
            print(f"\n[RESPONSE]: {block.text[:200]}...")
    
    return response

# Note: Uncomment to run (requires Claude 4 model access)
# interleaved_thinking_example()

<a id='summarized'></a>
## 8. Summarized vs Full Thinking

### Claude 4+ Models (Current)

- Returns **summarized thinking** in the `thinking` block
- You are **billed for full internal thinking tokens**, not just the summary
- Summary provides insight into reasoning without exposing full internal process

### Claude 3.7 (Deprecated)

- Returned **full thinking output**
- What you saw is what you were charged for

### Key Implications

- Thinking blocks from previous turns are **stripped** and don't count toward context window
- Each thinking block includes a **cryptographic signature** for verification
- Don't parse or modify the signature field

This section clarifies billing and what you see in the response. Keep this in mind
when you estimate cost or compare output quality.


In [7]:
def analyze_thinking_blocks():
    """Demonstrate the structure of thinking blocks"""
    
    response = client.messages.create(
        model="claude-sonnet-4-5-20250929",
        max_tokens=15000,
        thinking={
            "type": "enabled",
            "budget_tokens": 8000
        },
        messages=[{
            "role": "user",
            "content": """I have a list of numbers: [15, 23, 8, 42, 16, 4, 30, 12].
            
            Please:
            1. Find the median
            2. Calculate the mean
            3. Identify any outliers using the IQR method"""
        }]
    )
    
    print("Response Structure Analysis (block types and sizes)")
    print("=" * 60)
    
    for i, block in enumerate(response.content):
        print(f"\nBlock {i + 1}:")
        print(f"  Type: {block.type}")
        
        if block.type == "thinking":
            print(f"  Thinking Summary Length: {len(block.thinking)} characters")
            print(f"  Has Signature: {'Yes' if hasattr(block, 'signature') else 'No'}")
            print("\n  Thinking Content (first 500 chars):")
            print(f"  {block.thinking[:500]}...")
            print("\n  Note: This is a summary. Full thinking tokens were used internally.")
        elif block.type == "text":
            print(f"  Text Length: {len(block.text)} characters")
            print("\n  Final Response:")
            display(Markdown(block.text))

analyze_thinking_blocks()

Response Structure Analysis (block types and sizes)

Block 1:
  Type: thinking
  Thinking Summary Length: 1069 characters
  Has Signature: Yes

  Thinking Content (first 500 chars):
  I need to find the median, mean, and identify outliers using the IQR method for the list: [15, 23, 8, 42, 16, 4, 30, 12]

Let me start by organizing the data.

**Step 1: Sort the data**
[4, 8, 12, 15, 16, 23, 30, 42]

**Step 2: Find the median**
There are 8 numbers (even count), so the median is the average of the 4th and 5th values.
4th value: 15
5th value: 16
Median = (15 + 16) / 2 = 31 / 2 = 15.5

**Step 3: Calculate the mean**
Sum = 4 + 8 + 12 + 15 + 16 + 23 + 30 + 42 = 150
Mean = 150 / 8 = ...

  Note: This is a summary. Full thinking tokens were used internally.

Block 2:
  Type: text
  Text Length: 792 characters

  Final Response:


# Analysis of [15, 23, 8, 42, 16, 4, 30, 12]

## 1. **Median: 15.5**

First, I'll sort the data: [4, 8, 12, 15, 16, 23, 30, 42]

With 8 values (even number), the median is the average of the 4th and 5th values:
- Median = (15 + 16) ÷ 2 = **15.5**

## 2. **Mean: 18.75**

Sum: 4 + 8 + 12 + 15 + 16 + 23 + 30 + 42 = 150

Mean = 150 ÷ 8 = **18.75**

## 3. **Outliers (IQR Method): None**

**Calculating quartiles:**
- Q1 (1st quartile) = median of [4, 8, 12, 15] = 10
- Q3 (3rd quartile) = median of [16, 23, 30, 42] = 26.5
- IQR = Q3 - Q1 = 26.5 - 10 = 16.5

**Outlier boundaries:**
- Lower bound = Q1 - 1.5(IQR) = 10 - 24.75 = **-14.75**
- Upper bound = Q3 + 1.5(IQR) = 26.5 + 24.75 = **51.25**

**Result:** All values fall within [-14.75, 51.25], so there are **no outliers** in this dataset.

<a id='best-practices'></a>
## 9. Best Practices

### Start with Minimal Budget

Begin with 1,024 tokens and increase only as needed for the task.

### Don't Say "Think Step by Step"

Extended thinking handles this automatically - explicit instructions can be counterproductive.

### Use High-Level Instructions

Claude performs better with general guidance rather than step-by-step directives.

### Verify Work with Test Cases

Ask Claude to check its reasoning for better consistency.

### Incompatible Settings

Extended thinking is **NOT compatible** with:
- `temperature` parameter
- `top_k` parameter
- Forced tool use (`tool_choice: {type: "tool", name: "..."}`)

A practical prompting example. Read the prompt structure first, then run the cell
and examine the model's response format.


In [8]:
def prompting_best_practices():
    """Demonstrate effective prompting strategies"""
    
    # Good prompt - clear, specific, structured
    good_prompt = """Analyze the following investment options and recommend the best choice:

Option A: Stock Portfolio
- Expected annual return: 8%
- Risk level: High
- Minimum investment: $10,000
- Liquidity: High (can sell anytime)

Option B: Real Estate
- Expected annual return: 6%
- Risk level: Medium
- Minimum investment: $50,000
- Liquidity: Low (takes months to sell)

Option C: Bonds
- Expected annual return: 4%
- Risk level: Low
- Minimum investment: $5,000
- Liquidity: Medium

Investor Profile:
- Age: 35
- Investment horizon: 15 years
- Risk tolerance: Medium
- Available capital: $75,000
- Goal: Retirement savings

Please provide:
1. Analysis of each option
2. Recommended allocation
3. Justification for your recommendation"""
    
    # max_tokens must be greater than budget_tokens
    budget = 10000
    
    response = client.messages.create(
        model="claude-sonnet-4-5-20250929",
        max_tokens=budget + 4000,
        thinking={"type": "enabled", "budget_tokens": budget},
        messages=[{"role": "user", "content": good_prompt}]
    )
    
    print("Best Practices Example: Structured Investment Analysis")
    print("=" * 60)
    
    for block in response.content:
        if block.type == "text":
            display(Markdown(block.text))

prompting_best_practices()

Best Practices Example: Structured Investment Analysis


# Investment Analysis & Recommendation

## 1. Analysis of Each Option

### Option A: Stock Portfolio
**Strengths:**
- Highest expected return (8%) - important for long-term retirement growth
- Excellent liquidity provides flexibility for emergencies
- Low minimum investment allows for easy diversification
- 15-year horizon provides time to weather market volatility

**Weaknesses:**
- High risk exceeds your stated medium risk tolerance if used exclusively
- Subject to market volatility and potential significant short-term losses

### Option B: Real Estate
**Strengths:**
- Risk level matches your medium tolerance perfectly
- Solid returns (6%) with inflation hedge potential
- Diversification from traditional securities
- Tangible asset

**Weaknesses:**
- Poor liquidity is problematic for retirement savings (may need access)
- High minimum investment ($50,000) limits diversification opportunities
- Hidden costs: property taxes, maintenance, management fees not mentioned
- Concentration risk - single property vulnerability

### Option C: Bonds
**Strengths:**
- Low risk provides portfolio stability
- Decent liquidity for rebalancing
- Low minimum allows flexibility
- Predictable income stream

**Weaknesses:**
- 4% return may barely outpace inflation (historically ~3%)
- Insufficient growth for 15-year retirement goal
- Interest rate risk not mentioned

## 2. Recommended Allocation

### **Diversified Portfolio: $75,000**

- **Stock Portfolio (Option A): $37,500 (50%)**
- **Bonds (Option C): $22,500 (30%)**
- **Real Estate (Option B): $15,000 (20%)** - OR - **Additional Stocks/Bonds if real estate minimum can't be negotiated**

### **Alternative if Real Estate minimum is firm:**
- **Stock Portfolio: $45,000 (60%)**
- **Bonds: $30,000 (40%)**

**Expected Portfolio Return:** 
- Diversified: ~6.4% annually
- Alternative: ~6.4% annually

## 3. Justification

### Why Diversification?

**Matches Your Risk Profile:**
- Blending high-risk stocks with low-risk bonds creates a medium-risk portfolio
- This "balanced" approach aligns with your stated medium risk tolerance

**Optimizes for 15-Year Horizon:**
- At 35, you can afford some equity exposure for growth
- Bonds provide stability as you approach retirement
- Gradual shift to conservative allocation should occur after year 10

**Addresses Retirement Goals:**
- 6.4% expected return significantly outpaces inflation
- $75,000 at 6.4% for 15 years = ~$189,000
- Pure bonds at 4% = only ~$135,000 (insufficient growth)
- Pure stocks = maximum growth but excessive stress/risk

**Maintains Liquidity:**
- 80% of portfolio (stocks + bonds) remains relatively liquid
- Important for emergencies or opportunities
- Real estate portion (if included) should be limited due to liquidity concerns

### **Key Recommendation:**

**I advise AGAINST putting $50,000+ into real estate** given:
- Over-concentration in single, illiquid asset
- Limits ability to rebalance
- Emergency access concerns

**Instead, prioritize the 60/40 stocks-to-bonds split**, which is a classic moderate allocation for someone your age.

### Action Steps:
1. Start with 60% stocks / 40% bonds
2. Rebalance annually
3. Gradually shift to 50/50 by age 45
4. Consider real estate only if you exceed $150,000 in investments

Would you like me to elaborate on specific stock/bond types or discuss tax considerations?

In [9]:
def cost_calculator():
    """Calculate costs for extended thinking usage"""
    
    # Current pricing (prices per million tokens)
    pricing = {
        "claude-sonnet-4-5": {"input": 3, "output": 15},
        "claude-haiku-4-5": {"input": 1, "output": 5},
        "claude-opus-4-1": {"input": 15, "output": 75},
        "claude-sonnet-4": {"input": 3, "output": 15},
    }
    
    print("Extended Thinking Cost Calculator (approximate costs)")
    print("=" * 60)
    
    # Example scenario
    scenarios = [
        {"name": "Simple Analysis", "input": 500, "thinking": 5000, "output": 1000},
        {"name": "Complex Problem", "input": 2000, "thinking": 20000, "output": 3000},
        {"name": "Deep Research", "input": 5000, "thinking": 50000, "output": 8000}
    ]
    
    # Show example with Sonnet 4.5
    model = "claude-sonnet-4-5"
    prices = pricing[model]
    
    print(f"\nCost Analysis for {model}")
    print("-" * 40)
    
    for scenario in scenarios:
        # Thinking tokens are billed as output tokens
        input_cost = (scenario["input"] / 1_000_000) * prices["input"]
        thinking_cost = (scenario["thinking"] / 1_000_000) * prices["output"]
        output_cost = (scenario["output"] / 1_000_000) * prices["output"]
        total_cost = input_cost + thinking_cost + output_cost
        
        print(f"\n  {scenario['name']}:")
        print(f"    Input tokens: {scenario['input']:,}")
        print(f"    Thinking tokens: {scenario['thinking']:,} (billed as output)")
        print(f"    Output tokens: {scenario['output']:,}")
        print(f"    Total cost: ${total_cost:.4f}")
    
    print("\n\nImportant Pricing Notes:")
    print("-" * 40)
    print("- Thinking tokens are charged at OUTPUT token rates")
    print("- Claude 4+ models: Charged for full internal thinking, not summary")
    print("- Thinking blocks from previous turns don't count toward context window")
    print("- Use prompt caching to reduce costs on repeated patterns")

cost_calculator()

Extended Thinking Cost Calculator (approximate costs)

Cost Analysis for claude-sonnet-4-5
----------------------------------------

  Simple Analysis:
    Input tokens: 500
    Thinking tokens: 5,000 (billed as output)
    Output tokens: 1,000
    Total cost: $0.0915

  Complex Problem:
    Input tokens: 2,000
    Thinking tokens: 20,000 (billed as output)
    Output tokens: 3,000
    Total cost: $0.3510

  Deep Research:
    Input tokens: 5,000
    Thinking tokens: 50,000 (billed as output)
    Output tokens: 8,000
    Total cost: $0.8850


Important Pricing Notes:
----------------------------------------
- Thinking tokens are charged at OUTPUT token rates
- Claude 4+ models: Charged for full internal thinking, not summary
- Thinking blocks from previous turns don't count toward context window
- Use prompt caching to reduce costs on repeated patterns


<a id='examples'></a>
## 10. Practical Examples

Three longer examples (document analysis, architecture planning, and code review).
Use them as reference patterns for real-world tasks.


In [10]:
def document_analysis_example():
    """Analyze a complex document with extended thinking"""
    
    # Simulated legal document excerpt
    document = """PURCHASE AGREEMENT - EXECUTIVE SUMMARY
    
    This Agreement is entered into as of January 15, 2024, between TechCorp Inc. 
    ("Buyer") and DataSystems LLC ("Seller").
    
    TERMS:
    1. Purchase Price: $45,000,000 (Forty-five million dollars)
    2. Payment Structure:
       - Initial Payment: $20,000,000 upon closing
       - Deferred Payment: $15,000,000 payable over 3 years
       - Performance Earnout: Up to $10,000,000 based on revenue targets
    
    3. Conditions Precedent:
       - Regulatory approval from FTC
       - No material adverse change in Seller's business
       - Retention of key employees (minimum 80% for 12 months)
    
    4. Representations and Warranties:
       - Seller warrants all intellectual property is free of encumbrances
       - Financial statements are accurate per GAAP
       - No pending litigation exceeding $500,000
    
    5. Termination Clauses:
       - Either party may terminate if closing doesn't occur by March 31, 2024
       - Buyer may terminate if due diligence reveals material issues
       - Break-up fee: $2,000,000 if Buyer terminates without cause
    """
    
    # max_tokens must be greater than budget_tokens
    budget = 10000
    
    response = client.messages.create(
        model="claude-sonnet-4-5-20250929",
        max_tokens=budget + 4000,
        thinking={"type": "enabled", "budget_tokens": budget},
        messages=[{
            "role": "user",
            "content": f"""Analyze this purchase agreement and identify:
            
            1. Key risks for the buyer
            2. Key risks for the seller
            3. Potential deal breakers
            4. Areas that need clarification
            5. Recommendations for both parties
            
            Document:
            {document}"""
        }]
    )
    
    print("Complex Document Analysis with Extended Thinking")
    print("=" * 60)
    
    for block in response.content:
        if block.type == "thinking":
            print("\nAnalysis Process (Summary):")
            print(block.thinking[:800] + "...\n")
        elif block.type == "text":
            print("Detailed Analysis:")
            display(Markdown(block.text))

document_analysis_example()

Complex Document Analysis with Extended Thinking

Analysis Process (Summary):
This is a purchase agreement analysis request. I need to carefully review the document and identify risks, deal breakers, unclear areas, and provide recommendations for both parties. Let me analyze this systematically.

**Key Risks for the Buyer:**
1. Material Adverse Change (MAC) clause is vague - what constitutes "material"?
2. Performance earnout of $10M is not clearly defined - what are the specific revenue targets?
3. Key employee retention requirement (80% for 12 months) - this could be difficult to control
4. Deferred payment of $15M over 3 years - no mention of interest rate or security
5. Due diligence period not specified
6. "Material issues" for termination is vague
7. $2M break-up fee even if regulatory approval fails?
8. Litigation threshold of $500K seems low for exclusions
9...

Detailed Analysis:


# PURCHASE AGREEMENT ANALYSIS

## 1. KEY RISKS FOR THE BUYER

### Financial Risks
- **Deferred Payment Exposure**: $15M over 3 years with no mention of security, collateral, or interest rate terms
- **Employee Retention Risk**: 80% key employee retention requirement may be difficult to achieve and measure; "key employees" undefined
- **Earnout Uncertainty**: $10M earnout lacks specific, measurable revenue targets and calculation methodology

### Legal/Structural Risks
- **Vague Termination Rights**: "Material issues" in due diligence not defined - creates litigation risk
- **Break-up Fee Exposure**: $2M penalty even if regulatory approval fails (appears to apply "without cause" only)
- **Limited Litigation Threshold**: $500K threshold for pending litigation warranty seems low for a $45M transaction
- **No Indemnification Framework**: No mention of caps, baskets, escrows, or survival periods for warranty breaches

### Operational Risks
- **MAC Clause Ambiguity**: "Material adverse change" undefined - could lead to disputes
- **Tight Timeline**: 2.5 months to closing with regulatory approval requirement is aggressive

## 2. KEY RISKS FOR THE SELLER

### Financial Risks
- **Payment Concentration Risk**: Only 44% ($20M) received at closing; 56% subject to future conditions
- **Buyer Creditworthiness Unknown**: No financial covenants, guarantees, or credit support for deferred payments
- **Earnout Vulnerability**: $10M (22% of deal value) tied to undefined metrics potentially controlled by buyer post-closing

### Structural Risks
- **Post-Closing Control Loss**: Employee retention and earnout targets dependent on buyer's management decisions
- **Broad IP Warranty**: "All intellectual property is free of encumbrances" could expose unknown third-party claims
- **One-Sided Termination**: Buyer has broad termination rights; seller has limited protection

### Timing Risks
- **Aggressive Closing Deadline**: March 31, 2024 timeline with FTC approval may be unrealistic
- **No Reverse Break-up Fee**: If buyer terminates "with cause" (broadly defined), seller gets nothing

## 3. POTENTIAL DEAL BREAKERS

### Critical Issues
1. **FTC Regulatory Approval** - No certainty; could block deal entirely
2. **Employee Retention Condition** - 80% threshold as closing condition is highly unusual and risky
3. **Undefined Earnout Terms** - $10M at stake with no clear metrics = dispute guaranteed
4. **MAC Clause** - Vague definition could allow buyer walkaway based on normal business fluctuations
5. **Payment Security** - $25M in future payments with no security provisions
6. **March 31 Deadline** - Likely impossible given FTC review timelines (typically 4-6 months)

## 4. AREAS REQUIRING CLARIFICATION

### Critical Definitions Needed
- **"Material adverse change"** - Quantitative thresholds, exclusions (market conditions, regulatory changes)
- **"Material issues"** - Specific examples or dollar thresholds
- **"Key employees"** - Names, titles, and roles; what constitutes "retention"?
- **Revenue targets** - Specific earnout metrics, measurement period, calculation methodology

### Missing Terms
1. **Due Diligence**: Timeline, scope, access rights, and standards
2. **Deferred Payment Terms**: 
   - Payment schedule (monthly/annual?)
   - Interest rate (market rate? imputed interest?)
   - Security/collateral arrangements
   - Acceleration upon default
3. **Earnout Provisions**:
   - Revenue definition (GAAP? Cash basis?)
   - Measurement periods
   - Audit rights
   - Operating covenants during earnout period
4. **Indemnification**:
   - Survival periods
   - Caps and baskets
   - Escrow amount and release schedule
   - Exclusive remedy provisions
5. **Working Capital**: Adjustment mechanisms and settlement
6. **Post-Closing Covenants**: 
   - Non-compete duration and scope
   - Non-solicitation of employees/customers
   - Transition services
7. **Break-up Fee Triggers**: When exactly does $2M apply?
8. **Closing Conditions**: Complete list with specificity
9. **Litigation Definition**: "Pending" vs. threatened? Include regulatory investigations?

## 5. RECOMMENDATIONS

### FOR THE BUYER (TechCorp Inc.)

#### Immediate Actions
1. **Secure Deferred Payments**
   - Require security interest in Seller's assets
   - Demand personal guarantees from LLC members
   - Consider placing funds in escrow with conditional release

2. **Establish Escrow Account**
   - Recommend 15-20% ($3-4M) holdback for 18-24 months
   - Cover representation/warranty breaches and earnout disputes

3. **Define Earnout Precisely**
   ```
   Example: "Revenue targets measured by GAAP revenue 
   for fiscal years 2024-2026:
   - $10M if cumulative revenue ≥ $75M
   - Pro-rated down to $5M minimum if ≥ $60M
   - $0 if < $60M"
   ```

4. **Strengthen Due Diligence Rights**
   - Specify 45-60 day due diligence period
   - Define materiality: issues representing >5% of purchase price or >10% EBITDA impact
   - Full access to data room, employees, contracts, and systems

5. **Restructure Employee Retention**
   - Remove as closing condition (makes deal too risky)
   - Convert to post-closing bonus pool funded by buyer
   - Obtain signed retention agreements pre-closing

6. **Clarify MAC Clause**
   - Add quantitative threshold (e.g., >20% revenue decline)
   - Exclude market-wide conditions, regulatory changes, and buyer-caused changes

#### Negotiation Priorities
- **Reduce break-up fee** to $1M or eliminate if regulatory approval fails
- **Extend closing deadline** to June 30, 2024 (realistic for FTC review)
- **Add specific termination rights** for litigation >$2M, loss of major customers (>15% revenue), or key employee departures
- **Obtain representation insurance** for $5-10M to backstop warranty breaches

### FOR THE SELLER (DataSystems LLC)

#### Immediate Actions
1. **Restructure Payment Terms**
   - Negotiate higher upfront payment ($30M vs. $20M)
   - Reduce earnout exposure (max $5M vs. $10M)
   - Demand market-rate interest on deferred payments (8-10%)

2. **Demand Payment Security**
   - Parent company guarantee from TechCorp
   - Financial statements showing buyer creditworthiness
   - Cross-default provisions
   - Acceleration rights if buyer breaches covenants

3. **Add Reverse Break-up Fee**
   - $2M if buyer terminates for regulatory failure
   - $4M if buyer terminates for non-regulatory reasons
   - Protects 2.5 months of exclusivity and opportunity cost

4. **Cap and Limit Warranties**
   ```
   Recommend:
   - General indemnification cap: 25% of purchase price ($11M)
   - Basket: 1% of purchase price ($450K)
   - Survival: 18 months (24 months for tax/IP)
   - Escrow: $3M released after 18 months
   ```

5. **Protect Earnout Potential**
   - Require buyer to operate business consistent with past practices
   - Seller approval rights for major changes affecting revenue
   - Independent accounting firm to calculate earnout
   - Specify that earnout survives even if employee retention fails

6. **Clarify Employee Retention**
   - Make it covenant, not closing condition
   - Define as "employed and not on notice" as of 12-month anniversary
   - Limit to named list of 5-10 truly key employees
   - Exclude terminations for cause or voluntary resignations

#### Negotiation Priorities
- **Extend closing deadline** to allow realistic timeline without penalty
- **Narrow representations and warranties** - add knowledge qualifiers, limit litigation to ">$500K in potential liability"
- **Demand earnout floor** - minimum $5M guaranteed if revenue ≥ 80% of target
- **Obtain buyer financial commitments** - no debt incurrence >$10M, maintain working capital
- **Include key employee incentives** in transaction documents to ensure retention

---

## OVERALL ASSESSMENT

**Deal Viability**: ⚠️ **MODERATE-TO-HIGH RISK**

This agreement contains significant ambiguities that could lead to litigation. The deal structure places disproportionate risk on the seller (only 44% at closing) while giving the buyer broad termination rights and undefined earnout terms.

### Priority Actions Before Proceeding:
1. ✅ Engage experienced M&A counsel immediately
2. ✅ Define all earnout metrics with specific numbers and calculation examples
3. ✅ Establish escrow and security provisions for all deferred payments
4. ✅ Remove employee retention as closing condition or obtain signed agreements
5. ✅ Extend closing deadline to realistic timeframe (Q2 2024)
6. ✅ Add comprehensive indemnification framework
7. ✅ Both parties should obtain representation & warranty insurance

**Without these clarifications, the likelihood of post-closing disputes is >70%.**

In [11]:
def architecture_planning_example():
    """Use extended thinking for software architecture decisions"""
    
    requirements = """Design a microservices architecture for an e-commerce platform with:
    
    Functional Requirements:
    - User authentication and profiles
    - Product catalog with search
    - Shopping cart and checkout
    - Order management and tracking
    - Payment processing
    
    Non-Functional Requirements:
    - Handle 100,000 concurrent users
    - 99.9% uptime
    - Response time < 200ms for catalog
    - Scalable to 10x current load
    
    Tech Stack Preferences:
    - Cloud-native (AWS/GCP/Azure)
    - Container-based deployment
    - Modern languages (Python/Go/Node.js)
    """
    
    # max_tokens must be greater than budget_tokens
    budget = 15000
    
    response = client.messages.create(
        model="claude-sonnet-4-5-20250929",
        max_tokens=budget + 5000,
        thinking={"type": "enabled", "budget_tokens": budget},
        messages=[{
            "role": "user",
            "content": f"""Create a detailed microservices architecture plan.
            
            Include:
            1. Service breakdown and responsibilities
            2. Communication patterns (sync/async)
            3. Data storage strategy
            4. Security considerations
            5. Deployment architecture
            6. Scaling strategy
            
            Requirements:
            {requirements}"""
        }]
    )
    
    print("Microservices Architecture Planning (long-form response)")
    print("=" * 60)
    
    for block in response.content:
        if block.type == "text":
            display(Markdown(block.text))

architecture_planning_example()

Microservices Architecture Planning (long-form response)


# E-Commerce Microservices Architecture Plan

## Executive Summary

This architecture supports a scalable e-commerce platform designed for 100K concurrent users with 10x growth capacity, using cloud-native technologies and container orchestration.

---

## 1. Service Breakdown and Responsibilities

### Core Services

```
┌─────────────────────────────────────────────────────────────┐
│                     API Gateway Layer                        │
│                    (Kong/AWS API Gateway)                    │
└─────────────────────────────────────────────────────────────┘
                              │
          ┌───────────────────┼───────────────────┐
          │                   │                   │
          ▼                   ▼                   ▼
┌──────────────────┐  ┌──────────────┐  ┌──────────────────┐
│  Auth Service    │  │   Product    │  │   Cart Service   │
│   (Node.js)      │  │   Service    │  │    (Node.js)     │
│                  │  │   (Go)       │  │                  │
└──────────────────┘  └──────────────┘  └──────────────────┘
          │                   │                   │
          ▼                   ▼                   ▼
┌──────────────────┐  ┌──────────────┐  ┌──────────────────┐
│  Order Service   │  │   Payment    │  │  Notification    │
│   (Python)       │  │   Service    │  │    Service       │
│                  │  │   (Go)       │  │   (Node.js)      │
└──────────────────┘  └──────────────┘  └──────────────────┘
          │                   │                   │
          ▼                   ▼                   ▼
┌──────────────────┐  ┌──────────────┐  ┌──────────────────┐
│  Inventory       │  │   Search     │  │   Analytics      │
│  Service (Go)    │  │   Service    │  │   Service        │
│                  │  │   (Python)   │  │   (Python)       │
└──────────────────┘  └──────────────┘  └──────────────────┘
```

### Service Specifications

#### **1. Authentication Service (Node.js)**
```yaml
Responsibilities:
  - User registration and login
  - JWT token generation/validation
  - OAuth2 integration (Google, Facebook)
  - Session management
  - Password reset/recovery
  - MFA support

Technology:
  - Runtime: Node.js 18 LTS
  - Framework: Express.js
  - Auth: Passport.js, jsonwebtoken
  
Database:
  - Primary: PostgreSQL (user credentials)
  - Cache: Redis (sessions, tokens)

API Endpoints:
  - POST /auth/register
  - POST /auth/login
  - POST /auth/refresh
  - POST /auth/logout
  - GET /auth/validate
  - POST /auth/reset-password

Performance Targets:
  - Login: < 100ms
  - Token validation: < 10ms (cached)
  - Availability: 99.99%
```

#### **2. Product Catalog Service (Go)**
```yaml
Responsibilities:
  - Product CRUD operations
  - Category management
  - Product attributes/variants
  - Pricing management
  - Image management integration
  - Product recommendations

Technology:
  - Runtime: Go 1.21
  - Framework: Gin
  - ORM: GORM

Database:
  - Primary: PostgreSQL (product data)
  - Cache: Redis (frequently accessed products)
  - CDN: CloudFront/CloudFlare (images)
  - Search: Elasticsearch

API Endpoints:
  - GET /products
  - GET /products/{id}
  - POST /products (admin)
  - PUT /products/{id} (admin)
  - GET /categories
  - GET /products/{id}/recommendations

Performance Targets:
  - List products: < 200ms
  - Get single product: < 50ms
  - Cache hit ratio: > 80%
```

#### **3. Search Service (Python)**
```yaml
Responsibilities:
  - Full-text product search
  - Faceted filtering
  - Auto-suggestions
  - Search analytics
  - Indexing pipeline

Technology:
  - Runtime: Python 3.11
  - Framework: FastAPI
  - Search Engine: Elasticsearch 8.x
  - ML: scikit-learn (ranking)

Database:
  - Primary: Elasticsearch
  - Cache: Redis (popular searches)

API Endpoints:
  - GET /search?q={query}
  - GET /search/suggestions?q={query}
  - GET /search/facets

Performance Targets:
  - Search response: < 100ms
  - Auto-suggest: < 50ms
  - Index latency: < 5 seconds
```

#### **4. Shopping Cart Service (Node.js)**
```yaml
Responsibilities:
  - Cart management (add/remove/update)
  - Cart persistence
  - Cart abandonment tracking
  - Price calculation
  - Coupon/discount application

Technology:
  - Runtime: Node.js 18 LTS
  - Framework: Express.js
  
Database:
  - Primary: Redis (active carts)
  - Backup: PostgreSQL (cart history)

API Endpoints:
  - GET /cart/{userId}
  - POST /cart/{userId}/items
  - PUT /cart/{userId}/items/{itemId}
  - DELETE /cart/{userId}/items/{itemId}
  - POST /cart/{userId}/apply-coupon

Performance Targets:
  - Cart operations: < 50ms
  - Cart TTL: 7 days
```

#### **5. Order Service (Python)**
```yaml
Responsibilities:
  - Order creation and management
  - Order state machine
  - Order history
  - Returns/refunds processing
  - Order tracking

Technology:
  - Runtime: Python 3.11
  - Framework: FastAPI
  - Workflow: Temporal.io

Database:
  - Primary: PostgreSQL (orders)
  - Event Store: PostgreSQL (order events)
  - Cache: Redis

API Endpoints:
  - POST /orders
  - GET /orders/{orderId}
  - GET /users/{userId}/orders
  - PUT /orders/{orderId}/status
  - POST /orders/{orderId}/cancel

State Machine:
  PENDING → CONFIRMED → PROCESSING → 
  SHIPPED → DELIVERED → COMPLETED
  
Performance Targets:
  - Order creation: < 500ms
  - Order retrieval: < 100ms
```

#### **6. Payment Service (Go)**
```yaml
Responsibilities:
  - Payment processing
  - Payment gateway integration
  - Transaction management
  - Refund processing
  - PCI compliance
  - Fraud detection

Technology:
  - Runtime: Go 1.21
  - Framework: Gin
  - Payment Gateways: Stripe, PayPal

Database:
  - Primary: PostgreSQL (transactions)
  - Audit: Write-ahead log

API Endpoints:
  - POST /payments/initiate
  - POST /payments/confirm
  - POST /payments/refund
  - GET /payments/{paymentId}

Security:
  - PCI-DSS Level 1 compliance
  - No card storage (tokenization)
  - End-to-end encryption

Performance Targets:
  - Payment processing: < 3 seconds
  - Idempotency: Required
```

#### **7. Inventory Service (Go)**
```yaml
Responsibilities:
  - Stock management
  - Reservation system
  - Stock updates
  - Warehouse management
  - Low stock alerts

Technology:
  - Runtime: Go 1.21
  - Framework: Gin

Database:
  - Primary: PostgreSQL
  - Cache: Redis (stock levels)
  - Queue: RabbitMQ (stock updates)

API Endpoints:
  - GET /inventory/{productId}
  - POST /inventory/reserve
  - POST /inventory/release
  - PUT /inventory/{productId}

Performance Targets:
  - Stock check: < 50ms
  - Reservation: < 100ms
  - Eventual consistency: < 5 seconds
```

#### **8. Notification Service (Node.js)**
```yaml
Responsibilities:
  - Email notifications
  - SMS notifications
  - Push notifications
  - Notification templates
  - Delivery tracking

Technology:
  - Runtime: Node.js 18 LTS
  - Email: SendGrid/AWS SES
  - SMS: Twilio
  - Push: Firebase Cloud Messaging

Database:
  - Primary: MongoDB (notification logs)
  - Queue: AWS SQS/RabbitMQ

Events Handled:
  - Order confirmation
  - Shipping updates
  - Payment receipts
  - Password resets
  - Marketing campaigns

Performance Targets:
  - Processing: < 1 second
  - Delivery rate: > 99%
```

#### **9. Analytics Service (Python)**
```yaml
Responsibilities:
  - Event tracking
  - Business metrics
  - User behavior analysis
  - Reporting
  - A/B testing support

Technology:
  - Runtime: Python 3.11
  - Framework: FastAPI
  - Analytics: Apache Kafka → Flink → ClickHouse
  - Visualization: Grafana

Database:
  - Primary: ClickHouse (time-series)
  - Stream: Kafka

Metrics Tracked:
  - User sessions
  - Product views
  - Cart conversions
  - Search queries
  - Revenue metrics
```

---

## 2. Communication Patterns

### Synchronous Communication (REST/gRPC)

```yaml
User-Facing APIs (REST):
  - Auth Service ←→ API Gateway
  - Product Service ←→ API Gateway
  - Cart Service ←→ API Gateway
  - Order Service ←→ API Gateway
  - Search Service ←→ API Gateway

Internal Service Communication (gRPC):
  - Order Service → Payment Service (payment processing)
  - Order Service → Inventory Service (stock reservation)
  - Cart Service → Product Service (price validation)
  - Product Service → Inventory Service (stock check)

Rationale:
  - Low latency requirements
  - Request-response pattern
  - Strong consistency needed
  - Direct user interaction
```

### Asynchronous Communication (Event-Driven)

```yaml
Message Broker: Apache Kafka + RabbitMQ

Event Flow Architecture:
┌──────────────────────────────────────────────────────────┐
│                     Event Bus (Kafka)                     │
└──────────────────────────────────────────────────────────┘
     ▲              ▲              ▲              ▲
     │              │              │              │
 Publishers:    Orders      Payments    Inventory   Users
     │              │              │              │
     ▼              ▼              ▼              ▼
 Subscribers: Notification Analytics  Search    Warehouse

Event Types:

1. Order Events (Kafka):
   Topics:
     - order.created
     - order.confirmed
     - order.cancelled
     - order.shipped
     - order.delivered
   
   Subscribers:
     - Notification Service (emails)
     - Analytics Service (metrics)
     - Inventory Service (stock updates)

2. Payment Events (Kafka):
   Topics:
     - payment.initiated
     - payment.completed
     - payment.failed
     - payment.refunded
   
   Subscribers:
     - Order Service (order confirmation)
     - Notification Service (receipts)
     - Analytics Service (revenue tracking)

3. Inventory Events (RabbitMQ):
   Topics:
     - inventory.reserved
     - inventory.released
     - inventory.low_stock
   
   Subscribers:
     - Product Service (availability updates)
     - Notification Service (alerts)

4. Product Events (Kafka):
   Topics:
     - product.created
     - product.updated
     - product.deleted
   
   Subscribers:
     - Search Service (index updates)
     - Cache Service (invalidation)
     - Analytics Service (catalog metrics)

Message Format (CloudEvents Standard):
{
  "specversion": "1.0",
  "type": "order.created",
  "source": "order-service",
  "id": "A234-1234-1234",
  "time": "2024-01-15T10:00:00Z",
  "datacontenttype": "application/json",
  "data": {
    "orderId": "ORD-123456",
    "userId": "USR-789",
    "amount": 299.99,
    "items": [...]
  }
}

Reliability Patterns:
  - At-least-once delivery
  - Idempotent consumers
  - Dead letter queues
  - Retry with exponential backoff
  - Event versioning
```

### Communication Decision Matrix

```
┌─────────────────────┬──────────────┬───────────────┐
│ Use Case            │ Pattern      │ Technology    │
├─────────────────────┼──────────────┼───────────────┤
│ User checkout       │ Sync (gRPC)  │ Direct call   │
│ Order notification  │ Async (Pub)  │ Kafka         │
│ Stock check         │ Sync (gRPC)  │ Direct call   │
│ Search indexing     │ Async (Pub)  │ Kafka         │
│ Payment processing  │ Sync (gRPC)  │ Direct call   │
│ Analytics events    │ Async (Fire) │ Kafka         │
│ Email sending       │ Async (Queue)│ RabbitMQ      │
│ Cache invalidation  │ Async (Pub)  │ Redis Pub/Sub │
└─────────────────────┴──────────────┴───────────────┘
```

### Circuit Breaker Pattern

```yaml
Implementation: Istio Service Mesh

Configuration:
  consecutiveErrors: 5
  interval: 30s
  baseEjectionTime: 30s
  maxEjectionPercent: 50%

Example (Order → Payment):
  - If 5 consecutive payment failures
  - Eject payment service for 30s
  - Return cached failure response
  - Fallback: Queue order for retry
```

---

## 3. Data Storage Strategy

### Database Selection Per Service

```yaml
┌─────────────────────┬──────────────────┬─────────────────┐
│ Service             │ Primary DB       │ Cache/Secondary │
├─────────────────────┼──────────────────┼─────────────────┤
│ Auth Service        │ PostgreSQL       │ Redis           │
│ Product Service     │ PostgreSQL       │ Redis + ES      │
│ Cart Service        │ Redis            │ PostgreSQL      │
│ Order Service       │ PostgreSQL       │ Redis           │
│ Payment Service     │ PostgreSQL       │ None            │
│ Inventory Service   │ PostgreSQL       │ Redis           │
│ Notification Service│ MongoDB          │ None            │
│ Search Service      │ Elasticsearch    │ Redis           │
│ Analytics Service   │ ClickHouse       │ None            │
└─────────────────────┴──────────────────┴─────────────────┘
```

### Detailed Storage Architecture

#### **PostgreSQL Clusters**

```yaml
Configuration:
  Version: PostgreSQL 15
  Deployment: Amazon RDS/Cloud SQL
  
Cluster Setup:
  - Master-Replica architecture
  - 1 Primary + 2 Read Replicas per service
  - Automatic failover
  - Connection pooling (PgBouncer)

Services Using PostgreSQL:
  1. Auth Service:
     Tables:
       - users (id, email, password_hash, created_at)
       - user_profiles (user_id, name, address, phone)
       - refresh_tokens (token, user_id, expires_at)
     
     Partitioning: By user_id range
     Indexes: email (unique), created_at
     Backup: Daily + PITR

  2. Product Service:
     Tables:
       - products (id, name, description, price, category_id)
       - categories (id, name, parent_id)
       - product_images (id, product_id, url)
       - product_variants (id, product_id, sku, attributes)
     
     Partitioning: products by category_id
     Indexes: category_id, price, created_at
     Sharding Strategy: Horizontal by category

  3. Order Service:
     Tables:
       - orders (id, user_id, status, total, created_at)
       - order_items (id, order_id, product_id, quantity, price)
       - order_events (id, order_id, event_type, timestamp)
     
     Partitioning: orders by created_at (monthly)
     Indexes: user_id, status, created_at
     Archival: Move orders > 2 years to cold storage

  4. Payment Service:
     Tables:
       - transactions (id, order_id, amount, status, gateway)
       - payment_methods (id, user_id, type, token)
       - refunds (id, transaction_id, amount, reason)
     
     Encryption: Column-level for sensitive data
     Audit: All changes logged
     Compliance: PCI-DSS requirements

Scaling Strategy:
  Read Scaling:
    - Read replicas (up to 5 per cluster)
    - Read-only queries routed to replicas
    - Replica lag monitoring
  
  Write Scaling:
    - Vertical scaling (up to 64 vCPU, 256GB RAM)
    - Horizontal sharding for high-volume tables
    - Write-ahead logging optimization

Performance Tuning:
  - shared_buffers: 25% of RAM
  - effective_cache_size: 75% of RAM
  - max_connections: 500
  - Connection pooling: 100 connections per service
```

#### **Redis Clusters**

```yaml
Configuration:
  Version: Redis 7.x
  Deployment: Amazon ElastiCache/Redis Cloud
  Mode: Cluster mode enabled

Use Cases:
  1. Session Store (Auth Service):
     Data Structure: Hash
     TTL: 24 hours
     Keys: session:{userId}
     Eviction: LRU
     
  2. Product Cache (Product Service):
     Data Structure: Hash + Sorted Sets
     TTL: 1 hour
     Keys: product:{id}, trending:products
     Cache Strategy: Cache-aside
     
  3. Shopping Cart (Cart Service):
     Data Structure: Hash
     TTL: 7 days
     Keys: cart:{userId}
     Persistence: RDB every 5 minutes
     
  4. Rate Limiting:
     Data Structure: String (counter)
     TTL: 1 minute
     Keys: ratelimit:{userId}:{endpoint}
     Algorithm: Token bucket
     
  5. Inventory Cache:
     Data Structure: Hash
     TTL: 30 seconds
     Keys: inventory:{productId}
     Update: Write-through

Cluster Configuration:
  Nodes: 6 (3 master + 3 replica)
  Memory: 32GB per node
  Persistence: AOF + RDB
  Replication: Async
  
Scaling:
  - Add shards for horizontal scaling
  - Increase node size for vertical scaling
  - Target: < 1ms latency for 95th percentile

Monitoring:
  - Memory usage (alert at 80%)
  - Eviction rate
  - Hit rate (target > 90%)
  - Latency percentiles
```

#### **Elasticsearch**

```yaml
Configuration:
  Version: Elasticsearch 8.x
  Deployment: Elastic Cloud/Self-managed on K8s

Use Cases:
  1. Product Search:
     Index: products
     Mapping:
       - name (text, analyzed)
       - description (text, analyzed)
       - category (keyword)
       - price (float)
       - rating (float)
       - tags (keyword array)
     
     Analyzers:
       - Standard analyzer
       - Edge n-gram for autocomplete
       - Synonym filter
     
  2. Search Analytics:
     Index: search_logs
     Retention: 90 days

Cluster Setup:
  Nodes: 
    - 3 Master nodes (4 vCPU, 16GB RAM)
    - 6 Data nodes (8 vCPU, 64GB RAM)
    - 2 Coordinating nodes (4 vCPU, 16GB RAM)
  
  Shards: 6 primary + 1 replica per index
  Heap: 31GB per data node
  
Indexing Pipeline:
  1. Product created/updated → Kafka
  2. Indexer service consumes events
  3. Bulk index to Elasticsearch
  4. Refresh interval: 30s

Performance:
  - Query time: < 100ms (95th percentile)
  - Indexing rate: 10,000 docs/sec
  - Query cache enabled
  - Request cache enabled
```

#### **MongoDB**

```yaml
Configuration:
  Version: MongoDB 6.x
  Deployment: MongoDB Atlas/Self-managed

Use Case: Notification Service
  Collection: notifications
  Schema:
    {
      _id: ObjectId,
      userId: String,
      type: String, // email, sms, push
      channel: String,
      subject: String,
      body: String,
      status: String, // sent, failed, pending
      sentAt: Date,
      metadata: Object
    }

Indexes:
  - userId (for user notification history)
  - status + createdAt (for retry logic)
  - sentAt (for analytics)

Cluster:
  - Replica set: 3 nodes
  - Sharding: By userId (if needed)
  - Oplog size: 10GB

Performance:
  - Write concern: majority
  - Read preference: primaryPreferred
  - TTL index: Auto-delete after 90 days
```

#### **ClickHouse**

```yaml
Configuration:
  Version: ClickHouse 23.x
  Deployment: Self-managed on K8s

Use Case: Analytics Service
  Tables:
    1. events:
       Columns:
         - event_time (DateTime)
         - user_id (UInt64)
         - event_type (String)
         - product_id (Nullable(UInt64))
         - session_id (String)
         - metadata (JSON)
       
       Engine: MergeTree
       Order By: (event_type, event_time)
       Partition By: toYYYYMM(event_time)
       
    2. order_metrics:
       Aggregated materialized view
       Refresh: Real-time
       
Cluster:
  - 4 nodes
  - Replication factor: 2
  - Zookeeper for coordination

Performance:
  - Ingestion: 1M events/sec
  - Query latency: < 1 second
  - Compression: LZ4
  - TTL: 1 year
```

### Data Consistency Strategy

```yaml
Consistency Models:

1. Strong Consistency:
   Services: Payment, Order, Inventory
   Pattern: ACID transactions
   Implementation: PostgreSQL transactions
   
2. Eventual Consistency:
   Services: Product catalog, Search, Analytics
   Pattern: Event sourcing + CQRS
   Implementation: Kafka + eventual sync
   Max delay: 5 seconds
   
3. Session Consistency:
   Services: Cart, User sessions
   Pattern: Sticky sessions + replication
   Implementation: Redis with AOF

Distributed Transaction Pattern (Saga):
  Order Checkout Flow:
    1. Reserve inventory (compensatable)
    2. Process payment (pivot - no compensation)
    3. Create order (retriable)
    4. Release cart (retriable)
    
  Compensation Logic:
    If step 2 fails → Release inventory
    If step 3 fails → Refund payment
```

### Backup and Disaster Recovery

```yaml
PostgreSQL:
  - Automated daily backups
  - Point-in-time recovery (35 days)
  - Cross-region replication
  - RTO: 1 hour
  - RPO: 5 minutes

Redis:
  - RDB snapshots every 5 minutes
  - AOF persistence
  - Backup to S3 daily
  - RTO: 15 minutes
  - RPO: 5 minutes

Elasticsearch:
  - Snapshot to S3 every 6 hours
  - RTO: 2 hours
  - RPO: 6 hours

MongoDB:
  - Continuous backup
  - Point-in-time recovery
  - RTO: 1 hour
  - RPO: 1 minute
```

---

## 4. Security Considerations

### Security Architecture

```
┌───────────────────────────────────────────────────────┐
│                    Perimeter Layer                     │
│  ┌──────────┐  ┌──────────┐  ┌──────────────────┐   │
│  │   WAF    │→ │   DDoS   │→ │  Rate Limiting   │   │
│  │(CloudFlare)│  │Protection│  │  (Kong/Nginx)   │   │
│  └──────────┘  └──────────┘  └──────────────────┘   │
└───────────────────────────────────────────────────────┘
                         ↓
┌───────────────────────────────────────────────────────┐
│                 Authentication Layer                   │
│  ┌──────────────────────────────────────────────┐    │
│  │        API Gateway (OAuth 2.0 / JWT)         │    │
│  └──────────────────────────────────────────────┘    │
└───────────────────────────────────────────────────────┘
                         ↓
┌───────────────────────────────────────────────────────┐
│                 Service Mesh Layer (mTLS)             │
│  ┌─────────┐  ┌─────────┐  ┌─────────┐  ┌────────┐ │
│  │Service A│  │Service B│  │Service C│  │Service D│ │
│  └─────────┘  └─────────┘  └─────────┘  └────────┘ │
└───────────────────────────────────────────────────────┘
                         ↓
┌───────────────────────────────────────────────────────┐
│                   Data Layer (Encryption)             │
│  ┌──────────┐  ┌──────────┐  ┌─────────────────┐   │
│  │PostgreSQL│  │  Redis   │  │  Secrets Vault  │   │
│  │(TDE/SSL) │  │ (SSL/TLS)│  │ (HashiCorp Vault)│   │
│  └──────────┘  └──────────┘  └─────────────────┘   │
└───────────────────────────────────────────────────────┘
```

### Authentication & Authorization

```yaml
Authentication Strategy:

1. User Authentication (JWT):
   Flow:
     - User submits credentials → Auth Service
     - Auth Service validates → PostgreSQL
     - Generate JWT (15 min) + Refresh Token (7 days)
     - Store refresh token → Redis
     - Return tokens to client
   
   JWT Payload:
     {
       "sub": "user-id",
       "email": "user@example.com",
       "roles": ["customer"],
       "iat": 1234567890,
       "exp": 1234568790
     }
   
   Token Storage:
     - Access token: Memory (frontend)
     - Refresh token: HttpOnly secure cookie
   
   Validation:
     - API Gateway validates JWT
     - Public key verification (RS256)
     - Token blacklist check (Redis)

2. Service-to-Service (mTLS):
   Implementation: Istio Service Mesh
   
   Certificate Management:
     - Auto-rotation every 24 hours
     - cert-manager for lifecycle
     - Private CA (HashiCorp Vault)
   
   Policy:
     - All internal traffic requires mTLS
     - Service identity verification
     - Encrypted in-transit

3. Admin Access:
   - MFA required (TOTP)
   - Separate admin tokens
   - Additional claims: permissions[]
   - Audit logging mandatory

Authorization Strategy (RBAC):

Roles:
  - customer: Browse, cart, checkout
  - premium_customer: Above + early access, discounts
  - support: Read orders, update status
  - admin: Full access
  
Implementation:
  - Role stored in JWT
  - Policy enforcement at API Gateway
  - Fine-grained ACL at service level
  
Example Policy (OPA - Open Policy Agent):
  package authz
  
  default allow = false
  
  allow {
    input.method == "GET"
    input.path[0] == "products"
  }
  
  allow {
    input.method == "POST"
    input.path[0] == "orders"
    input.token.roles[_] == "customer"
  }
  
  allow {
    input.path[0] == "admin"
    input.token.roles[_] == "admin"
  }
```

### API Security

```yaml
1. Rate Limiting:
   Implementation: Redis + Kong
   
   Tiers:
     Anonymous: 100 req/min
     Authenticated: 1000 req/min
     Premium: 5000 req/min
   
   Response:
     HTTP 429 Too Many Requests
     Headers:
       X-RateLimit-Limit: 1000
       X-RateLimit-Remaining: 500
       X-RateLimit-Reset: 1640000000

2. Input Validation:
   - Schema validation (JSON Schema)
   - Request size limits (10MB)
   - Sanitization (XSS prevention)
   - SQL injection protection (parameterized queries)
   
   Example (FastAPI):
     from pydantic import BaseModel, validator
     
     class OrderCreate(BaseModel):
       items: List[OrderItem]
       
       @validator('items')
       def validate_items(cls, v):
         if len(v) > 100:
           raise ValueError('Max 100 items')
         return v

3. CORS Policy:
   Allowed Origins:
     - https://www.yourshop.com
     - https://admin.yourshop.com
   
   Allowed Methods: GET, POST, PUT, DELETE
   Allowed Headers: Authorization, Content-Type
   Max Age: 3600

4. HTTPS Enforcement:
   - TLS 1.3 only
   - HSTS headers
   - Certificate pinning (mobile apps)

5. API Versioning:
   Strategy: URL path versioning
   Example: /v1/products, /v2/products
   Deprecation: 6 months notice
```

### Data Security

```yaml
1. Encryption at Rest:
   PostgreSQL:
     - Transparent Data Encryption (TDE)
     - AWS RDS encryption enabled
     - KMS managed keys
   
   Redis:
     - Encrypted EBS volumes
   
   S3 (Product Images):
     - Server-side encryption (SSE-S3)
     - Bucket policies for access control

2. Encryption in Transit:
   - TLS 1.3 for all external traffic
   - mTLS for service-to-service
   - VPN for admin access

3. Sensitive Data Handling:
   Payment Card Data:
     - Never stored (use tokens)
     - Stripe/PayPal handles actual data
     - PCI-DSS compliance
   
   Personal Information (PII):
     - Column-level encryption
     - Access logging
     - GDPR compliance (right to deletion)
   
   Passwords:
     - bcrypt hashing (cost factor: 12)
     - Never logged
     - Password policies enforced

4. Secrets Management:
   Tool: HashiCorp Vault
   
   Stored Secrets:
     - Database credentials
     - API keys (payment gateways)
     - Encryption keys
     - Service certificates
   
   Access:
     - Dynamic secrets (short-lived)
     - Audit trail
     - Auto-rotation

5. Data Masking:
   Environments:
     Production: Full data
     Staging: Masked PII
     Development: Synthetic data
   
   Masked Fields:
     - Email: u***@example.com
     - Phone: ***-***-1234
     - Credit Card: ****-****-****-1234
```

### Network Security

```yaml
1. Network Segmentation:
   ┌─────────────────────────────────────────┐
   │         Public Subnet (DMZ)             │
   │  ┌────────────┐  ┌─────────────────┐  │
   │  │Load Balancer│  │  API Gateway   │  │
   │  └────────────┘  └─────────────────┘  │
   └─────────────────────────────────────────┘
                      ↓
   ┌─────────────────────────────────────────┐
   │       Private Subnet (Services)         │
   │  ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐  │
   │  │ Svc1 │ │ Svc2 │ │ Svc3 │ │ Svc4 │  │
   │  └──────┘ └──────┘ └──────┘ └──────┘  │
   └─────────────────────────────────────────┘
                      ↓
   ┌─────────────────────────────────────────┐
   │        Data Subnet (Databases)          │
   │  ┌──────┐ ┌──────┐ ┌──────────────┐   │
   │  │ PG   │ │Redis │ │ Elasticsearch│   │
   │  └──────┘ └──────┘ └──────────────┘   │
   └─────────────────────────────────────────┘

2. Firewall Rules:
   - Ingress: Only HTTPS (443) allowed
   - Egress: Restricted to required services
   - Inter-service: Only defined ports
   
   Example (AWS Security Groups):
     Load Balancer SG:
       Inbound: 0.0.0.0/0:443
       Outbound: Service SG:8080
     
     Service SG:
       Inbound: LB SG:8080
       Outbound: Database SG:5432
     
     Database SG:
       Inbound: Service SG:5432
       Outbound: None

3. DDoS Protection:
   - CloudFlare/AWS Shield
   - Rate limiting
   - IP reputation filtering

4. VPN Access:
   - Admin access via VPN only
   - MFA required
   - Session recording
```

### Security Monitoring

```yaml
1. Logging:
   What to Log:
     - All authentication attempts
     - API calls (request/response)
     - Database queries (slow queries)
     - Security events (failed auth, rate limits)
     - Configuration changes
   
   Log Format (JSON):
     {
       "timestamp": "2024-01-15T10:00:00Z",
       "service": "auth-service",
       "level": "INFO",
       "event": "login_success",
       "user_id": "USR-123",
       "ip": "1.2.3.4",
       "trace_id": "abc-123"
     }
   
   Storage:
     - Centralized logging (ELK Stack)
     - Retention: 90 days
     - Immutable logs

2. Monitoring & Alerting:
   Tools: Prometheus + Grafana + AlertManager
   
   Security Metrics:
     - Failed login attempts (alert > 100/min)
     - Rate limit hits (alert > 1000/min)
     - Certificate expiry (alert < 30 days)
     - Unauthorized access attempts
     - Anomalous traffic patterns
   
   SIEM Integration:
     - Forward logs to Splunk/Elastic Security
     - Correlation rules for attack detection
     - Automated incident response

3. Vulnerability Management:
   - Container scanning (Trivy/Snyk)
   - Dependency scanning (Dependabot)
   - SAST: SonarQube
   - DAST: OWASP ZAP
   - Penetration testing (quarterly)

4. Compliance:
   - PCI-DSS (payment processing)
   - GDPR (data privacy)
   - SOC 2 Type II (security controls)
   - Regular audits
```

---

## 5. Deployment Architecture

### Cloud Infrastructure (AWS Example)

```yaml
Region Strategy:
  Primary Region: us-east-1
  Secondary Region: eu-west-1 (DR)
  CDN: CloudFront (global)

Multi-AZ Deployment:
┌─────────────────────────────────────────────────────────┐
│                        Region                            │
│  ┌──────────────────┐           ┌──────────────────┐   │
│  │   AZ-1 (Primary) │           │   AZ-2 (Standby) │   │
│  │                  │           │                  │   │
│  │  ┌────────────┐  │           │  ┌────────────┐  │   │
│  │  │ EKS Nodes  │  │◄─────────►│  │ EKS Nodes  │  │   │
│  │  │ (3 nodes)  │  │           │  │ (3 nodes)  │  │   │
│  │  └────────────┘  │           │  └────────────┘  │   │
│  │                  │           │                  │   │
│  │  ┌────────────┐  │           │  ┌────────────┐  │   │
│  │  │ RDS Primary│  │──Repl───► │  │RDS Replica │  │   │
│  │  └────────────┘  │           │  └────────────┘  │   │
│  │                  │           │                  │   │
│  │  ┌────────────┐  │           │  ┌────────────┐  │   │
│  │  │  Redis     │  │◄─────────►│  │  Redis     │  │   │
│  │  │  Primary   │  │           │  │  Replica   │  │   │
│  │  └────────────┘  │           │  └────────────┘  │   │
│  └──────────────────┘           └──────────────────┘   │
│                                                         │
│  ┌─────────────────────────────────────────────────┐   │
│  │  Shared Services (Multi-AZ)                     │   │
│  │  - ALB (Load Balancer)                          │   │
│  │  - NAT Gateway                                  │   │
│  │  - ElastiCache                                  │   │
│  └─────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────┘
```

### Kubernetes Architecture

```yaml
Cluster Configuration:
  Platform: Amazon EKS / Google GKE
  Version: Kubernetes 1.28
  
  Node Pools:
    1. System Pool:
       - 3 nodes (t3.large)
       - Taints: CriticalAddonsOnly
       - Workloads: CoreDNS, Metrics Server
    
    2. Application Pool:
       - 6-50 nodes (auto-scaling)
       - Instance type: c6i.2xlarge (8 vCPU, 16GB)
       - Workloads: Microservices
    
    3. Data Pool:
       - 3 nodes (r6i.2xlarge - memory optimized)
       - Workloads: Redis, Kafka
    
    4. Compute Pool (GPU):
       - 0-5 nodes (g4dn.xlarge)
       - Workloads: ML-based recommendations

  Namespaces:
    - production
    - staging
    - monitoring
    - ingress

Service Deployment Example (Product Service):

apiVersion: apps/v1
kind: Deployment
metadata:
  name: product-service
  namespace: production
  labels:
    app: product-service
    version: v1
spec:
  replicas: 6
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 2
      maxUnavailable: 1
  selector:
    matchLabels:
      app: product-service
  template:
    metadata:
      labels:
        app: product-service
        version: v1
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "9090"
    spec:
      serviceAccountName: product-service
      
      # Pod anti-affinity for HA
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - product-service
            topologyKey: kubernetes.io/hostname
      
      containers:
      - name: product-service
        image: ecr.io/product-service:v1.2.3
        imagePullPolicy: IfNotPresent
        
        ports:
        - containerPort: 8080
          name: http
          protocol: TCP
        - containerPort: 9090
          name: metrics
          protocol: TCP
        
        env:
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: product-db-secret
              key: connection-string
        - name: REDIS_URL
          valueFrom:
            configMapKeyRef:
              name: product-config
              key: redis-url
        - name: LOG_LEVEL
          value: "info"
        
        resources:
          requests:
            cpu: 500m
            memory: 1Gi
          limits:
            cpu: 2000m
            memory: 2Gi
        
        livenessProbe:
          httpGet:
            path: /health/live
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 3
        
        readinessProbe:
          httpGet:
            path: /health/ready
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 5
          timeoutSeconds: 3
          failureThreshold: 2
        
        lifecycle:
          preStop:
            exec:
              command: ["/bin/sh", "-c", "sleep 15"]
        
        securityContext:
          runAsNonRoot: true
          runAsUser: 1000
          allowPrivilegeEscalation: false
          readOnlyRootFilesystem: true
          capabilities:
            drop:
            - ALL

---
apiVersion: v1
kind: Service
metadata:
  name: product-service
  namespace: production
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: nlb
spec:
  type: ClusterIP
  selector:
    app: product-service
  ports:
  - port: 8080
    targetPort: 8080
    protocol: TCP
    name: http
  sessionAffinity: None

---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: product-service-hpa
  namespace: production
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: product-service
  minReplicas: 6
  maxReplicas: 50
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  - type: Pods
    pods:
      metric:
        name: http_requests_per_second
      target:
        type: AverageValue
        averageValue: "1000"
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 50
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
      - type: Percent
        value: 100
        periodSeconds: 30

---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: product-service-pdb
  namespace: production
spec:
  minAvailable: 4
  selector:
    matchLabels:
      app: product-service
```

### Istio Service Mesh

```yaml
Installation:
  Version: Istio 1.20
  Profile: production
  
Components:
  - Istiod (control plane)
  - Ingress Gateway
  - Egress Gateway

Configuration:

# Virtual Service for traffic routing
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: product-service
  namespace: production
spec:
  hosts:
  - product-service
  http:
  - match:
    - headers:
        version:
          exact: canary
    route:
    - destination:
        host: product-service
        subset: v2
      weight: 10
    - destination:
        host: product-service
        subset: v1
      weight: 90
  - route:
    - destination:
        host: product-service
        subset: v1

---
# Destination Rule for load balancing
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: product-service
  namespace: production
spec:
  host: product-service
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 100
      http:
        http1MaxPendingRequests: 50
        http2MaxRequests: 100
        maxRequestsPerConnection: 2
    loadBalancer:
      simple: LEAST_REQUEST
    outlierDetection:
      consecutiveErrors: 5
      interval: 30s
      baseEjectionTime: 30s
      maxEjectionPercent: 50
  subsets:
  - name: v1
    labels:
      version: v1
  - name: v2
    labels:
      version: v2

---
# Peer Authentication (mTLS)
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: default
  namespace: production
spec:
  mtls:
    mode: STRICT

---
# Authorization Policy
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: product-service-authz
  namespace: production
spec:
  selector:
    matchLabels:
      app: product-service
  rules:
  - from:
    - source:
        principals:
        - cluster.local/ns/production/sa/api-gateway
        - cluster.local/ns/production/sa/order-service
    to:
    - operation:
        methods: ["GET", "POST"]
```

### CI/CD Pipeline

```yaml
Tool: GitHub Actions + ArgoCD

Pipeline Stages:

1. Build Stage:
   name: Build
   
   steps:
   - name: Checkout code
     uses: actions/checkout@v3
   
   - name: Run tests
     run: |
       npm test
       npm run test:integration
   
   - name: Security scan
     run: |
       npm audit
       trivy image --severity HIGH,CRITICAL
   
   - name: Build Docker image
     run: |
       docker build -t product-service:${{ github.sha }} .
   
   - name: Push to ECR
     run: |
       docker push ecr.io/product-service:${{ github.sha }}

2. Deploy Stage (GitOps):
   - ArgoCD watches Git repo
   - Detects Kubernetes manifest changes
   - Auto-syncs to cluster
   - Health checks before marking success
   
   ArgoCD Application:
   apiVersion: argoproj.io/v1alpha1
   kind: Application
   metadata:
     name: product-service
     namespace: argocd
   spec:
     project: production
     source:
       repoURL: https://github.com/company/k8s-manifests
       targetRevision: main
       path: product-service
     destination:
       server: https://kubernetes.default.svc
       namespace: production
     syncPolicy:
       automated:
         prune: true
         selfHeal: true
       syncOptions:
       - CreateNamespace=true
     healthCheck:
       interval: 10s
       timeout: 5m

3. Deployment Strategy:
   Blue-Green Deployment:
     - Deploy new version (green)
     - Run smoke tests
     - Switch traffic gradually
     - Keep blue for quick rollback
   
   Canary Deployment:
     - Deploy to 10% of pods
     - Monitor error rates
     - Gradually increase to 100%
     - Auto-rollback on errors

Rollback Strategy:
  - Automated: Error rate > 5%
  - Manual: kubectl rollout undo
  - ArgoCD: Revert Git commit
```

### Observability

```yaml
Monitoring Stack:

1. Metrics: Prometheus + Grafana
   
   Prometheus Scrape Config:
   - job_name: 'kubernetes-pods'
     kubernetes_sd_configs:
     - role: pod
     relabel_configs:
     - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
       action: keep
       regex: true
   
   Key Metrics:
     - http_request_duration_seconds (histogram)
     - http_requests_total (counter)
     - database_query_duration_seconds (histogram)
     - cache_hit_rate (gauge)
     - active_connections (gauge)
   
   Grafana Dashboards:
     - Service overview
     - Database performance
     - Cache performance
     - Business metrics (orders, revenue)

2. Logging: ELK Stack (Elasticsearch, Logstash, Kibana)
   
   Log Collection:
     - Fluent Bit (DaemonSet) collects logs
     - Forwards to Elasticsearch
     - Structured JSON logging
   
   Log Aggregation:
     - Centralized in Elasticsearch
     - Indexed by service, level, timestamp
     - Retention: 30 days hot, 90 days warm

3. Tracing: Jaeger
   
   Instrumentation: OpenTelemetry
   
   Trace Flow:
     API Gateway → Product Service → Database
     ↓
     Spans collected → Jaeger Collector → Elasticsearch
   
   Use Cases:
     - Request flow visualization
     - Performance bottleneck identification
     - Error debugging

4. Alerting: AlertManager + PagerDuty
   
   Alert Rules:
     - High error rate (> 1%)
     - High latency (p95 > 500ms)
     - Pod restarts (> 3 in 5 min)
     - Database connection errors
     - Cache miss rate (> 30%)
   
   Notification Channels:
     - PagerDuty (critical)
     - Slack (warning)
     - Email (info)
```

---

## 6. Scaling Strategy

### Horizontal Scaling

```yaml
Auto-Scaling Configuration:

1. Application Layer (HPA):
   Metrics:
     - CPU utilization (target: 70%)
     - Memory utilization (target: 80%)
     - Custom metrics (requests/sec: 1000)
   
   Scaling Behavior:
     Scale Up:
       - Trigger: Metrics exceed target for 1 min
       - Rate: Add 100% of current replicas (max)
       - Cooldown: 1 minute
     
     Scale Down:
       - Trigger: Metrics below target for 5 min
       - Rate: Remove 50% of excess replicas
       - Cooldown: 5 minutes
   
   Per-Service Configuration:
     Product Service:
       Min: 6 replicas
       Max: 50 replicas
       Target CPU: 70%
     
     Order Service:
       Min: 4 replicas
       Max: 30 replicas
       Target CPU: 60%
     
     Cart Service:
       Min: 4 replicas
       Max: 40 replicas
       Target: 1000 req/sec per pod

2. Cluster Auto-Scaling (CA):
   
   AWS Cluster Autoscaler:
     Min Nodes: 6
     Max Nodes: 50
     
     Scaling Triggers:
       - Pending pods (can't be scheduled)
       - Node utilization < 50% for 10 min (scale down)
     
     Node Selection:
       - Spot instances for non-critical workloads (70%)
       - On-demand for critical services (30%)
   
   Over-Provisioning:
     - Maintain 10% extra capacity
     - Placeholder pods for faster scaling

3. Database Scaling:
   
   PostgreSQL:
     Vertical Scaling:
       - Current: db.r6i.2xlarge (8 vCPU, 64GB)
       - Max: db.r6i.16xlarge (64 vCPU, 512GB)
       - Trigger: CPU > 80% or connections > 80%
     
     Horizontal Scaling (Read):
       - Add read replicas (max 5)
       - Connection pooling (PgBouncer)
       - Read-write split in application
     
     Sharding (if needed):
       - Shard key: user_id hash
       - Partition: By user_id % 4
       - Tool: Citus Data
   
   Redis:
     Cluster Mode:
       - Current: 6 shards (3 master + 3 replica)
       - Max: 90 shards
       - Auto-rebalancing enabled
     
     Scaling:
       - Add shards when memory > 80%
       - Vertical scaling: Up to 317GB per node

4. Message Queue Scaling:
   
   Kafka:
     Partitioning:
       - order.created: 20 partitions
       - payment.completed: 10 partitions
       - product.updated: 15 partitions
     
     Brokers:
       - Current: 6 brokers
       - Max: 30 brokers
       - Replication factor: 3
     
     Consumer Groups:
       - Scale consumers = partition count
       - Auto-scaling based on lag
```

### Load Distribution

```yaml
Load Balancing Strategy:

1. External Load Balancer:
   Type: Application Load Balancer (ALB)
   
   Configuration:
     - Algorithm: Least outstanding requests
     - Health checks: /health endpoint
     - SSL termination
     - Connection draining: 30s
     - Idle timeout: 60s
   
   Rules:
     - /api/products → Product Service
     - /api/cart → Cart Service
     - /api/orders → Order Service
     - /api/auth → Auth Service

2. Service Mesh (Istio):
   Load Balancing:
     - Algorithm: LEAST_REQUEST
     - Locality-aware routing
     - Zone-aware failover
   
   Traffic Split:
     - Geographic routing
     - A/B testing support
     - Canary deployments

3. Database Load Balancing:
   PgBouncer Configuration:
     - Pool mode: Transaction
     - Max connections: 500
     - Default pool size: 100
     - Reserve pool: 10
   
   Read-Write Split:
     Application logic:
       - Writes → Primary
       - Reads → Random replica
       - Consistency requirements → Primary

4. CDN (CloudFront):
   Cached Content:
     - Product images
     - Static assets (JS, CSS)
     - API responses (with short TTL)
   
   Configuration:
     - Origin: S3 + ALB
     - Edge locations: Global
     - Cache TTL: 1 hour (images), 5 min (API)
     - Compression: Gzip, Brotli
```

### Caching Strategy

```yaml
Multi-Layer Caching:

Layer 1 - CDN (CloudFront):
  Content:
    - Product images
    - Static assets
  TTL: 24 hours
  Hit rate target: > 95%

Layer 2 - API Gateway Cache:
  Content:
    - Product list responses
    - Category data
  TTL: 5 minutes
  Hit rate target: > 70%

Layer 3 - Application Cache (Redis):
  Cache Patterns:
    
    1. Cache-Aside (Product Service):
       def get_product(product_id):
         # Check cache
         cached = redis.get(f"product:{product_id}")
         if cached:
           return cached
         
         # Cache miss - fetch from DB
         product = db.query(Product).get(product_id)
         
         # Store in cache
         redis.setex(
           f"product:{product_id}",
           3600,  # 1 hour TTL
           json.dumps(product)
         )
         return product
    
    2. Write-Through (Inventory Service):
       def update_inventory(product_id, quantity):
         # Update database
         db.execute(
           "UPDATE inventory SET quantity = ? WHERE product_id = ?",
           quantity, product_id
         )
         
         # Update cache
         redis.set(
           f"inventory:{product_id}",
           quantity,
           ex=1800  # 30 min TTL
         )
    
    3. Write-Behind (Analytics):
       # Buffer events in Redis
       redis.lpush("events", event_data)
       
       # Background worker flushes to ClickHouse
       # Every 10 seconds or 1000 events

  Cache Invalidation:
    Strategies:
      - TTL-based (default)
      - Event-based (pub/sub)
      - Manual purge (admin API)
    
    Example (Product Update):
      1. Product updated in DB
      2. Publish event: product.updated
      3. Cache service subscribes
      4. Delete keys: product:{id}, product:list:*
      5. Next request repopulates cache

  Cache Warming:
    - Pre-populate popular products on deploy
    - Scheduled jobs for trending items
    - Predictive caching based on patterns

Layer 4 - Database Cache:
  PostgreSQL:
    - Query cache (enabled)
    - Buffer pool: 25% of RAM
    - Prepared statement cache
  
  Elasticsearch:
    - Node query cache
    - Shard request cache
    - Field data cache
```

### Performance Optimization

```yaml
1. Database Optimization:
   
   Indexing Strategy:
     Products Table:
       - Primary: id (clustered)
       - Secondary: category_id, price, created_at
       - Composite: (category_id, price)
       - Full-text: name, description
     
     Orders Table:
       - Primary: id
       - Secondary: user_id, status, created_at
       - Composite: (user_id, created_at DESC)
   
   Query Optimization:
     - Use EXPLAIN ANALYZE
     - Avoid N+1 queries (use joins/eager loading)
     - Pagination: Cursor-based (not offset)
     - Materialized views for complex aggregations
   
   Connection Pooling:
     - PgBouncer in transaction mode
     - Pool size: 100 per service
     - Max client connections: 500

2. API Optimization:
   
   Response Compression:
     - Gzip for responses > 1KB
     - Brotli for static assets
   
   Pagination:
     - Default page size: 20
     - Max page size: 100
     - Cursor-based for large datasets
   
   Field Selection:
     - GraphQL for flexible queries
     - REST with field parameter
     - Example: /products?fields=id,name,price
   
   Response Caching:
     - ETag support
     - Cache-Control headers
     - Conditional requests (If-None-Match)

3. Async Processing:
   
   Background Jobs:
     - Order confirmation emails
     - Analytics processing
     - Report generation
     - Image processing
   
   Tool: Celery (Python) / Bull (Node.js)
   
   Configuration:
     Workers: 10 per service
     Queue: Redis
     Retry: 3 attempts with exponential backoff

4. Code-Level Optimization:
   
   - Use compiled languages for CPU-intensive tasks (Go)
   - Async I/O for I/O-bound operations (Node.js)
   - Connection reuse (keep-alive)
   - Batch operations where possible
   - Lazy loading of resources
```

### Capacity Planning

```yaml
Current Capacity (100K concurrent users):

Service Instances:
  - Product Service: 6 pods (can handle 50K RPS)
  - Cart Service: 4 pods (can handle 30K RPS)
  - Order Service: 4 pods (can handle 10K orders/min)
  - Auth Service: 4 pods (can handle 20K auth/min)

Database:
  - PostgreSQL: db.r6i.2xlarge (max 500 connections)
  - Redis: 32GB cluster (6 shards)
  - Elasticsearch: 6 data nodes (500GB total)

Growth Plan (10x to 1M concurrent users):

Phase 1 (100K → 300K):
  Timeline: Months 1-6
  
  Services:
    - Product: 6 → 15 pods
    - Cart: 4 → 12 pods
    - Order: 4 → 10 pods
  
  Database:
    - PostgreSQL: Add 3 more read replicas
    - Redis: Add 6 more shards (12 total)
    - Elasticsearch: Add 6 data nodes (12 total)
  
  Infrastructure:
    - K8s nodes: 12 → 30
    - Cost increase: +200%

Phase 2 (300K → 600K):
  Timeline: Months 7-12
  
  Services:
    - Product: 15 → 30 pods
    - Implement service sharding if needed
  
  Database:
    - PostgreSQL: Upgrade to db.r6i.4xlarge
    - Implement read-write split at app level
    - Consider Citus for horizontal sharding
  
  Infrastructure:
    - K8s nodes: 30 → 50
    - Multi-region deployment (active-passive)

Phase 3 (600K → 1M):
  Timeline: Months 13-18
  
  Services:
    - Full auto-scaling up to max limits
    - Geographic distribution (multi-region active)
  
  Database:
    - PostgreSQL: Full sharding implementation
    - Redis: 30+ shards
    - Elasticsearch: 20+ data nodes
  
  Infrastructure:
    - Multi-region active-active
    - Global load balancing
    - K8s nodes: 100+ across regions

Cost Estimation:
  Current (100K users): $15K/month
  300K users: $45K/month
  600K users: $90K/month
  1M users: $150K/month
```

---

## Summary & Key Metrics

### Architecture Highlights

```yaml
Service Count: 9 core microservices
Communication: REST + gRPC + Event-driven (Kafka)
Database Strategy: Polyglot persistence
Deployment: Kubernetes on AWS/GCP
Service Mesh: Istio for mTLS and traffic management
Observability: Prometheus + Grafana + Jaeger + ELK
```

### Performance Targets

```yaml
Availability: 99.9% (43 minutes downtime/month)
Latency:
  - Product catalog: < 200ms (p95)
  - Search: < 100ms (p95)
  - Checkout: < 500ms (p95)
  - Authentication: < 100ms (p95)

Throughput:
  - Concurrent users: 100,000
  - Requests/sec: 50,000
  - Orders/day: 500,000

Scalability:
  - Horizontal: 10x current capacity
  - Vertical: Per-service optimization
  - Geographic: Multi-region support
```

### Security Posture

```yaml
Authentication: JWT + OAuth2
Authorization: RBAC with OPA
Encryption: TLS 1.3 (transit), AES-256 (rest)
Compliance: PCI-DSS, GDPR, SOC 2
Secrets: HashiCorp Vault
Network: Zero-trust with mTLS
```

### Cost Optimization

```yaml
Strategies:
  - Spot instances for non-critical workloads (60% savings)
  - Auto-scaling to match demand
  - Reserved instances for baseline (40% savings)
  - S3 lifecycle policies for old data
  - CDN for reduced origin requests
  - Database query optimization

Monthly Cost Breakdown (100K users):
  - Compute (EKS): $6,000
  - Database (RDS, Redis): $4,500
  - Data transfer: $1,500
  - Monitoring/Logging: $1,000
  - Other services: $2,000
  Total: ~$15,000/month
```

This architecture provides a production-ready, scalable, and secure foundation for an e-commerce platform that can grow from 100K to 1M concurrent users while maintaining high performance and reliability.

In [12]:
def code_review_example():
    """Code review with extended thinking"""
    
    code = '''
def process_transactions(transactions, user_id):
    results = []
    total = 0
    
    for t in transactions:
        if t["user"] == user_id:
            amount = t["amount"]
            if t["type"] == "credit":
                total += amount
            else:
                total -= amount
            results.append({"id": t["id"], "running_total": total})
    
    return results, total

def get_user_balance(user_id):
    conn = database.connect()
    cursor = conn.cursor()
    cursor.execute(f"SELECT * FROM transactions WHERE user_id = {user_id}")
    rows = cursor.fetchall()
    transactions = [{"id": r[0], "user": r[1], "amount": r[2], "type": r[3]} for r in rows]
    return process_transactions(transactions, user_id)
'''
    
    # max_tokens must be greater than budget_tokens
    budget = 10000
    
    response = client.messages.create(
        model="claude-sonnet-4-5-20250929",
        max_tokens=budget + 4000,
        thinking={"type": "enabled", "budget_tokens": budget},
        messages=[{
            "role": "user",
            "content": f"""Review this Python code for:
            1. Security vulnerabilities
            2. Performance issues
            3. Best practice violations
            4. Error handling gaps
            
            Provide specific fixes for each issue found.
            
            Code:
            ```python
            {code}
            ```"""
        }]
    )
    
    print("Code Review with Extended Thinking (issues + fixes)")
    print("=" * 60)
    
    for block in response.content:
        if block.type == "text":
            display(Markdown(block.text))

code_review_example()

Code Review with Extended Thinking (issues + fixes)


# Code Review: Security, Performance, Best Practices & Error Handling

## 🚨 Critical Issues Found

### 1. **SECURITY VULNERABILITIES**

#### **SQL Injection (CRITICAL)**
```python
# ❌ VULNERABLE
cursor.execute(f"SELECT * FROM transactions WHERE user_id = {user_id}")
```

**Risk:** Allows arbitrary SQL execution if `user_id` is malicious input.

**Fix:**
```python
# ✅ SECURE - Use parameterized queries
cursor.execute("SELECT id, user_id, amount, type FROM transactions WHERE user_id = ?", (user_id,))
```

---

### 2. **PERFORMANCE ISSUES**

#### **Resource Leak**
```python
# ❌ Connection never closed
conn = database.connect()
cursor = conn.cursor()
```

#### **Inefficient Data Loading**
```python
# ❌ Loads entire dataset into memory
rows = cursor.fetchall()
```

---

### 3. **BEST PRACTICE VIOLATIONS**

- No type hints
- Missing docstrings
- Magic array indexes (`r[0], r[1]`)
- Poor variable names (`t`, `r`)
- No input validation
- Hard-coded string literals

---

### 4. **ERROR HANDLING GAPS**

- No exception handling for database operations
- Missing dictionary key validation
- No null/empty checks

---

## ✅ FIXED CODE

```python
from typing import List, Dict, Tuple, Optional
from decimal import Decimal
import logging

# Constants
TRANSACTION_TYPE_CREDIT = "credit"
TRANSACTION_TYPE_DEBIT = "debit"

logger = logging.getLogger(__name__)


class TransactionError(Exception):
    """Custom exception for transaction processing errors."""
    pass


def process_transactions(
    transactions: List[Dict], 
    user_id: int
) -> Tuple[List[Dict], Decimal]:
    """
    Process transactions for a specific user and calculate running totals.
    
    Args:
        transactions: List of transaction dictionaries
        user_id: ID of the user to filter transactions
        
    Returns:
        Tuple of (processed_results, final_total)
        
    Raises:
        TransactionError: If transaction data is invalid
    """
    if not transactions:
        return [], Decimal("0")
    
    if not isinstance(user_id, int) or user_id <= 0:
        raise ValueError(f"Invalid user_id: {user_id}")
    
    results = []
    total = Decimal("0")
    
    for transaction in transactions:
        try:
            # Validate required keys
            if transaction.get("user") == user_id:
                amount = Decimal(str(transaction["amount"]))
                transaction_type = transaction["type"]
                transaction_id = transaction["id"]
                
                # Update total based on transaction type
                if transaction_type == TRANSACTION_TYPE_CREDIT:
                    total += amount
                elif transaction_type == TRANSACTION_TYPE_DEBIT:
                    total -= amount
                else:
                    logger.warning(
                        f"Unknown transaction type '{transaction_type}' "
                        f"for transaction {transaction_id}"
                    )
                    continue
                
                results.append({
                    "id": transaction_id,
                    "running_total": float(total)
                })
                
        except KeyError as e:
            raise TransactionError(f"Missing required field: {e}")
        except (ValueError, TypeError) as e:
            raise TransactionError(f"Invalid transaction data: {e}")
    
    return results, total


def get_user_balance(user_id: int) -> Optional[Tuple[List[Dict], Decimal]]:
    """
    Retrieve user balance and transaction history from database.
    
    Args:
        user_id: ID of the user
        
    Returns:
        Tuple of (transaction_results, final_balance) or None on error
        
    Raises:
        ValueError: If user_id is invalid
        TransactionError: If transaction processing fails
    """
    # Input validation
    if not isinstance(user_id, (int, str)):
        raise ValueError(f"user_id must be int or string, got {type(user_id)}")
    
    try:
        user_id = int(user_id)
        if user_id <= 0:
            raise ValueError("user_id must be positive")
    except (ValueError, TypeError) as e:
        raise ValueError(f"Invalid user_id: {e}")
    
    # Use context manager for automatic resource cleanup
    try:
        with database.connect() as conn:
            with conn.cursor() as cursor:
                # Parameterized query prevents SQL injection
                query = """
                    SELECT id, user_id, amount, type 
                    FROM transactions 
                    WHERE user_id = ?
                    ORDER BY created_at ASC
                """
                cursor.execute(query, (user_id,))
                
                # Use column names for clarity
                columns = ["id", "user", "amount", "type"]
                transactions = [
                    dict(zip(columns, row)) 
                    for row in cursor.fetchall()
                ]
                
                if not transactions:
                    logger.info(f"No transactions found for user {user_id}")
                    return [], Decimal("0")
                
                return process_transactions(transactions, user_id)
                
    except database.DatabaseError as e:
        logger.error(f"Database error retrieving balance for user {user_id}: {e}")
        raise
    except Exception as e:
        logger.error(f"Unexpected error in get_user_balance: {e}")
        raise


# Enhanced version with pagination for large datasets
def get_user_balance_paginated(
    user_id: int, 
    batch_size: int = 1000
) -> Optional[Tuple[List[Dict], Decimal]]:
    """
    Memory-efficient version using cursor iteration.
    Suitable for users with many transactions.
    """
    if not isinstance(user_id, int) or user_id <= 0:
        raise ValueError(f"Invalid user_id: {user_id}")
    
    results = []
    total = Decimal("0")
    
    try:
        with database.connect() as conn:
            with conn.cursor() as cursor:
                query = """
                    SELECT id, amount, type 
                    FROM transactions 
                    WHERE user_id = ?
                    ORDER BY created_at ASC
                """
                cursor.execute(query, (user_id,))
                
                # Process in batches to avoid memory issues
                while True:
                    rows = cursor.fetchmany(batch_size)
                    if not rows:
                        break
                    
                    for row in rows:
                        transaction_id, amount, trans_type = row
                        amount = Decimal(str(amount))
                        
                        if trans_type == TRANSACTION_TYPE_CREDIT:
                            total += amount
                        elif trans_type == TRANSACTION_TYPE_DEBIT:
                            total -= amount
                        
                        results.append({
                            "id": transaction_id,
                            "running_total": float(total)
                        })
                
                return results, total
                
    except Exception as e:
        logger.error(f"Error in paginated balance retrieval: {e}")
        raise

```

---

## 📋 SUMMARY OF FIXES

| Issue | Severity | Fix |
|-------|----------|-----|
| SQL Injection | 🔴 Critical | Parameterized queries |
| Resource Leak | 🟠 High | Context managers |
| No Error Handling | 🟠 High | Try-except blocks |
| Missing Validation | 🟡 Medium | Input validation added |
| Magic Indexes | 🟡 Medium | Named columns with `zip()` |
| No Type Hints | 🟢 Low | Full type annotations |
| Poor Naming | 🟢 Low | Descriptive variable names |
| Missing Docs | 🟢 Low | Comprehensive docstrings |

---

## 🔒 ADDITIONAL SECURITY RECOMMENDATIONS

1. **Add rate limiting** to prevent abuse
2. **Implement authentication** to verify user permissions
3. **Use prepared statements** consistently across codebase
4. **Add audit logging** for financial transactions
5. **Encrypt sensitive data** in transit and at rest

<a id='summary'></a>
## 11. Summary

### What We've Learned

1. **Extended Thinking Basics**: How to enable and use Claude's reasoning capabilities
2. **Model Support**: All Claude 4+ models support extended thinking
3. **Budget Tokens**: Start minimal (1,024) and increase based on task complexity
4. **Streaming**: Required for large max_tokens, improves UX
5. **Tool Use**: Only `tool_choice: "auto"` or `"none"` supported
6. **Interleaved Thinking**: Beta feature for reasoning between tool calls (Claude 4 only)
7. **Summarized Thinking**: Claude 4+ returns summaries, billed for full tokens

### Key Takeaways

**Use Extended Thinking for:**
- Complex multi-step problems requiring sequential reasoning
- Deep document analysis and structured evaluation
- Strategic planning with multiple constraints
- STEM problems and optimization tasks
- Quality-critical tasks where accuracy matters more than speed

**Avoid Extended Thinking for:**
- Simple queries or lookups
- Real-time chat applications
- Tasks where latency is critical
- High-volume, low-complexity requests

### Common Pitfalls to Avoid

- Don't toggle thinking mid-conversation (complete the assistant turn first)
- Don't use with forced tool use or modified temperature settings
- Don't manually edit or parse signature fields
- Don't over-specify instructions (let Claude's creativity work)
- Don't forget to pass thinking blocks back unchanged when using tools

### Resources

- [Extended Thinking Documentation](https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking)
- [Extended Thinking Tips](https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/extended-thinking-tips)
- [Anthropic API Reference](https://docs.anthropic.com/api/)