## **Convert all of these Anthropic examples to work with DeepSeek-R1 via Ollama.** 
 
 ##**Anthropic Examples**## 
 
 Sure, I can help you convert the Anthropic examples to work with DeepSeek-R1 via Ollama. Please provide the specific Anthropic examples you would like to convert. Here are some general guidelines and a template to get started:

 1. Stream Thinking Example
 2. Tools Example 
 3. Budget Comparison 
 4. Prompting Best Practices 
 5. Cost Analysis 
   
###
| **Anthropic Feature** | **DeepSeek-R1 Equivalent** |
|----------------------|----------------------------|
| `thinking={"type": "enabled"}` | Automatic `<think>` tags |
| `budget_tokens=5000` | No budget needed - automatic |
| `client.messages.stream()` | `client.chat(stream=True)` |
| `block.type == "thinking"` | Parse `<think>...</think>` |
| Token cost management | FREE local inference |
| Tool calling | Manual calculation requests |
| Response streaming events | Chunk-based content streaming |

### Key Advantages of DeepSeek-R1:

1. **💰 Cost**: Completely FREE vs Claude's $0.05-0.50 per analysis
2. **🧠 Thinking**: Always visible, no budget management needed  
3. **🔒 Privacy**: Runs locally, no cloud dependencies
4. **⚡ Performance**: No network latency, unlimited usage



<a id='setup'></a>
## 2. Setting Up Your Environment using Astral uv

Let's start by installing the necessary packages and setting up our API key.

In [None]:
import os
import ollama
from IPython.display import Markdown, display
import json
import time
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Use the working URL directly
ollama_url = 'http://localhost:11434'  # Your confirmed working URL
model_name = 'deepseek-r1:14b'

# Create Ollama client
client = ollama.Client(host=ollama_url)

print(f"✅ Connected to Ollama at: {ollama_url}")
print(f"🤖 Using model: {model_name}")

# Verify the model is available
try:
    models = client.list()
    available_models = [m['name'] for m in models.get('models', [])]
    print(f"✅ Found {len(available_models)} models")
    
    if model_name in available_models:
        print(f"✅ {model_name} is available and ready!")
    else:
        print(f"❌ {model_name} not found. Available models:")
        for model in available_models[:5]:  # Show first 5
            print(f"  - {model}")
            
except Exception as e:
    print(f"❌ Error verifying models: {e}")

# Test a simple chat
def test_basic_chat():
    print(f"\n🧪 Testing basic chat...")
    try:
        response = client.chat(
            model=model_name,
            messages=[{"role": "user", "content": "Hello! Just say 'Hi' back."}],
            stream=False
        )
        
        answer = response.get('message', {}).get('content', '')
        print(f"✅ DeepSeek says: {answer}")
        return True
    except Exception as e:
        print(f"❌ Chat test failed: {e}")
        return False

# Run the test
if test_basic_chat():
    print(f"\n🎉 Everything is working perfectly!")
    print(f"You can now use the converted examples.")
else:
    print(f"\n❌ Something went wrong with the chat test.")

### All of my Ollama.ai models are supported.

| Model Name               | ID              | Size    | Last Modified |
|--------------------------|-----------------|---------|---------------|
| phi4-mini:3.8b           | 78fad5d182a7    | 2.5 GB  | 6 days ago    |
| deepseek-r1:32b          | edba8017331d    | 19 GB   | 6 days ago    |
| deepseek-r1:14b          | c333b7232bdb    | 9.0 GB  | 6 days ago    |
| deepseek-r1:8b           | 6995872bfe4c    | 5.2 GB  | 6 days ago    |
| deepseek-r1:1.5b         | e0979632db5a    | 1.1 GB  | 6 days ago    |
| gemma3n:e4b              | 15cb39fd9394    | 7.5 GB  | 6 days ago    |
| qwen2.5vl:7b             | 5ced39dfa4ba    | 6.0 GB  | 3 weeks ago   |
| qwen2.5vl:32b            | 3edc3a52fe98    | 21 GB   | 3 weeks ago   |
| mistral-small3.1:24b     | b9aaf0c2586a    | 15 GB   | 3 weeks ago   |
| gemma3:27b               | a418f5838eaf    | 17 GB   | 4 weeks ago   |
| gemma3:12b               | f4031aab637d    | 8.1 GB  | 4 weeks ago   |
| gemma3:4b                | a2af6cc3eb7f    | 3.3 GB  | 4 weeks ago   |
| phi4-reasoning:latest    | 47e2630ccbcd    | 11 GB   | 4 weeks ago   |
| qwen2.5-coder:14b        | 9ec8897f747e    | 9.0 GB  | 4 weeks ago   |
| qwen2.5-coder:7b         | dae161e27b0e    | 4.7 GB  | 4 weeks ago   |
| qwen3:14b                | bdbd181c33f2    | 9.3 GB  | 4 weeks ago   |
| qwen3:8b                 | 500a1f067a9f    | 5.2 GB  | 4 weeks ago   |
| phi4:latest              | ac896e5b8b34    | 9.1 GB  | 6 months ago  |


<a id='basic-usage'></a>
## 3. Basic Usage

Let's start with a simple example to see extended thinking in action.

In [None]:
# Your basic thinking example (converted)
def basic_thinking_example():
    response = client.chat(
        model='deepseek-r1:14b',
        messages=[{
            "role": "user", 
            "content": "What is 27 * 453? Show me how you calculate this step by step."
        }],
        stream=False
    )
    
    full_response = response.get('message', {}).get('content', '')
    
    # Parse thinking and answer
    if '<think>' in full_response:
        thinking_start = full_response.find('<think>') + 7
        thinking_end = full_response.find('</think>')
        thinking = full_response[thinking_start:thinking_end].strip()
        answer = full_response[thinking_end + 8:].strip()
        
        print("🤔 DeepSeek's Thinking:")
        print("-" * 50)
        print(thinking)
        print("-" * 50)
        print("\n✅ Final Answer:")
        print(answer)
    else:
        print("✅ Response:")
        print(full_response)

# Run it
basic_thinking_example()

# **No, this is NOT valid for DeepSeek-R1.** Here's the key difference:

## Anthropic Claude vs DeepSeek-R1 Thinking

### ❌ **Anthropic Claude (Manual Configuration)**
```python
thinking={
    "type": "enabled",        # Must explicitly enable
    "budget_tokens": 5000     # Must set token budget
}
```

### ✅ **DeepSeek-R1 (Automatic)**
```python
# No thinking parameters needed!
client.chat(
    model='deepseek-r1:14b',
    messages=[{"role": "user", "content": "Your question"}]
)
# Thinking happens automatically in <think>...</think> tags
```

## Key Differences:

| **Feature** | **Anthropic Claude** | **DeepSeek-R1** |
|-------------|---------------------|-----------------|
| **Thinking Control** | Manual (`thinking={"type": "enabled"}`) | ✅ **Automatic** |
| **Token Budget** | Required (`budget_tokens: 1024-32000`) | ✅ **No budget needed** |
| **Thinking Output** | Separate block/summary | ✅ **Full process in `<think>` tags** |
| **Cost** | Thinking tokens billed as output | ✅ **FREE (local)** |
| **Configuration** | Complex parameter management | ✅ **Zero configuration** |

## DeepSeek-R1 Thinking Behavior:

```python
# Simple question = shorter thinking
response = client.chat(messages=[{"role": "user", "content": "What is 2+2?"}])
# Result: <think>Simple addition: 2+2=4</think>The answer is 4.

# Complex question = extensive thinking  
response = client.chat(messages=[{"role": "user", "content": "Design a REST API..."}])
# Result: <think>[5000+ characters of detailed reasoning]</think>[Final answer]
```

## Why DeepSeek-R1 is Simpler:

1. **🎯 No Parameter Management**: Just ask your question
2. **🧠 Intelligent Scaling**: Thinking depth automatically matches complexity
3. **💰 No Token Costs**: Think as much as needed without cost concerns
4. **🔍 Full Transparency**: See the complete thought process, not just summaries

**Bottom Line**: DeepSeek-R1's thinking is **automatic, free, and transparent** - no configuration needed like Claude's manual thinking parameters!

<a id='thinking-blocks'></a>
## 4. Understanding Thinking Blocks

Let's explore how thinking blocks work and what information they contain.

In [None]:
def analyze_thinking_blocks():
    """Demonstrate the structure of DeepSeek-R1 thinking blocks"""
    
    response = client.chat(
        model='deepseek-r1:14b',
        messages=[{
            "role": "user",
            "content": """I have a list of numbers: [15, 23, 8, 42, 16, 4, 30, 12].
            
            Please:
            1. Find the median
            2. Calculate the mean
            3. Identify any outliers using the IQR method
            4. Suggest what this data might represent"""
        }],
        stream=False
    )
    
    # Get the full response
    full_response = response.get('message', {}).get('content', '')
    
    # Analyze the response structure
    print("📊 DeepSeek-R1 Response Structure Analysis")
    print("=" * 60)
    
    # Parse thinking vs final answer
    if '<think>' in full_response and '</think>' in full_response:
        thinking_start = full_response.find('<think>') + 7
        thinking_end = full_response.find('</think>')
        thinking_content = full_response[thinking_start:thinking_end].strip()
        final_answer = full_response[thinking_end + 8:].strip()
        
        print(f"\nResponse Structure:")
        print(f"  Total Length: {len(full_response)} characters")
        print(f"  Has Thinking Block: Yes")
        print(f"  Thinking Length: {len(thinking_content)} characters")
        print(f"  Final Answer Length: {len(final_answer)} characters")
        print(f"  Thinking Ratio: {len(thinking_content)/(len(full_response))*100:.1f}%")
        
        print(f"\n🤔 THINKING PROCESS:")
        print("-" * 40)
        print(thinking_content)
        
        print(f"\n✅ FINAL ANSWER:")
        print("-" * 40)
        display(Markdown(final_answer))
        
        # Additional analysis
        print(f"\n📈 THINKING ANALYSIS:")
        print("-" * 40)
        thinking_lines = thinking_content.split('\n')
        print(f"  Lines in thinking: {len(thinking_lines)}")
        print(f"  Words in thinking: {len(thinking_content.split())}")
        
        # Look for mathematical reasoning patterns
        math_keywords = ['calculate', 'median', 'mean', 'average', 'sort', 'IQR', 'outlier']
        found_keywords = [kw for kw in math_keywords if kw.lower() in thinking_content.lower()]
        print(f"  Mathematical concepts mentioned: {found_keywords}")
        
    else:
        print(f"\nResponse Structure:")
        print(f"  Total Length: {len(full_response)} characters")
        print(f"  Has Thinking Block: No")
        print(f"  Response Type: Direct answer")
        
        print(f"\n✅ RESPONSE:")
        print("-" * 40)
        display(Markdown(full_response))
    
    return full_response

# Run the analysis
result = analyze_thinking_blocks()

**No, this is NOT valid for DeepSeek-R1.** Those specific features are unique to Anthropic's Claude 4 implementation. Here's the comparison:

## Anthropic Claude 4 vs DeepSeek-R1 Thinking

| **Feature** | **Anthropic Claude 4** | **DeepSeek-R1** |
|-------------|------------------------|------------------|
| **Summarization** | ✅ Provides summarized thinking | ❌ Shows FULL thinking process |
| **Billing Model** | 💰 Charged for full thinking tokens | 💰 FREE (local inference) |
| **Signature Verification** | ✅ Cryptographic signatures | ❌ No signatures needed |
| **Privacy Controls** | ✅ Controlled thinking exposure | ✅ Full local privacy |

## DeepSeek-R1 Thinking Characteristics:

### 1. **Full Thinking Display** (Not Summarized)
```python
# DeepSeek-R1 shows EVERYTHING in <think> tags
response = """<think>
Let me work through this step by step...
First I need to calculate 27 * 453...
27 * 453 = 27 * (450 + 3) = 27 * 450 + 27 * 3
27 * 450 = 27 * 45 * 10 = 1215 * 10 = 12,150
27 * 3 = 81
So 27 * 453 = 12,150 + 81 = 12,231
</think>

The answer is 27 × 453 = 12,231"""
```

### 2. **No Token Billing** (Free Local)
- No "thinking budget" needed
- No charge per token
- Unlimited thinking depth

### 3. **No Cryptographic Signatures**
- No verification system
- Direct model output
- Trust based on model reliability

### 4. **Complete Transparency**
- You see the actual thinking process
- No hidden reasoning steps
- Full visibility into model reasoning

## What This Means for You:## The Key Difference:

**Anthropic Claude 4** gives you a "polished summary" of thinking with enterprise features.

**DeepSeek-R1** gives you the "raw, unfiltered thinking process" with complete transparency.

Think of it like:
- **Claude 4**: A professional report with executive summary
- **DeepSeek-R1**: The researcher's complete notebook with all work shown

Both are valuable, but for **learning and understanding AI reasoning**, DeepSeek-R1 actually gives you MORE insight into how the model thinks!

In [None]:
def deepseek_thinking_reality():
    """Demonstrate actual DeepSeek-R1 thinking characteristics"""
    
    print("🔍 DeepSeek-R1 Thinking Reality Check")
    print("=" * 60)
    
    # Test to show actual thinking behavior
    response = client.chat(
        model='deepseek-r1:14b',
        messages=[{
            "role": "user", 
            "content": "Calculate 27 * 453 and show your work clearly."
        }],
        stream=False
    )
    
    full_response = response.get('message', {}).get('content', '')
    
    print("🧠 WHAT YOU ACTUALLY GET:")
    print("-" * 40)
    
    if '<think>' in full_response and '</think>' in full_response:
        thinking_start = full_response.find('<think>') + 7
        thinking_end = full_response.find('</think>')
        thinking_content = full_response[thinking_start:thinking_end].strip()
        final_answer = full_response[thinking_end + 8:].strip()
        
        print("✅ Full thinking process visible:")
        print(f"   Length: {len(thinking_content)} characters")
        print(f"   Words: {len(thinking_content.split())} words")
        print(f"   Complete reasoning: YES")
        print(f"   Summarized: NO")
        print(f"   Cryptographic signature: NO")
        print(f"   Cost: $0.00 (FREE)")
        
        print(f"\n🎯 THINKING SAMPLE (first 500 chars):")
        print(f"   {thinking_content[:500]}...")
        
        print(f"\n💡 FINAL ANSWER:")
        print(f"   {final_answer}")
        
    else:
        print("❌ No thinking block found in this response")
    
    return full_response

def anthropic_vs_deepseek_summary():
    """Summary of key differences"""
    
    print(f"\n📊 ANTHROPIC vs DEEPSEEK-R1 SUMMARY")
    print("=" * 60)
    
    differences = [
        {
            "aspect": "Thinking Visibility",
            "anthropic": "Summarized snippets",
            "deepseek": "Complete full process",
            "winner": "DeepSeek-R1"
        },
        {
            "aspect": "Cost Model", 
            "anthropic": "Pay per thinking token",
            "deepseek": "Free local inference",
            "winner": "DeepSeek-R1"
        },
        {
            "aspect": "Token Management",
            "anthropic": "Budget required",
            "deepseek": "No limits needed",
            "winner": "DeepSeek-R1"
        },
        {
            "aspect": "Privacy",
            "anthropic": "Cloud-based processing",
            "deepseek": "Local-only processing", 
            "winner": "DeepSeek-R1"
        },
        {
            "aspect": "Setup Complexity",
            "anthropic": "API key only",
            "deepseek": "Local installation",
            "winner": "Anthropic"
        },
        {
            "aspect": "Verification",
            "anthropic": "Cryptographic signatures",
            "deepseek": "Direct model output",
            "winner": "Anthropic"
        }
    ]
    
    print(f"{'Aspect':<20} | {'Anthropic':<20} | {'DeepSeek-R1':<20} | {'Better'}")
    print("-" * 85)
    
    for diff in differences:
        winner_symbol = "🏆" if diff['winner'] == "DeepSeek-R1" else "⭐"
        print(f"{diff['aspect']:<20} | {diff['anthropic']:<20} | {diff['deepseek']:<20} | {winner_symbol} {diff['winner']}")
    
    print(f"\n🎯 BOTTOM LINE:")
    print(f"   • DeepSeek-R1: Better for transparency, cost, privacy")
    print(f"   • Anthropic: Better for enterprise verification, ease of setup")
    print(f"   • Both: Excellent reasoning capabilities")

def practical_implications():
    """What this means for your usage"""
    
    print(f"\n💡 PRACTICAL IMPLICATIONS")
    print("=" * 50)
    
    implications = [
        "✅ You see MORE thinking detail with DeepSeek-R1",
        "✅ No token budgets to manage or optimize", 
        "✅ No surprise costs from long thinking processes",
        "❌ No verification signatures (trust the model)",
        "❌ More complex initial setup required",
        "✅ Complete privacy and data control"
    ]
    
    for implication in implications:
        print(f"   {implication}")
    
    print(f"\n🚀 RECOMMENDATION:")
    print(f"   Use DeepSeek-R1 for: Learning, experimentation, cost-sensitive projects")
    print(f"   Use Claude 4 for: Enterprise verification, quick API setup")

# Run the reality check
result = deepseek_thinking_reality()
anthropic_vs_deepseek_summary()
practical_implications()

No, the specific **streaming API structure** from Anthropic is not directly valid for DeepSeek-R1 via Ollama. Here's the comparison:

## ❌ Anthropic Streaming (Not Valid for DeepSeek)

```python
# This is Anthropic-specific and won't work with DeepSeek-R1
with client.messages.stream(
    model="claude-sonnet-4-20250514",
    thinking={"type": "enabled", "budget_tokens": 5000},
    messages=[{"role": "user", "content": "..."}]
) as stream:
    for event in stream:
        if event.type == "content_block_start":
            # Anthropic-specific event structure
        elif event.type == "content_block_delta":
            # Anthropic-specific delta events
```

## ✅ DeepSeek-R1 Streaming (What Actually Works)## Summary: What's Valid vs Invalid

### ❌ **NOT Valid for DeepSeek-R1:**
- `client.messages.stream()` - Different API
- `thinking={"type": "enabled", "budget_tokens": 5000}` - No budget system
- `event.type == "content_block_start"` - Different event structure
- `event.delta.type == "thinking_delta"` - No delta events
- Context managers (`with stream as s:`) - Different pattern

### ✅ **Valid Concepts (but different implementation):**
- **Streaming responses** - Yes, but with `client.chat(stream=True)`
- **Thinking visibility** - Yes, but via `<think>` tag parsing
- **Progressive display** - Yes, but with chunk iteration
- **Real-time feedback** - Yes, but simpler implementation

### 🎯 **Key Difference:**
**Anthropic**: Complex event-driven streaming with budget management  
**DeepSeek-R1**: Simple chunk-based streaming with automatic thinking

**The concepts are similar, but the implementation is completely different.** DeepSeek-R1 is actually simpler - no complex event handling needed, just parse the `<think>` tags from the streaming content!

In [None]:
# ============================================================================
# ANTHROPIC STREAMING (DOESN'T WORK WITH DEEPSEEK)
# ============================================================================

def anthropic_style_streaming():
    """This is how Anthropic streaming works - NOT valid for DeepSeek"""
    
    # ❌ This structure doesn't exist in Ollama
    """
    with client.messages.stream(
        model="claude-sonnet-4-20250514",
        thinking={"type": "enabled", "budget_tokens": 5000},
        messages=[{"role": "user", "content": "..."}]
    ) as stream:
        for event in stream:
            if event.type == "content_block_start":
                # Anthropic-specific events
            elif event.type == "content_block_delta":
                # Anthropic-specific deltas
    """
    print("❌ This Anthropic pattern doesn't work with DeepSeek-R1/Ollama")

# ============================================================================
# DEEPSEEK-R1 STREAMING (WHAT ACTUALLY WORKS)
# ============================================================================

def deepseek_streaming_correct():
    """This is how DeepSeek-R1 streaming actually works"""
    
    print("✅ DeepSeek-R1 Streaming (Correct Method)")
    print("=" * 50)
    
    # ✅ This is the correct way for DeepSeek-R1
    stream = client.chat(
        model='deepseek-r1:14b',
        messages=[{
            "role": "user", 
            "content": "Explain quantum computing in simple terms."
        }],
        stream=True  # Simple boolean flag
    )
    
    # Track streaming state
    full_response = ""
    in_thinking = False
    
    print("🤖 DeepSeek-R1: ", end="", flush=True)
    
    # ✅ Simple iteration over chunks
    for chunk in stream:
        if 'message' in chunk:
            content = chunk['message'].get('content', '')
            full_response += content
            
            # Handle thinking transitions
            if '<think>' in content and not in_thinking:
                in_thinking = True
                print("\n🧠 [Thinking...] ", end="", flush=True)
                content = content.replace('<think>', '')
            
            if '</think>' in content and in_thinking:
                in_thinking = False
                print(" [Done]\n💡 Answer: ", end="", flush=True)
                content = content.replace('</think>', '')
            
            # Show appropriate content
            if in_thinking:
                print(".", end="", flush=True)  # Progress dots
            else:
                print(content, end="", flush=True)  # Actual content
    
    print(f"\n\n✅ Complete! ({len(full_response)} chars)")
    return full_response

# ============================================================================
# COMPARISON TABLE
# ============================================================================

def show_streaming_comparison():
    """Show the differences between Anthropic and DeepSeek streaming"""
    
    print("\n📊 STREAMING API COMPARISON")
    print("=" * 60)
    
    comparison = [
        {
            "Feature": "Stream initiation",
            "Anthropic": "client.messages.stream()",
            "DeepSeek": "client.chat(stream=True)"
        },
        {
            "Feature": "Context manager", 
            "Anthropic": "with stream as s:",
            "DeepSeek": "for chunk in stream:"
        },
        {
            "Feature": "Event types",
            "Anthropic": "event.type, event.delta",
            "DeepSeek": "chunk['message']['content']"
        },
        {
            "Feature": "Thinking detection",
            "Anthropic": "event.type == 'thinking'",
            "DeepSeek": "Parse <think> tags"
        },
        {
            "Feature": "Content access",
            "Anthropic": "event.delta.text",
            "DeepSeek": "chunk['message']['content']"
        },
        {
            "Feature": "Thinking budget",
            "Anthropic": "budget_tokens=5000",
            "DeepSeek": "Automatic (no config needed)"
        }
    ]
    
    print(f"{'Feature':<20} | {'Anthropic':<25} | {'DeepSeek-R1':<25}")
    print("-" * 75)
    
    for comp in comparison:
        print(f"{comp['Feature']:<20} | {comp['Anthropic']:<25} | {comp['DeepSeek']:<25}")

# ============================================================================
# PRACTICAL DEEPSEEK STREAMING EXAMPLES
# ============================================================================

def practical_streaming_examples():
    """Show practical streaming patterns for DeepSeek-R1"""
    
    print(f"\n🚀 PRACTICAL DEEPSEEK STREAMING PATTERNS")
    print("=" * 60)
    
    # Pattern 1: Simple streaming
    print(f"\n1️⃣ Simple Streaming:")
    print("-" * 30)
    print("""
stream = client.chat(model='deepseek-r1:14b', messages=messages, stream=True)
for chunk in stream:
    if 'message' in chunk:
        content = chunk['message'].get('content', '')
        print(content, end='', flush=True)
""")
    
    # Pattern 2: Thinking-aware streaming
    print(f"\n2️⃣ Thinking-Aware Streaming:")
    print("-" * 30)
    print("""
in_thinking = False
for chunk in stream:
    content = chunk['message'].get('content', '')
    if '<think>' in content: in_thinking = True
    if '</think>' in content: in_thinking = False
    
    if in_thinking:
        print('.', end='', flush=True)  # Progress
    else:
        print(content, end='', flush=True)  # Response
""")
    
    # Pattern 3: Advanced streaming with analysis
    print(f"\n3️⃣ Advanced Streaming with Analysis:")
    print("-" * 30)
    print("""
thinking_buffer = ""
response_buffer = ""

for chunk in stream:
    content = chunk['message'].get('content', '')
    
    if in_thinking_mode:
        thinking_buffer += content
        show_thinking_progress()
    else:
        response_buffer += content
        display_response(content)

analyze_thinking_patterns(thinking_buffer)
""")

# ============================================================================
# RUN EXAMPLES
# ============================================================================

def run_all_examples():
    """Run all streaming examples"""
    
    print("🎯 DEEPSEEK-R1 STREAMING GUIDE")
    print("=" * 60)
    
    # Show what doesn't work
    anthropic_style_streaming()
    
    # Show what does work
    result = deepseek_streaming_correct()
    
    # Show comparisons
    show_streaming_comparison()
    
    # Show practical patterns
    practical_streaming_examples()
    
    print(f"\n✅ Key Takeaway: DeepSeek-R1 streaming is simpler!")
    print(f"   No complex event handling - just parse <think> tags")
    print(f"   No budget management - thinking is automatic")
    print(f"   No special context managers - simple iteration")

# Run the complete guide
run_all_examples()

In [None]:
def stream_thinking_example():
    """Demonstrate streaming with DeepSeek-R1 thinking"""
    
    print("🌊 Streaming DeepSeek-R1 Thinking Example")
    print("=" * 60)
    
    question = """Design a simple REST API for a todo list application. 
    Include endpoints for CRUD operations and consider:
    - Authentication
    - Error handling
    - Data validation
    - Response formats"""
    
    print(f"❓ Question: {question}\n")
    
    # Stream the response
    stream = client.chat(
        model='deepseek-r1:14b',
        messages=[{
            "role": "user",
            "content": question
        }],
        stream=True
    )
    
    # Track the response as it comes in
    full_response = ""
    in_thinking = False
    thinking_content = ""
    final_answer = ""
    
    print("🤖 DeepSeek-R1 Response:")
    print("-" * 30)
    
    for chunk in stream:
        if 'message' in chunk:
            content = chunk['message'].get('content', '')
            full_response += content
            
            # Check if we're entering thinking mode
            if '<think>' in content and not in_thinking:
                in_thinking = True
                print("\n🧠 [THINKING...] ", end="", flush=True)
                continue
            
            # Check if we're exiting thinking mode
            if '</think>' in content and in_thinking:
                in_thinking = False
                print(" [THINKING COMPLETE]\n")
                print("💡 Final Answer:")
                print("-" * 20)
                continue
            
            # Show progress dots during thinking
            if in_thinking:
                print(".", end="", flush=True)
            else:
                # Show the actual final answer
                print(content, end="", flush=True)
    
    print(f"\n\n📊 Stream Summary:")
    print(f"  Total response length: {len(full_response)} characters")
    
    # Parse the complete response for detailed analysis
    if '<think>' in full_response and '</think>' in full_response:
        thinking_start = full_response.find('<think>') + 7
        thinking_end = full_response.find('</think>')
        thinking_content = full_response[thinking_start:thinking_end].strip()
        final_answer = full_response[thinking_end + 8:].strip()
        
        print(f"  Thinking length: {len(thinking_content)} characters")
        print(f"  Final answer length: {len(final_answer)} characters")
        print(f"  Thinking took: {len(thinking_content.split())} words to process")
    
    return full_response

# Run the streaming example
result = stream_thinking_example()

### 5.2 Extended Thinking with Tool Use

Extended thinking can be combined with tool use for even more powerful applications.

In [None]:
import time
import sys

def advanced_streaming_with_thinking():
    """Advanced streaming that shows thinking process in real-time"""
    
    print("🚀 Advanced Streaming with Real-time Thinking Display")
    print("=" * 60)
    
    question = """Analyze this scenario: A tech startup wants to implement AI in their 
    customer service. They have 10,000 daily support tickets, 60% are simple FAQ-type 
    questions, 30% require human judgment, and 10% are complex technical issues. 
    Design a solution considering costs, customer satisfaction, and implementation timeline."""
    
    print(f"❓ Complex Question: {question}\n")
    
    # Stream the response
    stream = client.chat(
        model='deepseek-r1:14b',
        messages=[{
            "role": "user",
            "content": question
        }],
        stream=True
    )
    
    # Advanced tracking
    full_response = ""
    thinking_buffer = ""
    final_buffer = ""
    in_thinking = False
    thinking_started = False
    
    print("🤖 DeepSeek-R1 Processing:")
    print("=" * 40)
    
    for chunk in stream:
        if 'message' in chunk:
            content = chunk['message'].get('content', '')
            full_response += content
            
            # Handle thinking block start
            if '<think>' in content:
                in_thinking = True
                thinking_started = True
                content = content.replace('<think>', '')
                print("\n🧠 THINKING PROCESS:")
                print("-" * 30)
                time.sleep(0.1)  # Small delay for readability
            
            # Handle thinking block end
            if '</think>' in content:
                in_thinking = False
                content = content.replace('</think>', '')
                thinking_buffer += content
                print(f"\n{'.'*30}")
                print("💡 FINAL SOLUTION:")
                print("-" * 30)
                time.sleep(0.2)
                continue
            
            # Process content based on current state
            if in_thinking:
                thinking_buffer += content
                # Show thinking in real-time with slight delay
                for char in content:
                    print(char, end='', flush=True)
                    time.sleep(0.005)  # Very small delay for dramatic effect
            else:
                final_buffer += content
                # Show final answer immediately
                print(content, end='', flush=True)
    
    # Final analysis
    print(f"\n\n📊 ADVANCED ANALYSIS:")
    print("=" * 40)
    
    if thinking_buffer:
        print(f"✅ Thinking captured: {len(thinking_buffer)} characters")
        print(f"✅ Final answer: {len(final_buffer)} characters")
        print(f"✅ Thinking-to-answer ratio: {len(thinking_buffer)/len(final_buffer):.1f}:1")
        
        # Analyze thinking patterns
        thinking_sentences = thinking_buffer.split('.')
        print(f"✅ Thinking sentences: {len(thinking_sentences)}")
        
        # Look for solution patterns
        solution_keywords = ['consider', 'implement', 'solution', 'approach', 'strategy', 'recommend']
        found_patterns = [kw for kw in solution_keywords if kw.lower() in thinking_buffer.lower()]
        print(f"✅ Solution patterns found: {found_patterns}")
        
        # Show thinking summary
        print(f"\n🎯 THINKING SUMMARY (first 300 chars):")
        print(f"   {thinking_buffer[:300]}...")
    else:
        print("⚠️  No thinking block detected")
    
    return {
        'full_response': full_response,
        'thinking': thinking_buffer,
        'final_answer': final_buffer,
        'has_thinking': bool(thinking_buffer)
    }

# Run the advanced streaming example
result = advanced_streaming_with_thinking()

In [None]:
def stream_thinking_example():
    """Demonstrate streaming with DeepSeek-R1 extended thinking"""
    
    print("🌊 Streaming Extended Thinking Example")
    print("=" * 60)
    
    # DeepSeek-R1 automatically provides thinking - no budget needed!
    stream = client.chat(
        model='deepseek-r1:14b',
        messages=[{
            "role": "user",
            "content": """Design a simple REST API for a todo list application. 
            Include endpoints for CRUD operations and consider:
            - Authentication
            - Error handling
            - Data validation
            - Response formats"""
        }],
        stream=True
    )
    
    # Track streaming state
    in_thinking = False
    full_response = ""
    
    for chunk in stream:
        if 'message' in chunk:
            content = chunk['message'].get('content', '')
            full_response += content
            
            # Detect thinking block start
            if '<think>' in content and not in_thinking:
                in_thinking = True
                print("\n🤔 DeepSeek is thinking...", end="", flush=True)
                content = content.replace('<think>', '')
            
            # Detect thinking block end
            if '</think>' in content and in_thinking:
                in_thinking = False
                content = content.replace('</think>', '')
                print(" Done thinking!")
                print("\n\n✅ Final Response:\n", end="", flush=True)
            
            # Show appropriate output
            if in_thinking:
                # Show progress dots during thinking
                print(".", end="", flush=True)
            else:
                # Show the actual response content
                print(content, end="", flush=True)
    
    print(f"\n\nResponse complete! Total length: {len(full_response)} characters")
    return full_response

# Run the example
result = stream_thinking_example()

In [None]:
def thinking_with_tools_example():
    """Demonstrate extended thinking with simulated tool use"""
    
    # Note: DeepSeek-R1 doesn't have native tool calling like Claude,
    # but we can simulate it by asking it to "think through" calculations
    
    response = client.chat(
        model='deepseek-r1:14b',
        messages=[{
            "role": "user",
            "content": """I'm planning a party for 25 people. Each person will eat:
            - 3 slices of pizza (8 slices per pizza)
            - 2 sodas ($1.50 each)
            - 1 dessert ($3.00 each)
            
            Pizzas cost $12 each. Calculate the total cost and quantities needed.
            Please show your mathematical calculations step by step."""
        }],
        stream=False
    )
    
    print("🎉 Party Planning with Extended Thinking")
    print("=" * 60)
    
    full_response = response.get('message', {}).get('content', '')
    
    # Parse thinking and final answer
    if '<think>' in full_response and '</think>' in full_response:
        thinking_start = full_response.find('<think>') + 7
        thinking_end = full_response.find('</think>')
        thinking_content = full_response[thinking_start:thinking_end].strip()
        final_answer = full_response[thinking_end + 8:].strip()
        
        print("\n🤔 Planning Process:")
        print("-" * 40)
        # Show first 1000 characters of thinking
        print(thinking_content[:1000])
        if len(thinking_content) > 1000:
            print("...\n")
        
        print("\n📋 Final Plan:")
        print("-" * 40)
        display(Markdown(final_answer))
        
        # Analyze the thinking for mathematical patterns
        calc_keywords = ['calculate', 'multiply', 'add', 'total', 'cost', 'pizza', 'soda']
        found_calcs = [kw for kw in calc_keywords if kw.lower() in thinking_content.lower()]
        print(f"\n🔧 Mathematical concepts used: {found_calcs}")
        
    else:
        print("\n📋 Response:")
        display(Markdown(full_response))
    
    return full_response

# DeepSeek-R1 Token Guidelines (No explicit budget needed)
print("🎯 DeepSeek-R1 Guidelines")
print("=" * 50)
print("✅ No token budgets needed - thinking is automatic")
print("✅ Thinking depth scales with problem complexity")
print("✅ Average thinking: 2K-10K characters")
print("✅ Complex problems: 10K-50K+ characters")
print("✅ No additional cost for thinking tokens")

thinking_with_tools_example()

<a id='best-practices'></a>
## 6. Best Practices

### 6.1 Choosing the Right Budget

In [None]:
def prompting_best_practices():
    """Demonstrate effective prompting strategies for DeepSeek-R1"""
    
    # Good prompt - clear, specific, structured
    good_prompt = """Analyze the following investment options and recommend the best choice:

Option A: Stock Portfolio
- Expected annual return: 8%
- Risk level: High
- Minimum investment: $10,000
- Liquidity: High (can sell anytime)

Option B: Real Estate
- Expected annual return: 6%
- Risk level: Medium
- Minimum investment: $50,000
- Liquidity: Low (takes months to sell)

Option C: Bonds
- Expected annual return: 4%
- Risk level: Low
- Minimum investment: $5,000
- Liquidity: Medium

Investor Profile:
- Age: 35
- Investment horizon: 15 years
- Risk tolerance: Medium
- Available capital: $75,000
- Goal: Retirement savings

Please provide:
1. Analysis of each option
2. Recommended allocation
3. Justification for your recommendation

Think through this systematically, considering risk-return tradeoffs, 
diversification principles, and the investor's specific situation."""
    
    response = client.chat(
        model='deepseek-r1:14b',
        messages=[{"role": "user", "content": good_prompt}],
        stream=False
    )
    
    print("✅ Best Practices Example: Structured Investment Analysis")
    print("=" * 60)
    
    full_response = response.get('message', {}).get('content', '')
    
    # Parse and display thinking + answer
    if '<think>' in full_response and '</think>' in full_response:
        thinking_start = full_response.find('<think>') + 7
        thinking_end = full_response.find('</think>')
        thinking_content = full_response[thinking_start:thinking_end].strip()
        final_answer = full_response[thinking_end + 8:].strip()
        
        print("\n🧠 THINKING PROCESS:")
        print("=" * 40)
        # Show key parts of thinking
        print(thinking_content[:800] + "..." if len(thinking_content) > 800 else thinking_content)
        
        print(f"\n💡 FINAL RECOMMENDATION:")
        print("=" * 40)
        display(Markdown(final_answer))
        
        # Analyze thinking quality
        analysis_keywords = ['risk', 'return', 'diversification', 'allocation', 'horizon', 'liquidity']
        found_concepts = [kw for kw in analysis_keywords if kw.lower() in thinking_content.lower()]
        print(f"\n📊 Investment concepts analyzed: {found_concepts}")
        
    else:
        display(Markdown(full_response))
    
    return full_response

def show_prompting_tips():
    """Show DeepSeek-R1 specific prompting tips"""
    
    print("\n🎯 DeepSeek-R1 Prompting Tips")
    print("=" * 50)
    
    tips = [
        {
            "tip": "Be Explicit About Thinking",
            "description": "Ask to 'think through systematically' or 'analyze step by step'",
            "example": "'Think through this problem step by step before giving your answer'"
        },
        {
            "tip": "Structure Your Requests", 
            "description": "Use numbered lists and clear sections",
            "example": "'Please provide: 1. Analysis 2. Recommendation 3. Justification'"
        },
        {
            "tip": "Provide Context",
            "description": "Give background information and constraints",
            "example": "'Consider the investor profile and 15-year time horizon'"
        },
        {
            "tip": "Ask for Reasoning",
            "description": "Request explanations of the thought process",
            "example": "'Explain your reasoning and show your calculations'"
        },
        {
            "tip": "Specify Output Format",
            "description": "Request specific formats or structures",
            "example": "'Provide a summary table and detailed analysis'"
        }
    ]
    
    for i, tip in enumerate(tips, 1):
        print(f"\n{i}. {tip['tip']}")
        print(f"   📝 {tip['description']}")
        print(f"   💡 Example: {tip['example']}")
    
    print(f"\n✨ Pro Tip: DeepSeek-R1 excels at mathematical reasoning,")
    print(f"   financial analysis, and systematic problem-solving!")

# Run the examples
result = prompting_best_practices()
show_prompting_tips()

In [None]:
def deepseek_cost_benefits():
    """Analyze DeepSeek-R1 cost benefits vs Claude thinking"""
    
    print("💰 DeepSeek-R1 vs Claude Thinking Cost Analysis")
    print("=" * 60)
    
    # Simulated costs (DeepSeek-R1 via Ollama is FREE!)
    scenarios = [
        {"name": "Simple Analysis", "input": 500, "thinking": 5000, "output": 1000},
        {"name": "Complex Problem", "input": 2000, "thinking": 20000, "output": 3000},
        {"name": "Deep Research", "input": 5000, "thinking": 50000, "output": 8000}
    ]
    
    # Claude pricing (example - per million tokens)
    claude_pricing = {"input": 3, "output": 15}
    
    print(f"\n📊 Cost Comparison:")
    print("-" * 40)
    
    total_claude_cost = 0
    
    for scenario in scenarios:
        # Claude costs (thinking billed as output)
        claude_input_cost = (scenario["input"] / 1_000_000) * claude_pricing["input"]
        claude_thinking_cost = (scenario["thinking"] / 1_000_000) * claude_pricing["output"]
        claude_output_cost = (scenario["output"] / 1_000_000) * claude_pricing["output"]
        claude_total = claude_input_cost + claude_thinking_cost + claude_output_cost
        total_claude_cost += claude_total
        
        # DeepSeek-R1 costs (FREE with local Ollama!)
        deepseek_cost = 0.00
        
        print(f"\n  {scenario['name']}:")
        print(f"    Input: {scenario['input']:,} | Thinking: {scenario['thinking']:,} | Output: {scenario['output']:,}")
        print(f"    Claude cost: ${claude_total:.4f}")
        print(f"    DeepSeek-R1: ${deepseek_cost:.2f} (FREE!)")
        print(f"    Savings: ${claude_total:.4f}")
    
    print(f"\n💡 TOTAL SAVINGS: ${total_claude_cost:.4f} per analysis cycle")
    print(f"🎯 Monthly savings (100 analyses): ${total_claude_cost * 100:.2f}")
    print(f"🚀 Annual savings (1200 analyses): ${total_claude_cost * 1200:.2f}")

def deepseek_advantages():
    """Show DeepSeek-R1 advantages"""
    
    print(f"\n⭐ DeepSeek-R1 Advantages")
    print("=" * 50)
    
    advantages = [
        {
            "category": "💰 Cost",
            "points": [
                "Completely FREE when run locally",
                "No token counting or budget management",
                "No API costs or rate limits",
                "One-time setup, unlimited usage"
            ]
        },
        {
            "category": "🧠 Thinking Quality", 
            "points": [
                "Deep reasoning automatically included",
                "Visible thought process in <think> tags",
                "Excellent at mathematical reasoning",
                "Strong logical problem-solving"
            ]
        },
        {
            "category": "🔒 Privacy",
            "points": [
                "Runs entirely on your hardware",
                "No data sent to external APIs",
                "Complete control over your data",
                "Enterprise-safe deployment"
            ]
        },
        {
            "category": "⚡ Performance",
            "points": [
                "No network latency (local inference)",
                "Consistent availability",
                "Customizable parameters",
                "GPU acceleration support"
            ]
        }
    ]
    
    for advantage in advantages:
        print(f"\n{advantage['category']}")
        for point in advantage['points']:
            print(f"  ✅ {point}")

def performance_comparison():
    """Compare performance characteristics"""
    
    print(f"\n📈 Performance Characteristics")
    print("=" * 50)
    
    comparison = [
        {"metric": "Cost per analysis", "claude": "$0.05-0.50", "deepseek": "$0.00"},
        {"metric": "Thinking transparency", "claude": "Summary only", "deepseek": "Full process"},
        {"metric": "Setup complexity", "claude": "API key only", "deepseek": "Local install"},
        {"metric": "Data privacy", "claude": "Cloud-based", "deepseek": "Local only"},
        {"metric": "Token limits", "claude": "Budget required", "deepseek": "No limits"},
        {"metric": "Availability", "claude": "Internet required", "deepseek": "Always local"}
    ]
    
    print(f"{'Metric':<20} | {'Claude':<15} | {'DeepSeek-R1':<15}")
    print("-" * 55)
    
    for comp in comparison:
        print(f"{comp['metric']:<20} | {comp['claude']:<15} | {comp['deepseek']:<15}")

# Run all analyses
deepseek_cost_benefits()
deepseek_advantages() 
performance_comparison()