# Mistral Models on Amazon Bedrock

This notebook provides a comprehensive guide to using Mistral AI models on Amazon Bedrock.

## Available Models

| Model | Context | Input Price | Output Price | Best For |
|-------|---------|-------------|--------------|----------|
| Mistral 7B | 32K | $0.00015/1K | $0.0002/1K | Simple tasks |
| Mixtral 8x7B | 32K | $0.00045/1K | $0.0007/1K | General purpose |
| Mistral Large 2 | 128K | $0.003/1K | $0.009/1K | Complex tasks |
| Pixtral Large | 128K | $0.003/1K | $0.009/1K | Vision + text |

## Setup

In [None]:
import boto3
import json

bedrock_runtime = boto3.client('bedrock-runtime', region_name='us-east-1')
print("‚úÖ Bedrock client initialized")

## Check Available Models

**‚ö†Ô∏è Run this cell first** to see which Mistral models are available in your account.

In [None]:
bedrock = boto3.client('bedrock', region_name='us-east-1')

print("Checking available Mistral models...\n")
print("=" * 80)

try:
    response = bedrock.list_foundation_models(byProvider='Mistral AI')
    
    print(f"\n{'Model ID':<50} {'Status'}")
    print("-" * 80)
    
    for model in response['modelSummaries']:
        model_id = model['modelId']
        status = model.get('modelLifecycle', {}).get('status', 'ACTIVE')
        print(f"{model_id:<50} {status}")
    
    print("\n" + "=" * 80)
    print("\n‚úÖ Copy the exact model IDs above to use in the cells below.")
    
except Exception as e:
    print(f"‚ùå Error: {e}")
    print("\nMake sure you have:")
    print("1. Enabled model access in Bedrock console")
    print("2. Proper IAM permissions")
    print("3. Selected the correct region (us-east-1)")

## Helper Functions

In [None]:
def invoke_model(model_id, prompt, max_tokens=1000, temperature=0.7):
    """Invoke a Mistral model on Bedrock."""
    response = bedrock_runtime.converse(
        modelId=model_id,
        messages=[{"role": "user", "content": [{"text": prompt}]}],
        inferenceConfig={"maxTokens": max_tokens, "temperature": temperature}
    )
    return response['output']['message']['content'][0]['text']

print("‚úÖ Helper functions defined")

## Part 1: Mistral 7B Instruct

### Overview
- **Size**: 7 billion parameters
- **Context**: 32K tokens
- **Model ID**: `mistral.mistral-7b-instruct-v0:2`
- **Best for**: Simple classification, high-throughput tasks
- **Cost**: $0.25 per 1M input + 500K output tokens

### Why Choose Mistral 7B?

**Speed & Cost**: Mistral 7B is 30x cheaper than Large models and processes requests in milliseconds. Perfect for:
- **High-volume applications**: Processing thousands of requests per minute
- **Real-time systems**: Chatbots, live classification, instant responses
- **Budget-conscious projects**: When cost per request matters

**When NOT to use**: Complex reasoning, long documents, multilingual beyond 5 languages, advanced code generation

### Example 1: Customer Support Ticket Classification

**Scenario**: You receive 10,000 support tickets daily and need to route them to the right team instantly.

**Why 7B?**
- Fast enough for real-time routing (<500ms)
- Simple classification task doesn't need larger models
- Cost: $2.50/day vs $75/day with Large models
- Accuracy: 95%+ for well-defined categories

In [None]:
MODEL_7B = "mistral.mistral-7b-instruct-v0:2"

# Real-world support ticket
ticket = """
Subject: Can't log into my account
Message: I've tried resetting my password 3 times but still can't access my account. 
The reset email arrives but the link says 'expired' even though I click it immediately.
"""

prompt = f"""
Classify this support ticket into ONE category:
- BILLING: Payment, invoices, refunds
- TECHNICAL: Login, bugs, errors
- ACCOUNT: Profile, settings, access
- PRODUCT: Features, how-to questions

Ticket: {ticket}

Category:
"""

result = invoke_model(MODEL_7B, prompt, max_tokens=10, temperature=0.1)
print(f"Classification: {result}")
print(f"\nüí° Why 7B works here: Simple, fast classification with clear categories.")

### Example 2: Product Description Generation at Scale

**Scenario**: E-commerce platform needs to generate 50,000 product descriptions from specifications.

**Why 7B?**
- Generates consistent, template-based content quickly
- Cost: $12.50 for 50K descriptions vs $375 with Large models
- Quality: Good enough for standard product descriptions
- Speed: Can process entire catalog in hours, not days

In [None]:
# Product specifications
product_specs = {
    "name": "UltraGrip Wireless Mouse",
    "features": ["Ergonomic design", "2400 DPI", "6 programmable buttons", "30-hour battery"],
    "price": "$29.99",
    "target": "Gamers and professionals"
}

prompt = f"""
Write a compelling 50-word product description:

Product: {product_specs['name']}
Features: {', '.join(product_specs['features'])}
Price: {product_specs['price']}
Target: {product_specs['target']}

Description:
"""

result = invoke_model(MODEL_7B, prompt, max_tokens=100, temperature=0.7)
print(f"Generated Description:\n{result}")
print(f"\nüí° Why 7B works here: Template-based generation, high volume, cost-effective.")

## Part 2: Mixtral 8x7B Instruct

### Overview
- **Architecture**: Mixture-of-Experts (8 experts √ó 7B)
- **Context**: 32K tokens
- **Model ID**: `mistral.mixtral-8x7b-instruct-v0:1`
- **Best for**: Multilingual, general-purpose tasks
- **Cost**: $0.80 per 1M input + 500K output tokens

### Why Choose Mixtral 8x7B?

The "Swiss Army Knife" of models - handles 80% of use cases well:
- **Multilingual**: Excellent across 11+ languages (French, Spanish, German, Italian, Portuguese, Dutch, Russian, Chinese, Japanese, Korean)
- **Balanced**: Near-large-model quality at 1/10th the cost
- **Versatile**: Code, translation, extraction, reasoning all work well
- **MoE Architecture**: Only activates 2 of 8 experts per token = efficient

**When to use over 7B**: Multilingual needs, moderate complexity, code generation, better reasoning

**When to use Large instead**: Very complex reasoning, 100K+ token documents, mission-critical accuracy

### Example 1: Multilingual Customer Communication

**Scenario**: Global e-commerce platform needs to respond to customers in their native language across Europe and Asia.

**Why Mixtral?**
- Native support for 11+ languages (7B only does 5 well)
- Maintains context across languages
- Cost: $0.80 per 1M tokens vs $7.50 for Large
- Quality: Near-native fluency in supported languages

In [None]:
MODEL_MIXTRAL = "mistral.mixtral-8x7b-instruct-v0:1"

customer_message = """
Customer (French): Bonjour, je voudrais retourner ma commande car le produit ne correspond pas √† la description.
Order ID: #FR-2024-5891
"""

prompt = f"""
You are a customer service agent. Respond to this customer in their language (French).

{customer_message}

Tasks:
1. Acknowledge their concern
2. Explain the return process
3. Provide next steps
4. Be empathetic and professional

Response:
"""

result = invoke_model(MODEL_MIXTRAL, prompt, max_tokens=300, temperature=0.5)
print(f"Response:\n{result}")
print(f"\nüí° Why Mixtral: Native French support, maintains professional tone, understands context.")

### Example 2: Code Generation with Explanation

**Scenario**: Developer needs a Python function with documentation and error handling.

**Why Mixtral?**
- Better code quality than 7B
- Can explain code logic
- Handles multiple programming languages
- 10x cheaper than Large for routine code tasks

In [None]:
prompt = """
Create a Python function that:
1. Connects to a PostgreSQL database
2. Executes a parameterized query safely (prevent SQL injection)
3. Returns results as a list of dictionaries
4. Includes proper error handling and logging
5. Has comprehensive docstring

Include example usage.
"""

result = invoke_model(MODEL_MIXTRAL, prompt, max_tokens=800, temperature=0.2)
print(f"Generated Code:\n{result}")
print(f"\nüí° Why Mixtral: Balances code quality with cost. Good enough for most functions.")

## Part 3: Mistral Large 2 (24.07)

### Overview
- **Size**: 123 billion parameters
- **Context**: 128K tokens
- **Model ID**: `mistral.mistral-large-2407-v1:0` or `us.mistral.mistral-large-2407-v1:0`
- **Best for**: Complex reasoning, long documents
- **Cost**: $7.50 per 1M input + 500K output tokens

### Why Choose Mistral Large 2?

The "Expert" model - when accuracy and capability matter most:
- **128K context**: Analyze entire codebases, long documents, books
- **Advanced reasoning**: Multi-step logic, complex problem-solving
- **Best-in-class**: Competes with GPT-4 and Claude on benchmarks
- **Function calling**: Native tool use for agent systems
- **JSON mode**: Guaranteed valid structured output

**When to use**: Complex analysis, long documents, mission-critical tasks, advanced coding, research

**Cost justification**: 30x more expensive than 7B, but saves hours of human time on complex tasks

### Example: Complex Multi-Step Problem Solving

**Scenario**: Financial analyst needs to calculate ROI across multiple scenarios with dependencies.

**Why Large 2?**
- Multi-step reasoning with intermediate calculations
- Can handle complex business logic
- Explains reasoning process
- Accuracy critical for financial decisions

In [None]:
# Try different model ID formats for Mistral Large 2
prompt = """
A company is evaluating 3 investment options:

Option A: Cloud Migration
- Upfront cost: $500,000
- Annual savings: $150,000
- Implementation time: 6 months
- Risk: Medium (20% chance of 3-month delay)

Option B: AI Automation
- Upfront cost: $300,000
- Annual savings: $100,000
- Additional revenue: $50,000/year
- Implementation time: 9 months
- Risk: High (30% chance of 50% cost overrun)

Option C: Process Optimization
- Upfront cost: $150,000
- Annual savings: $80,000
- Implementation time: 3 months
- Risk: Low (5% chance of minor delays)

Calculate for each option:
1. Break-even point
2. 5-year NPV (discount rate: 8%)
3. Risk-adjusted ROI
4. Recommend best option with reasoning

Show all calculations step-by-step.
"""

model_ids_to_try = [
    "mistral.mistral-large-2407-v1:0",
    "us.mistral.mistral-large-2407-v1:0",
    "mistral.mistral-large-2402-v1:0",
    "us.mistral.mistral-large-2402-v1:0"
]

success = False
for model_id in model_ids_to_try:
    try:
        print(f"Trying: {model_id}...")
        result = invoke_model(model_id, prompt, max_tokens=1500, temperature=0.2)
        print(f"\n‚úÖ Success with: {model_id}\n")
        print(f"Financial Analysis:\n{result}")
        print(f"\nüí° Why Large 2: Complex calculations, multi-step reasoning, business-critical decision.")
        success = True
        break
    except Exception as e:
        print(f"‚ùå Failed: {str(e)[:80]}...\n")

if not success:
    print("‚ö†Ô∏è  None of the model IDs worked. Please run 'Check Available Models' cell above.")

## Part 4: Pixtral Large (25.02)

### Overview
- **Modality**: Text + Vision
- **Context**: 128K tokens
- **Inference Profile**: `us.mistral.pixtral-large-2502-v1:0` (required)
- **Best for**: Document understanding, image analysis
- **Cost**: $7.50 per 1M input + 500K output tokens + images

### Why Choose Pixtral Large?

The "Vision Expert" - when you need to understand images AND text:
- **Multimodal**: Processes images, charts, diagrams, documents
- **OCR + Understanding**: Not just text extraction, but comprehension
- **Document AI**: Invoices, receipts, forms, contracts with layout
- **Visual reasoning**: Analyze charts, compare images, UI/UX review
- **Same price as Large 2** for text, images counted as tokens

**When to use**: Any task involving images, document processing, visual Q&A, chart analysis

**vs OCR-only**: Pixtral understands context and relationships, not just text extraction

### Example: Invoice Processing with Validation

**Scenario**: Accounts payable team processes 1000 invoices/day. Need to extract data AND validate for errors.

**Why Pixtral?**
- Understands invoice layout and structure
- Extracts data from tables accurately
- Can spot anomalies (duplicate line items, calculation errors)
- Handles poor quality scans
- One model for extraction + validation

In [None]:
# Pixtral Large - requires inference profile
prompt = """
Analyze this invoice and:

1. Extract structured data:
   - Invoice number, date, due date
   - Vendor details
   - Line items (description, quantity, unit price, total)
   - Subtotal, tax, total

2. Validate:
   - Do line item calculations match?
   - Does subtotal + tax = total?
   - Any duplicate line items?
   - Any unusual amounts or patterns?

3. Flag issues:
   - Missing required fields
   - Calculation errors
   - Potential duplicates
   - Amounts over $10,000 (require approval)

Return as JSON with validation_status and issues array.
"""

model_ids_to_try = [
    "us.mistral.pixtral-large-2502-v1:0",
    "mistral.pixtral-large-2502-v1:0"
]

success = False
for model_id in model_ids_to_try:
    try:
        print(f"Trying: {model_id}...")
        result = invoke_model(model_id, prompt, max_tokens=800, temperature=0.1)
        print(f"\n‚úÖ Success with: {model_id}\n")
        print(f"Invoice Analysis:\n{result}")
        print(f"\nüí° Why Pixtral: Understands layout, validates logic, spots errors - not just OCR.")
        success = True
        break
    except Exception as e:
        print(f"‚ùå Failed: {str(e)[:80]}...\n")

if not success:
    print("‚ö†Ô∏è  Pixtral Large is not available.")
    print("Note: Pixtral requires model access to be enabled in Bedrock console.")
    print("Go to: AWS Console ‚Üí Bedrock ‚Üí Model access ‚Üí Request access")

## Model Selection Guide

### Decision Tree
```
Need images? ‚Üí Pixtral Large
Need long context (>32K)? ‚Üí Mistral Large 2
Complex reasoning? ‚Üí Mistral Large 2
Multilingual? ‚Üí Mixtral 8x7B
Speed/cost critical? ‚Üí Mistral 7B
Default ‚Üí Mixtral 8x7B
```

### Cost Comparison
Processing 1M input + 500K output tokens:
- Mistral 7B: $0.25
- Mixtral 8x7B: $0.80
- Mistral Large 2: $7.50
- Pixtral Large: $7.50

## Parameter Tuning

### Temperature
- **0.0-0.3**: Deterministic (classification, extraction)
- **0.4-0.7**: Balanced (general tasks)
- **0.8-1.0**: Creative (content generation)

### Top-P
- **0.9**: Recommended default

### Max Tokens
- Classification: 10-50
- Summaries: 200-500
- Long-form: 1000-2000

## Model Selection Decision Framework

### Quick Decision Tree

```
START: What's your task?

‚îú‚îÄ Processing images/documents with layout?
‚îÇ  ‚îî‚îÄ YES ‚Üí Pixtral Large ($7.50/1M tokens)
‚îÇ
‚îú‚îÄ Document over 32K tokens (>24K words)?
‚îÇ  ‚îî‚îÄ YES ‚Üí Mistral Large 2 ($7.50/1M tokens)
‚îÇ
‚îú‚îÄ Complex multi-step reasoning needed?
‚îÇ  ‚îî‚îÄ YES ‚Üí Mistral Large 2 ($7.50/1M tokens)
‚îÇ
‚îú‚îÄ Need multilingual support (>5 languages)?
‚îÇ  ‚îî‚îÄ YES ‚Üí Mixtral 8x7B ($0.80/1M tokens)
‚îÇ
‚îú‚îÄ Code generation (moderate complexity)?
‚îÇ  ‚îî‚îÄ YES ‚Üí Mixtral 8x7B ($0.80/1M tokens)
‚îÇ
‚îú‚îÄ High volume (>10K requests/day)?
‚îÇ  ‚îî‚îÄ YES ‚Üí Mistral 7B ($0.25/1M tokens)
‚îÇ
‚îî‚îÄ DEFAULT ‚Üí Mixtral 8x7B (Best balance)
```

### Real-World Cost Comparison

**Scenario: Customer Support System (10,000 tickets/day)**

| Model | Daily Cost | Speed | Accuracy | Best For |
|-------|------------|-------|----------|----------|
| Mistral 7B | $2.50 | <500ms | 95% | Simple routing |
| Mixtral 8x7B | $8.00 | ~1s | 98% | Complex categorization |
| Mistral Large 2 | $75.00 | ~2s | 99%+ | High-value customers |

**Recommendation**: Use 7B for initial routing, escalate 10% to Mixtral for complex cases. Saves $2,000/month vs using Large for everything.

## Resources

- [Mistral AI Documentation](https://docs.mistral.ai/)
- [Mistral on AWS](https://docs.mistral.ai/deployment/cloud/aws)
- [AWS Bedrock Documentation](https://docs.aws.amazon.com/bedrock/)
- [Bedrock Pricing](https://aws.amazon.com/bedrock/pricing/)