# Custom Models with Strands Agents

Strands Agents supports multiple model providers, giving you flexibility in choosing the right model for your use case. This notebook covers different model configurations and their trade-offs.

## What you'll learn
- Working with Amazon Bedrock models
- Configuring local models with Ollama
- Model selection criteria and performance considerations
- Cost optimization strategies

## Prerequisites
- Completed notebooks 01-02
- AWS credentials configured (for Bedrock)
- Ollama installed (for local models - optional)

## Setup and Configuration

In [None]:
!pip install -r requirements.txt

In [1]:
import os
import time
from strands import Agent
from strands.models import BedrockModel
from strands.models.ollama import OllamaModel
import logging

# Configure logging
logging.getLogger("strands").setLevel(logging.WARNING)

# Optional: Set MEM0 API key for memory features (get your free key from https://mem0.ai)
MEM0_API_KEY = os.getenv("MEM0_API_KEY")
if MEM0_API_KEY:
    os.environ["MEM0_API_KEY"] = MEM0_API_KEY

print("Custom models setup complete!")

Custom models setup complete!


## Model Configuration

Let's configure different models for comparison:

In [2]:
# Configure Bedrock model
bedrock_model = BedrockModel(
    model_id="us.anthropic.claude-sonnet-4-20250514-v1:0",
    region_name="us-west-2",
    temperature=0.3, # Randomness, (higher number = more random)
)

# Configure Ollama model (requires Ollama running locally)
ollama_model = OllamaModel(
    host="http://localhost:11434",
    model_id="llama3",
    temperature=0.3, # Randomness, (higher number = more random)
)

print("Models configured successfully!")

Models configured successfully!


## Creating Agents with Custom Models

Now let's create agents using our configured models:

In [3]:
# Create agents with different models
bedrock_agent = Agent(model=bedrock_model)
ollama_agent = Agent(model=ollama_model)

print("Agents created successfully!")

Agents created successfully!


## Performance Comparison

Let's compare the performance of different models:

In [4]:
question = "Explain quantum computing in one sentence."
print(f"💬 Question: {question}")

# Test Bedrock model
print("\n⚡ Bedrock Model (Sonnet 4):")
start = time.time()
bedrock_response = bedrock_agent(question)
bedrock_time = time.time() - start
print(f"Time: {bedrock_time:.2f}s")

💬 Question: Explain quantum computing in one sentence.

⚡ Bedrock Model (Sonnet 4):
Quantum computing harnesses the strange properties of quantum mechanics—like superposition (being in multiple states simultaneously) and entanglement—to process information in ways that could solve certain complex problems exponentially faster than classical computers.Time: 2.63s


In [5]:
# Test Ollama model (only if Ollama is running)
try:
    print("\n🧠 Ollama model (llama 3):")
    start = time.time()
    ollama_response = ollama_agent(question)
    ollama_time = time.time() - start
    print(f"Time: {ollama_time:.2f}s")
    
    print(f"\n📊 Speed difference: {ollama_time/bedrock_time:.1f}x")
except Exception as e:
    print(f"\n⚠️ Ollama not available: {e}")
    print("💡 To use Ollama: Install Ollama and run 'ollama pull llama3'")


🧠 Ollama model (llama 3):
Quantum computing is a type of computing that uses the principles of quantum mechanics, such as superposition and entanglement, to perform calculations on data that exists in multiple states simultaneously, allowing for exponentially faster processing times and new possibilities for solving complex problems.Time: 6.41s

📊 Speed difference: 2.4x


## Model Selection Guidelines

### When to use Bedrock (Cloud):
- **Production applications** requiring high reliability
- **Complex reasoning tasks** needing advanced capabilities
- **Scalable solutions** with automatic infrastructure management
- **Enterprise compliance** requirements

### When to use Ollama (Local):
- **Privacy-sensitive** applications
- **Offline environments** without internet access
- **Development and testing** to reduce costs
- **Custom model fine-tuning** requirements

### Performance Considerations:
- **Latency**: Local models typically faster for simple tasks
- **Quality**: Cloud models generally more capable
- **Cost**: Local models free after setup, cloud models pay-per-use
- **Scalability**: Cloud models handle concurrent users better

## Advanced Model Configuration

You can fine-tune model behavior with additional parameters:

In [None]:
# Advanced Bedrock configuration
advanced_bedrock = BedrockModel(
    model_id="us.anthropic.claude-sonnet-4-20250514-v1:0",
    region_name="us-west-2",
    temperature=0.7,  # More creative
    max_tokens=1000,  # Longer responses
    top_p=0.9,        # Nucleus sampling
)

# Create agent with advanced configuration
creative_agent = Agent(
    model=advanced_bedrock,
    system_prompt="You are a creative assistant. Be imaginative and detailed."
)

# Test creative response
creative_response = creative_agent("Write a creative story about AI in 3 sentences.")
print("🎨 Creative Agent Response:")
print(creative_response)

## Summary

You've learned how to:

✅ **Configure custom models** with BedrockModel and OllamaModel  
✅ **Compare performance** between cloud and local models  
✅ **Optimize model parameters** for different use cases  
✅ **Choose the right model** based on requirements  

### Next Steps:
- Experiment with different temperature settings
- Try other Ollama models (llama3.2:1b, llama3.2:3b)
- Test with your specific use cases
- Consider cost vs. performance trade-offs

**Continue to notebook 04 to learn about memory-enabled agents! 🧠**