# Module 11.2: Tenant-Specific Customization

**Database-backed multi-tenant configuration management for RAG pipelines**

This notebook demonstrates configuration-driven customization where each tenant can:
- Select preferred LLM models (GPT-4, GPT-3.5, Claude variants)
- Configure retrieval parameters (top_k, alpha, reranking)
- Define custom prompt templates with safe variable injection
- Set resource limits and branding preferences

**Key Pattern**: Database-backed configuration with Redis caching eliminates hardcoded tenant-specific if-statements that don't scale beyond 5-10 tenants.

## Setup and Imports

In [None]:
# Import core modules
import json
import logging
from l2_m11_tenant_specific_customization import (
    TenantConfig,
    TenantConfigRepository,
    BrandingConfig,
    get_default_config,
    apply_config_to_pipeline,
    simulate_rag_query,
)
from config import Config

# Configure logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)

print("‚úì Imports complete")
# Expected: Confirmation message

## 1. Default Configuration

Every new tenant starts with a default configuration that includes:
- **Model**: gpt-3.5-turbo (cost-effective default)
- **Temperature**: 0.7 (balanced creativity/determinism)
- **Top-K**: 5 documents retrieved
- **Alpha**: 0.5 (hybrid search weight)
- **Max Tokens**: 500 (basic responses)
- **Branding**: Gray color scheme

This ensures tenants are immediately functional without manual setup.

In [None]:
# Load and display default configuration
default_config = get_default_config()

print("Default Tenant Configuration:")
print(f"  Model: {default_config['model']}")
print(f"  Temperature: {default_config['temperature']}")
print(f"  Top-K: {default_config['top_k']}")
print(f"  Alpha: {default_config['alpha']}")
print(f"  Max Tokens: {default_config['max_tokens']}")
print(f"  Branding: {default_config['branding']}")

# Expected: Default settings displayed (gpt-3.5-turbo, temp=0.7, top_k=5)

## 2. Initialize Repository

The `TenantConfigRepository` manages tenant configurations with:
- **Database-backed storage** (or in-memory fallback)
- **Redis caching** with TTL-based invalidation
- **Automatic fallback** to defaults on errors

Without external database/Redis, the repository uses in-memory storage for development.

In [None]:
# Initialize repository (will use in-memory storage without external services)
repository = TenantConfigRepository()

print("‚úì Repository initialized")
print(f"  Database: {'memory fallback' if not repository.db_engine else 'connected'}")
print(f"  Cache: {'disabled' if not repository.redis_client else 'enabled'}")

# Expected: Repository initialized with in-memory storage

## 3. Pydantic Validation Models

The `TenantConfig` Pydantic model provides type-safe validation with:
- **Bounded numeric fields**: Temperature (0-2), Top-K (1-20), Alpha (0-1)
- **Model whitelist**: Only approved models (GPT-4, GPT-3.5, Claude variants)
- **Prompt injection prevention**: Blocks malicious patterns
- **Hex color validation**: Ensures valid branding colors

This prevents invalid configurations from entering the system.

In [None]:
# Example 1: Valid configuration
valid_config = TenantConfig(
    model="gpt-4",
    temperature=0.5,
    top_k=10,
    alpha=0.7
)
print(f"‚úì Valid config: {valid_config.model} at temp={valid_config.temperature}")

# Example 2: Test validation - invalid model
try:
    invalid = TenantConfig(model="invalid-model")
    print("‚úó Should have failed!")
except ValueError as e:
    print(f"‚úì Validation blocked invalid model: {str(e)[:50]}...")

# Example 3: Test validation - temperature out of bounds
try:
    invalid = TenantConfig(temperature=3.0)
    print("‚úó Should have failed!")
except ValueError as e:
    print(f"‚úì Validation blocked high temperature: {str(e)[:50]}...")

# Expected: Valid config succeeds, invalid configs rejected

## 4. Create Tenant Configurations

Let's create configurations for different tenant types:
- **Startup**: Budget-conscious, uses GPT-3.5
- **Enterprise**: Premium features, uses GPT-4 with reranking
- **Research Lab**: High-quality responses, uses Claude Opus

Each tenant gets customized model selection, retrieval parameters, and prompt templates.

In [None]:
# Startup tenant - budget-conscious
startup_config = {
    "model": "gpt-3.5-turbo",
    "temperature": 0.7,
    "top_k": 5,
    "alpha": 0.5,
    "max_tokens": 500,
    "prompt_template": "You are a helpful assistant for {company_name}. Answer: {query}",
    "prompt_variables": {"company_name": "Startup Co"},
    "branding": {"primary_color": "#3B82F6", "secondary_color": "#10B981"},
    "enable_reranking": False
}
repository.update_config("tenant_startup_co", startup_config, merge=False)
print("‚úì Created config for tenant_startup_co")

# Enterprise tenant - premium features
enterprise_config = {
    "model": "gpt-4-turbo",
    "temperature": 0.5,
    "top_k": 10,
    "alpha": 0.7,
    "max_tokens": 1000,
    "prompt_template": "Expert AI for {company_name}: {query}",
    "prompt_variables": {"company_name": "Enterprise Corp"},
    "branding": {"primary_color": "#1E40AF", "secondary_color": "#DC2626"},
    "enable_reranking": True
}
repository.update_config("tenant_enterprise_corp", enterprise_config, merge=False)
print("‚úì Created config for tenant_enterprise_corp")

# Research lab - high quality
research_config = {
    "model": "claude-3-opus",
    "temperature": 0.9,
    "top_k": 15,
    "max_tokens": 2000,
    "enable_reranking": True
}
repository.update_config("tenant_research_lab", research_config, merge=True)
print("‚úì Created config for tenant_research_lab")

# Expected: 3 tenant configs created

## 5. Retrieve and Cache Configurations

The repository implements a caching strategy:
1. **Check Redis cache** (if available) for fast lookups
2. **Query database** on cache miss
3. **Cache result** with TTL (300 seconds default)
4. **Fallback to defaults** on errors

This reduces database load and improves response times for high-traffic tenants.

In [None]:
# Retrieve configurations
startup = repository.get_config("tenant_startup_co")
enterprise = repository.get_config("tenant_enterprise_corp")
research = repository.get_config("tenant_research_lab")

print("Tenant Configurations:")
print(f"  Startup: {startup.model} @ temp={startup.temperature}, top_k={startup.top_k}")
print(f"  Enterprise: {enterprise.model} @ temp={enterprise.temperature}, reranking={enterprise.enable_reranking}")
print(f"  Research: {research.model} @ temp={research.temperature}, top_k={research.top_k}")

# Expected: Shows different configs per tenant

## 6. Apply Configuration to RAG Pipeline

Tenant configurations control the entire RAG pipeline:
- **Model selection**: Route requests to appropriate LLM
- **Temperature**: Control response creativity
- **Retrieval parameters**: Configure top_k documents and hybrid search weight (alpha)
- **Prompt rendering**: Inject tenant-specific variables safely
- **Reranking**: Optional semantic reranking for better relevance

This demonstrates how a single pipeline adapts to diverse tenant needs.

In [None]:
# Apply configuration to pipeline for different tenants
query = "What are the latest AI trends?"

# Startup tenant pipeline
startup_params = apply_config_to_pipeline("tenant_startup_co", query, repository)
print("Startup Pipeline:")
print(f"  Model: {startup_params['model']}")
print(f"  Prompt: {startup_params['prompt'][:60]}...")
print(f"  Retrieval: top_k={startup_params['top_k']}, alpha={startup_params['alpha']}")

# Enterprise tenant pipeline
enterprise_params = apply_config_to_pipeline("tenant_enterprise_corp", query, repository)
print("\nEnterprise Pipeline:")
print(f"  Model: {enterprise_params['model']}")
print(f"  Reranking: {enterprise_params['enable_reranking']}")
print(f"  Tokens: {enterprise_params['max_tokens']}")

# Expected: Different pipeline params per tenant

## 7. Simulate RAG Queries

Let's execute simulated RAG queries for different tenants to see how configurations affect responses.

**Note**: This uses mock responses since API keys are not configured. In production, this would make actual LLM API calls with the configured models.

In [None]:
# Simulate queries for different tenants
query = "What is machine learning?"

# Startup query (budget-friendly)
startup_result = simulate_rag_query("tenant_startup_co", query, repository)
print("Startup Query Result:")
print(f"  Answer: {startup_result['answer'][:50]}...")
print(f"  Docs: {startup_result['documents_retrieved']}, Reranking: {startup_result['reranking_applied']}")

# Enterprise query (premium)
enterprise_result = simulate_rag_query("tenant_enterprise_corp", query, repository)
print("\nEnterprise Query Result:")
print(f"  Answer: {enterprise_result['answer'][:50]}...")
print(f"  Model: {enterprise_result['config_used']['model']}")
print(f"  Reranking: {enterprise_result['reranking_applied']}")

# Expected: Simulated responses with different configs applied

## 8. Configuration Updates: Merge vs Replace

Two update modes are supported:
- **Merge mode** (`merge=True`): Update specific fields while preserving others
- **Replace mode** (`merge=False`): Full replacement, other fields reset to defaults

This flexibility allows both incremental adjustments and complete reconfiguration.

In [None]:
# Merge mode - update only temperature
print("Before merge update:")
config = repository.get_config("tenant_startup_co")
print(f"  Model: {config.model}, Temp: {config.temperature}, Top-K: {config.top_k}")

repository.update_config("tenant_startup_co", {"temperature": 0.9}, merge=True)

print("\nAfter merge update:")
config = repository.get_config("tenant_startup_co")
print(f"  Model: {config.model}, Temp: {config.temperature}, Top-K: {config.top_k}")
print("  ‚úì Temperature updated, other fields preserved")

# Replace mode - full reset
repository.update_config("tenant_startup_co", {"temperature": 0.3}, merge=False)

print("\nAfter replace update:")
config = repository.get_config("tenant_startup_co")
print(f"  Model: {config.model}, Temp: {config.temperature}, Top-K: {config.top_k}")
print("  ‚úì Other fields reset to defaults")

# Expected: Merge preserves fields, replace resets to defaults

## 9. Common Failure Scenarios

### Failure 1: Configuration Conflicts
**Symptom**: Tenant A's settings accidentally applied to Tenant B

**Prevention**: Always validate tenant_id in request pipeline

In [None]:
# Demonstrate proper tenant isolation
tenant_a_config = repository.get_config("tenant_startup_co")
tenant_b_config = repository.get_config("tenant_enterprise_corp")

print("Tenant Isolation:")
print(f"  Tenant A uses: {tenant_a_config.model}")
print(f"  Tenant B uses: {tenant_b_config.model}")
print(f"  ‚úì Each tenant has independent configuration")

# Expected: Different models per tenant, no cross-contamination

### Failure 2: Default Override Failures
**Symptom**: Partial updates don't merge correctly with defaults

**Fix**: Use `merge=True` for partial updates, `merge=False` for full replacement

In [None]:
# WRONG: Using merge=False for partial update loses other fields
repository.update_config("tenant_test", {"model": "gpt-4"}, merge=False)
wrong_config = repository.get_config("tenant_test")
print("Wrong approach (merge=False with partial data):")
print(f"  Model: {wrong_config.model} (updated)")
print(f"  Temperature: {wrong_config.temperature} (reset to default!)")

# RIGHT: Using merge=True preserves existing fields
repository.update_config("tenant_test", {"temperature": 0.8, "top_k": 10}, merge=False)  # Setup
repository.update_config("tenant_test", {"model": "gpt-4"}, merge=True)
right_config = repository.get_config("tenant_test")
print("\nRight approach (merge=True):")
print(f"  Model: {right_config.model} (updated)")
print(f"  Temperature: {right_config.temperature} (preserved)")

# Expected: Merge mode preserves other fields

### Failure 3: Cache Staleness
**Symptom**: Updates don't propagate to running processes

**Fix**: Repository automatically invalidates cache on updates. Monitor cache hit rates to ensure TTL is appropriate.

In [None]:
# Cache invalidation happens automatically on updates
print("Cache behavior:")

# Update config (cache gets invalidated automatically)
repository.update_config("tenant_enterprise_corp", {"temperature": 0.6}, merge=True)
print("‚úì Updated temperature - cache automatically invalidated")

# Next read will be cache miss (in production with Redis)
config = repository.get_config("tenant_enterprise_corp")
print(f"‚úì Retrieved updated config: temperature={config.temperature}")

# Note: Without Redis, cache is skipped; with Redis, cache would be repopulated here
print("\n‚ö†Ô∏è Skipping cache demo (no Redis service)")

# Expected: Cache invalidation prevents stale data

### Failure 4: No Rollback Mechanism
**Symptom**: Bad configurations break tenants until manual intervention

**Workaround**: Store backup before updates for manual rollback

In [None]:
# Rollback workaround: store backup before risky updates
tenant_id = "tenant_research_lab"

# Create backup
backup = repository.get_config(tenant_id).model_dump()
print(f"Backup created: {backup['model']} @ temp={backup['temperature']}")

# Make risky update
try:
    repository.update_config(tenant_id, {"temperature": 1.5}, merge=True)
    print("‚úì Update applied")
except Exception as e:
    print(f"Update failed: {e}")
    # Restore from backup
    repository.update_config(tenant_id, backup, merge=False)
    print("‚úì Rolled back to backup")

# Verify current state
current = repository.get_config(tenant_id)
print(f"Current: {current.model} @ temp={current.temperature}")

# Expected: Backup allows manual rollback on failures

## 10. Decision Card (TVH v2.0)

### Use This Pattern When:
‚úÖ Managing **10-100+ tenants** with varied needs  
‚úÖ Revenue models tied to **feature differentiation**  
‚úÖ Need **self-service configuration** capabilities  
‚úÖ Deployment cycle too slow for frequent changes  

### Use Alternatives When:
‚ùå **Standardization** simplifies product (startups MVP)  
‚ùå Configuration changes rare (**<monthly**)  
‚ùå **Cost control** more important than customization  
‚ùå Serving **<10 tenants** with similar needs  

### Trade-offs Accepted:
- Database queries for every request (mitigated by caching)
- Eventually consistent configuration updates
- Storage costs scale with tenant count

### When It Breaks:
üî¥ **>1000 concurrent tenants** with frequent config changes  
üî¥ **Microsecond-level latency** requirements  
üî¥ Configurations requiring **real-time synchronization** across regions

## 11. Summary and Key Takeaways

### What We Learned:
1. **Database-backed configuration** eliminates hardcoded tenant logic
2. **Pydantic validation** prevents invalid configurations
3. **Redis caching** reduces database load for high-traffic tenants
4. **Merge vs Replace** modes provide flexibility in updates
5. **Failure handling** requires planning for conflicts, cache staleness, and rollbacks

### Production Considerations:
- **Scaling**: Handles 100+ tenants; caching critical above 500
- **Cost**: ~$45/month infrastructure (DB + Redis + monitoring)
- **Monitoring**: Track config load times, cache hit rates, validation errors
- **Security**: Prompt injection prevention, bounded parameters, model whitelist

### Next Steps:
- Implement configuration versioning (Practathon challenge)
- Add cost tracking per tenant
- Build admin UI for self-service configuration
- Monitor and optimize cache hit rates

**Key Takeaway**: *"Database-driven configuration separates infrastructure from customization logic, enabling SaaS products to serve diverse customer needs without code deployments."*

In [None]:
# Final summary: List all configured tenants
tenants = repository.list_tenants()

print("=" * 60)
print("CONFIGURED TENANTS SUMMARY")
print("=" * 60)

for tenant_id in tenants[:5]:  # Show first 5
    config = repository.get_config(tenant_id)
    print(f"\n{tenant_id}:")
    print(f"  Model: {config.model}")
    print(f"  Temperature: {config.temperature}")
    print(f"  Top-K: {config.top_k}")
    print(f"  Reranking: {config.enable_reranking}")

print(f"\n{'=' * 60}")
print(f"Total tenants configured: {len(tenants)}")
print("=" * 60)
print("\n‚úì Module 11.2 complete!")

# Expected: Summary of all configured tenants