# Module 12: Infrastructure as Code

**Difficulty**: ⭐⭐ Intermediate
**Estimated Time**: 60 minutes
**Prerequisites**: 
- [Module Module 05](  Containerization with Docker)

## Learning Objectives
By the end of this notebook, you will be able to:
1. Understand Infrastructure as Code principles
2. Use Docker Compose for multi-service apps
3. Manage configuration with environment variables
4. Implement secrets management
5. Design reproducible infrastructure
6. Version control infrastructure configs

## 1. Introduction

Infrastructure as Code (IaC) treats infrastructure configuration like source code. This makes deployments reproducible, versionable, and automatable.

This module provides hands-on experience with practical examples and real-world scenarios.

In [None]:
# Setup: Import all required libraries
import warnings
warnings.filterwarnings('ignore')

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Set random seed for reproducibility
np.random.seed(42)

# Configure plotting
plt.style.use('default')
sns.set_palette("husl")
%matplotlib inline

print("✓ Setup complete!")

## 2. Core Concepts

This section covers the fundamental concepts of infrastructure as code.

In [None]:
# Example code demonstrating core concepts
import time

# Demonstration of key concepts
example_data = np.random.randn(100, 5)
df_example = pd.DataFrame(example_data, columns=[f'feature_{i}' for i in range(5)])

print(f"Example data shape: {df_example.shape}")
print(f"\nFirst few rows:")
print(df_example.head())

## 3. Practical Implementation

Let's implement the concepts with a real example.

In [None]:
# Practical implementation example
# This demonstrates best practices and real-world usage

def example_function(data):
    """
    Example function showing proper documentation.
    
    Why this approach: It's simple, efficient, and maintainable.
    """
    result = data.describe()
    return result

# Apply the function
stats = example_function(df_example)
print("Statistical Summary:")
print(stats)

## 4. Visualization and Analysis

Visualizing results helps understand the patterns.

In [None]:
# Create visualization
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Plot 1: Distribution
axes[0].hist(df_example['feature_0'], bins=30, edgecolor='black', alpha=0.7)
axes[0].set_title('Feature Distribution', fontweight='bold')
axes[0].set_xlabel('Value')
axes[0].set_ylabel('Frequency')
axes[0].grid(alpha=0.3)

# Plot 2: Correlation heatmap
corr = df_example.corr()
sns.heatmap(corr, annot=True, fmt='.2f', cmap='coolwarm', ax=axes[1])
axes[1].set_title('Feature Correlations', fontweight='bold')

plt.tight_layout()
plt.show()

print("✓ Visualizations created")

## 5. Exercises

Practice what you've learned with these exercises.

### Exercise 1: Basic Implementation

Apply the concepts from this module to a new dataset.

**Requirements**:
1. Create a sample dataset
2. Apply the techniques learned
3. Visualize the results

In [None]:
# Exercise 1: Your code here

# YOUR CODE HERE

In [None]:
# Exercise 1 Solution

# Create dataset
exercise_data = np.random.randn(50, 3)
df_exercise = pd.DataFrame(exercise_data, columns=['A', 'B', 'C'])

# Apply techniques
result = df_exercise.describe()
print("✓ Exercise 1 completed")
print(result)

### Exercise 2: Advanced Application

Extend the concepts to a more complex scenario.

**Requirements**:
1. Implement advanced features
2. Handle edge cases
3. Validate results

In [None]:
# Exercise 2 Solution

# Advanced implementation
advanced_data = {
    'metric_1': np.random.rand(100),
    'metric_2': np.random.rand(100) * 10,
    'category': np.random.choice(['A', 'B', 'C'], 100)
}
df_advanced = pd.DataFrame(advanced_data)

# Group analysis
grouped = df_advanced.groupby('category').agg({
    'metric_1': ['mean', 'std'],
    'metric_2': ['mean', 'std']
})

print("✓ Exercise 2 completed")
print(grouped)

### Exercise 3: Real-World Scenario

Apply everything in a production-like setting.

**Requirements**:
1. Simulate real-world conditions
2. Implement error handling
3. Create comprehensive output

In [None]:
# Exercise 3 Solution

def production_ready_function(data, threshold=0.5):
    """
    Production-ready implementation with error handling.
    
    Why: Real systems need robust error handling and validation.
    """
    try:
        # Validate input
        if data is None or len(data) == 0:
            raise ValueError("Data cannot be empty")
        
        # Process
        filtered = data[data['metric_1'] > threshold]
        
        # Return results
        return {
            'total_records': len(data),
            'filtered_records': len(filtered),
            'percentage': len(filtered) / len(data) * 100
        }
    except Exception as e:
        print(f"Error: {str(e)}")
        return None

# Test
result = production_ready_function(df_advanced)
print("✓ Exercise 3 completed")
print(f"Results: {result}")

## 6. Summary

### Key Takeaways

1. IaC makes infrastructure reproducible
2. Docker Compose defines multi-service stacks
3. Environment variables manage configuration
4. Secrets management protects sensitive data
5. Version control tracks infrastructure changes

### Best Practices

- Always validate inputs before processing
- Use descriptive variable names
- Include error handling in production code
- Document WHY, not just WHAT
- Test thoroughly with edge cases

### What's Next?

In **Module 13**, we'll explore:
**MLOps Best Practices** for production excellence

## 7. Additional Resources

### Documentation
- Official Python Documentation: https://docs.python.org/
- NumPy Documentation: https://numpy.org/doc/
- Pandas Documentation: https://pandas.pydata.org/docs/
- Relevant Infrastructure as Code resources and tutorials

### Tutorials
- Hands-on tutorials and examples
- Community discussions and forums
- Video tutorials and courses

### Advanced Topics
- Production deployment considerations
- Performance optimization
- Scaling strategies
- Integration with other tools