# Lab 4.1: Deep Network Architecture Implementation

## Duration: 45 minutes

## Learning Objectives
By the end of this lab, you will be able to:
- Design and implement deep neural network architectures
- Understand the structure of multi-layer neural networks
- Initialize weights and biases for deep networks
- Implement a flexible neural network class for various architectures

## Prerequisites
- Basic understanding of neural networks
- Knowledge of Python and NumPy
- Understanding of matrix operations

## Key Concepts
- **Deep Neural Networks**: Networks with multiple hidden layers
- **Weight Initialization**: Proper initialization strategies for deep networks
- **Network Architecture**: Layer sizes and connectivity patterns
- **Parameter Management**: Organizing weights and biases across layers

## Setup and Imports

First, let's import all necessary libraries and set up our environment.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')

# Set random seed for reproducibility
np.random.seed(42)

# Configure matplotlib for better plots
plt.style.use('default')
plt.rcParams['figure.figsize'] = (10, 6)

print("Environment setup complete!")
print(f"NumPy version: {np.__version__}")

## Step 1: Understanding Deep Network Architecture

A deep neural network consists of:
- **Input layer**: Receives the input data
- **Hidden layers**: Multiple layers that process information
- **Output layer**: Produces the final predictions

Let's visualize different network architectures:

In [None]:
def visualize_network_architecture(layer_sizes, title="Neural Network Architecture"):
    """
    Visualize neural network architecture
    
    Parameters:
    layer_sizes: list of integers representing number of neurons in each layer
    title: title for the plot
    """
    fig, ax = plt.subplots(figsize=(12, 8))
    
    # Calculate positions
    max_neurons = max(layer_sizes)
    layer_positions = np.linspace(0, len(layer_sizes)-1, len(layer_sizes))
    
    # Draw neurons
    for layer_idx, num_neurons in enumerate(layer_sizes):
        # Center neurons vertically
        neuron_positions = np.linspace(
            (max_neurons - num_neurons) / 2, 
            (max_neurons - num_neurons) / 2 + num_neurons - 1, 
            num_neurons
        )
        
        # Draw neurons as circles
        for neuron_pos in neuron_positions:
            circle = plt.Circle((layer_positions[layer_idx], neuron_pos), 0.1, 
                              color='lightblue', ec='darkblue', linewidth=2)
            ax.add_patch(circle)
        
        # Draw connections to next layer
        if layer_idx < len(layer_sizes) - 1:
            next_neurons = layer_sizes[layer_idx + 1]
            next_positions = np.linspace(
                (max_neurons - next_neurons) / 2, 
                (max_neurons - next_neurons) / 2 + next_neurons - 1, 
                next_neurons
            )
            
            # Draw connections
            for curr_pos in neuron_positions:
                for next_pos in next_positions:
                    ax.plot([layer_positions[layer_idx], layer_positions[layer_idx + 1]], 
                           [curr_pos, next_pos], 'gray', alpha=0.3, linewidth=0.5)
    
    # Add layer labels
    layer_names = ['Input'] + [f'Hidden {i}' for i in range(1, len(layer_sizes)-1)] + ['Output']
    for i, (pos, name, size) in enumerate(zip(layer_positions, layer_names, layer_sizes)):
        ax.text(pos, max_neurons + 0.5, f'{name}\n({size} units)', 
               ha='center', va='bottom', fontsize=10, fontweight='bold')
    
    ax.set_xlim(-0.5, len(layer_sizes) - 0.5)
    ax.set_ylim(-1, max_neurons + 1.5)
    ax.set_aspect('equal')
    ax.axis('off')
    ax.set_title(title, fontsize=14, fontweight='bold', pad=20)
    
    plt.tight_layout()
    plt.show()

# Visualize different architectures
print("Example Network Architectures:")
print("\n1. Shallow Network:")
visualize_network_architecture([4, 3, 1], "Shallow Network (4-3-1)")

print("\n2. Deep Network:")
visualize_network_architecture([4, 7, 5, 3, 1], "Deep Network (4-7-5-3-1)")

## Step 2: Weight Initialization Strategies

Proper weight initialization is crucial for deep networks to train effectively. Let's implement different initialization methods:

In [None]:
class WeightInitializer:
    """
    Collection of weight initialization methods for deep networks
    """
    
    @staticmethod
    def zeros(shape):
        """Initialize weights to zeros (not recommended for hidden layers)"""
        return np.zeros(shape)
    
    @staticmethod
    def random_normal(shape, mean=0, std=0.01):
        """Initialize weights from normal distribution"""
        return np.random.normal(mean, std, shape)
    
    @staticmethod
    def random_uniform(shape, low=-0.1, high=0.1):
        """Initialize weights from uniform distribution"""
        return np.random.uniform(low, high, shape)
    
    @staticmethod
    def xavier_uniform(shape):
        """Xavier/Glorot uniform initialization"""
        fan_in, fan_out = shape[1], shape[0]
        limit = np.sqrt(6 / (fan_in + fan_out))
        return np.random.uniform(-limit, limit, shape)
    
    @staticmethod
    def xavier_normal(shape):
        """Xavier/Glorot normal initialization"""
        fan_in, fan_out = shape[1], shape[0]
        std = np.sqrt(2 / (fan_in + fan_out))
        return np.random.normal(0, std, shape)
    
    @staticmethod
    def he_uniform(shape):
        """He uniform initialization (good for ReLU)"""
        fan_in = shape[1]
        limit = np.sqrt(6 / fan_in)
        return np.random.uniform(-limit, limit, shape)
    
    @staticmethod
    def he_normal(shape):
        """He normal initialization (good for ReLU)"""
        fan_in = shape[1]
        std = np.sqrt(2 / fan_in)
        return np.random.normal(0, std, shape)

# Test different initialization methods
print("Weight Initialization Methods:")
test_shape = (10, 5)  # 10 output neurons, 5 input neurons

initializers = {
    'Random Normal': WeightInitializer.random_normal,
    'Xavier Uniform': WeightInitializer.xavier_uniform,
    'Xavier Normal': WeightInitializer.xavier_normal,
    'He Uniform': WeightInitializer.he_uniform,
    'He Normal': WeightInitializer.he_normal
}

fig, axes = plt.subplots(2, 3, figsize=(15, 10))
axes = axes.ravel()

for idx, (name, init_func) in enumerate(initializers.items()):
    weights = init_func(test_shape)
    
    axes[idx].hist(weights.ravel(), bins=30, alpha=0.7, density=True)
    axes[idx].set_title(f'{name}\nMean: {weights.mean():.4f}, Std: {weights.std():.4f}')
    axes[idx].set_xlabel('Weight Value')
    axes[idx].set_ylabel('Density')
    axes[idx].grid(True, alpha=0.3)

# Remove empty subplot
axes[-1].remove()

plt.tight_layout()
plt.suptitle('Weight Initialization Distributions', y=1.02, fontsize=16)
plt.show()

print("\nKey Points:")
print("- Xavier initialization works well with tanh and sigmoid activations")
print("- He initialization works well with ReLU and its variants")
print("- Proper initialization prevents vanishing/exploding gradients")

## Step 3: Deep Neural Network Class Implementation

Now let's implement a flexible deep neural network class:

In [None]:
class DeepNeuralNetwork:
    """
    Deep Neural Network implementation with configurable architecture
    """
    
    def __init__(self, layer_sizes, initialization='he_normal', random_seed=None):
        """
        Initialize the deep neural network
        
        Parameters:
        layer_sizes: list of integers, number of units in each layer
        initialization: string, weight initialization method
        random_seed: int, random seed for reproducibility
        """
        if random_seed:
            np.random.seed(random_seed)
            
        self.layer_sizes = layer_sizes
        self.num_layers = len(layer_sizes)
        self.initialization = initialization
        
        # Initialize parameters
        self.parameters = self._initialize_parameters()
        
        # Store gradients
        self.gradients = {}
        
        # Store cache for backpropagation
        self.cache = {}
        
        print(f"Deep Neural Network initialized with architecture: {layer_sizes}")
        print(f"Total parameters: {self._count_parameters()}")
    
    def _initialize_parameters(self):
        """
        Initialize weights and biases for all layers
        
        Returns:
        parameters: dictionary containing weights and biases
        """
        parameters = {}
        
        # Get initialization function
        init_functions = {
            'random_normal': WeightInitializer.random_normal,
            'xavier_uniform': WeightInitializer.xavier_uniform,
            'xavier_normal': WeightInitializer.xavier_normal,
            'he_uniform': WeightInitializer.he_uniform,
            'he_normal': WeightInitializer.he_normal
        }
        
        init_func = init_functions.get(self.initialization, WeightInitializer.he_normal)
        
        for layer in range(1, self.num_layers):
            # Weight matrix shape: (current_layer_size, previous_layer_size)
            weight_shape = (self.layer_sizes[layer], self.layer_sizes[layer-1])
            
            # Initialize weights
            parameters[f'W{layer}'] = init_func(weight_shape)
            
            # Initialize biases to zero
            parameters[f'b{layer}'] = np.zeros((self.layer_sizes[layer], 1))
            
            print(f"Layer {layer}: W{layer} shape = {weight_shape}, b{layer} shape = {parameters[f'b{layer}'].shape}")
        
        return parameters
    
    def _count_parameters(self):
        """
        Count total number of parameters in the network
        
        Returns:
        total_params: int, total number of parameters
        """
        total_params = 0
        for key, param in self.parameters.items():
            total_params += param.size
        return total_params
    
    def get_parameter_summary(self):
        """
        Get summary of network parameters
        
        Returns:
        summary: dictionary with parameter information
        """
        summary = {
            'architecture': self.layer_sizes,
            'num_layers': self.num_layers,
            'total_parameters': self._count_parameters(),
            'initialization': self.initialization,
            'parameter_details': {}
        }
        
        for layer in range(1, self.num_layers):
            W_key, b_key = f'W{layer}', f'b{layer}'
            summary['parameter_details'][f'Layer_{layer}'] = {
                'weights_shape': self.parameters[W_key].shape,
                'biases_shape': self.parameters[b_key].shape,
                'weights_params': self.parameters[W_key].size,
                'biases_params': self.parameters[b_key].size
            }
        
        return summary
    
    def print_architecture(self):
        """
        Print detailed architecture information
        """
        print("\n" + "="*60)
        print(f"DEEP NEURAL NETWORK ARCHITECTURE")
        print("="*60)
        print(f"Architecture: {self.layer_sizes}")
        print(f"Number of layers: {self.num_layers}")
        print(f"Initialization method: {self.initialization}")
        print(f"Total parameters: {self._count_parameters():,}")
        print()
        
        layer_names = ['Input'] + [f'Hidden {i}' for i in range(1, self.num_layers-1)] + ['Output']
        
        for i, (name, size) in enumerate(zip(layer_names, self.layer_sizes)):
            if i == 0:
                print(f"Layer {i} ({name:>8}): {size:>4} units")
            else:
                W_key, b_key = f'W{i}', f'b{i}'
                W_params = self.parameters[W_key].size
                b_params = self.parameters[b_key].size
                total_params = W_params + b_params
                print(f"Layer {i} ({name:>8}): {size:>4} units, {total_params:>6} parameters")
        
        print("="*60)

# Test the Deep Neural Network class
print("Testing Deep Neural Network Implementation:")
print()

# Example 1: Small deep network
print("Example 1: Small Deep Network")
small_network = DeepNeuralNetwork([4, 8, 6, 3, 1], initialization='he_normal', random_seed=42)
small_network.print_architecture()

print("\n" + "-"*60 + "\n")

# Example 2: Larger deep network
print("Example 2: Larger Deep Network")
large_network = DeepNeuralNetwork([784, 256, 128, 64, 32, 10], initialization='xavier_normal', random_seed=42)
large_network.print_architecture()

## Step 4: Parameter Inspection and Analysis

Let's analyze the initialized parameters and understand their properties:

In [None]:
def analyze_network_parameters(network, layer_to_analyze=1):
    """
    Analyze and visualize network parameters
    
    Parameters:
    network: DeepNeuralNetwork instance
    layer_to_analyze: which layer to analyze in detail
    """
    print(f"Analyzing parameters for layer {layer_to_analyze}:")
    
    W_key = f'W{layer_to_analyze}'
    b_key = f'b{layer_to_analyze}'
    
    weights = network.parameters[W_key]
    biases = network.parameters[b_key]
    
    print(f"\nWeight matrix {W_key}:")
    print(f"  Shape: {weights.shape}")
    print(f"  Mean: {weights.mean():.6f}")
    print(f"  Std: {weights.std():.6f}")
    print(f"  Min: {weights.min():.6f}")
    print(f"  Max: {weights.max():.6f}")
    
    print(f"\nBias vector {b_key}:")
    print(f"  Shape: {biases.shape}")
    print(f"  Mean: {biases.mean():.6f}")
    print(f"  Std: {biases.std():.6f}")
    
    # Visualize weight distributions
    fig, axes = plt.subplots(1, 2, figsize=(12, 5))
    
    # Weight histogram
    axes[0].hist(weights.ravel(), bins=50, alpha=0.7, color='blue', density=True)
    axes[0].set_title(f'Weight Distribution - Layer {layer_to_analyze}')
    axes[0].set_xlabel('Weight Value')
    axes[0].set_ylabel('Density')
    axes[0].grid(True, alpha=0.3)
    axes[0].axvline(weights.mean(), color='red', linestyle='--', label=f'Mean: {weights.mean():.4f}')
    axes[0].legend()
    
    # Weight matrix heatmap
    im = axes[1].imshow(weights, cmap='RdBu', aspect='auto')
    axes[1].set_title(f'Weight Matrix Heatmap - Layer {layer_to_analyze}')
    axes[1].set_xlabel('Input Neuron')
    axes[1].set_ylabel('Output Neuron')
    plt.colorbar(im, ax=axes[1])
    
    plt.tight_layout()
    plt.show()

# Analyze parameters for the small network
print("Parameter Analysis for Small Deep Network:")
analyze_network_parameters(small_network, layer_to_analyze=1)

print("\n" + "-"*60 + "\n")

# Compare different initialization methods
print("Comparing Initialization Methods:")

initialization_methods = ['random_normal', 'xavier_uniform', 'xavier_normal', 'he_uniform', 'he_normal']
test_architecture = [100, 50, 25, 1]

fig, axes = plt.subplots(2, 3, figsize=(15, 10))
axes = axes.ravel()

for idx, init_method in enumerate(initialization_methods):
    test_network = DeepNeuralNetwork(test_architecture, initialization=init_method, random_seed=42)
    
    # Get first layer weights
    weights = test_network.parameters['W1']
    
    axes[idx].hist(weights.ravel(), bins=40, alpha=0.7, density=True)
    axes[idx].set_title(f'{init_method.replace("_", " ").title()}\nMean: {weights.mean():.4f}, Std: {weights.std():.4f}')
    axes[idx].set_xlabel('Weight Value')
    axes[idx].set_ylabel('Density')
    axes[idx].grid(True, alpha=0.3)

# Remove empty subplot
axes[-1].remove()

plt.tight_layout()
plt.suptitle('Comparison of Weight Initialization Methods', y=1.02, fontsize=16)
plt.show()

## Step 5: Architecture Design Best Practices

Let's explore best practices for designing deep network architectures:

In [None]:
class ArchitectureDesigner:
    """
    Helper class for designing neural network architectures
    """
    
    @staticmethod
    def pyramid_architecture(input_size, output_size, num_hidden_layers, reduction_factor=2):
        """
        Create a pyramid-style architecture that gradually reduces layer size
        
        Parameters:
        input_size: number of input features
        output_size: number of output units
        num_hidden_layers: number of hidden layers
        reduction_factor: factor by which each layer is reduced
        
        Returns:
        architecture: list of layer sizes
        """
        if num_hidden_layers == 0:
            return [input_size, output_size]
        
        architecture = [input_size]
        
        # Calculate intermediate sizes
        current_size = input_size
        for i in range(num_hidden_layers):
            current_size = max(output_size, int(current_size / reduction_factor))
            architecture.append(current_size)
        
        # Ensure last layer connects to output
        if architecture[-1] != output_size:
            architecture.append(output_size)
        
        return architecture
    
    @staticmethod
    def diamond_architecture(input_size, output_size, num_hidden_layers, expansion_factor=2):
        """
        Create a diamond-style architecture that expands then contracts
        
        Parameters:
        input_size: number of input features
        output_size: number of output units
        num_hidden_layers: number of hidden layers
        expansion_factor: factor by which middle layers expand
        
        Returns:
        architecture: list of layer sizes
        """
        if num_hidden_layers == 0:
            return [input_size, output_size]
        
        architecture = [input_size]
        
        # Expansion phase
        mid_point = num_hidden_layers // 2
        max_size = input_size * expansion_factor
        
        for i in range(mid_point):
            size = int(input_size + (max_size - input_size) * (i + 1) / mid_point)
            architecture.append(size)
        
        # Contraction phase
        remaining_layers = num_hidden_layers - mid_point
        current_size = architecture[-1] if mid_point > 0 else input_size
        
        for i in range(remaining_layers):
            size = int(current_size - (current_size - output_size) * (i + 1) / remaining_layers)
            size = max(output_size, size)
            architecture.append(size)
        
        # Ensure output layer
        if architecture[-1] != output_size:
            architecture.append(output_size)
        
        return architecture
    
    @staticmethod
    def uniform_architecture(input_size, output_size, num_hidden_layers, hidden_size=None):
        """
        Create a uniform architecture with same-sized hidden layers
        
        Parameters:
        input_size: number of input features
        output_size: number of output units
        num_hidden_layers: number of hidden layers
        hidden_size: size of hidden layers (if None, use input_size)
        
        Returns:
        architecture: list of layer sizes
        """
        if hidden_size is None:
            hidden_size = input_size
        
        architecture = [input_size]
        architecture.extend([hidden_size] * num_hidden_layers)
        architecture.append(output_size)
        
        return architecture

# Test different architecture designs
print("Architecture Design Examples:")
print()

input_size, output_size = 784, 10  # MNIST-like problem
num_hidden = 4

architectures = {
    'Pyramid (2x reduction)': ArchitectureDesigner.pyramid_architecture(input_size, output_size, num_hidden, 2),
    'Pyramid (3x reduction)': ArchitectureDesigner.pyramid_architecture(input_size, output_size, num_hidden, 3),
    'Diamond (2x expansion)': ArchitectureDesigner.diamond_architecture(input_size, output_size, num_hidden, 2),
    'Uniform (512 units)': ArchitectureDesigner.uniform_architecture(input_size, output_size, num_hidden, 512),
    'Uniform (256 units)': ArchitectureDesigner.uniform_architecture(input_size, output_size, num_hidden, 256)
}

for name, arch in architectures.items():
    print(f"{name:20}: {arch}")
    network = DeepNeuralNetwork(arch, initialization='he_normal')
    print(f"{'':20}  Total parameters: {network._count_parameters():,}")
    print()

# Visualize different architectures
fig, axes = plt.subplots(2, 3, figsize=(18, 12))
axes = axes.ravel()

for idx, (name, arch) in enumerate(architectures.items()):
    # Simplified visualization for subplots
    max_neurons = max(arch)
    layer_positions = np.linspace(0, len(arch)-1, len(arch))
    
    for layer_idx, num_neurons in enumerate(arch):
        # Normalize neuron positions
        neuron_positions = np.linspace(
            (max_neurons - num_neurons) / 2, 
            (max_neurons - num_neurons) / 2 + num_neurons - 1, 
            min(num_neurons, 10)  # Limit visualization to 10 neurons max
        )
        
        # Draw neurons
        for neuron_pos in neuron_positions:
            circle = plt.Circle((layer_positions[layer_idx], neuron_pos), 0.05, 
                              color='lightblue', ec='darkblue', linewidth=1)
            axes[idx].add_patch(circle)
    
    axes[idx].set_xlim(-0.5, len(arch) - 0.5)
    axes[idx].set_ylim(-1, max_neurons)
    axes[idx].set_aspect('equal')
    axes[idx].axis('off')
    axes[idx].set_title(f'{name}\n{arch}', fontsize=10)

# Remove empty subplot
axes[-1].remove()

plt.tight_layout()
plt.suptitle('Different Neural Network Architectures', y=1.02, fontsize=16)
plt.show()

## Step 6: Architecture Validation and Testing

Let's create a validation framework to test our architectures:

In [None]:
def validate_architecture(architecture, max_params=1000000, min_layers=2):
    """
    Validate neural network architecture
    
    Parameters:
    architecture: list of layer sizes
    max_params: maximum allowed parameters
    min_layers: minimum number of layers
    
    Returns:
    validation_result: dictionary with validation results
    """
    result = {
        'valid': True,
        'warnings': [],
        'errors': [],
        'recommendations': []
    }
    
    # Check minimum layers
    if len(architecture) < min_layers:
        result['errors'].append(f"Architecture must have at least {min_layers} layers")
        result['valid'] = False
    
    # Check for zero or negative layer sizes
    if any(size <= 0 for size in architecture):
        result['errors'].append("All layer sizes must be positive")
        result['valid'] = False
    
    # Estimate parameter count
    total_params = 0
    for i in range(1, len(architecture)):
        # Weights: current_layer_size × previous_layer_size
        # Biases: current_layer_size
        total_params += architecture[i] * architecture[i-1] + architecture[i]
    
    # Check parameter count
    if total_params > max_params:
        result['warnings'].append(f"High parameter count: {total_params:,} (max recommended: {max_params:,})")
    
    # Check for very large layers
    max_layer_size = max(architecture)
    if max_layer_size > 2048:
        result['warnings'].append(f"Very large layer detected: {max_layer_size} units")
    
    # Check for dramatic size changes
    for i in range(1, len(architecture)):
        ratio = architecture[i-1] / architecture[i] if architecture[i] > 0 else float('inf')
        if ratio > 10:
            result['warnings'].append(f"Large reduction from layer {i-1} to {i}: {architecture[i-1]} → {architecture[i]}")
    
    # Recommendations
    if len(architecture) > 6:
        result['recommendations'].append("Consider using techniques like batch normalization for very deep networks")
    
    if architecture[0] > 1000:
        result['recommendations'].append("Consider dimensionality reduction for high-dimensional input")
    
    result['total_parameters'] = total_params
    
    return result

def print_validation_results(architecture, validation_result):
    """
    Print validation results in a formatted way
    """
    print(f"\nValidation Results for Architecture: {architecture}")
    print("-" * 60)
    
    if validation_result['valid']:
        print("✅ Architecture is VALID")
    else:
        print("❌ Architecture is INVALID")
    
    print(f"Total Parameters: {validation_result['total_parameters']:,}")
    
    if validation_result['errors']:
        print("\n🚨 ERRORS:")
        for error in validation_result['errors']:
            print(f"  - {error}")
    
    if validation_result['warnings']:
        print("\n⚠️  WARNINGS:")
        for warning in validation_result['warnings']:
            print(f"  - {warning}")
    
    if validation_result['recommendations']:
        print("\n💡 RECOMMENDATIONS:")
        for rec in validation_result['recommendations']:
            print(f"  - {rec}")

# Test validation on different architectures
print("Architecture Validation Tests:")

test_architectures = [
    [784, 256, 128, 64, 10],  # Good architecture
    [784, 2048, 1024, 512, 10],  # Large architecture
    [1000, 10],  # Too simple
    [100, 1000, 5],  # Large jump down
    [784, 512, 256, 128, 64, 32, 16, 8, 4, 1],  # Very deep
    [784, 0, 10],  # Invalid (zero neurons)
]

for arch in test_architectures:
    validation = validate_architecture(arch)
    print_validation_results(arch, validation)
    print()

## Step 7: Progress Tracking and Key Concepts

Let's summarize what we've learned and check our progress:

In [None]:
# Progress Tracking Checklist
progress_checklist = {
    "Understanding deep network architecture concepts": True,
    "Implementing weight initialization strategies": True,
    "Creating a flexible DeepNeuralNetwork class": True,
    "Analyzing network parameters and properties": True,
    "Exploring architecture design patterns": True,
    "Validating architectures for best practices": True,
    "Understanding parameter count implications": True
}

print("Progress Tracking Checklist:")
print("=" * 50)
for item, completed in progress_checklist.items():
    status = "✅" if completed else "❌"
    print(f"{status} {item}")

completed_items = sum(progress_checklist.values())
total_items = len(progress_checklist)
print(f"\nProgress: {completed_items}/{total_items} ({completed_items/total_items*100:.1f}%) Complete")

print("\n" + "=" * 60)
print("KEY CONCEPTS SUMMARY")
print("=" * 60)

key_concepts = {
    "Deep Neural Networks": "Networks with multiple hidden layers for complex pattern recognition",
    "Weight Initialization": "Critical for training success; He for ReLU, Xavier for tanh/sigmoid",
    "Architecture Design": "Layer sizes and connectivity patterns affect model capacity",
    "Parameter Management": "Organizing weights and biases across all network layers",
    "Pyramid Architecture": "Gradually reducing layer sizes from input to output",
    "Diamond Architecture": "Expanding then contracting layer sizes",
    "Uniform Architecture": "Same-sized hidden layers throughout the network",
    "Architecture Validation": "Checking for best practices and potential issues"
}

for concept, description in key_concepts.items():
    print(f"\n{concept}:")
    print(f"  {description}")

print("\n" + "=" * 60)
print("NEXT STEPS")
print("=" * 60)
print("1. Implement forward propagation for deep networks")
print("2. Implement backward propagation with chain rule")
print("3. Add regularization techniques")
print("4. Implement training loops and optimization")
print("5. Test on real datasets")

## Lab Cleanup Instructions

### Windows Users:
1. Close all Jupyter notebook tabs
2. Press `Ctrl+C` in the command prompt to stop Jupyter server
3. Type `conda deactivate` or `deactivate` to exit virtual environment
4. Close command prompt

### Mac Users:
1. Close all Jupyter notebook tabs
2. Press `Ctrl+C` in terminal to stop Jupyter server
3. Type `conda deactivate` or `deactivate` to exit virtual environment
4. Close terminal

### Save Your Work:
- Your notebook is automatically saved
- Consider saving a copy with your name: `lab_4_1_[your_name].ipynb`
- Export as HTML for offline viewing: File → Download as → HTML

## Troubleshooting Guide

### Common Issues and Solutions:

**Issue 1: Import errors (NumPy, Matplotlib)**
- **Solution**: Install missing packages: `pip install numpy matplotlib`
- **Windows**: Use `conda install numpy matplotlib` if using Anaconda
- **Mac**: Same as Windows, or use `pip3` instead of `pip`

**Issue 2: Memory errors with large networks**
- **Solution**: Reduce network size or batch size
- **Alternative**: Use different initialization with smaller values

**Issue 3: Slow execution**
- **Solution**: Reduce network complexity or visualization details
- **Check**: Available RAM and close other applications

**Issue 4: Visualization not showing**
- **Solution**: Run `%matplotlib inline` in a cell
- **Alternative**: Try `plt.show()` after each plot

**Issue 5: Random seed not working**
- **Solution**: Run the seed setting cell before network creation
- **Check**: Ensure consistent seed values across experiments

### Getting Help:
- Check the error message carefully
- Try restarting the kernel: Kernel → Restart
- Ask instructor or teaching assistant
- Refer to NumPy documentation: https://numpy.org/doc/