# 5. Hands-on Exercises

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/maleehahassan/NNBuildingBlocksTeachingPt1/blob/main/content/05_hands_on_exercises.ipynb)

## Workshop Summary & Practice

Congratulations! You've learned about the fundamental building blocks of neural networks:

1. **Supervised Learning** - Learning from labeled examples
2. **Perceptron** - The first artificial neuron and its limitations
3. **Activation Functions** - Adding non-linearity to enable complex patterns
4. **Loss Functions** - Measuring and minimizing prediction errors

Now it's time to **put it all together** with hands-on exercises!

## Exercise Overview

In the next 10 minutes, you'll:
- Build a simple neural network from scratch
- Experiment with different activation functions
- Compare different loss functions
- See how all the concepts work together

### 🎯 Learning Objectives
- Apply the concepts you've learned
- Build intuition through experimentation
- Understand how components interact
- Gain confidence in neural network basics

In [1]:
# Setup - Import required libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_classification, make_regression
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import warnings
warnings.filterwarnings('ignore')

# Set random seed for reproducibility
np.random.seed(42)

print("🚀 Welcome to the Neural Networks Building Blocks Workshop!")
print("Let's put everything we've learned into practice!")
print("\n" + "="*60)

ModuleNotFoundError: No module named 'numpy'

## Exercise 1: Build Your First Neural Network (3 minutes)

Let's build a simple neural network class that incorporates all the concepts we've learned.

### 🎯 Your Task:
Complete the missing parts in the neural network implementation below.

In [None]:
class SimpleNeuralNetwork:
    def __init__(self, input_size, hidden_size, output_size, activation='relu', loss='mse'):
        """
        Initialize a simple 2-layer neural network
        
        Parameters:
        - input_size: number of input features
        - hidden_size: number of neurons in hidden layer
        - output_size: number of output neurons
        - activation: activation function ('relu', 'sigmoid', 'tanh')
        - loss: loss function ('mse', 'binary_crossentropy')
        """
        # Initialize weights randomly
        self.W1 = np.random.randn(input_size, hidden_size) * 0.5
        self.b1 = np.zeros((1, hidden_size))
        self.W2 = np.random.randn(hidden_size, output_size) * 0.5
        self.b2 = np.zeros((1, output_size))
        
        self.activation = activation
        self.loss_function = loss
        self.losses = []
    
    def activate(self, z, activation=None):
        """Apply activation function"""
        if activation is None:
            activation = self.activation
            
        if activation == 'relu':
            return np.maximum(0, z)
        elif activation == 'sigmoid':
            return 1 / (1 + np.exp(-np.clip(z, -250, 250)))
        elif activation == 'tanh':
            return np.tanh(z)
        else:
            return z
    
    def activate_derivative(self, z, activation=None):
        """Compute derivative of activation function"""
        if activation is None:
            activation = self.activation
            
        if activation == 'relu':
            return (z > 0).astype(float)
        elif activation == 'sigmoid':
            s = self.activate(z, 'sigmoid')
            return s * (1 - s)
        elif activation == 'tanh':
            return 1 - np.tanh(z)**2
        else:
            return np.ones_like(z)
    
    def forward(self, X):
        """Forward propagation"""
        # TODO: Implement forward pass
        # Hint: z1 = X @ W1 + b1, a1 = activate(z1), z2 = a1 @ W2 + b2, a2 = activate(z2)
        
        # Layer 1
        self.z1 = X @ self.W1 + self.b1
        self.a1 = self.activate(self.z1)
        
        # Layer 2 (output)
        self.z2 = self.a1 @ self.W2 + self.b2
        if self.loss_function == 'binary_crossentropy':
            self.a2 = self.activate(self.z2, 'sigmoid')  # Sigmoid for classification
        else:
            self.a2 = self.z2  # Linear for regression
            
        return self.a2
    
    def compute_loss(self, y_true, y_pred):
        """Compute loss"""
        if self.loss_function == 'mse':
            return np.mean((y_true - y_pred)**2)
        elif self.loss_function == 'binary_crossentropy':
            y_pred_clipped = np.clip(y_pred, 1e-15, 1 - 1e-15)
            return -np.mean(y_true * np.log(y_pred_clipped) + 
                          (1 - y_true) * np.log(1 - y_pred_clipped))
    
    def backward(self, X, y):
        """Backward propagation"""
        m = X.shape[0]
        
        # Output layer gradients
        if self.loss_function == 'binary_crossentropy':
            dz2 = self.a2 - y
        else:
            dz2 = 2 * (self.a2 - y) / m
        
        dW2 = self.a1.T @ dz2 / m
        db2 = np.sum(dz2, axis=0, keepdims=True) / m
        
        # Hidden layer gradients
        da1 = dz2 @ self.W2.T
        dz1 = da1 * self.activate_derivative(self.z1)
        dW1 = X.T @ dz1 / m
        db1 = np.sum(dz1, axis=0, keepdims=True) / m
        
        return dW1, db1, dW2, db2
    
    def train(self, X, y, epochs=1000, learning_rate=0.01, verbose=False):
        """Train the neural network"""
        for epoch in range(epochs):
            # Forward pass
            predictions = self.forward(X)
            
            # Compute loss
            loss = self.compute_loss(y, predictions)
            self.losses.append(loss)
            
            # Backward pass
            dW1, db1, dW2, db2 = self.backward(X, y)
            
            # Update weights
            self.W1 -= learning_rate * dW1
            self.b1 -= learning_rate * db1
            self.W2 -= learning_rate * dW2
            self.b2 -= learning_rate * db2
            
            if verbose and (epoch + 1) % 100 == 0:
                print(f"Epoch {epoch + 1}, Loss: {loss:.4f}")
    
    def predict(self, X):
        """Make predictions"""
        return self.forward(X)

print("✅ Neural Network class implemented!")
print("Now let's test it on some data...")

## Exercise 2: Activation Function Comparison (3 minutes)

Let's see how different activation functions perform on the same problem.

### 🎯 Your Task:
Run the code below and observe how different activation functions affect learning.

In [None]:
# Generate a classification dataset
X, y = make_classification(n_samples=1000, n_features=2, n_redundant=0, 
                          n_informative=2, n_clusters_per_class=1, 
                          random_state=42)
y = y.reshape(-1, 1)  # Reshape for our network

# Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Standardize the data
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

print(f"Training set: {X_train_scaled.shape[0]} samples")
print(f"Test set: {X_test_scaled.shape[0]} samples")
print(f"Features: {X_train_scaled.shape[1]}")

# Test different activation functions
activations = ['relu', 'sigmoid', 'tanh']
networks = {}
results = {}

print("\n🧪 Training networks with different activation functions...")

for activation in activations:
    print(f"\nTraining with {activation.upper()} activation:")
    
    # Create and train network
    network = SimpleNeuralNetwork(input_size=2, hidden_size=10, output_size=1, 
                                 activation=activation, loss='binary_crossentropy')
    
    network.train(X_train_scaled, y_train, epochs=500, learning_rate=0.1, verbose=True)
    
    # Make predictions
    train_pred = network.predict(X_train_scaled)
    test_pred = network.predict(X_test_scaled)
    
    # Calculate accuracy
    train_acc = np.mean((train_pred > 0.5) == y_train)
    test_acc = np.mean((test_pred > 0.5) == y_test)
    
    networks[activation] = network
    results[activation] = {'train_acc': train_acc, 'test_acc': test_acc}
    
    print(f"Final - Train Accuracy: {train_acc:.3f}, Test Accuracy: {test_acc:.3f}")

print("\n" + "="*50)
print("📊 ACTIVATION FUNCTION COMPARISON RESULTS:")
print("="*50)
for activation in activations:
    train_acc = results[activation]['train_acc']
    test_acc = results[activation]['test_acc']
    print(f"{activation.upper():8s}: Train={train_acc:.3f}, Test={test_acc:.3f}")

In [None]:
# Visualize the results
fig, axes = plt.subplots(2, 2, figsize=(15, 12))

# Plot 1: Learning curves
ax1 = axes[0, 0]
for activation in activations:
    network = networks[activation]
    ax1.plot(network.losses, label=f'{activation.upper()}', linewidth=2)

ax1.set_xlabel('Epoch')
ax1.set_ylabel('Loss')
ax1.set_title('Learning Curves: Different Activation Functions', fontweight='bold')
ax1.legend()
ax1.grid(True, alpha=0.3)
ax1.set_yscale('log')

# Plot 2: Accuracy comparison
ax2 = axes[0, 1]
train_accs = [results[act]['train_acc'] for act in activations]
test_accs = [results[act]['test_acc'] for act in activations]

x = np.arange(len(activations))
width = 0.35

ax2.bar(x - width/2, train_accs, width, label='Train Accuracy', alpha=0.8)
ax2.bar(x + width/2, test_accs, width, label='Test Accuracy', alpha=0.8)

ax2.set_ylabel('Accuracy')
ax2.set_title('Final Accuracy Comparison', fontweight='bold')
ax2.set_xticks(x)
ax2.set_xticklabels([act.upper() for act in activations])
ax2.legend()
ax2.grid(True, alpha=0.3)
ax2.set_ylim(0, 1)

# Plot 3 & 4: Decision boundaries for best and worst performing models
best_activation = max(activations, key=lambda x: results[x]['test_acc'])
worst_activation = min(activations, key=lambda x: results[x]['test_acc'])

for i, (activation, title) in enumerate([(best_activation, 'Best Performer'), 
                                        (worst_activation, 'Worst Performer')]):
    ax = axes[1, i]
    network = networks[activation]
    
    # Create decision boundary
    h = 0.02
    x_min, x_max = X_test_scaled[:, 0].min() - 1, X_test_scaled[:, 0].max() + 1
    y_min, y_max = X_test_scaled[:, 1].min() - 1, X_test_scaled[:, 1].max() + 1
    xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
                        np.arange(y_min, y_max, h))
    
    mesh_points = np.c_[xx.ravel(), yy.ravel()]
    Z = network.predict(mesh_points)
    Z = Z.reshape(xx.shape)
    
    ax.contourf(xx, yy, Z, levels=50, alpha=0.8, cmap='RdYlBu')
    scatter = ax.scatter(X_test_scaled[:, 0], X_test_scaled[:, 1], 
                        c=y_test.ravel(), cmap='RdYlBu', edgecolors='black')
    
    test_acc = results[activation]['test_acc']
    ax.set_title(f'{title}: {activation.upper()}\nTest Accuracy: {test_acc:.3f}', 
                fontweight='bold')
    ax.set_xlabel('Feature 1')
    ax.set_ylabel('Feature 2')

plt.tight_layout()
plt.show()

print(f"\n🏆 Best performing activation: {best_activation.upper()}")
print(f"🔧 Try experimenting with different network architectures!")

## Exercise 3: Loss Function Impact (2 minutes)

Let's see how different loss functions affect regression performance.

### 🎯 Your Task:
Observe how MSE vs MAE loss functions handle outliers differently.

In [None]:
# Generate regression data with outliers
np.random.seed(42)
X_reg, y_reg = make_regression(n_samples=200, n_features=1, noise=10, random_state=42)

# Add some outliers
outlier_indices = np.random.choice(len(y_reg), size=20, replace=False)
y_reg[outlier_indices] += np.random.normal(0, 50, 20)  # Add large noise to create outliers

y_reg = y_reg.reshape(-1, 1)

# Split data
X_train_reg, X_test_reg, y_train_reg, y_test_reg = train_test_split(
    X_reg, y_reg, test_size=0.2, random_state=42)

# Standardize
scaler_X = StandardScaler()
scaler_y = StandardScaler()
X_train_reg_scaled = scaler_X.fit_transform(X_train_reg)
X_test_reg_scaled = scaler_X.transform(X_test_reg)
y_train_reg_scaled = scaler_y.fit_transform(y_train_reg)
y_test_reg_scaled = scaler_y.transform(y_test_reg)

print("🎯 Training regression models with different loss functions...")

# We'll simulate MAE by modifying our loss function
class MAENeuralNetwork(SimpleNeuralNetwork):
    def compute_loss(self, y_true, y_pred):
        return np.mean(np.abs(y_true - y_pred))
    
    def backward(self, X, y):
        m = X.shape[0]
        
        # MAE gradient
        dz2 = np.sign(self.a2 - y) / m
        dW2 = self.a1.T @ dz2 / m
        db2 = np.sum(dz2, axis=0, keepdims=True) / m
        
        da1 = dz2 @ self.W2.T
        dz1 = da1 * self.activate_derivative(self.z1)
        dW1 = X.T @ dz1 / m
        db1 = np.sum(dz1, axis=0, keepdims=True) / m
        
        return dW1, db1, dW2, db2

# Train with MSE
print("\nTraining with MSE loss:")
mse_network = SimpleNeuralNetwork(input_size=1, hidden_size=20, output_size=1, 
                                 activation='relu', loss='mse')
mse_network.train(X_train_reg_scaled, y_train_reg_scaled, epochs=1000, 
                 learning_rate=0.01, verbose=True)

# Train with MAE
print("\nTraining with MAE loss:")
mae_network = MAENeuralNetwork(input_size=1, hidden_size=20, output_size=1, 
                              activation='relu', loss='mse')  # We override the loss anyway
mae_network.train(X_train_reg_scaled, y_train_reg_scaled, epochs=1000, 
                 learning_rate=0.01, verbose=True)

# Make predictions
mse_pred_train = mse_network.predict(X_train_reg_scaled)
mse_pred_test = mse_network.predict(X_test_reg_scaled)
mae_pred_train = mae_network.predict(X_train_reg_scaled)
mae_pred_test = mae_network.predict(X_test_reg_scaled)

# Convert back to original scale
mse_pred_train_orig = scaler_y.inverse_transform(mse_pred_train)
mse_pred_test_orig = scaler_y.inverse_transform(mse_pred_test)
mae_pred_train_orig = scaler_y.inverse_transform(mae_pred_train)
mae_pred_test_orig = scaler_y.inverse_transform(mae_pred_test)

y_train_orig = scaler_y.inverse_transform(y_train_reg_scaled)
y_test_orig = scaler_y.inverse_transform(y_test_reg_scaled)

print("\n📊 Regression Results:")
print(f"MSE Network - Train MSE: {np.mean((y_train_orig - mse_pred_train_orig)**2):.2f}")
print(f"MSE Network - Test MSE: {np.mean((y_test_orig - mse_pred_test_orig)**2):.2f}")
print(f"MAE Network - Train MSE: {np.mean((y_train_orig - mae_pred_train_orig)**2):.2f}")
print(f"MAE Network - Test MSE: {np.mean((y_test_orig - mae_pred_test_orig)**2):.2f}")

In [None]:
# Visualize regression results
fig, axes = plt.subplots(2, 2, figsize=(15, 12))

# Plot 1: Learning curves
ax1 = axes[0, 0]
ax1.plot(mse_network.losses, 'b-', label='MSE Loss', linewidth=2)
ax1.plot(mae_network.losses, 'r-', label='MAE Loss', linewidth=2)
ax1.set_xlabel('Epoch')
ax1.set_ylabel('Loss')
ax1.set_title('Learning Curves: MSE vs MAE', fontweight='bold')
ax1.legend()
ax1.grid(True, alpha=0.3)
ax1.set_yscale('log')

# Plot 2: Training data predictions
ax2 = axes[0, 1]
sort_idx = np.argsort(X_train_reg.ravel())
ax2.scatter(X_train_reg, y_train_orig, alpha=0.6, label='True Data', color='gray')
ax2.plot(X_train_reg[sort_idx], mse_pred_train_orig[sort_idx], 'b-', 
         label='MSE Prediction', linewidth=2)
ax2.plot(X_train_reg[sort_idx], mae_pred_train_orig[sort_idx], 'r-', 
         label='MAE Prediction', linewidth=2)
ax2.set_xlabel('Input Feature')
ax2.set_ylabel('Target Value')
ax2.set_title('Training Data Predictions', fontweight='bold')
ax2.legend()
ax2.grid(True, alpha=0.3)

# Plot 3: Test data predictions
ax3 = axes[1, 0]
sort_idx_test = np.argsort(X_test_reg.ravel())
ax3.scatter(X_test_reg, y_test_orig, alpha=0.6, label='True Data', color='gray')
ax3.plot(X_test_reg[sort_idx_test], mse_pred_test_orig[sort_idx_test], 'b-', 
         label='MSE Prediction', linewidth=2)
ax3.plot(X_test_reg[sort_idx_test], mae_pred_test_orig[sort_idx_test], 'r-', 
         label='MAE Prediction', linewidth=2)
ax3.set_xlabel('Input Feature')
ax3.set_ylabel('Target Value')
ax3.set_title('Test Data Predictions', fontweight='bold')
ax3.legend()
ax3.grid(True, alpha=0.3)

# Plot 4: Error analysis
ax4 = axes[1, 1]
mse_errors = np.abs(y_test_orig.ravel() - mse_pred_test_orig.ravel())
mae_errors = np.abs(y_test_orig.ravel() - mae_pred_test_orig.ravel())

ax4.scatter(range(len(mse_errors)), mse_errors, alpha=0.7, label='MSE Model Errors')
ax4.scatter(range(len(mae_errors)), mae_errors, alpha=0.7, label='MAE Model Errors')
ax4.axhline(y=np.mean(mse_errors), color='blue', linestyle='--', 
           label=f'MSE Mean Error: {np.mean(mse_errors):.1f}')
ax4.axhline(y=np.mean(mae_errors), color='red', linestyle='--', 
           label=f'MAE Mean Error: {np.mean(mae_errors):.1f}')
ax4.set_xlabel('Test Sample')
ax4.set_ylabel('Absolute Error')
ax4.set_title('Error Analysis', fontweight='bold')
ax4.legend()
ax4.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("\n💡 Key Observations:")
print("• MSE loss is more sensitive to outliers (larger errors get squared)")
print("• MAE loss treats all errors equally")
print("• Choice of loss function affects how the model handles outliers")
print("• This is why understanding your data is crucial for loss selection!")

## Exercise 4: Build Your Own Experiment (2 minutes)

Now it's your turn to experiment! Try modifying the code to explore different scenarios.

### 🎯 Your Challenges:
Pick one or more of these experiments to try:

In [None]:
print("🧪 EXPERIMENT CHALLENGES:")
print("=" * 40)
print("1. 🏗️  ARCHITECTURE: Try different hidden layer sizes (5, 20, 50)")
print("2. 📚  LEARNING: Experiment with different learning rates (0.001, 0.01, 0.1)")
print("3. 🎯  DATA: Create a more complex dataset (more features, non-linear patterns)")
print("4. 🔧  OPTIMIZATION: Try training for different numbers of epochs")
print("5. 🎨  VISUALIZATION: Plot how decision boundaries change during training")
print()
print("💡 BONUS CHALLENGE: Can you solve the XOR problem we discussed earlier?")
print("   Hint: XOR data looks like: [[0,0]→0, [0,1]→1, [1,0]→1, [1,1]→0]")
print()
print("Choose one experiment and implement it below! 👇")
print("="*60)

# YOUR EXPERIMENT CODE HERE!
# Example: Solving XOR problem

# XOR dataset
X_xor = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
y_xor = np.array([[0], [1], [1], [0]])

print("\n🔥 BONUS: Solving the XOR Problem!")
print("Remember: The perceptron couldn't solve this, but our neural network can!")

# Train neural network on XOR
xor_network = SimpleNeuralNetwork(input_size=2, hidden_size=10, output_size=1, 
                                 activation='relu', loss='binary_crossentropy')

print("\nTraining on XOR data...")
xor_network.train(X_xor, y_xor, epochs=1000, learning_rate=0.1, verbose=True)

# Test predictions
xor_predictions = xor_network.predict(X_xor)
xor_binary_pred = (xor_predictions > 0.5).astype(int)

print("\n🎯 XOR Results:")
print("Input | True | Predicted | Probability | Correct?")
print("-" * 50)
for i in range(len(X_xor)):
    input_str = f"{X_xor[i]}"
    true_val = y_xor[i, 0]
    pred_val = xor_binary_pred[i, 0]
    prob_val = xor_predictions[i, 0]
    correct = "✅" if pred_val == true_val else "❌"
    print(f"{input_str:10s} | {true_val:4d} | {pred_val:9d} | {prob_val:11.3f} | {correct}")

accuracy = np.mean(xor_binary_pred == y_xor)
print(f"\n🏆 XOR Accuracy: {accuracy:.1%}")

if accuracy == 1.0:
    print("🎉 CONGRATULATIONS! You've solved the XOR problem!")
    print("🧠 This is what the perceptron couldn't do - you've built something more powerful!")
else:
    print("🔧 Try adjusting the network architecture or training parameters!")

In [None]:
# Visualize XOR solution
if 'xor_network' in locals():
    plt.figure(figsize=(12, 5))
    
    # Plot 1: XOR data and decision boundary
    plt.subplot(1, 2, 1)
    
    # Create decision boundary
    h = 0.01
    x_min, x_max = -0.5, 1.5
    y_min, y_max = -0.5, 1.5
    xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
                        np.arange(y_min, y_max, h))
    
    mesh_points = np.c_[xx.ravel(), yy.ravel()]
    Z = xor_network.predict(mesh_points)
    Z = Z.reshape(xx.shape)
    
    plt.contourf(xx, yy, Z, levels=50, alpha=0.8, cmap='RdYlBu')
    
    # Plot XOR points
    colors = ['red', 'blue', 'blue', 'red']
    for i, (point, color, label) in enumerate(zip(X_xor, colors, y_xor.ravel())):
        plt.scatter(point[0], point[1], c=color, s=200, edgecolors='black', linewidth=3)
        plt.annotate(f'({point[0]},{point[1]})→{label}', 
                    (point[0], point[1]), 
                    xytext=(10, 10), textcoords='offset points', 
                    fontsize=12, fontweight='bold')
    
    plt.xlim(-0.3, 1.3)
    plt.ylim(-0.3, 1.3)
    plt.xlabel('Input 1', fontsize=12)
    plt.ylabel('Input 2', fontsize=12)
    plt.title('XOR Problem Solved!\nNon-linear Decision Boundary', fontweight='bold')
    plt.colorbar(label='Output Probability')
    
    # Plot 2: Learning curve
    plt.subplot(1, 2, 2)
    plt.plot(xor_network.losses, 'b-', linewidth=2)
    plt.xlabel('Epoch')
    plt.ylabel('Loss')
    plt.title('XOR Learning Curve', fontweight='bold')
    plt.grid(True, alpha=0.3)
    plt.yscale('log')
    
    plt.tight_layout()
    plt.show()
    
    print("\n🎓 Notice how the neural network creates a non-linear decision boundary!")
    print("🔥 This is the power of hidden layers + non-linear activation functions!")

## 🎉 Workshop Wrap-up & Next Steps

### Congratulations! You've Successfully:

✅ **Understood Supervised Learning** - The foundation of neural networks  
✅ **Implemented a Perceptron** - The first artificial neuron  
✅ **Explored Activation Functions** - The key to non-linearity  
✅ **Mastered Loss Functions** - How networks learn from mistakes  
✅ **Built a Complete Neural Network** - Putting it all together!  

### 🧠 Key Insights You've Gained:

1. **Neural networks are just mathematical functions** that learn to map inputs to outputs
2. **Each component serves a purpose**: perceptrons process information, activations add non-linearity, loss functions guide learning
3. **The magic happens when components work together** - no single part makes a neural network powerful
4. **Understanding fundamentals helps you debug and improve** real-world models

### 🚀 Where to Go From Here:

#### Immediate Next Steps:
- **Experiment more** with the code you've written
- **Try different datasets** from sklearn.datasets
- **Modify network architectures** (more layers, different sizes)
- **Explore hyperparameter tuning** (learning rates, epochs)

#### Advanced Topics to Explore:
- **Convolutional Neural Networks (CNNs)** for image data
- **Recurrent Neural Networks (RNNs)** for sequential data
- **Deep Learning Frameworks** like TensorFlow, PyTorch
- **Regularization Techniques** to prevent overfitting
- **Modern Architectures** like Transformers, ResNets

#### Resources for Continued Learning:
- **Online Courses**: Coursera Deep Learning Specialization, Fast.ai
- **Books**: "Deep Learning" by Goodfellow, "Neural Networks and Deep Learning" by Nielsen
- **Practice Platforms**: Kaggle, Google Colab
- **Communities**: Reddit r/MachineLearning, AI/ML Discord servers

### 💡 Remember:

- **Start simple, then add complexity** - master the basics first
- **Understand your data** before choosing architectures
- **Experiment and iterate** - ML is empirical
- **Focus on problems you care about** - motivation drives learning

### 🎯 Final Challenge:

Take a problem from your domain and try to formulate it as a supervised learning task:
- What are your inputs and outputs?
- Is it classification or regression?
- What activation and loss functions would you choose?
- How would you evaluate success?

---

## 📚 Workshop Resources

### Code Repository:
- All notebooks are available on GitHub
- Each notebook can be opened directly in Google Colab
- Feel free to fork, modify, and share!

### Contact & Support:
- Workshop materials: [GitHub Repository](https://github.com/maleehahassan/NNBuildingBlocksTeachingPt1)
- Questions? Open an issue in the repository
- Want to contribute? Pull requests welcome!

---

**Thank you for participating in the Neural Networks Building Blocks Workshop!** 🎓

*Remember: Every expert was once a beginner. Keep learning, keep experimenting, and most importantly, keep building!* 🚀