# Neural Networks Fundamentals

In this notebook, we'll explore Neural Networks, a powerful class of machine learning models inspired by the human brain. We'll implement a simple neural network from scratch to understand its core components and how they work together.

We'll cover:
1. Understanding neural network architecture and components
2. Implementing forward and backward propagation
3. Training a neural network
4. Visualizing learning curves and decision boundaries
5. Comparing with scikit-learn's implementation

In [19]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier
import sys
import os

# Add the parent directory to sys.path to import custom modules
sys.path.append(os.path.join(os.getcwd(), '..'))

# Import custom modules
from models.neural_network import NeuralNetwork
from utils.data_generator import generate_nonlinear_data
from utils.plotting import plot_decision_boundary, plot_learning_curve

# Set random seed for reproducibility
np.random.seed(42)

## 1. Generate Synthetic Data

We'll create a synthetic dataset with a non-linear decision boundary to demonstrate the capabilities of neural networks.

In [None]:
# Generate synthetic data
X, y = generate_nonlinear_data(n_samples=1000, noise=0.1, random_state=42)

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Plot the data
plt.figure(figsize=(10, 8))
plt.scatter(X[:, 0], X[:, 1], c=y, cmap='viridis')
plt.colorbar()
plt.title('Synthetic Classification Dataset')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.show()

## 2. Train Our Custom Neural Network

Now we'll train our custom neural network implementation on the synthetic data.

In [None]:
# Create and train our custom neural network
nn = NeuralNetwork(layer_sizes=[2, 4, 1], learning_rate=0.01, epochs=1000)
nn.fit(X_train, y_train)

# Make predictions
y_pred = nn.predict(X_test)

# Calculate accuracy
accuracy = np.mean((y_pred > 0.5) == y_test)
print(f'Test accuracy: {accuracy:.4f}')

# Plot learning curve
plt.figure(figsize=(10, 6))
epochs = range(1, len(nn.cost_history) + 1)
plt.plot(epochs, nn.cost_history, 'b-', label='Training Cost')
plt.title('Neural Network Learning Curve')
plt.xlabel('Epochs')
plt.ylabel('Cost')
plt.legend()
plt.grid(True)
plt.show()

## 3. Visualize Decision Boundary

Let's visualize the decision boundary learned by our custom neural network.

In [None]:
# Plot decision boundary for our custom implementation
plot_decision_boundary(X_test, y_test, nn, title='Custom Neural Network Decision Boundary')

## 4. Compare with Scikit-learn

Let's compare our implementation with scikit-learn's neural network classifier.

In [None]:
# Create and train scikit-learn's neural network
sklearn_nn = MLPClassifier(hidden_layer_sizes=(4,), max_iter=1000, learning_rate_init=0.01, random_state=42)
sklearn_nn.fit(X_train, y_train)

# Make predictions
sklearn_y_pred = sklearn_nn.predict(X_test)

# Calculate accuracy
sklearn_accuracy = np.mean(sklearn_y_pred == y_test)
print(f'Scikit-learn Neural Network accuracy: {sklearn_accuracy:.4f}')

# Plot decision boundary
plot_decision_boundary(X_test, y_test, sklearn_nn, title='Scikit-learn Neural Network Decision Boundary')

## 5. Effect of Network Architecture

Let's explore how different network architectures affect the model's performance and decision boundary.

In [None]:
architectures = [[2, 2, 1], [2, 4, 1], [2, 8, 1], [2, 4, 4, 1]]
fig, axes = plt.subplots(2, 2, figsize=(15, 10))
axes = axes.ravel()  # Flatten the 2x2 array of axes

for i, arch in enumerate(architectures):
    # Train neural network with different architectures
    nn = NeuralNetwork(layer_sizes=arch, learning_rate=0.01, epochs=1000)
    nn.fit(X_train, y_train)
    
    # Create mesh grid for decision boundary
    x_min, x_max = X_test[:, 0].min() - 0.5, X_test[:, 0].max() + 0.5
    y_min, y_max = X_test[:, 1].min() - 0.5, X_test[:, 1].max() + 0.5
    xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.02),
                        np.arange(y_min, y_max, 0.02))
    
    # Make predictions
    Z = nn.predict(np.c_[xx.ravel(), yy.ravel()])
    Z = Z.reshape(xx.shape)
    
    # Plot
    axes[i].contourf(xx, yy, Z, alpha=0.4)
    axes[i].scatter(X_test[:, 0], X_test[:, 1], c=y_test, alpha=0.8)
    axes[i].set_title(f'Decision Boundary (Architecture: {arch})')
    axes[i].set_xlabel('Feature 1')
    axes[i].set_ylabel('Feature 2')

plt.tight_layout()
plt.show()

## Conclusion

In this notebook, we've explored neural networks by:
1. Implementing a neural network from scratch
2. Training it on synthetic data
3. Visualizing its decision boundaries
4. Comparing it with scikit-learn's implementation
5. Analyzing the effect of network architecture on model performance

Key takeaways:
- Neural networks can learn complex, non-linear decision boundaries
- Network architecture (number and size of layers) affects model capacity
- Learning rate and number of epochs are important hyperparameters
- Our implementation performs similarly to scikit-learn's 