# **BOOTCAMP @ GIKI (Content designed by Usama Arshad) WEEK 3**

---



# Week 3: Fundamentals of Deep Neural Networks

Week 3: Day 11 - Introduction to Deep Learning

# Introduction to Deep Learning

## Basics of Neural Networks
A neural network is a series of algorithms that attempt to recognize underlying relationships in a set of data through a process that mimics the way the human brain operates. Neural networks can adapt to changing input, so the network generates the best possible result without needing to redesign the output criteria.

**When to Use**: Neural networks are used for a wide range of tasks, including image and speech recognition, medical diagnosis, and financial forecasting.

## Activation Functions
Activation functions decide whether a neuron should be activated or not by calculating the weighted sum and further adding bias with it. The purpose of the activation function is to introduce non-linearity into the output of a neuron.

**Common Activation Functions**:
- **ReLU (Rectified Linear Unit)**: Introduces non-linearity and helps mitigate the vanishing gradient problem.
- **Sigmoid**: Maps input values to a range between 0 and 1.
- **Tanh**: Maps input values to a range between -1 and 1.

**When to Use**: Activation functions are used in the hidden layers of a neural network to introduce non-linearity, which allows the network to model complex relationships.

## Backpropagation Algorithm
Backpropagation is the heart of neural network training. It is the process of fine-tuning the weights of a neural network based on the error rate obtained in the previous epoch (iteration).

**When to Use**: Backpropagation is used during the training phase of the neural network. It helps minimize the error by adjusting the weights in the network.

## Optimization Techniques
Optimization algorithms are used to change the attributes of the neural network, such as weights and learning rate, to reduce the losses.

**Common Optimization Techniques**:
- **Gradient Descent**: The simplest optimization algorithm that minimizes the cost function.
- **Adam (Adaptive Moment Estimation)**: Combines the advantages of two other extensions of gradient descent: AdaGrad and RMSProp.
- **RMSprop**: Divides the learning rate for a weight by a running average of the magnitudes of recent gradients for that weight.

**When to Use**: Optimization techniques are used during the training process to efficiently converge to the minimum of the cost function.



In [None]:
# Install necessary libraries
!pip install ipywidgets scikit-learn matplotlib pyvis

Collecting pyvis
  Downloading pyvis-0.3.2-py3-none-any.whl (756 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m756.0/756.0 kB[0m [31m9.1 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: pyvis
Successfully installed pyvis-0.3.2


In [None]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
import ipywidgets as widgets
from IPython.display import display, clear_output, HTML
from pyvis.network import Network
import tempfile
import os

# Load and display the Iris dataset
iris = load_iris()
X = iris.data  # Features
y = iris.target.reshape(-1, 1)  # Target variable reshaped

# Display dataset information
def display_data_info():
    print("Dataset Information:")
    print(f"Number of samples: {X.shape[0]}")
    print(f"Number of features: {X.shape[1]}")
    print(f"Classes: {np.unique(y.flatten())}")
    print(f"Class distribution: {np.bincount(y.flatten())}")

# Display original data
def display_original_data():
    plt.figure(figsize=(14, 6))
    plt.subplot(1, 2, 1)
    plt.scatter(X[:, 0], X[:, 1], c=y.flatten(), cmap='viridis')
    plt.title('Original Data (Features 1 and 2)')
    plt.xlabel('Feature 1')
    plt.ylabel('Feature 2')
    plt.colorbar()

    plt.subplot(1, 2, 2)
    plt.scatter(X[:, 2], X[:, 3], c=y.flatten(), cmap='viridis')
    plt.title('Original Data (Features 3 and 4)')
    plt.xlabel('Feature 3')
    plt.ylabel('Feature 4')
    plt.colorbar()
    plt.show()

# Standardize the features
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)  # Standardized features

# One-hot encode the target variable
encoder = OneHotEncoder(sparse=False)
y_encoded = encoder.fit_transform(y)  # One-hot encoded targets

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y_encoded, test_size=0.2, random_state=42)

# Display preprocessed data
def display_preprocessed_data():
    plt.figure(figsize=(14, 6))
    plt.subplot(1, 2, 1)
    plt.scatter(X_scaled[:, 0], X_scaled[:, 1], c=y.flatten(), cmap='viridis')
    plt.title('Preprocessed Data (Features 1 and 2)')
    plt.xlabel('Feature 1')
    plt.ylabel('Feature 2')
    plt.colorbar()

    plt.subplot(1, 2, 2)
    plt.scatter(X_scaled[:, 2], X_scaled[:, 3], c=y.flatten(), cmap='viridis')
    plt.title('Preprocessed Data (Features 3 and 4)')
    plt.xlabel('Feature 3')
    plt.ylabel('Feature 4')
    plt.colorbar()
    plt.show()

# Basic Neural Network Class
class SimpleNeuralNetwork:
    def __init__(self, layers, activation='relu'):
        self.layers = layers  # List of layer sizes
        self.activation = activation  # Activation function
        self.params = self.initialize_params()  # Initialize weights and biases

    def initialize_params(self):
        params = {}
        for i in range(1, len(self.layers)):
            params['W' + str(i)] = np.random.randn(self.layers[i], self.layers[i-1]) * 0.1  # Weights
            params['b' + str(i)] = np.zeros((self.layers[i], 1))  # Biases
        return params

    def activation_function(self, Z, derivative=False):
        # Define the activation functions and their derivatives
        if self.activation == 'relu':
            if derivative:
                return (Z > 0).astype(float)
            return np.maximum(0, Z)  # ReLU activation
        elif self.activation == 'sigmoid':
            A = 1 / (1 + np.exp(-Z))
            if derivative:
                return A * (1 - A)
            return A  # Sigmoid activation
        elif self.activation == 'tanh':
            A = np.tanh(Z)
            if derivative:
                return 1 - A ** 2
            return A  # Tanh activation

    def forward(self, X):
        # Forward propagation
        cache = {'A0': X.T}  # Store input data
        A = X.T
        for i in range(1, len(self.layers)):
            Z = self.params['W' + str(i)].dot(A) + self.params['b' + str(i)]  # Linear step
            A = self.activation_function(Z)  # Activation step
            cache['Z' + str(i)] = Z  # Store linear output
            cache['A' + str(i)] = A  # Store activation output
        return A, cache

    def backward(self, cache, X, y, learning_rate):
        # Backward propagation
        m = X.shape[0]  # Number of samples
        grads = {}
        A_last = cache['A' + str(len(self.layers) - 1)]  # Output of the last layer
        dA = A_last - y.T  # Derivative of cost with respect to activation
        for i in reversed(range(1, len(self.layers))):
            dZ = dA * self.activation_function(cache['Z' + str(i)], derivative=True)  # Derivative of activation
            grads['W' + str(i)] = 1/m * dZ.dot(cache['A' + str(i-1)].T)  # Gradient for weights
            grads['b' + str(i)] = 1/m * np.sum(dZ, axis=1, keepdims=True)  # Gradient for biases
            dA = self.params['W' + str(i)].T.dot(dZ)  # Derivative for next layer

        # Update parameters using gradient descent
        for i in range(1, len(self.layers)):
            self.params['W' + str(i)] -= learning_rate * grads['W' + str(i)]
            self.params['b' + str(i)] -= learning_rate * grads['b' + str(i)]

    def train(self, X, y, epochs, learning_rate):
        # Training the neural network
        history = []
        for epoch in range(epochs):
            A, cache = self.forward(X)  # Forward pass
            cost = np.mean((A - y.T) ** 2)  # Compute cost (mean squared error)
            history.append(cost)  # Store cost history
            self.backward(cache, X, y, learning_rate)  # Backward pass and parameter update
        return history

# Interactive Widgets
layer_count = widgets.IntSlider(value=3, min=2, max=10, step=1, description='Number of Layers:')
layer_sizes_text = widgets.Text(value='4, 10, 3', description='Layer Sizes:')
activation_function = widgets.Dropdown(options=['relu', 'sigmoid', 'tanh'], value='relu', description='Activation:')
learning_rate = widgets.FloatSlider(value=0.01, min=0.001, max=0.1, step=0.001, description='Learning Rate:')
epochs = widgets.IntSlider(value=1000, min=100, max=5000, step=100, description='Epochs:')
display_data_button = widgets.Button(description='Display Original Data')
display_preprocessed_button = widgets.Button(description='Display Preprocessed Data')
visualize_nn_button = widgets.Button(description='Visualize Neural Network')
run_button = widgets.Button(description='Run')

output = widgets.Output()

def on_display_data_button_clicked(b):
    with output:
        output.clear_output()
        display_data_info()
        display_original_data()

def on_display_preprocessed_button_clicked(b):
    with output:
        output.clear_output()
        display_preprocessed_data()

def visualize_neural_network(layers):
    net = Network(height='500px', width='1000px', notebook=True, cdn_resources='in_line')
    layer_sizes = [X_train.shape[1]] + layers + [y_train.shape[1]]

    # Add nodes for each layer
    for i, layer_size in enumerate(layer_sizes):
        for j in range(layer_size):
            net.add_node(f"L{i}_N{j}", label=f"L{i}_N{j}", level=i)

    # Add edges between layers
    for i in range(len(layer_sizes) - 1):
        for j in range(layer_sizes[i]):
            for k in range(layer_sizes[i+1]):
                net.add_edge(f"L{i}_N{j}", f"L{i+1}_N{k}")

    with tempfile.NamedTemporaryFile(delete=False, suffix='.html') as tmpfile:
        net.show(tmpfile.name)
        with open(tmpfile.name, 'r') as f:
            html_content = f.read()
        display(HTML(html_content))

def on_visualize_nn_button_clicked(b):
    with output:
        output.clear_output()

        # Parse layer sizes
        layers = [int(x.strip()) for x in layer_sizes_text.value.split(',')]

        # Visualize neural network structure
        visualize_neural_network(layers)

def on_run_button_clicked(b):
    with output:
        output.clear_output()

        # Parse layer sizes
        layers = [int(x.strip()) for x in layer_sizes_text.value.split(',')]

        # Visualize neural network structure
        visualize_neural_network(layers)

        # Initialize and train the neural network
        nn = SimpleNeuralNetwork(layers, activation=activation_function.value)
        history = nn.train(X_train, y_train, epochs=epochs.value, learning_rate=learning_rate.value)

        # Plotting
        plt.figure(figsize=(14, 6))

        # Plot training loss
        plt.subplot(1, 2, 1)
        plt.plot(history)
        plt.title('Training Loss')
        plt.xlabel('Epoch')
        plt.ylabel('Loss')

        # Evaluate on test data
        predictions, _ = nn.forward(X_test)
        predictions = np.argmax(predictions, axis=0)
        actuals = np.argmax(y_test, axis=1)
        accuracy = np.mean(predictions == actuals)

        # Plot accuracy
        plt.subplot(1, 2, 2)
        plt.bar(['Accuracy'], [accuracy])
        plt.ylim(0, 1)
        plt.title('Test Accuracy')

        plt.show()

        # Display classification report
        print("Classification Report:")
        print(classification_report(actuals, predictions, target_names=iris.target_names))

# Connect buttons to functions
display_data_button.on_click(on_display_data_button_clicked)
display_preprocessed_button.on_click(on_display_preprocessed_button_clicked)
visualize_nn_button.on_click(on_visualize_nn_button_clicked)
run_button.on_click(on_run_button_clicked)

def update_layer_sizes_text(change):
    layer_count_value = change['new']
    layer_sizes_text.value = ', '.join(['10'] * layer_count_value)

layer_count.observe(update_layer_sizes_text, names='value')

# Display widgets
display(widgets.VBox([display_data_button, display_preprocessed_button, layer_count, layer_sizes_text, activation_function, learning_rate, epochs, visualize_nn_button, run_button, output]))




VBox(children=(Button(description='Display Original Data', style=ButtonStyle()), Button(description='Display P…