GPT: "What is a neural network?"

A neural network is a computational model inspired by the way biological neural networks in the human brain operate. It is composed of interconnected nodes, also known as neurons or artificial neurons, organized in layers. Neural networks are a fundamental component of deep learning, a subfield of machine learning.

Key Components of a Neural Network:

Input Layer:

The layer that receives input features or data.

Hidden Layers:

Intermediate layers between the input and output layers where computations are performed. Deep neural networks have multiple hidden layers.

Weights and Biases:

Parameters that the neural network learns from training data. They are adjusted during the training process to optimize the network's performance.

Activation Function:

Non-linear functions applied to the output of each neuron in a layer. They introduce non-linearity to the model, enabling it to learn complex relationships in the data.

Output Layer:

The layer that produces the final output or prediction.

How Neural Networks Work:

Forward Propagation:

During forward propagation, input data is passed through the network layer by layer, and computations are performed using weights, biases, and activation functions to generate predictions.

Loss Function:

A loss function measures the difference between the predicted output and the actual target. The goal during training is to minimize this loss.

Backward Propagation (Training):

Backward propagation involves updating the weights and biases based on the gradients of the loss function with respect to the parameters. This is done using optimization algorithms like gradient descent.

Training Epochs:

The entire process of forward and backward propagation is repeated for multiple iterations or epochs until the model achieves satisfactory performance on the training data.

There are four initial types of neural nets to learn about. First is Feedforward Neural Network (FNN). In this notebook, I will not use packages such as tensorflow and Keras. These require specific environment setups, and its much more diffucult to see the underlying algorithm. Be sure to comment these algorithms line by line if anything is confusing. 

GPT: "Write a FNN from scratch and apply it to the Iris dataset with thorough comments." I also added some more information to each step.

In [41]:
#Feedforward Neural Networks (FNNs) are a type of artificial neural network where information moves 
#in one direction—from the input layer through hidden layers to the output layer—without cycles or loops.
#FNN is good for structured problems, featured engineering, simple to complex mapping, supervised and labeled
#data, scalable problems. 

#FNNs are not ideal for image recognition, unsupervised data,sequential data like time-series, transfer learning, 
#factorization machines, collaborative filtering. 



import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder, StandardScaler
from sklearn.datasets import load_iris

# Load Iris dataset
iris = load_iris()
X, y = iris.data, iris.target

# Standardize features to have zero mean and unit variance, NOTICE how it is 70% accuracy without standardization
scaler = StandardScaler()
X = scaler.fit_transform(X)

# One-hot encode the labels for classification
encoder = OneHotEncoder(sparse=False, categories='auto')
y_onehot = encoder.fit_transform(y.reshape(-1, 1))
# One-hot encoding is used to convert categorical labels into a binary matrix.
# Each unique label is represented by a binary vector with all zeros except for a one at the index corresponding to the label.
# In this case, it transforms the single-column array 'y' into a binary matrix 'y_onehot'.
#This is effectively dummy coding. 

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y_onehot, test_size=0.2, random_state=42)

# Neural Network class
class NeuralNetwork:
    def __init__(self, input_size, hidden_size, output_size):
        # Initialize weights and biases
        self.weights_input_hidden = np.random.rand(input_size, hidden_size)
        self.bias_hidden = np.zeros((1, hidden_size))
        self.weights_hidden_output = np.random.rand(hidden_size, output_size)
        self.bias_output = np.zeros((1, output_size))
# The weights represent the strength of connections between neurons in different layers.
        # The biases provide flexibility and allow the network to learn offset values for each neuron.
        # Random initialization helps break symmetry and allows the network to start learning unique features.

    def sigmoid(self, x):
        return 1 / (1 + np.exp(-x))
 # The sigmoid function squashes the input values between 0 and 1, making them suitable for binary classification problems.
        # It introduces non-linearity to the network, allowing it to learn complex patterns.

    def sigmoid_derivative(self, x):
        return x * (1 - x)
# The derivative of the sigmoid function is used in the backward pass (backpropagation) to compute gradients.
        # It helps determine the contribution of each neuron to the error, facilitating weight updates.

    def softmax(self, x):
        # Softmax activation function for multi-class classification
        exp_x = np.exp(x - np.max(x, axis=1, keepdims=True))
        return exp_x / np.sum(exp_x, axis=1, keepdims=True)
# The softmax function is used to convert the raw output scores into probabilities for multi-class classification.
        # It ensures that the sum of probabilities for each data point is 1, making it a valid probability distribution.
    
    def forward(self, X):
        #  Forward pass through the network. This is the predicted output.
        self.hidden_output = self.sigmoid(np.dot(X, self.weights_input_hidden) + self.bias_hidden)
        self.final_output = self.softmax(np.dot(self.hidden_output, self.weights_hidden_output) + self.bias_output)
        return self.final_output
# The forward pass computes the output of the network given the input.
        # It involves matrix multiplications with weights and adding biases, followed by activation functions.

    def backward(self, X, y, learning_rate):
        #Backward pass to update weights and biases based on the error
        #The backward pass computes the gradients of the loss with respect to the parameters (weights and biases) 
        #of the network. It allows the network to understand how each parameter contributed to the error and 
        #updates the parameters accordingly.
        error = y - self.final_output
        output_delta = error * self.sigmoid_derivative(self.final_output)
        # Compute the error at the final output layer and the gradient of the error with respect to the output
        error_hidden = output_delta.dot(self.weights_hidden_output.T)
        hidden_delta = error_hidden * self.sigmoid_derivative(self.hidden_output)
        # The backward pass computes the gradients of the loss with respect to weights and biases.
        # It uses these gradients to update the weights and biases using gradient descent.
        # Backpropagate the error to the hidden layer by computing the gradient of the error with respect to the hidden layer output

        # Update weights and biases using gradient descent
        self.weights_hidden_output += self.hidden_output.T.dot(output_delta) * learning_rate
        self.bias_output += np.sum(output_delta, axis=0, keepdims=True) * learning_rate
        self.weights_input_hidden += X.T.dot(hidden_delta) * learning_rate
        self.bias_hidden += np.sum(hidden_delta, axis=0, keepdims=True) * learning_rate
        # Updating weights and biases involves adjusting them in the direction that reduces the loss.
        # Learning rate controls the size of these adjustments.

    def train(self, X, y, epochs, learning_rate):
        # Train the neural network for a specified number of epochs
        for epoch in range(epochs):
            # Perform a forward pass and a backward pass for each epoch
            output = self.forward(X)
            self.backward(X, y, learning_rate)
            
            # Print the loss every 1000 epochs for monitoring
            if epoch % 1000 == 0:
                loss = -np.sum(y * np.log(output))
                print(f'Epoch {epoch}, Loss: {loss}')

    def predict(self, X):
        # Make predictions using the trained neural network
        return np.argmax(self.forward(X), axis=1)


# Set random seed for reproducibility
np.random.seed(42)

# Initialize neural network with input size, hidden size, and output size
input_size = X_train.shape[1]
hidden_size = 5
output_size = y_train.shape[1]
nn = NeuralNetwork(input_size, hidden_size, output_size)

# Train the neural network with the training data
nn.train(X_train, y_train, epochs=10000, learning_rate=0.01)

# Test the neural network on the testing data
predictions = nn.predict(X_test)

# Calculate accuracy of the model
accuracy = np.mean(predictions == np.argmax(y_test, axis=1))
print(f'Accuracy: {accuracy * 100}%')




Epoch 0, Loss: 154.7865425456444
Epoch 1000, Loss: 56.510517792337055
Epoch 2000, Loss: 36.71469653214517
Epoch 3000, Loss: 28.151420208129704
Epoch 4000, Loss: 32.66388458843728
Epoch 5000, Loss: 30.732255348136125
Epoch 6000, Loss: 27.343314460242798
Epoch 7000, Loss: 21.953026188244053
Epoch 8000, Loss: 31.204677792416714
Epoch 9000, Loss: 17.575426356629414
Accuracy: 70.0%


<b> Convultional Neural Netowork (CNN) </b>

In [42]:
#Ive spent several hours on this and cannot get it to work. Maybe try this on GPT4 or whatever when I can.

CNN, RNN, LSTM, Factorization and Colaborative