#### About

> Neural Networks (MLPs, CNNs, and RNNs)

Neural networks are a type of deep learning model that are inspired by the structure and function of the human brain. They consist of interconnected nodes or neurons organized in layers, and they can learn to perform tasks such as classification, regression, and sequence prediction through a process called training, where the model adjusts its parameters based on labeled training data.

> Multi-Layer Perceptron (MLP)

The Multi-Layer Perceptron (MLP) is a type of feedforward neural network, where information flows in one direction, from input to output, without any loops. It consists of an input layer, one or more hidden layers, and an output layer. Each layer contains multiple neurons, and neurons in adjacent layers are fully connected, meaning that they receive input from all neurons in the previous layer and produce output that is passed to all neurons in the next layer.

> Mathematics

Following terms are important to understand the maths of MLP:

1. Input Layer: The input layer receives the input features, denoted as X, which are usually represented as a vector or a matrix.

2. Hidden Layers: Each hidden layer consists of multiple neurons, denoted as H_i, where i is the index of the hidden layer. The output of each neuron in the hidden layer is calculated using an activation function, denoted as f_i, applied to the weighted sum of inputs, denoted as Z_i, which is calculated as the dot product of the input vector X and the weight matrix W_i, followed by the addition of a bias vector b_i:

Z_i = W_i * X + b_i
H_i = f_i(Z_i)

Common activation functions include sigmoid, tanh, ReLU,GeLU and softmax, depending on the problem type and requirements.

3. Output Layer: The output layer consists of neurons that produce the final predictions, denoted as Y. The output of each neuron in the output layer is calculated in a similar way as the hidden layers, but with a different activation function, denoted as f_o:

Z_o = W_o * H_{n-1} + b_o
Y = f_o(Z_o)

where n is the total number of hidden layers, H_{n-1} is the output of the last hidden layer, and W_o and b_o are the weight matrix and bias vector for the output layer, respectively.

4. Training: During training, the model updates its weights and biases using an optimization algorithm, such as gradient descent, to minimize the error between its predictions and the ground truth labels from the training data.






In [1]:
from sklearn.neural_network import MLPClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

In [2]:
iris = load_iris()
X = iris.data
y = iris.target

In [3]:
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


In [4]:
# Initialize and train the MLP model
mlp = MLPClassifier(hidden_layer_sizes=(64, 32), activation='relu', solver='adam', random_state=42)
mlp.fit(X_train, y_train)



In [5]:
# Make predictions on the test data
y_pred = mlp.predict(X_test)


In [6]:
# Calculate accuracy
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)

Accuracy: 0.9666666666666667


Similarly, There are CNNs and RNNs which will be discussed in deep learning section.