# Simple Perceptron or a single-layer neural network

#### What is a Perceptron?
A perceptron is a single-layer neural network used for binary classification. It was introduced by Frank Rosenblatt in the late 1950sd. 

### Components of a Perceptron:

- **Inputs:** A set of real numbers, which could represent anything from pixel values in an image to any other kind of feature. These inputs are often denoted as $x_1, x_2, \dots, x_n$.

- **Weights:** Each input has an associated weight, represented as $w_1, w_2, \dots, w_n$. These weights are learned during the training process to improve the accuracy of predictions.

- **Bias:** A constant value that allows the perceptron to shift the decision boundary away from the origin. Not all implementations of a perceptron include bias, but it often helps in making the perceptron more flexible.

- **Activation Function:** A function that transforms the weighted sum of the inputs to produce the final output. For a simple perceptron, this is typically a step function, which outputs a 1 or 0 depending on whether the input is above or below a certain threshold.


### Working:

The perceptron works by multiplying each input by its corresponding weight, summing these products together with the bias, and then passing this sum through an activation function.

Mathematically, the output $y$ of a perceptron is given by:

$$ y = f\left(\sum_{i=1}^{n} w_i x_i + b\right) $$

Where:

- $w_i$ is the weight associated with the $i^{th}$ input.
- $x_i$ is the $i^{th}$ input.
- $b$ is the bias.
- $f$ is the activation function, which in its simplest form is a step function that outputs 1 if its input is positive, and 0 otherwise.

### Learning:

The perceptron learns by adjusting its weights based on the errors it makes. The basic idea is:

1. For every input in the training set, compute the perceptron's output.
2. Compare this output to the desired output.
3. Adjust the weights in the direction that reduces the error.

The most common rule to update the weights is the **Perceptron Learning Rule**, which is:

$$ w_i = w_i + \eta (d - y) x_i $$

Where:

- $w_i$ is the weight associated with the $i^{th}$ input.
- $d$ is the desired output.
- $y$ is the perceptron's output.
- $x_i$ is the $i^{th}$ input.
- $\eta$ is the learning rate, a small positive constant.



### Limitations:
The perceptron can only classify linearly separable data, which means there exists a straight line (in 2D), plane (in 3D), or hyperplane (in higher dimensions) that can separate the two classes. If the data is not linearly separable, a single perceptron will not work. This limitation led to the development of multi-layer neural networks, which are much more powerful and flexible.

In summary, a perceptron is a fundamental building block in neural networks and serves as an introduction to the concepts of weights, biases, activations, and learning in more complex neural network architectures.






### Simple perceptron based on [perceptron_n00b.py](./1_simple_perceptron.ipynb)


The provided model is a simple perceptron, not a multilayer perceptron (MLP). It consists of just one layer of neurons, and that makes it a simple perceptron.


In [8]:
from src.dataset_service import read_mnist_data
from src.perceptron_n00b import NeuralNetwork


#
# Initiate and train simple perceptron for the hand writen didgits recognition 
#
nn = NeuralNetwork()
train_path = "./data/mnist_train_60k.csv.zip"
train_inputs, train_targets = read_mnist_data(output_nodes_amount=10, samples=10, csv_path=train_path)


#add features vector
nn.add_layer(neurons_in_layer=len(train_inputs[0]))
#add single layer
nn.add_layer(neurons_in_layer=10)

#train the model
nn.train(train_inputs, train_targets, learning_rate=0.1, epoch=1)


The header of the CSV ->  ['label', '1x1', '1x2', '1x3', '1x4', '1x5', '1x6', '1x7', '1x8', '1x9', '1x10', '1x11', '1x12', '1x13', '1x14', '1x15', '1x16', '1x17', '1x18', '1x19', '1x20', '1x21', '1x22', '1x23', '1x24', '1x25', '1x26', '1x27', '1x28', '2x1', '2x2', '2x3', '2x4', '2x5', '2x6', '2x7', '2x8', '2x9', '2x10', '2x11', '2x12', '2x13', '2x14', '2x15', '2x16', '2x17', '2x18', '2x19', '2x20', '2x21', '2x22', '2x23', '2x24', '2x25', '2x26', '2x27', '2x28', '3x1', '3x2', '3x3', '3x4', '3x5', '3x6', '3x7', '3x8', '3x9', '3x10', '3x11', '3x12', '3x13', '3x14', '3x15', '3x16', '3x17', '3x18', '3x19', '3x20', '3x21', '3x22', '3x23', '3x24', '3x25', '3x26', '3x27', '3x28', '4x1', '4x2', '4x3', '4x4', '4x5', '4x6', '4x7', '4x8', '4x9', '4x10', '4x11', '4x12', '4x13', '4x14', '4x15', '4x16', '4x17', '4x18', '4x19', '4x20', '4x21', '4x22', '4x23', '4x24', '4x25', '4x26', '4x27', '4x28', '5x1', '5x2', '5x3', '5x4', '5x5', '5x6', '5x7', '5x8', '5x9', '5x10', '5x11', '5x12', '5x13', '5x14', '

In [65]:
#Test and check result of hidden didgit recognition 
train_path = "./data/mnist_test_10k.csv.zip"
test_inputs, test_targets = read_mnist_data(output_nodes_amount=10, samples=10, csv_path=train_path)

predicted_outputs = []
for i in range(len(test_inputs)):
    predicted_vector = nn.pred(test_inputs[i])

    formatted_predicted_vector = [float(f'{num:.2f}') for num in predicted_vector]
    print("predicted vector   -> ", formatted_predicted_vector)
    print("real target vector -> ", test_targets[i])

    predicted_outputs.append(predicted_vector)

res = nn.calculate_accuracy(predicted_outputs, test_targets)
print("Accuracy = ", res)

The header of the CSV ->  ['label', '1x1', '1x2', '1x3', '1x4', '1x5', '1x6', '1x7', '1x8', '1x9', '1x10', '1x11', '1x12', '1x13', '1x14', '1x15', '1x16', '1x17', '1x18', '1x19', '1x20', '1x21', '1x22', '1x23', '1x24', '1x25', '1x26', '1x27', '1x28', '2x1', '2x2', '2x3', '2x4', '2x5', '2x6', '2x7', '2x8', '2x9', '2x10', '2x11', '2x12', '2x13', '2x14', '2x15', '2x16', '2x17', '2x18', '2x19', '2x20', '2x21', '2x22', '2x23', '2x24', '2x25', '2x26', '2x27', '2x28', '3x1', '3x2', '3x3', '3x4', '3x5', '3x6', '3x7', '3x8', '3x9', '3x10', '3x11', '3x12', '3x13', '3x14', '3x15', '3x16', '3x17', '3x18', '3x19', '3x20', '3x21', '3x22', '3x23', '3x24', '3x25', '3x26', '3x27', '3x28', '4x1', '4x2', '4x3', '4x4', '4x5', '4x6', '4x7', '4x8', '4x9', '4x10', '4x11', '4x12', '4x13', '4x14', '4x15', '4x16', '4x17', '4x18', '4x19', '4x20', '4x21', '4x22', '4x23', '4x24', '4x25', '4x26', '4x27', '4x28', '5x1', '5x2', '5x3', '5x4', '5x5', '5x6', '5x7', '5x8', '5x9', '5x10', '5x11', '5x12', '5x13', '5x14', '

### Simple perceptron based on PyTorch "optimized tensor library for deep learning using GPUs and CPUs." [PyTorch](https://pytorch.org)


The provided model is a simple perceptron, not a multilayer perceptron (MLP). It consists of just one layer of neurons, and that makes it a simple perceptron.


In [None]:
from src.dataset_service import read_mnist_data
import torch
import torch.nn as nn
import torch.optim as optim

train_path = "./data/mnist_train_60k.csv.zip"
samples = 50000
train_inputs, train_targets = read_mnist_data(output_nodes_amount=10, samples=samples, csv_path=train_path)

# Convert the data to PyTorch tensors
train_inputs = torch.tensor(train_inputs, dtype=torch.float32)
train_targets = torch.tensor(train_targets, dtype=torch.float32)

print("Train: \n", train_inputs[0])
print("Target: \n", train_targets[0])


The header of the CSV ->  ['label', '1x1', '1x2', '1x3', '1x4', '1x5', '1x6', '1x7', '1x8', '1x9', '1x10', '1x11', '1x12', '1x13', '1x14', '1x15', '1x16', '1x17', '1x18', '1x19', '1x20', '1x21', '1x22', '1x23', '1x24', '1x25', '1x26', '1x27', '1x28', '2x1', '2x2', '2x3', '2x4', '2x5', '2x6', '2x7', '2x8', '2x9', '2x10', '2x11', '2x12', '2x13', '2x14', '2x15', '2x16', '2x17', '2x18', '2x19', '2x20', '2x21', '2x22', '2x23', '2x24', '2x25', '2x26', '2x27', '2x28', '3x1', '3x2', '3x3', '3x4', '3x5', '3x6', '3x7', '3x8', '3x9', '3x10', '3x11', '3x12', '3x13', '3x14', '3x15', '3x16', '3x17', '3x18', '3x19', '3x20', '3x21', '3x22', '3x23', '3x24', '3x25', '3x26', '3x27', '3x28', '4x1', '4x2', '4x3', '4x4', '4x5', '4x6', '4x7', '4x8', '4x9', '4x10', '4x11', '4x12', '4x13', '4x14', '4x15', '4x16', '4x17', '4x18', '4x19', '4x20', '4x21', '4x22', '4x23', '4x24', '4x25', '4x26', '4x27', '4x28', '5x1', '5x2', '5x3', '5x4', '5x5', '5x6', '5x7', '5x8', '5x9', '5x10', '5x11', '5x12', '5x13', '5x14', '

In [None]:

import torch.nn.functional as F

class SimplePerceptron(nn.Module):
    """   
    This perceptron will consist of a single linear layer followed by a softmax 
    to get probabilities for each of the 10 classes (digits 0-9).
    """
    def __init__(self, input_size, num_classes):
        super(SimplePerceptron, self).__init__()
        self.fc = nn.Linear(input_size, num_classes)

    def forward(self, x): 
        return self.fc(x)

input_size = 28 * 28  # MNIST images are 28x28
num_classes = 10
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model = SimplePerceptron(input_size, num_classes).to(device)
print(model.eval)


<bound method Module.eval of SimplePerceptron(
  (fc): Linear(in_features=784, out_features=10, bias=True)
)>


In [None]:
# Training Loop: Define the loss function, the optimizer, and the training loop.

# Hyperparameters
learning_rate = 0.001
num_epochs = 20
batch_size = 64

# Loss and optimizer
criterion = nn.CrossEntropyLoss().to(device)
optimizer = optim.Adam(model.parameters(), lr=learning_rate)

# Training loop
for epoch in range(num_epochs):
    for i in range(0, len(train_inputs), batch_size):
        inputs = train_inputs[i:i+batch_size]
        labels = train_targets[i:i+batch_size]

        # Forward pass
        outputs = model(inputs)
        loss = criterion(outputs, labels)

        # Backward pass and optimize
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

    print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}")



Epoch [1/20], Loss: 1.2161
Epoch [2/20], Loss: 1.0839
Epoch [3/20], Loss: 1.0222
Epoch [4/20], Loss: 0.9880
Epoch [5/20], Loss: 0.9665
Epoch [6/20], Loss: 0.9517
Epoch [7/20], Loss: 0.9408
Epoch [8/20], Loss: 0.9324
Epoch [9/20], Loss: 0.9257
Epoch [10/20], Loss: 0.9202
Epoch [11/20], Loss: 0.9156
Epoch [12/20], Loss: 0.9117
Epoch [13/20], Loss: 0.9083
Epoch [14/20], Loss: 0.9053
Epoch [15/20], Loss: 0.9026
Epoch [16/20], Loss: 0.9002
Epoch [17/20], Loss: 0.8980
Epoch [18/20], Loss: 0.8960
Epoch [19/20], Loss: 0.8942
Epoch [20/20], Loss: 0.8924


In [None]:
from src.dataset_service import read_mnist_data

samples = 1000
test_path = "./data/mnist_test_10k.csv.zip"
test_inputs, test_targets = read_mnist_data(output_nodes_amount=10, samples=samples, csv_path=test_path)

# Convert the data to PyTorch tensors
test_inputs = torch.tensor(test_inputs, dtype=torch.float32)
test_targets = torch.tensor(test_targets, dtype=torch.float32)

with torch.no_grad():
    correct = 0
    total = 0
    for i in range(0, len(test_inputs), batch_size):
        inputs = test_inputs[i:i+batch_size].to(device)
        labels = test_targets[i:i+batch_size].to(device)
        logits = model.forward(inputs)
        probabilities = torch.nn.functional.softmax(logits, dim=1)
        
        # Find the index with the maximum probability
        predicted = torch.argmax(probabilities.data, 1)        
        true_classes = torch.argmax(labels, dim=1)  

        # Compare predicted index with the target index
        correct += (predicted == true_classes).sum().item()  

print(f"Accuracy: {correct / samples}%")



The header of the CSV ->  ['label', '1x1', '1x2', '1x3', '1x4', '1x5', '1x6', '1x7', '1x8', '1x9', '1x10', '1x11', '1x12', '1x13', '1x14', '1x15', '1x16', '1x17', '1x18', '1x19', '1x20', '1x21', '1x22', '1x23', '1x24', '1x25', '1x26', '1x27', '1x28', '2x1', '2x2', '2x3', '2x4', '2x5', '2x6', '2x7', '2x8', '2x9', '2x10', '2x11', '2x12', '2x13', '2x14', '2x15', '2x16', '2x17', '2x18', '2x19', '2x20', '2x21', '2x22', '2x23', '2x24', '2x25', '2x26', '2x27', '2x28', '3x1', '3x2', '3x3', '3x4', '3x5', '3x6', '3x7', '3x8', '3x9', '3x10', '3x11', '3x12', '3x13', '3x14', '3x15', '3x16', '3x17', '3x18', '3x19', '3x20', '3x21', '3x22', '3x23', '3x24', '3x25', '3x26', '3x27', '3x28', '4x1', '4x2', '4x3', '4x4', '4x5', '4x6', '4x7', '4x8', '4x9', '4x10', '4x11', '4x12', '4x13', '4x14', '4x15', '4x16', '4x17', '4x18', '4x19', '4x20', '4x21', '4x22', '4x23', '4x24', '4x25', '4x26', '4x27', '4x28', '5x1', '5x2', '5x3', '5x4', '5x5', '5x6', '5x7', '5x8', '5x9', '5x10', '5x11', '5x12', '5x13', '5x14', '