## **Resources**

### **Training Neural Network with C++**

[Linkedin_Learning](https://www.linkedin.com/learning/training-neural-networks-in-c-plus-plus-22661958/the-many-applications-of-machine-learning?autoSkip=true&resume=false&u=42288921)

### **Understanding Neural Network in Depth**

[Essential_Idea_Of_Neural_Network](https://www.youtube.com/watch?v=CqOfi41LfDw)

[How_CNN_Works_in_Depth](https://www.youtube.com/watch?v=JB8T_zN7ZC0)

### **The Mathematics Behind Neural Network**

[Maths_Behind_Neural_Network](https://www.youtube.com/watch?v=Ixl3nykKG9M)


# **Tomorrow**

[OR_Gate](https://www.linkedin.com/learning/training-neural-networks-in-c-plus-plus-22661958/solution-logic-gates-with-perceptrons?resume=false&u=42288921)


<hr>
<hr>
<hr>
<hr>


## **Neural Network Implementation Note**

- All values must be real numbers, not integers. We will use double point precision (e.g., 0.1, 0.2).

- The weights and inputs may be implemented as `1-D` vectors. We will use the `std::vector<double>` type from the C++ Standard Library i.e. `vec(w)` and `vec(x)`.

- This way, the sum may be calculated in one operation: `z = vec(w) * vec(x)`.

- We will feed the weighted sum to the sigmoid activation function.


## **Files and Their Meaning**

**`.h` files**: These are header files in C++ that typically contain function declarations, class definitions, and macros. They are included in `.cpp` files to provide the necessary declarations for the functions and classes used in the implementation.

**`.cpp` files**: These are source files in C++ that contain the actual implementation of the functions and classes declared in the corresponding header files. They are compiled to create the final executable program.


<hr>
<hr>
<hr>
<hr>


## **Built in Functions That We Will Use**

`std::vector`: A dynamic array that can resize itself automatically when elements are added or removed.

Syntax: `std::vector<Type> vec;`

<hr>

`std::inner_product`: Computes the inner product of two ranges.

Syntax: `std::inner_product(first1, last1, first2, init);`

<hr>

`std::generate`: Fills a range with values generated by a function.

Syntax: `std::generate(first, last, generator);`

<hr>

`std::push_back`: Adds an element to the end of a vector.

Syntax: `vec.push_back(value);`

<hr>

`std::resize`: Changes the size of a vector.

Syntax: `vec.resize(new_size);`

<hr>

`std::exp`: Computes the exponential function.

Syntax: `std::exp(x);`


<hr>
<hr>
<hr>
<hr>


## **Neural Network into Action**

We will write all the declarations in the header files and all the implementations in the source files. This will help us keep our code organized and modular.

Our first task it to implement basic `Multi Layer Perceptron` class in C++.

For that we are creating `MLP.h` and `MLP.cpp` files.

### **`MLP.h`**

```C++

// Perceptron class

class Perceptron
{
public:
  std::vector<double> weights;
  double bias;

  // Constructor
  Perceptron(size_t inputs, double bias = 1.0);

  // Run the perceptron
  double run(std::vector<double> x);

  // Set Custom Weights if needed
  void set_weights(std::vector<double> w_init);

  // Sigmoid Activation Function
  double sigmoid(double x);
};
```

Here, `size_t` is used to represent the number of inputs to the perceptron, ensuring that the value is always non-negative. It is an `unsigned integer` type which store `8 Bytes` in 64 Bit System and `4 Bytes` in 32 Bit System.

<hr>

Now we'll implement the `Perceptron` class in the `MLP.cpp`.

### **`MLP.cpp`**

Here, we will write the implementation of the `Perceptron` class.

```C++

#include "mlp.h"
#include <iostream>
using namespace std;

// Random Number Generator Function

double frand()
{
  return (2.0 * (double)rand() / RAND_MAX) - 1.0;
}

// Return a new Perceptron Object with the Specified number of Inputs (+1 for the bias)

Perceptron::Perceptron(size_t inputs, double bias)
{
  this->bias = bias;

  // Initialize the Weights as Random numbers of Double between -1 and 1

  weights.resize(inputs + 1); // Resize the Vector for Weights + Bias

  // Generate Random Numbers and Fill in the Vectors. Pass the frand function to generate the number

  generate(weights.begin(), weights.end(), frand);
}

// Run Function
// Feeds an Input Vector X into the perceptron to return the activation function output.

double Perceptron::run(std::vector<double> x)
{

  // Add the bias at the end
  x.push_back(bias);

  // Weighted Sum
  double sum = inner_product(x.begin(), x.end(), weights.begin(), (double)0.0);

  return sigmoid(sum); // Pass into the sigmoid function
}

// Set the weights. w_init is a vector with the Weights

void Perceptron::set_weights(std::vector<double> w_init)
{
  weights = w_init; // Copies the vector
}

// Evaluate the Sigmoid Function for the floating point of input

double Perceptron::sigmoid(double x)
{
  return 1.0 / (1.0 + exp(-x));
}
```

**Below is the Step wise Step Explanation for Each Implementation.**

`weights.resize(inputs + 1);`

This line resizes the weights vector to hold the specified number of inputs plus one `additional` element for the bias. This ensures that the weights vector has the correct size to accommodate all input weights and the bias term.

`generate(weights.begin(), weights.end(), frand);`

This line fills the weights vector with random values generated by the `frand` function. The `generate` function takes a range (from the beginning to the end of the weights vector) and applies the `frand` function to each element in that range, effectively initializing the weights to small random values.

`x.push_back(bias);`

This line adds the bias term to the end of the input vector `x`. This is necessary because the bias is treated as an additional input to the perceptron, and it needs to be included in the weighted sum calculation.

`inner_product(x.begin(), x.end(), weights.begin(), (double)0.0);`

This line computes the weighted sum of the inputs by taking the inner product of the input vector `x` (which now includes the bias) and the weights vector. The `inner_product` function multiplies each element of the input vector by the corresponding element of the weights vector and sums the results. The last argument `(double)0.0` specifies the initial value for the sum.

`return sigmoid(sum);`

This line passes the computed weighted sum into the sigmoid function and returns the result. The sigmoid function applies the logistic activation function to the weighted sum, squashing the output to a range between 0 and 1. This is a crucial step in the perceptron's operation, as it determines the final output of the neuron.

`weights = w_init;`

This line sets the weights of the perceptron to the provided initialization vector `w_init`. This allows the user to specify custom weights for the perceptron, which can be useful for tasks like transfer learning or fine-tuning a pre-trained model.

`return 1.0 / (1.0 + exp(-x));`

Calculates the sigmoid activation value for the given input `x`.


<hr>
<hr>
<hr>
<hr>


## **AND Gate**

Both the inputs need to be `True` for `True` output.

Now how do we create a Perceptron that can classify inputs like an AND gate?

Let's visualize the inputs and outputs of the `AND` gate in a Graph:

<img src='./Notes_Images/and_gate.png'>

Now, to successfully classify we need to draw a line that separates the two classes (0 and 1). This line is called the decision boundary.

<img src='./Notes_Images/boundary.png'>

**The Line that is To be Drawn is of Sigmoid Function**

<img src='./Notes_Images/sigmoid.png'>

In this image, the boundary is the line where sigmoid is `0.5`.

<hr>

`Before Moving Forward`,

Let's try to implement a function that exactly mimics as `AND` gate, but the function should be linear i.e. `f(x1, x2) = w1*x1 + w2*x2 + b`

**Is that Possible?**

<img src='./Notes_Images/linear_and.png'>

This proof shows that it is not possible to create a linear function that mimics the behavior of an `AND` gate.

The only solution is that the function should be non-linear, which means the function can be `exponential`, `quadratic`, or any other non-linear form.

<img src='./Notes_Images/non_linear_and.png'>

**TL;DR: AND is `linearly separable` (a perceptron can classify it), but it is not a linear function of the `inputs`.**

### **A Perceptron as an AND Gate**

Let's say there are two inputs `x1` and `x2`. The perceptron will compute a weighted sum of the inputs and pass it through a step function to produce the output.

The weighted sum can be represented as:

```
z = w1*x1 + w2*x2 + b
```

Where:

- `w1` and `w2` are the weights for the inputs
- `b` is the bias term

<hr>

The earlier problem was that we were not able to find a linear function that could separate the two classes (0 and 1).

But if we pass the output of the `linear function` i.e. `z = w1*x1 + w2*x2 + b` through a `non-linear activation` function i.e. `sigmoid`, we can achieve the desired results.

**Sigmoid**

The sigmoid function is defined as:

```
σ(z) = 1 / (1 + e^(-z))
```

Where `e` is the base of the natural logarithm.

When the value of `z` is `0`, the sigmoid function outputs `0.5`.

When `z` is positive, the sigmoid function outputs a value between `0.5` and `1`. When `z` is negative, the sigmoid function outputs a value between `0` and `0.5`.

For positive value, the output converges to `1` as `z` increases. For negative value, the output converges to `0` as `z` decreases.

<hr>

The step function will output `1` if `z` is greater than or equal to `0`, and `0` otherwise.

To implement the AND gate, we need to find appropriate values for `w1`, `w2`, and `b` such that the perceptron produces the correct output for all possible combinations of inputs.

The truth table for the AND gate is as follows:

| x1  | x2  | AND |
| --- | --- | --- |
| 0   | 0   | 0   |
| 0   | 1   | 0   |
| 1   | 0   | 0   |
| 1   | 1   | 1   |

From the truth table, we can see that the perceptron should output `1` only when both `x1` and `x2` are `1`. This means we need to set the weights and bias as follows:

- `w1 = 10`
- `w2 = 10`
- `b = -15`

With these values, the perceptron will compute the following:

```text
For (0, 0): z = 10*0 + 10*0 - 15 = -15 (output 0) i.e. 0.0000003 near to 0
For (0, 1): z = 10*0 + 10*1 - 15 = -5 (output 0) i.e. 0.0066929 near to 0
For (1, 0): z = 10*1 + 10*0 - 15 = -5 (output 0) i.e. 0.0066929 near to 0
For (1, 1): z = 10*1 + 10*1 - 15 = 5 (output 1) i.e. 0.9933071 near to 1
```

As we can see, the perceptron correctly mimics the behavior of the AND gate.

<hr>

To conclude,

we can see that the non-linear activation function (sigmoid) is able to generalize the `AND` gate with a `Single Perceptron`. Here, the weighted sum is the `Perceptron` output before applying the sigmoid function.

### **The Equation of Boundary Line That Separates the Classes**

The decision boundary for the AND gate can be represented by the equation:

```bash
z = 10*x1 + 10*x2 - 15

and

10*x1 + 10*x2 - 15 = 0 // The Sigmoid Function outputs 0.5 when z = 0

So,

x1 + x2 = 1.5

or

x2 = 1.5 - x1 i.e. y = mx + c
```

Because this equation defines a line in the 2D space (x1, x2) that separates the two classes (0 and 1).

**Image**

<img src='./Notes_Images/boundary_line.png'>

### **Note**

We just witnessed how a `Simple Single Perceptron` can model the behavior of an `AND` gate using a non-linear activation function.

Now, imagine what `1000s` or `even millions` of these simple perceptrons can achieve when combined in a multi-layer architecture.

Also, note that here we witnessed that for the `Non-Linear Activation Function` to give correct output, the combination of `Weights` and `Bias` should be carefully chosen.

Therefore, the design of neural networks involves not just the architecture (how many layers, how many neurons per layer) but also the careful tuning of these parameters to achieve the desired performance.

<hr>

The generalization rule for the `AND` becomes:

The weights `w1` and `w2` should be positive and the bias `b` should be negative. This ensures that the perceptron will only activate (output 1) when both inputs are 1.

But,

The `Bias` should be a negative number that is bigger than the weighted sum of the inputs when they are both `1`. This ensures that the perceptron will only activate (output 1) when both inputs are 1.

**Would `Sigmoid` be able to Generalize well, if the Weights and Bias are not carefully chosen?**

No, `Sigmoid` would not be able to generalize well if the weights and bias are not carefully chosen. This is because the `Sigmoid` function is sensitive to the input values, and if the weights and bias do not create a suitable decision boundary, the output may not correctly represent the underlying data distribution.

If the `Weights` i.e. `{10,10}` and `Bias` i.e. `{-5}` then the output of the `sigmoid` would be as below:

```bash

Gate: AND
0 AND 0 = 0.00669285 i.e. 0 which is correct
0 AND 1 = 0.993307 i.e. 1 which is not correct should be 0
1 AND 0 = 0.993307 i.e. 1 which is not correct should be 0
1 AND 1 = 1 i.e. 1 which is correct

```

Therefore, for the `Non-Linear` Activation function to work effectively, the weights and bias must be chosen carefully to create a suitable decision boundary.

For that, Gradient Descent is often used to optimize the weights and bias during the training process.

### **Follow Up Questions**

**What if our inputs are not binary (0 or 1) but continuous values? How would that affect the design of the perceptron?**

**What if we want to implement a different logical operation, such as OR or XOR? How would the design of the perceptron change in those cases?**

**What if we want to implement a multi-class classification problem? How would the design of the perceptron change in that case?**


<hr>
<hr>
<hr>
<hr>


## **Our Perceptron as an AND Gate**

Now let's try to implement our Perceptron as an AND gate. The AND gate outputs 1 only if both inputs are 1, otherwise, it outputs 0.

```C++
#include "mlp.h"
#include <iostream>
#include <vector>
using namespace std;

int main()
{
  Perceptron p(2); // Object with 2 inputs on the Stack, No need to delete

  p.set_weights({10, 10, -15}); // +1 Bias
}

cout << "Gate: AND" << endl;

cout << "0 AND 0 = " << p.run({0,0}) << endl;
cout << "0 AND 1 = " << p.run({0,1}) << endl;
cout << "1 AND 0 = " << p.run({1,0}) << endl;
cout << "1 AND 1 = " << p.run({1,1}) << endl;

// Output

// Gate: AND
// 0 AND 0 = 3.05902e-07
// 0 AND 1 = 0.00669285
// 1 AND 0 = 0.00669285
// 1 AND 1 = 0.993307
```


<hr>
<hr>
<hr>
<hr>


## **OR Gate**

<img src='./Notes_Images/or_gate.png'>

The OR gate outputs 1 if at least one of the inputs is 1, otherwise, it outputs 0.

<hr>

The weights should be `{15,15}` and the bias should be `-10`.

The linear combination for the OR gate can be expressed as:

```bash
15x + 15y - 10 = 0

then

y = -x + 2/3

```

Below are the outputs of Sigmoid function for the OR gate:

```C++

// Output

// Gate: OR
// 0 AND 0 = 4.53979e-05
// 0 AND 1 = 0.993307
// 1 AND 0 = 0.993307
// 1 AND 1 = 1

```

**OR Gate Boundary Line Equation**

<img src='./Notes_Images/or_gate_boundary.png'>


<hr>
<hr>
<hr>
<hr>


## **Linear Separability**

`Linear separability` is a property of a dataset that allows it to be separated into different classes using a `linear boundary`. In the context of neural networks, this means that a single layer perceptron can be used to classify the data points.

For example, the `AND` gate is linearly separable because we can draw a `straight line` (or `hyperplane` in higher dimensions) that separates the positive examples (1s) from the negative examples (0s).

Similarly, the `OR` gate is also linearly separable for the same reason.

**Note**

- Both the `Straight Line` i.e. `y = mx + b` and the `Hyperplane` i.e. `W*x + b = 0` can be used to separate linearly separable data.

- For 2 dimensional data i.e. 2 input features we need the equation of line, for 3 dimensional data, we need the equation of plane i.e. `Ax + By + Cz + D = 0` and for data whose dimension is greater than 3, we need the equation of hyperplane i.e. `W*x + b = 0`, where `W` is the weight vector and `b` is the bias.

<hr>

On the other hand, the `XOR` gate is not linearly separable because there is no single straight line that can separate the positive examples from the negative examples.

Below is the Graphical representation of the `XOR` gate:

<img src='./Notes_Images/xor_gate.png'>

Here, we cannot separate the positive and negative examples with a single straight line. We would need two lines to separate the classes.

If we use an `OR` gate only, it will get all but one of the `XOR` gate inputs correct. The `OR` gate will output `1` for both `(0, 1)` and `(1, 0)` inputs, which is incorrect for the `XOR` operation.

<img src='./Notes_Images/or_for_xor.png'>

If we use `NAND` gate, it will give one incorrect output for the `(0, 0)` input, which is the only case where the `XOR` gate outputs `1`.

<img src='./Notes_Images/nand_for_xor.png'>

**But**

If we combine the outputs of the `NAND` gate and the `OR` gate, we can create a circuit that correctly implements the `XOR` function.

<img src='./Notes_Images/nand_or_xor.png'>

### **Creating XOR with NAND, AND, and OR Gates**

To create an `XOR` gate using `NAND`, `AND`, and `OR` gates, we can use the following configuration:

1. **Inputs**: A and B

2. **NAND Gate**: Connect A and B to a `NAND` gate. This will give us the output `NAND(A, B)`.

3. **OR Gate**: Connect A and B to an `OR` gate. This will give us the output `OR(A, B)`.

4. **AND Gate**: Connect the outputs of the `NAND` gate and the `OR` gate to an `AND` gate. This will give us the final output `XOR(A, B)`.

The logical expression for the `XOR` gate can be represented as:

```bash
XOR(A, B) = AND(NAND(A, B), OR(A, B))
```

This configuration allows us to implement the `XOR` function using only `NAND`, `AND`, and `OR` gates.

**XOR Diagram**

<img src='./Notes_Images/xor_diagram.png'>

### **Neural Network for XOR Gate**

We know a single perceptron can solve a linear separable problem, but the `XOR` function is not linearly separable. Therefore, we need a neural network with `3 perceptrons` i.e. `2 in the hidden layer and 1 in the output layer`.

Also, we know that a single perceptron can represent all the three basic logic gates: `AND`, `OR`, and `NAND` each having different `weight` and `bias` configurations.

**Linear Equations for Basic Logic Gates**

`OR Gate` : `y = -x + 0.5` with Weights `{15,15}` and Bias `{-10}`

`NAND Gate` : `y = x + 1.5` with Weights `{-10,-10}` and Bias `{15}`

Then we plug the output of `OR Gate` and `NAND Gate` as an Input for the `AND Gate`

`AND Gate` : `y = -x + 1.5` with Weights `{10.10}` and Bias `{-15}`

<hr>


## **Multi Layer Perceptron**

A Multi-Layer Perceptron (MLP) is a type of neural network that consists of multiple layers of neurons, including an input layer, one or more hidden layers, and an output layer. MLPs are capable of learning complex patterns in data and can be used for a variety of tasks, including classification and regression.

**Image of MLP for XOR Gate**

<img src='./Notes_Images/mlp.png'/>

### **Architecture of MLP for XOR Gate**

1. **Input Layer**: The input layer consists of two inputs, each representing one of the input features (X1 and X2) of the XOR gate.

2. **Hidden Layer**: The hidden layer contains two neurons. One of the neurons represents the `NAND Gate` operation, while the other represents the `OR` operation.

3. **Output Layer**: The output layer consists of a single neuron that produces the final output of the network. This neuron receives inputs from both hidden layer neurons, applies a weighted sum and a non-linear activation function, and produces the final output (Y) of the `XOR` gate.


## **XOR Gate Implementation**

<hr>

In the `mlp.h` file previously we had written declaration for the `Single Perceptron` class. Now, we will extend this class to create a `MultiLayerPerceptron` class that can handle the XOR problem.

### **`mlp.h`**

```C++
#pragma once

#include <algorithm>
#include <vector>
#include <iostream>
#include <random>
#include <numeric>
#include <cmath>
#include <time.h>

class Perceptron
{
public:
  std::vector<double> weights;
  double bias;

  // Constructor
  Perceptron(size_t inputs, double bias = 1.0);

  // Run the Perceptron
  double run(std::vector<double> x);

  // Set the Customize Weights if Needed
  void set_weights(std::vector<double> w_init);

  // Sigmoid Activation Function
  double sigmoid(double x);
};

class MultiLayerPerceptron
{
public:
  // Constructor for initilizing layers
  MultiLayerPerceptron(std::vector<size_t> layers, double bias = 1.0, double eta = 0.5);

  // Set custom weights, w_init for weights of 3 perceptron
  void set_weights(std::vector<std::vector<std::vector<double>>> w_init);

  // Display the weights
  void print_weights();

  // Run the MLP
  std::vector<double> run(std::vector<double> x);

  // For Backpropagation
  double bp(std::vector<double> x, std::vector<double> y);

  // Attributes

  std::vector<size_t> layers; // Unsigned Integers, Number of Neurons Per Layer, 0 for Input, 2 for Hidden, 1 for Output

  double bias; // Bias
  double eta;  // Learning Rate

  std::vector<std::vector<Perceptron>> network; // Neural Network
  std::vector<std::vector<double>> values;      // Hodl the Output Valuse of the Network
  std::vector<std::vector<double>> d;           // Error Terms for the Neurons
};
```

In the above code,

The `MultiLayerPerceptron` class is designed to handle the XOR problem by utilizing multiple layers of neurons.

**`MultiLayerPerceptron(std::vector<size_t> layers, double bias = 1.0, double eta = 0.5)`** : This constructor initializes the MLP with the specified layer structure, bias, and learning rate. It creates the necessary layers and populates the network with `Perceptron` objects.

**`run(std::vector<double> x)`** : This method takes an input vector `x` and passes it through the network, returning the output of the MLP.

**`bp(std::vector<double> x, std::vector<double> y)`** : This method performs backpropagation to update the weights of the network based on the error between the predicted output and the true output `y`.

**`std::vector<std::vector<Perceptron>> network;`** : This attribute holds the layers of the MLP, where each layer is a vector of `Perceptron` objects.

**`std::vector<std::vector<double>> values;`** : This attribute stores the output values of each neuron in the network for a given input.

**`std::vector<std::vector<double>> d;`** : This attribute holds the error terms for the neurons, which are used during backpropagation to update the weights.

<hr>

### **`mlp.cpp`**

```C++

#include "mlp.h"
#include <iostream>
using namespace std;

// Random Number Generator Function
double frand()
{
  return (2.0 * (double)rand() / RAND_MAX) - 1.0;
}

/*
Single Layer Perceptron Implementation
*/

// Return a new Perceptron Object with the Specified number of Inputs (+1 for the bias)

Perceptron::Perceptron(size_t inputs, double bias)
{
  this->bias = bias;

  // Initialize the Weights as Random numbers of Double between -1 and 1

  weights.resize(inputs + 1); // Resize the Vector for Weights + Bias

  // Generate Random Numbers and Fill in the Vectors. Pass the frand function to generate the number

  generate(weights.begin(), weights.end(), frand);
}

// Run Function
// Feeds an Input Vector X into the perceptron to return the activation function output.

double Perceptron::run(std::vector<double> x)
{

  // Add the bias at the end
  x.push_back(bias);

  // Weighted Sum
  double sum = inner_product(x.begin(), x.end(), weights.begin(), (double)0.0);

  return sigmoid(sum); // Pass into the sigmoid function
}

// Set the weights. w_init is a vector with the Weights

void Perceptron::set_weights(std::vector<double> w_init)
{
  weights = w_init; // Copies the vector
}

// Evaluate the Sigmoid Function for the floating point of input

double Perceptron::sigmoid(double x)
{
  return 1.0 / (1.0 + exp(-x));
}

/*
Multi Layer Perceptron Implementation
*/

// Return a new Perceptron Object with the Specified number of Inputs (+1 for the bias)

MultiLayerPerceptron::MultiLayerPerceptron(std::vector<size_t> layers, double bias, double eta) : layers(layers), bias(bias), eta(eta)
{
  // Create Neurons Layer By Layer

  for (size_t i = 0; i < layers.size(); i++)
  {
    // Add Vector of Values Filled with Zeros
    values.push_back(vector<double>(layers[i], 0.0)); // Output of Each Neuron Value set to Zero based on the number of Neurons in Each layer

    // Add Vector of Neurons
    network.push_back(vector<Perceptron>());
  }
}
```

Here, in the above code for `MLP`,

**`MultiLayerPerceptron::MultiLayerPerceptron(std::vector<size_t> layers, double bias, double eta)`** : This constructor initializes the MLP with the specified layer structure, bias, and learning rate. It creates the necessary layers and populates the network with `Perceptron` objects.

```C++

// Create Neurons Layer By Layer

for (size_t i = 0; i < layers.size(); i++)
{
  // Add Vector of Values Filled with Zeros
  values.push_back(vector<double>(layers[i], 0.0));
}

```

The above code creates the necessary layers for the MLP by adding vectors of zeros for each layer. This sets up the structure for the neuron outputs, which will be filled during the forward pass of the network.

`vector<double>(layers[i], 0.0)` : Uses the `std::vector` constructor to create a vector of the specified size (`layers[i]`) initialized with zeros (`0.0`).

```C++
// Add Vector of Neurons
network.push_back(vector<Perceptron>()); // Perceptron Constructor, Empty for Now
```

The above code adds a new vector of `Perceptron` objects for each layer in the network. This sets up the structure for the neurons in each layer, which will be initialized with random weights during the training process.
