<a href="https://colab.research.google.com/github/mohammadreza-mohammadi94/Deep_Learning_Projects/blob/main/PyTorch_Faradars/13_FashionMNISTClassification_NeuralNetwork.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [6]:
import os
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import DataLoader
from torchvision import datasets, transforms

# Data Retrieval

In [2]:
train_data = datasets.FashionMNIST(
    root='./train',
    train=True,
    download=True,
    transform=transforms.ToTensor()
)

test_data = datasets.FashionMNIST(
    root='./test',
    train=False,
    download=True,
    transform=transforms.ToTensor()
)

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to ./train/FashionMNIST/raw/train-images-idx3-ubyte.gz


100%|██████████| 26421880/26421880 [00:01<00:00, 16758748.44it/s]


Extracting ./train/FashionMNIST/raw/train-images-idx3-ubyte.gz to ./train/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to ./train/FashionMNIST/raw/train-labels-idx1-ubyte.gz


100%|██████████| 29515/29515 [00:00<00:00, 283041.89it/s]


Extracting ./train/FashionMNIST/raw/train-labels-idx1-ubyte.gz to ./train/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to ./train/FashionMNIST/raw/t10k-images-idx3-ubyte.gz


100%|██████████| 4422102/4422102 [00:00<00:00, 4980260.38it/s]


Extracting ./train/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to ./train/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to ./train/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz


100%|██████████| 5148/5148 [00:00<00:00, 13411352.17it/s]


Extracting ./train/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to ./train/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to ./test/FashionMNIST/raw/train-images-idx3-ubyte.gz


100%|██████████| 26421880/26421880 [00:01<00:00, 17903815.78it/s]


Extracting ./test/FashionMNIST/raw/train-images-idx3-ubyte.gz to ./test/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to ./test/FashionMNIST/raw/train-labels-idx1-ubyte.gz


100%|██████████| 29515/29515 [00:00<00:00, 269420.36it/s]


Extracting ./test/FashionMNIST/raw/train-labels-idx1-ubyte.gz to ./test/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to ./test/FashionMNIST/raw/t10k-images-idx3-ubyte.gz


100%|██████████| 4422102/4422102 [00:00<00:00, 5057906.11it/s]


Extracting ./test/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to ./test/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to ./test/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz


100%|██████████| 5148/5148 [00:00<00:00, 17497793.35it/s]

Extracting ./test/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to ./test/FashionMNIST/raw






# Getting Training Device

In [3]:
device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(f'Using {device} device')

Using cpu device


# Defining the Model

**Model's architecture:**

Three layers:
* `Fully-connected Layer_1 --> (784, 64)`
* `Fully-connected Layer_2 --> (64, 64)`
* `Fully-connected Layer_3 --> (64, 10)`

In [9]:
class SimpleNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(in_features = 784,
                             out_features = 64)
        self.fc2 = nn.Linear(in_features = 64,
                             out_features = 64)
        self.fc3 = nn.Linear(in_features = 64,
                             out_features = 10)


    def forward(self, x):
        x = x.view(-1, 784)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = F.softmax(self.fc3(x), dim=1)
        return x

In [10]:
model = SimpleNN()
model

SimpleNN(
  (fc1): Linear(in_features=784, out_features=64, bias=True)
  (fc2): Linear(in_features=64, out_features=64, bias=True)
  (fc3): Linear(in_features=64, out_features=10, bias=True)
)

This code defines a simple feedforward neural network (often called a **fully connected network** or **Multi-Layer Perceptron (MLP)**) using PyTorch. Below is a step-by-step explanation of each component, with particular focus on the `__init__` and `forward` methods, which are critical for defining and applying the architecture.

### 1. **Class Definition:**
```python
class SimpleNN(nn.Module):
```
- **Explanation**: The `SimpleNN` class inherits from `torch.nn.Module`, the base class for all neural network modules in PyTorch. This base class helps organize the layers and forward-pass operations of the neural network.
- **Purpose**: By subclassing `nn.Module`, you can define the layers and logic of your custom neural network, which can then be easily integrated with PyTorch’s training loop and other functionalities.

---

### 2. **The `__init__` Method (Initialization)**
```python
def __init__(self):
    super().__init__()
    self.fc1 = nn.Linear(in_features=784, out_features=64)
    self.fc2 = nn.Linear(in_features=64, out_features=64)
    self.fc3 = nn.Linear(in_features=64, out_features=10)
```

#### **Step-by-step Explanation**:
- **`__init__`**: This is the constructor of the class. It is called when you create an instance of the `SimpleNN` class.
  
- **`super().__init__()`**: This calls the parent class `nn.Module`'s constructor. This is necessary because `nn.Module` contains some essential setup steps (such as tracking parameters, layers, etc.) that need to be initialized.

- **Defining Layers**:
    - `self.fc1 = nn.Linear(in_features=784, out_features=64)`:
        - **`nn.Linear`** defines a fully connected (dense) layer. This layer has 784 input features and 64 output features. Each feature is connected to every other feature, and the weights are learned during training.
        - **Why 784?**: If you are working with images of size 28x28 pixels (such as in the MNIST dataset), then flattening the image results in a 784-dimensional vector (28 x 28 = 784). This is why the input size is 784.
        - **Why 64?**: This is a design choice. You can experiment with different numbers of neurons in this hidden layer, but 64 is a typical small hidden layer size.

    - `self.fc2 = nn.Linear(in_features=64, out_features=64)`:
        - The second layer also has 64 input features and 64 output features. It is another fully connected layer, meaning each neuron in this layer is connected to each neuron in the previous layer.

    - `self.fc3 = nn.Linear(in_features=64, out_features=10)`:
        - The final layer has 64 input features and 10 output features. This is because the network is likely designed for a classification task with 10 possible classes (e.g., digits 0-9 in the MNIST dataset).
  
#### **Purpose of `__init__`**:
- The `__init__` method is used to **define the architecture** of the neural network. It initializes the layers (here, 3 fully connected layers) and specifies the number of input and output features for each layer. No computations are performed in this method; it simply sets up the building blocks of the network.

---

### 3. **The `forward` Method (Forward Pass)**
```python
def forward(self, x):
    x = x.view(-1, 784)
    x = F.relu(self.fc1(x))
    x = F.relu(self.fc2(x))
    x = F.softmax(self.fc3(x), dim=1)
    return x
```

#### **Step-by-step Explanation**:
- **`forward`**: This method defines how the input `x` will pass through the network layers and what computations will be applied at each layer. It is automatically called when you pass data through the model (i.e., `model(input)`).

- **Input Reshaping**:
  - `x = x.view(-1, 784)`:
    - This line reshapes the input tensor `x` to have 784 features.
    - **Why `view(-1, 784)`?**: The `-1` means that PyTorch will automatically infer the appropriate batch size from the data. For instance, if your input is a batch of images of shape `[batch_size, 1, 28, 28]`, the `.view(-1, 784)` flattens each image from 28x28 into a vector of size 784 while keeping the batch size the same.
  
- **First Layer (fc1)**:
  - `x = F.relu(self.fc1(x))`:
    - The input `x` is passed through the first fully connected layer `fc1`. This layer takes the 784-dimensional input and reduces it to 64 dimensions.
    - The output is then passed through the **ReLU activation function** (`F.relu`). ReLU (Rectified Linear Unit) introduces non-linearity into the model, allowing it to learn more complex representations.
  
- **Second Layer (fc2)**:
  - `x = F.relu(self.fc2(x))`:
    - Similarly, the output from the first layer (which now has 64 features) is passed through the second fully connected layer `fc2`, which maintains the 64-dimensional size.
    - Again, ReLU is applied to introduce non-linearity.
  
- **Third Layer (fc3)**:
  - `x = F.softmax(self.fc3(x), dim=1)`:
    - The output from the second layer is passed to the final fully connected layer `fc3`, which outputs a 10-dimensional vector (since there are 10 output classes).
    - The **Softmax activation function** is applied to this output. Softmax converts the 10 raw output values (logits) into probabilities, where the sum of all probabilities equals 1. The model can then use this to predict the class with the highest probability.

- **Return**:
  - The output `x` is returned. This is typically a tensor containing the class probabilities for each input sample in the batch.

#### **Purpose of `forward`**:
- The `forward` method is where the **actual computations** happen. It defines how the input data flows through the network and what transformations (linear layers, activations) are applied. This method is essential for training and inference, as it specifies how inputs map to outputs.

---

### Summary of Purpose:
- **`__init__`**: Initializes the network's layers, setting up the architecture (but no computation is done here). The layers are defined as objects, so they can be reused during the forward pass.
- **`forward`**: Defines the forward pass (i.e., how data moves through the network). It performs all the computations, including applying activation functions like ReLU and Softmax.

### Would you like more details on how to train or evaluate this model, or maybe about backpropagation through the layers?