## Introduction
Suppose you have a 28x28 set of grayscale images, and you're assigned to 
classify whether it's certain class or not, considering there are 10 
classes. You can try a neural network for this task, so you devise the 
following architecture:  

![image1.png](../images/image1.png)  

Each node correspond to a pixel of the image. Hidden layers have activation 
functions defined (in this case, ReLU), and the output layer has a final 
decision layer (LogSoftmax) for the ten classes it has.

We could try another architecture with each hidden layer having 784 nodes, 
that it, the same quantity that the input layer. So each node is connected 
to the other nodes, and that connection is called weight. What we want is 
to find the weight that minimizes the prediction error. So we train our 
neural network and we can calculate how many parameters (the weights) we 
will need. For this case:

![image2.png](../images/image2.png)  

In [1]:
import torch
if torch.backends.mps.is_available():
    mps_device = torch.device("mps")
    x = torch.ones(1, device=mps_device)
    print (x)
else:
    print ("MPS device not found.")

tensor([1.], device='mps:0')


In [7]:
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor

In [8]:
training_data = datasets.FashionMNIST(
    root='data',
    train=True,
    download=True,
    transform=ToTensor()
)

test_data = datasets.FashionMNIST(
    root='data',
    train=False,
    download=True,
    transform=ToTensor(),
)

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to data/FashionMNIST/raw/train-images-idx3-ubyte.gz


100.0%


Extracting data/FashionMNIST/raw/train-images-idx3-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw/train-labels-idx1-ubyte.gz


100.0%


Extracting data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz


100.0%


Extracting data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz


100.0%

Extracting data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw






In [13]:
# DataLoader wraps an iterable over our dataset, and supports automatic 
# batching, sampling, shuffling and multiprocess data loading
batch_size = 64
train_dataloader = DataLoader(training_data, batch_size=batch_size)
test_dataloader = DataLoader(test_data, batch_size=batch_size)

for X, y in test_dataloader:
    # N: Batch size, C: Channels, H: Height, W: Width
    print(f'Size of X[N,C,H,W]: {X.shape}, type: {X.dtype}')
    print(f'Size of y: {y.shape}, type: {y.dtype}')
    break

Size of X[N,C,H,W]: torch.Size([64, 1, 28, 28]), type: torch.float32
Size of y: torch.Size([64]), type: torch.int64


In [14]:
print(f'{mps_device}')

mps


In [33]:
class NeuralNetwork(nn.Module):
    def __init__(self):
        super().__init__()
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28*28, 512),
            nn.ReLU(),
            nn.Dropout(p=0.3),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, 10))
        
    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

model = NeuralNetwork().to(mps_device)
print(model)

NeuralNetwork(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (linear_relu_stack): Sequential(
    (0): Linear(in_features=784, out_features=512, bias=True)
    (1): ReLU()
    (2): Dropout(p=0.3, inplace=False)
    (3): Linear(in_features=512, out_features=512, bias=True)
    (4): ReLU()
    (5): Linear(in_features=512, out_features=10, bias=True)
  )
)


In [34]:
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    model.train()
    for batch, (X,y) in enumerate(dataloader):
        X, y = X.to(mps_device), y.to(mps_device)
        pred = model(X)
        loss = loss_fn(pred, y)
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()
        if batch % 100 == 0:
            loss, current = loss.item(), (batch+1) * len(X)
            print(f'Loss: {loss:>.8f}, [{(current / size)}]')

In [35]:
def test(dataloader, model, loss_fn):
    size = len(dataloader.dataset)
    num_batches =  len(dataloader)
    model.eval()
    test_loss, correct = 0, 0
    with torch.no_grad():
        for X, y in dataloader:
            X, y = X.to(mps_device), y.to(mps_device)
            pred = model(X)
            test_loss += loss_fn(pred, y).item()
            correct += (pred.argmax(1) == y).type(torch.float).sum().item()
    test_loss /= num_batches
    correct /= size
    print(f'Test error: \n Accuracy {correct*100:>.1f}% ,avg loss: {test_loss:>.8f}')
    



In [36]:
epochs = 10
for t in range(epochs):
    print(f'Epoch {t+1}\n')
    train(train_dataloader, model, loss_fn, optimizer)
    test(test_dataloader, model, loss_fn)
print('Done!')

Epoch 1

Loss: 2.30007744, [0.0010666666666666667]
Loss: 0.62993050, [0.10773333333333333]
Loss: 0.36848697, [0.2144]
Loss: 0.54830742, [0.32106666666666667]
Loss: 0.57817626, [0.42773333333333335]
Loss: 0.46647772, [0.5344]
Loss: 0.42265308, [0.6410666666666667]
Loss: 0.53752494, [0.7477333333333334]
Loss: 0.53509820, [0.8544]
Loss: 0.53787357, [0.9610666666666666]
Test error: 
 Accuracy 84.4% ,avg loss: 0.42449598
Epoch 2

Loss: 0.28726852, [0.0010666666666666667]
Loss: 0.37831783, [0.10773333333333333]
Loss: 0.29176784, [0.2144]
Loss: 0.47445440, [0.32106666666666667]
Loss: 0.41815668, [0.42773333333333335]
Loss: 0.44904909, [0.5344]
Loss: 0.36283976, [0.6410666666666667]
Loss: 0.47039002, [0.7477333333333334]
Loss: 0.40545613, [0.8544]
Loss: 0.51748407, [0.9610666666666666]
Test error: 
 Accuracy 86.0% ,avg loss: 0.39232450
Epoch 3

Loss: 0.28840831, [0.0010666666666666667]
Loss: 0.29506528, [0.10773333333333333]
Loss: 0.25237128, [0.2144]
Loss: 0.35961080, [0.32106666666666667]
Lo

In [None]:
class Dog():
    def __init__(self, name, age):
        self.name = name
        self.age = age

    def bark(self):
        print(self.name + ' is barking')
    
    def __repr__(self) -> str:
        print(f'{self.name} is {self.age} years old')