<a href="https://colab.research.google.com/github/neuron283/deep-learning-from-scratch/blob/main/digit_classifier.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
!pip install -Uqq fastai

In [3]:
from fastai.vision.all import *

In [None]:
path = untar_data(URLs.MNIST_SAMPLE)
Path.BASE_PATH = path

In [None]:
path.ls()

In [None]:
(path/'valid').ls()

In [None]:
threes = (path/'train'/'3').ls()
sevens = (path/'train'/'7').ls()

In [None]:
im3_path = threes[0]
im3 = Image.open(im3_path)
array(im3)

**Baseline: Pixel Similarity**

In [None]:
#A Tensor is a N-dimensional Matrix: A Scalar is a 0-dimensional tensor. A Vector is a 1-dimensional tensor. A Matrix is a 2-dimensional tensor.

im3_t = tensor(im3)
df = pd.DataFrame(im3_t)
df.style.set_properties(**{'font-size':'6pt'}).background_gradient('Greys')

In [None]:
seven_tensors = [tensor(Image.open(o)) for o in sevens]
three_tensors = [tensor(Image.open(o)) for o in threes]

In [None]:
len(seven_tensors), len(three_tensors)

In [None]:
stacked_sevens = torch.stack(seven_tensors).float()/255
stacked_threes = torch.stack(three_tensors).float()/255
stacked_threes.shape # shape is the size of each axis of a tensor.

In [None]:
len(stacked_threes.shape) # the length of a tensor's shape is its rank

In [None]:
stacked_threes.ndim # the number of tensor's dimensions

In [None]:
# for every pixel position, this will compute the average of that pixel over all images
mean3 = stacked_threes.mean(0) # "ideal" 3
show_image(mean3)

In [None]:
mean7 = stacked_sevens.mean(0)
show_image(mean7);


In [None]:
a3 = stacked_threes[69]
show_image(a3)

In [None]:
# 2 loss functions which calculate distance of a3 from mean3
dist3_mse = ((a3 - mean3)**2).mean().sqrt()
dist3_mae = ((a3 - mean3)).abs().mean()
dist3_mae, dist3_mse

In [None]:
dist7_mse = ((a3 - mean7)**2).mean().sqrt()
dist7_mae = (a3 - mean7).abs().mean()
dist7_mae, dist7_mse

In [None]:
#mean square error and mean absolute error are implemented in torch.nn.functional
import torch.nn.functional as F

In [None]:
F.l1_loss(a3.float(), mean7), F.mse_loss(a3, mean7).sqrt()

Metrics and Broadcasting

In [None]:
#valid_stacked_threes_tensor = torch.stack([tensor(Image.open(o)) for o in (path/'valid'/'3').ls()]).float()/255

valid_threes = (path/'valid'/'3').ls()
valid_sevens = (path/'valid'/'7').ls()

valid_threes, valid_sevens

In [None]:
valid_threes_tensor = [tensor(Image.open(o)) for o in valid_threes]
valid_sevens_tensor = [tensor(Image.open(o)) for o in valid_sevens]

stacked_valid_threes_tensor = torch.stack(valid_threes_tensor).float()/255
stacked_valid_sevens_tensor = torch.stack(valid_sevens_tensor).float()/255

stacked_valid_threes_tensor.shape, stacked_valid_sevens_tensor.shape

In [None]:
def mnist_distance(a, b):
  return (a-b).abs().mean((-1, -2))
  # return F.l1_loss(a, b, reduction='none').mean([-1,-2])

mnist_distance(a3, mean3)

our stacked_valid_threes_tensor is of dimension 3 whereas mean3(the ideal 3) is just 1 dimension.

**Using a loop** to calculate mnist_distance between each image in stacked_valid_threes_tensor with mean3 is **SLOW**

we pass stacked_valid_threes_tensor in mnist_distance.

PyTorch, when it tries to perform a simple subtraction operation between two tensors of different ranks, will use **broadcasting**.

*   PyTorch doesn't actually copy mean3 1,010 times. It pretends it were a tensor of that shape, but doesn't actually allocate any additional memory

*   It does the whole calculation in C (or, if you're using a GPU, in CUDA, the equivalent of C on the GPU), tens of thousands of times faster than pure Python (up to millions of times faster on a GPU!).



In [None]:
mnist_distance(stacked_valid_threes_tensor, mean3)

Now to predict whether the image is a 3 or 7,


*   if distance of image from mean3 < distance from mean7 => image is a 3


*   if distance of image from mean3 > distance from mean7 => image is NOT a 3



In [None]:
def is_3(x):
  return mnist_distance(x, mean3) < mnist_distance(x, mean7)

is_3(a3), is_3(tensor(Image.open((path/'valid'/'7').ls()[2]))) #random 7 from validation set

In [None]:
accuracy3 = is_3(stacked_valid_threes_tensor).float().mean()
accuracy7 = 1-is_3(stacked_valid_sevens_tensor).float().mean()

(accuracy3+accuracy7)/2

We got 90+% accuracy which is good but we can do better.

Now we'll build a system that can automatically modify itself to improve its performance. In other words, it's time to talk about the training process, and SGD.

**Stochastic Gradient Descent**

In this approach, we come up with a set of weights for each pixel, such that the highest weights are associated with those pixels most likely to be black for a particular category.

For example: Bottom right pixels would have low weight since they aren't very likely to be activated for a 7 but they are likely to be activated for an 8, so they should have a high weight for an 8.  

This can be represented as a function and set of weight values for each possible category:

```
def pr_eight(x,w): return (x*w).sum()
```


Where x is the image in tensor form and w is the weights tensor.



---



1.   Initialize the weights.
2.   For each image, use these weights to predict whether it appears to be a 3 or a 7.
3. Based on these predictions, calculate how good the model is (its loss).
4. Calculate the gradient, which measures for each weight, how changing that weight would change the loss
5. Step (that is, change) all the weights based on that calculation.
6. Go back to the step 2, and repeat the process.
7. Iterate until you decide to stop the training process (for instance, because the model is good enough or you don't want to wait any longer).



In [5]:
def f(x): return f**2

plot_function(f,'x','x^2')

NameError: name 'plot_function' is not defined