# Programming PyTorch for Deep Learning
### Ian Pointer (O'Reilly)
### Notes and tests

This books assumes a working CUDA installation. Let's hope for the best...

## Chapter 1. Getting started with PyTorch
### Tensors
A tensor is both a container for numbers (like a vector or matrix) but also represents sets of rules defining transformations
between tensors that produce new tensors. Tensors have ranks that represent their dimensional space. Tensors with PyTorch
support fetching and changing elements using the standard Python indexing. 

In [1]:
import torch, torchvision, numpy, pandas
from torchvision import transforms
from torch.utils import data
import torch.nn as nn
import torch.nn.functional as F

x = torch.tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(x)
print(x[0][-1])  # fetch the last element of the first dimension
x[0][0] = 10  # change the value of the first element of the first dimension

# There are multiple functions that can create tensors, like torch.zeros(), torch.ones(), torch.rand()

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])
tensor(3)


#### Tensor operations
There are a lot.
For instance, we can find the maximum value in a tensor like this:

In [2]:
x = torch.rand(2, 2)
print(x)
print(x.max(), x.argmax(), x.max().item())

tensor([[0.4608, 0.0096],
        [0.7883, 0.7824]])
tensor(0.7883) tensor(2) 0.7883459329605103


There are multiple types of tensors, for instance ```LongTensors``` or ```FloatTensors```. We can convert back and forth 
using the ```.to()``` method.

In [3]:
long_tensor = torch.tensor([[0, 0, 1], [1, 1, 1], [0, 0, 0]])
print(long_tensor.type())
float_tensor = long_tensor.to(dtype=torch.float32)
print(float_tensor.type())

torch.LongTensor
torch.FloatTensor


Sometimes it may be useful to make use of **in-place** operations, as it will save memory by avoiding copying the tensor.
In-place functions are post-fixed with a "_" symbol.

In [4]:
x = torch.rand(2, 2)
print(x)
x.log2()
print(x)
x.log2_()  # Only this in-place operation will change the original tensor (x)
print(x)

tensor([[0.2643, 0.8364],
        [0.1725, 0.3988]])
tensor([[0.2643, 0.8364],
        [0.1725, 0.3988]])
tensor([[-1.9199, -0.2577],
        [-2.5355, -1.3263]])


#### Reshaping
An important task is to reshape tensors. We can use ```torch.view()``` or ```torch.reshape()```. The main difference is
that ```torch.view()``` operates as a view of the original tensor, so if the underlying data is changed, the view will also
change, whereas this does not happen with ```torch.reshape()```. Another difference is that ```torch.view()``` requires
that the tensors/views being operated on **are contiguous**, that is, they need to share the same memory blocks they would
occupy if a new tensor of the desired shape was created ex-novo. In this case, de-fragment stuff with ```torch.contiguous()```
before the ```torch.view()``` operations.
#### Rearranging tensor dimensions
Another important operation is to re-arrange the dimensions in tensors. For instance, usually RGB image data is organized
in ```[width, height, channel]``` but usually PyTorch likes this data as ```[channel, width, height]```. We can use the
```torch.permute()``` method by supplying the new order of the dimensions.

In [5]:
x = torch.rand(640, 480, 3)
x_rearranged = x.permute(2, 0, 1)  # put the last dimension (RGB) as the first one.
print(x_rearranged.size())

torch.Size([3, 640, 480])


#### Tensor broadcasting
Tensor broadcasting is an approach that allows to perform operations between a tensor and a smaller tensor. It is possible
to broadcast across two tensors if, starting from their trailing dimensions:
* The two dimensions are equal
* One of the dimensions is 1

The book doesn't really expand much on this, but my understanding is that, as long as the limitations above are respected,
it can automatically pad the smaller tensor to have the same size as the larger one, and then operate.
## Chapter 2. Image Classification with PyTorch
Now we are going to incrementally build a simple neural network with the task of performing image classification between
fishes and cats. First of all, we need data. The ```download.py``` script, included in the book's GitHub, supposedly
downloads a subset of ImageNet data, already separated in 3 datasets (**training**, **test** and **validation**) and for
each of these datasets, the images are already divided into fish or cat categories. The script, unfortunately, seems to 
have failed for several images, and several other had not been downloaded properly, so let's see how it goes. I guess it's
a really real-world example...
### PyTorch and Data Loaders
Formally, a PyTorch ```dataset``` is a Python class that allows us to get at the data we're supplying to the neural network. 
A ```data loader``` is what actually feeds data from the dataset into the network. 
A dataset is defined as a class that defines at least a ```.__getitem__(self, index)``` method, and a ```.__len__(self)```
method. These two methods provide a way of retrieving elements from the data, in ```(label, tensor)``` pairs, and a way
of obtaining the size of the dataset, respectively.
### Building a training dataset
The ```torchvision``` module provide several convenience function to deal with image dataset, such as the
```ImageFolder``` class, which will greatly simplify dealing with image data, as long as the images are contained in a 
directory structure where each directory is a label (i.e. ```./train/cat/```, ```./train/fish/``` etc).

For our purposes, this will be enough:

In [6]:
train_data_path = "./book_examples/train/"

my_transforms = transforms.Compose([
    transforms.Resize(64),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                         std=[0.229, 0.224, 0.225])
])
train_data = torchvision.datasets.ImageFolder(root=train_data_path, transform=my_transforms)

val_data_path = "./book_examples/val/"
val_data = torchvision.datasets.ImageFolder(root=val_data_path, transform=my_transforms)

test_data_path = "./book_examples/test/"
test_data = torchvision.datasets.ImageFolder(root=test_data_path, transform=my_transforms)

my_batch_size = 64
train_data_loader = data.DataLoader(train_data, batch_size=my_batch_size)
val_data_loader = data.DataLoader(val_data, batch_size=my_batch_size)
test_data_loader = data.DataLoader(test_data, batch_size=my_batch_size)

Now that set ```datasets``` (and pertinent transforms) and ```dataloaders```, it's time to create the actual neural 
network!
### Creating a network

In [7]:
# Had to readjust several variables/imports that are not mentioned in the book examples. Cross-referencing with the official
# 60-min tutorial helps a lot.
class SimpleNet(nn.Module):
    def __init__(self):
        super(SimpleNet, self).__init__()
        self.fc1 = nn.Linear(12288, 84)
        self.fc2 = nn.Linear(84, 50)
        self.fc3 = nn.Linear(50, 2)
    
    def forward(self,x ):
        x = x.view(-1, 12288)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = F.softmax(self.fc3(x))
        return x

simplenet = SimpleNet()

print(torch.cuda.is_available())

True
