Predict labels for images

Use all pixel information as input

Convolution patch

Stride 1 x 1 (how we move)

Example (1 x 1 stride)

3 x 3 x 1 image
2 x 2 x 1 filter

Output is 2 x 2

Input
```
[
    1, 2, 3,
    4, 5, 6,
    7, 8, 9
]
```
Filter
```
[
    0.1, 0.5,
    0.3, 0.4
]
```
Output
```
[
       4.3, ...
]
```

$ w^Tx + b $ (bias is not always used)

Don't forget about 0 padding!

Filter is almost used as a feature map

The result of the filter is an activation map

Can build layers of activation maps by running multiple filters and stacking the results ontop of each other 2 x 2 matrix to create 3d matrix

Apply the filter, then threshold with RELU each number in the matrix

### Max Pooling

Move by filter size as stride, do an argmax / average, to get the resultant value of the filter applied

### Locally Connected Features

#### Fully connected neural network

Each pixel goes to each neuron !

#### Locally connected neural network

Parts of an image to go a neuron (pixels get combined into a singlar set of information, most valuable information that is)

Smaller weights / more flexible

In [96]:
import torch as t
input = t.randn(3, 16, 50)
input

tensor([[[ 0.2890,  0.6837,  0.0262,  ..., -0.0653,  0.8718,  0.6821],
         [ 0.6440,  0.0763, -0.6445,  ..., -0.1091,  0.2596, -0.9315],
         [-0.5382,  0.8802, -0.3386,  ..., -1.2852, -0.9651, -1.2392],
         ...,
         [ 0.8056,  0.0134, -1.8466,  ..., -0.5810, -0.6743, -1.1742],
         [-0.3886,  0.2824,  0.0743,  ..., -0.3317,  0.7260,  0.2819],
         [-0.7559, -0.2986,  0.4459,  ..., -0.2426, -0.5779, -1.7032]],

        [[-0.0205, -3.2954,  0.5483,  ...,  1.1553, -1.3097, -1.6172],
         [ 0.2934, -0.8175, -1.4015,  ...,  0.0356,  1.1573,  1.0726],
         [-0.2480,  1.2908,  0.3898,  ..., -0.4925, -0.3988,  1.6352],
         ...,
         [ 0.5866,  0.7395,  1.5658,  ..., -0.9274, -2.0744,  0.8876],
         [ 0.7643,  0.4222, -0.0477,  ..., -1.6724,  0.6320,  0.5566],
         [ 0.7445,  0.9054,  0.3023,  ...,  0.3868,  0.3438, -2.1708]],

        [[-0.8475,  0.0121,  1.8199,  ..., -0.0637,  2.1401,  0.7824],
         [ 0.2190,  1.5356,  0.8098,  ..., -0

In [97]:
m = t.nn.Conv2d(3, 1, 3) # the 3x output channel filters are random, 
# this says colors (channels) (rows)/(height) (columns) / (width)
m
# https://discuss.pytorch.org/t/explaination-of-conv2d/8082

Conv2d(3, 1, kernel_size=(3, 3), stride=(1, 1))

In [99]:
output = m(input.view(1, 3, 16, 50))

In [72]:
from torchvision import datasets, transforms

# Training settings
batch_size = 64

# MNIST Dataset
train_dataset = datasets.MNIST(root='./mnist_data/',
                               train=True,
                               transform=transforms.ToTensor(),
                               download=True)

test_dataset = datasets.MNIST(root='./mnist_data/',
                              train=False,
                              transform=transforms.ToTensor())

# Data Loader (Input Pipeline)
train_loader = t.utils.data.DataLoader(dataset=train_dataset,
                                           batch_size=batch_size,
                                           shuffle=True)

test_loader = t.utils.data.DataLoader(dataset=test_dataset,
                                          batch_size=batch_size,
                                          shuffle=False)

In [92]:
#kk = t.nn.Conv2d(1, 10, kernel_size=5)
#kk(train_dataset[0][0].view(1,1,28,28))
#kk(train_dataset[0][0].view(1,1,28,28))[0][1]

In [107]:
import torch.nn.functional as F

class Net(t.nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv_1 = t.nn.Conv2d(1, 10, kernel_size=5)
        self.conv_2 = t.nn.Conv2d(10, 20, kernel_size=5)
        self.mp = t.nn.MaxPool2d(2)
        self.fc = t.nn.Linear(320, 10)
    def forward(self, x):
        in_size = x.size(0)
        x = F.relu(self.mp(self.conv_1(x)))
        x = F.relu(self.mp(self.conv_2(x)))
        x = x.view(in_size, -1) # flatten 1d
        x = self.fc(x)
        return F.log_softmax(x)

In [108]:
model = Net()
criterion = t.nn.CrossEntropyLoss()
optimizer = t.optim.SGD(model.parameters(), lr=0.01, momentum=0.5)

def train(epoch):
    model.train()
    for batch_idx, (data, target) in enumerate(train_loader):
        output = model.forward(data)
        loss = criterion(output, target)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        if batch_idx % 10 == 0:
            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                epoch, batch_idx * len(data), len(train_loader.dataset),
                100. * batch_idx / len(train_loader), loss.data[0]))

def test():
    model.eval() # put in eval mode, no gradients
    test_loss = 0
    correct = 0
    for data, target in test_loader:
        output = model(data)
        # sum up batch loss
        test_loss += criterion(output, target).data[0]
        # get the index of the max
        pred = output.data.max(1, keepdim=True)[1]
        correct += pred.eq(target.data.view_as(pred)).cpu().sum()

    test_loss /= len(test_loader.dataset)
    print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
        test_loss, correct, len(test_loader.dataset),
        100. * correct / len(test_loader.dataset)))


for epoch in range(1, 10):
    train(epoch)
    test()

  app.launch_new_instance()
  app.launch_new_instance()







Test set: Average loss: 0.0033, Accuracy: 9336/10000 (93%)


Test set: Average loss: 0.0018, Accuracy: 9651/10000 (96%)




Test set: Average loss: 0.0014, Accuracy: 9734/10000 (97%)


Test set: Average loss: 0.0012, Accuracy: 9765/10000 (97%)




Test set: Average loss: 0.0010, Accuracy: 9802/10000 (98%)


Test set: Average loss: 0.0009, Accuracy: 9825/10000 (98%)




Test set: Average loss: 0.0009, Accuracy: 9810/10000 (98%)




Test set: Average loss: 0.0008, Accuracy: 9826/10000 (98%)


Test set: Average loss: 0.0009, Accuracy: 9821/10000 (98%)

