# Working with PyTorch 

-> https://pythonprogramming.net/introduction-deep-learning-neural-network-pytorch/ 

What's a tensor?!

You can just think of a tensor like an array. Really all we're doing is basically multiplying arrays here. That's all there is to it. The fancy bits are when we run an optimization algorithm on all those weights to start modifying them. Neural networks themselves are actually super basic and simple. Their optimization is a little more challenging, but most of these deep learning libraries also help you a bit with that math. If you want to learn how to do everything yourself by hand, stay tuned later in the series. I just don't think it would be wise to lead with that.

## Playing around with PyTorch 

In [2]:
# standard imports 

import torch 

# tensors are like arrays so...

x = torch.Tensor([5,3]) # establishes a 1 by 2 array with values 5 and 3 
y = torch.Tensor([2,1]) # establises a 1 by 2 array with values 2 and 1 

print(x*y) # multiplies the corresponding values 

tensor([10.,  3.])


In [3]:
x = torch.zeros([2,5]) # creates a two (row) by 5 (column) array filled with zeros. 
x_shape = x.shape # outputs the row by column 
y = torch.rand([2,5]) # creates a two by 5 array filled with random digits between 0 and 1. 

print("Printing Array Zeros: " + str(x))
print("Rows, Columns Size: " + str(x_shape))
print("Random Tensor: " + str(y))

Printing Array Zeros: tensor([[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]])
Rows, Columns Size: torch.Size([2, 5])
Random Tensor: tensor([[0.8039, 0.0427, 0.8986, 0.7554, 0.6506],
        [0.5862, 0.1356, 0.5430, 0.3869, 0.6304]])


In [4]:
visible_y = y.view([1,10]) # for the first ten values 
visible_y

tensor([[0.8039, 0.0427, 0.8986, 0.7554, 0.6506, 0.5862, 0.1356, 0.5430, 0.3869,
         0.6304]])

## Working with Neural Networks 

In [5]:
# standard imports

import torch 
import torchvision # collection to benchmark with vision tasks 
import matplotlib.pyplot as plt

from torchvision import transforms, datasets



First, we need a dataset. Next, we need to handle for how we're going to iterate over that dataset. 

Training and Testing data split
To train any machine learning model, we want to first off have training and validation datasets. This is so we can use data that the machine has never seen before to "test" the machine.

Shuffling
Then, within our training dataset, we generally want to randomly shuffle the input data as much as possible to hopefully not have any patterns in the data that might throw the machine off.

For example, if you fed the machine a bunch of images of zeros, the machine would learn to classify everything as zero. Then you'd start feeding it ones, and the machine would figure out pretty quick to classify everything as ones...and so on. Whenever you stop, the machine would probably just classify everything as the last thing you trained on. If you shuffle the data, your machine is much more likely to figure out what's what.

Scaling and normalization
Another consideration at some point in the pipeline is usually scaling/normalization of the dataset. In general, we want all input data to be between zero and one. Often many datasets will contain data in ranges that are not within this range, and we generally will want to come up with a way to scale the data to be within this range.

In [None]:
# data set implementation from library 

import os

os.environ['KMP_DUPLICATE_LIB_OK']='True'

train = datasets.MNIST('', train=True, download=True,
            transform=transforms.Compose([
                transforms.ToTensor()
            ]))

test = datasets.MNIST('', train=False, download=True,
            transform=transforms.Compose([
                transforms.ToTensor()
            ]))



In [7]:
trainset = torch.utils.data.DataLoader(train, batch_size=10, shuffle = True)
testset = torch.utils.data.DataLoader(test, batch_size=10, shuffle= True)

In [8]:
for data in trainset: 
    print(data) 
    break 

[tensor([[[[0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          ...,
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.]]],


        [[[0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          ...,
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.]]],


        [[[0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          ...,
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.]]],


        ...,


        [[[0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          ...,
          [0., 0., 0.,  ..., 0

In [9]:
x,y = data[0][0], data[1][0]
print(y)

tensor(8)


In [10]:
print(data[0][0].shape)

torch.Size([1, 28, 28])


: 

In [None]:
plt.imshow(data[0][0].view(28,28))
plt.show()