# The Hello World of Deep Learning with Neural Networks

Summary:

- Simple script showing overall scaffolding for how Pytorch code works
- Use neural networks to learn the relationship between two numbers
- Feed set of Xs and Ys to neural network to determine relationship/rule

In the example, the function is: y = 2x -1. Using data, we are training a neural network to figure out the relationship between X and Y.

## Imports

Importing libraries needed to run the simple neural network including torch and numpy

In [None]:
import torch 
import numpy as np
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable

## Providing the Data

Next we'll use numpy to generate some data with 6 xs and 6ys. The relationship between x and y is that y=2x-1.

In [None]:
xs = np.array([-1.0,  0.0, 1.0, 2.0, 3.0, 4.0], dtype=np.float32).reshape(-1,1)
ys = np.array([-3.0, -1.0, 1.0, 3.0, 5.0, 7.0], dtype=np.float32).reshape(-1,1)

## Define and Compile the Neural Network

Next we will create the simplest possible neural network, which has 1 layer (in this case nn.Linear). The input shape to the layer is just 1 value and the layer has 1 neuron.  

In pytorch, neural networks can be constructed using torch.nn package. Here, we create a nn.Module class named Net. The init function and forward function (where the network makes a guess) are defined and the backward function (where gradients are computed) is automatically defined using autograd.

In [None]:
class Net(nn.Module):
    def __init__(self):
        super(Net,self).__init__()
        self.fc1 = nn.Linear(1,1)     # initializing a linear network with 1 input and 1 neuron, respectively
        
    def forward(self,x):
        out = self.fc1(x)
        return out

Now we instantiate our Neural Network. If using GPUs, we also specify the model runs on the GPU device. 

In [None]:
model = Net()  # instantiate the neural network
if torch.cuda.is_available():
    model.cuda()
print(model)

We also define a loss and an optimizer to our Neural Network.  

We know that in our function is y=2x-1. When the computer is trying to 'learn', it makes a guess (maybe y=10x+10). The LOSS function measures how close the guessed answers is to the known correct answers.

It then uses the OPTIMIZER function to make another guess, while trying to minimize the loss (e.g., y=5x+5, which, while still pretty bad, is closer to the correct result and has lower loss)

In [None]:
# using mean squared error for the loss function
criterion = nn.MSELoss()   
# using stochastic gradient descent for the optimizer
learning_rate = 0.001
optimizer = torch.optim.SGD(model.parameters(),lr=learning_rate)


# Training the Neural Network

The process of training the neural network, where it 'learns' the relationship between the Xs and Ys is contained in the loop over the number of epochs. This is where it will go through the loop making a guess (outputs), measuring how good or bad it is (aka the loss), calculating the gradients (loss.backward), and using the opimizer to make another guess etc. When you run this code, you'll see the loss on the right hand side.

In [None]:
for epoch in range(2000):
#    for i, data in enumerate(train_loader,0):
#        inputs, labels = data
    if torch.cuda.is_available():
        inputs = Variable(torch.from_numpy(xs).cuda())   # if using gpu, pass in inputs and labels to the GPU device
        labels = Variable(torch.from_numpy(ys).cuda())
    else:
        inputs = Variable(torch.from_numpy(xs))
        labels = Variable(torch.from_numpy(ys))
    optimizer.zero_grad()                               # zero the parameter gradients
    outputs = model(inputs)                             # initial prediction                        
    loss = criterion(outputs,labels)                    # define loss
    loss.backward()                                     # calculate gradient
    optimizer.step()                                    # take optimizer step
    print('epoch {}, loss {}'.format(epoch, loss.item()))
print('Finished Training')

As the training progesses, the loss decreases. After training, we are plotting the predicted values (dots) compared to the actual values (line) for the set of x's used in the training set. 

In [None]:
%matplotlib inline 
import matplotlib.pyplot as plt
with torch.no_grad(): # we don't need gradients in the testing phase
    if torch.cuda.is_available():
        predicted = model(Variable(torch.from_numpy(xs).cuda())).cpu().data.numpy()   # GPU version
    else:
        predicted = model(Variable(torch.from_numpy(xs))).data.numpy()                # non-GPU version
    print(predicted)

plt.clf()
plt.plot(xs, predicted, 'go', label='Predictions', alpha=0.5)
plt.plot(xs, ys, '--', label='True Data', alpha=0.5)
plt.legend(loc='best')
plt.show()

## Use model to predict unseen value of x

Now that you have a model trained to learn the X-Y relationship. You can use the model to figure out the Y for a previously unknown X (e.g., 10).

In [None]:
from torch.autograd import Variable
test = np.array([10.0], dtype=np.float32).reshape(-1,1)     # need to reshape data to feed to the neural network
model(Variable(torch.from_numpy(test).cuda())).cpu().data.numpy()

The prediction ended up slightly under 19. Why?

Remember that neural networks deal with probabilities, so given the data that we fed the NN with, it calculated that there is a very high probability that the relationship between X and Y is Y=2X-1, but with only 6 data points we can't know for sure. As a result, the result for 10 is very close to 19, but not necessarily 19.

To determine how the predictions are made, we can look at the internal variables of the Dense layer.

In [None]:
print("These are the weights used: {}".format(list(model.parameters())))

Note that the weights 1.92 and -0.78 are close to 2x -1. Increasing the number of epochs should improve the predictions. 