# PyTorch: nn

A fully connected ReLU network with one hidden layer, trained to predict y from x by minimizing squared Eucledian distance 

Note:
PyTorch autograd makes it easy to define computational graphs and take gradients, but raw autograd can be a bit too low-level for defining complex neural networks; this is where the nn package can help. The nn package defines a set of Modules, which are in a way neural network layers that produce output from input and may have some trainable weights. 


In [4]:
import torch

batch_size = 64
input_dimension = 1000
hidden_dimension = 100
output_dimension = 10

x = torch.randn(batch_size, input_dimension)
y = torch.randn(batch_size, output_dimension)

#Use the nn package to define our model as a sequence of layers
#nn.Sequential is a Module which contains other Modules, and a applies them in sequence
#to produce its output. Each Linear Module computes output form input using a linear function,
#and holds internal Tensors for its weight and bias.
model = torch.nn.Sequential(
    torch.nn.Linear(input_dimension, hidden_dimension),
    torch.nn.ReLU(), 
    torch.nn.Linear(hidden_dimension, output_dimension),
)

#The nn package also contains definitions of populat loss functions; in this case
# we will use Mean Squared Error (MSE) as our loss function
loss_fn = torch.nn.MSELoss(size_average=False)

learning_rate = 1e-4

for n in range(500):
    #Forward pass: compute predicted y by passing x to the model. Module objects
    # override the __call__ operator so you can call them like functions. When 
    # doing so you pass a Tensor of input data to the Module and it produces a Tensor of output data
    y_pred = model(x)
    
    #Compute and print loss. We pass Tensors containing the predicted and true values of y, 
    #and the loss function returns a Tensor containing the loss 
    loss = loss_fn(y_pred, y)
    print(n, loss.item())
    
    #Zero the gradients before running the backward pass
    model.zero_grad()
    
    #Backward pass: compute gradient of the loss with respect to all the learnable 
    #parameters of the model. Internally, the parameters of each Module are stored in Tensors
    #with required_grad = True, so this call will compute gradients for all learnable parameters 
    # in the model and print 