<a href="https://colab.research.google.com/github/Maupin1991/ML_pytorch_tutorial/blob/master/2_NeuralNetworksWithPytorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Neural Networks with Pytorch

We could build a neural network defining the layers as tensors and connecting them with linear operations and activation functions. In general, the easiest way to build a neural network is the module **nn**, which provides easier interfaces for this task.

In [0]:
import torch
from torch import nn

In [0]:
class Network(nn.Module):
    def __init__(self):
        super().__init__()
        
        # Inputs to hidden layer linear transformation
        self.hidden = nn.Linear(784, 256)
        # Output layer, 10 units - one for each digit
        self.output = nn.Linear(256, 10)
        
        # Define sigmoid activation and softmax output 
        self.sigmoid = nn.Sigmoid()
        self.softmax = nn.Softmax(dim=1) # calculates along columns
        
    def forward(self, x):
        # Pass the input tensor through each of our operations
        x = self.hidden(x)
        x = self.sigmoid(x)
        x = self.output(x)
        x = self.softmax(x)
        
        return x

Here we're inheriting from **nn.Module**. Combined with** super().\__init\__()** this creates a class that tracks the architecture and provides a lot of useful methods and attributes. It is mandatory to inherit from nn.Module when you're creating a class for your network. The name of the class itself can be anything.

The weights and bias are automatically initialized with random values.

The **forward** method defines the sequence of the operations that need to be computed for getting the output from the network.

In [0]:
net = Network()
print(net)
# DON'T USE THIS! print(net.forward(torch.randn(1, 784)))
# the __call__ method performs more operations for both consistency and optimization 
print(net(torch.randn(1, 784)))

In [0]:
# access hidden layer weights and bias
# (note that they are already initialized)
print(net.hidden.weight)
print(net.hidden.bias)

# sample from random normal with standard dev = 0.01 and mean = 3
# (note the inplace operation)
net.hidden.weight.data.normal_(std=0.01, mean=3)
print(net.hidden.weight)

Often programmers prefer to use the functional module, which allows to define the network in a more compact way. The **nn.functional** module is imported as **F** as a convention.

In [0]:
import torch.nn.functional as F

class Network(nn.Module):
    def __init__(self):
        super().__init__()
        # Inputs to hidden layer linear transformation
        self.hidden = nn.Linear(784, 256)
        # Output layer, 10 units - one for each digit
        self.output = nn.Linear(256, 10)
        
    def forward(self, x):
        # Hidden layer with sigmoid activation
        x = F.sigmoid(self.hidden(x))
        # Output layer with softmax activation
        x = F.softmax(self.output(x), dim=1)
        
        return x
    
model = Network()
print(model)

Other kinds of activation functions are defined in the Functional module. Have a look at the **ReLU** operation.

In [0]:
class Network(nn.Module):
    def __init__(self):
        super().__init__()
        # Defining the layers, 128, 64, 10 units each
        self.fc1 = nn.Linear(784, 128)
        self.fc2 = nn.Linear(128, 64)
        # Output layer, 10 units - one for each digit
        self.fc3 = nn.Linear(64, 10)
        
    def forward(self, x):
        ''' Forward pass through the network, returns the output logits '''
        
        x = self.fc1(x)
        x = F.relu(x)
        x = self.fc2(x)
        x = F.relu(x)
        x = self.fc3(x)
        x = F.softmax(x, dim=1)
        
        return x

model = Network()
print(model)

**FASTEST WAY! Use sequential class. It allows to define the sequence of operations and layers sequentially.
Check out also the definition of model with OrderedDict.**

In [0]:
# Hyperparameters for our network
input_size = 784
hidden_sizes = [128, 64]
output_size = 10

# Build a feed-forward network
model1 = nn.Sequential(nn.Linear(input_size, hidden_sizes[0]),
                       nn.ReLU(),
                       nn.Linear(hidden_sizes[0], hidden_sizes[1]),
                       nn.ReLU(),
                       nn.Linear(hidden_sizes[1], output_size),
                       nn.Softmax(dim=1))
print(model1)

# Ordered dict implementation
from collections import OrderedDict
model2 = nn.Sequential(OrderedDict([
                       ('fc1', nn.Linear(input_size, hidden_sizes[0])),
                       ('relu1', nn.ReLU()),
                       ('fc2', nn.Linear(hidden_sizes[0], hidden_sizes[1])),
                       ('relu2', nn.ReLU()),
                       ('output', nn.Linear(hidden_sizes[1], output_size)),
                       ('softmax', nn.Softmax(dim=1))]))
print(model2)

print(model1[0])
print(model2[0])
print(model2.fc1)