# TOC

__Chapter 4 - Introduction to neural networks using PyTorch__

1. [Import](#Import)
1. [Recipe 4-1 : Working with Activation Functions](#Recipe-4-1-:-Working-with-activation-functions)
1. [Recipe 4-2 : Visualizing the shape of activation functions](#Recipe-4-2-:-Visualizing-the-shape-of-activation-functions)
1. [Recipe 4-3 : Basic neural network model](#Recipe-4-3-:-Basic-neural-network-model)
1. [Recipe 4-4 : Tensor differentiation](#Recipe-4-4-:-Tensor-differentiation)


# Import

<a id = 'Import'></a>

In [5]:
# standard libary and settings
import os
import sys
import importlib
import itertools
import warnings

warnings.simplefilter("ignore")
from IPython.core.display import display, HTML

display(HTML("<style>.container { width:95% !important; }</style>"))

# data extensions and settings
import numpy as np

np.set_printoptions(threshold=np.inf, suppress=True)
import pandas as pd

pd.set_option("display.max_rows", 500)
pd.set_option("display.max_columns", 500)
pd.options.display.float_format = "{:,.6f}".format

import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.init as init
import torch.nn.functional as F
from torch.autograd import Variable

# visualization extensions and settings
import seaborn as sns
import matplotlib.pyplot as plt

# magic functions
%matplotlib inline

# Recipe 4-1 : Working with activation functions

__Problem__: What are the activation functions and how do they work in real projects? How do we implement an activation function using PyTorch?

__Solution__: Activation functions are mathematical formulas that transform a vector to another representation of that vector. These functions act upon data as it moves through the neural network. All activation functions can be broudly classified as linear and nonlinear.


<a id = 'Recipe-4-1-:-Working-with-activation-functions'></a>

In [7]:
# linear function - typically used to transfer information from the last hidden layer to the output layer. y = x * A + b
# bilinear function - a simple function typically used to transfer information. y = x_1 * A * x2 + b
x = Variable(torch.randn(100, 10))
y = Variable(torch.randn(100, 30))

linear = nn.Linear(in_features=10, out_features=5, bias=True)
output_linear = linear(x)
print("Output size : {}".format(output_linear.size()))

bilinear = nn.Bilinear(in1_features=10, in2_features=30, out_features=5, bias=True)
output_bilinear = bilinear(x, y)
print("Output size : {}".format(output_bilinear.size()))

Output size : torch.Size([100, 5])
Output size : torch.Size([100, 5])


In [None]:
# sigmoid function - this nonlinear activation function is frequently used because it is easy to explain an implement. Its output is confined to the range (0,1).
# it provides a probability of belonging to a class. It is mostly used in performing classification-based tasks. It may get stuck in local minima during gradient descent
x = Variable(torch.randn(100, 10))
y = Variable(torch.randn(100, 30))

print("Output size : {}".format(output_.size()))
print("Output size : {}".format(output_.size()))

In [None]:
# hyperbolic tangent function - better known as tanh, this function transforms information as it moves from the mapping layer to the
# hidden layer, and between hidden layers. it can take on values in the range -1 and +1
x = Variable(torch.randn(100, 10))
y = Variable(torch.randn(100, 30))

print("Output size : {}".format(output_.size()))
print("Output size : {}".format(output_.size()))

In [None]:
# log sigmoid transfer function - this function is generally used when mapping th einput layer to the hidden layer. it is frequently
# applied when the data entering this activation function is non-binary is of the data type float, and contains many outliers.
x = Variable(torch.randn(100, 10))
y = Variable(torch.randn(100, 30))

print("Output size : {}".format(output_.size()))
print("Output size : {}".format(output_.size()))

In [None]:
# rectified linear unit - better known as ReLU, this function is used when transferring information from the input layer to the output
# layer. it is mostly used in convolutional neural networks. it produces values in the range (o, inf). It is mainly used between hidden layers.
x = Variable(torch.randn(100, 10))
y = Variable(torch.randn(100, 30))

print("Output size : {}".format(output_.size()))
print("Output size : {}".format(output_.size()))

In [None]:
# leady ReLU - a variant of ReLU that aims to address the problem of a 'dying gradient'. this activation function avoids this problem
# by allowing a small and non-zero gradient when the unit is not active
x = Variable(torch.randn(100, 10))
y = Variable(torch.randn(100, 30))

print("Output size : {}".format(output_.size()))
print("Output size : {}".format(output_.size()))

# Recipe 4-2 : Visualizing the shape of activation functions


__Problem__: How do we visualize activation functions?

__Solution__: The data transformed by an activation function can be plotted against the actual tensor. For illustrative purposes, we can use a sample tensor, converted to a PyTorch variable. We will apply an activation function and store the result as another tensor, and plot the result

<a id = 'Recipe-4-2-:-Visualizing-the-shape-of-activation-functions'></a>

In [None]:
#

# Recipe 4-3 : Basic neural network model


__Problem__: How do we build a basic neural network model using PyTorch

__Solution__: A basic neural network model in PyTorch requires six steps: preparing training data, initializing weights, creating a basic network model, calculating the loss function, selecting the learning rate, and optimizing the loss function with respect to the model’s parameters.



<a id = 'Recipe-4-3-:-Basic-neural-network-model'></a>

In [None]:
#

# Recipe 4-4 : Tensor differentiation

__Problem__: What is tensor differentiation and how is it relevant in compuation graph execution in PyTorch

__Solution__: 

<a id = 'Recipe-4-4-:-Tensor-differentiation'></a>

In [None]:
#