A really good explainer on how Neural Networks approximate functions using only linear functions and activation functions is Michael Nielsen's *A visual proof that neural nets can compute any function*. See [this link.](https://neuralnetworksanddeeplearning.com/chap4.html)

**MAKE SURE TO ADD NOTES ON ACTIVATION FUNCTIONS ONCE YOU HAVE A BETTER GRASP**

# Using PyTorch's NN modules
In our previous linear model (chap 05), we wrote a simple linear function to map a line through our data points. Below is the function.

In [19]:
# where w is the 'weight' and b is the 'bias'
def model(unknown_unit, w, b):
    return w * unknown_unit + b

PyTorch provides *modules* (known as *layers* in other frameworks) in the sub-library `torch.nn`. A PyTorch module is a class deriving from the `torch.nn.Module` base class. A module can have one or more `Parameter` instances as attributes which are the tensors to be optimized during training (in our case *w* and *b*). A module can also have one or more *submodules* which are subclasses of `nn.Module` as attributes and will be able to track their parameters as well. <br><br>
If we wanted to replace our above model function, we could use the subclass of `nn.Module`, `nn.Linear`. `nn.Linear` has the following parameters:
* in_features(int) - the size of the input.
* out_features(int) - the size of the output
* bias(bool) - defaults to `True`. If set to `False`, the model will not learn a bias. No *b* parameter!

---
### A Little tangent on matrix-vector products to help me full understand the above parameters


In [20]:
import torch
import torch.nn as nn # import torch.nn with a convenient alias

In [28]:
linear_model = nn.Linear(2,3,False) # input 2D vector, output 3D vector, NO BIAS
linear_model.weight # weights are a 3 X 2 matrix

Parameter containing:
tensor([[ 0.3005,  0.3781],
        [ 0.1356,  0.4321],
        [ 0.4837, -0.1394]], requires_grad=True)

In [None]:
dummy_input = torch.Tensor([1.0, 2.0]) # the input 2D vector
