# Using a neural network to fit the data
---
This chapter covers
- Nonlinear activation functions as the key difference compared with linear models
- Working with PyTorch's nn module
- Solving a linear-fit problem with a neural network

## Activation functions
---
An activation function is needed to
- allow the output function to have different slopes at different values in the inner parts 
- concentrate the outputs into a given range at the last layer
  
It needed to be:
- nonlinear, because put linear functions together will still result in a linear function
- differentiable

They do:
- have at least one sensitive range, in order to train
- usually have an insensitive range
- usually have a lower bound and an upper bound

## The PyTorch nn module
---
Note: the submodules of a PyTorch module must not be inside list or dict, or the optimizer can not locate them and their parameters  
Use nn.ModuleList and nn.ModuleDict instead

### \_\_call\_\_ and forward
Use \_\_call\_\_ instead of forward by treat the module as a function  
because \_\_call\_\_ do other important things before it call forward, these two only produce the same output value. Call forward is a mistake.

In [1]:
import torch

t_c = [0.5, 14.0, 15, 28, 11, 8, 3, -4, 6, 13, 21]
t_u = [35.7, 55.9, 58.2, 81.9, 56.3, 48.9, 33.9, 21.8, 48.4, 60.4, 68.4]

t_c = torch.tensor(t_c).unsqueeze(1)  # change from a list to a B * 1 tensor
t_u = torch.tensor(t_u).unsqueeze(1)

t_u.shape

torch.Size([11, 1])

In [5]:
# use nn to implement the model again
import torch.nn as nn
import torch.optim as optim

linear_model = nn.Linear(1, 1)
optimizer = optim.SGD(
    linear_model.parameters(),
    lr=1e-2
)

In [7]:
for name, param in linear_model.named_parameters():
    print(name, param)

weight Parameter containing:
tensor([[-0.0476]], requires_grad=True)
bias Parameter containing:
tensor([0.2949], requires_grad=True)


In [8]:
linear_model.bias

Parameter containing:
tensor([0.2949], requires_grad=True)