# `torch.nn`: Easy neural-network construction

Convention: the first index is the data-item index, so N images each of shape 128 x 128 will be in a tensor of shape N x 128 x 128

## References

* C. F. Higham and D. J. Higham (2018) Deep Learning: An Introduction for Applied Mathematicians, https://arxiv.org/abs/1801.05894
* J. Berner, P. Grohs, G. Kutyniok, P. Petersen (2021) The Modern Mathematics of Deep Learning, https://arxiv.org/abs/2105.04026

In [None]:
import numpy as np
import torch
import matplotlib.pyplot as plt
%matplotlib inline

## Define a nonlinear function f(x) that we want to approximate

In [None]:
x = torch.linspace(0, 2*np.pi, 100)
y = torch.sin(x)
plt.plot(x.numpy(), y.numpy(), 'b.')
plt.xlabel('x')
plt.ylabel('f(x)');

## Create a basic fully-connected neural network with ReLU layers

In [None]:
class MyModel(torch.nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        self.fc1 = torch.nn.Linear(1, 8)
        self.fc2 = torch.nn.Linear(8, 8)
        self.fc3 = torch.nn.Linear(8, 1)

    def forward(self, x):
        y1 = torch.nn.functional.relu(self.fc1(x))
        y2 = torch.nn.functional.relu(self.fc2(y1))
        y3 = self.fc3(y2)
        return y3

In [None]:
model = MyModel()

## What does this model predict?

We'll plot the training data as blue dots and the model as a red line.

In [None]:
yp = model(x.reshape(100,1))

In [None]:
plt.plot(x.numpy(), y.numpy(), 'b.')
plt.plot(x.numpy(), yp.detach().numpy(), 'r')
plt.xlabel('x')
plt.ylabel('f(x)');

In [None]:
model.fc3.weight

In [None]:
model.fc3.bias

In [None]:
model.fc3.bias.data

In [None]:
for p in model.parameters():
    print(p.shape)

## Optimize the model parameters so that the model predicts our function f(x)

In [None]:
opt = torch.optim.Adam(model.parameters(), lr=0.01)
loss_history = []

In [None]:
for i in range(10):
    opt.zero_grad()
    yp = model(x.reshape(100,1))
    loss = torch.nn.MSELoss()(yp, y.reshape(100,1))
    loss_history.append(loss.item())
    loss.backward()
    opt.step()

In [None]:
plt.plot(loss_history)
plt.xlabel('optimization step')
plt.ylabel('loss');

In [None]:
plt.plot(x.numpy(), y.numpy(), 'b.')
plt.plot(x.numpy(), yp.detach().numpy(), 'r')
plt.xlabel('x')
plt.ylabel('f(x)');

## Tasks

1. How many optimization steps are needed to get a reasonable approximation of our training data?

2. Can you adjust the learning rate or other parameters to speed up the optimization?

# Saving and restoring models

Save and load the parameters, not the full models.

In [None]:
torch.save(model.state_dict(), 'model_file.pkl')

In [None]:
model = MyModel()
model.load_state_dict(torch.load('model_file.pkl'))

In [None]:
model.state_dict()

# Visualizing run-time output with TensorBoard

Run TensorBoard with `tensorboard --logdir training_logs`

Navigate to http://localhost:6006/

In [None]:
import tensorboardX, datetime

In [None]:
timestamp = datetime.datetime.now().isoformat(timespec='seconds')
writer = tensorboardX.SummaryWriter(f'training_logs/train-{timestamp}')
# writer.add_text('hyperparameters', f'param1 = {param1}, param2 = {param2}')
for i in range(100):
    loss = np.exp(-i/50)
    writer.add_scalar('loss', loss, i)
    # writer.add_text('parameter value = 65', i)
    
    # plt.ioff()
    # fig = plt.gcf()
    # plt.plot(...)
    # writer.add_figure('output_visualization', fig, i)
    
    # writer.add_distribution('input_samples', xvec, i)