## Train ResNets

### Danish Rafiq and Asif Hamid

code originally created by Yuying Liu,
This script is a template for training neural network time-steppers for different systems and different time scales. To reproduce the results in the paper, one needs to obtain all 11 neural network models for each nonlinear system under study. For setup details, please refer to Table 2 in the paper.

In [121]:
import os
import sys
import torch
import numpy as np

module_path = os.path.abspath(os.path.join('../../src/'))
if module_path not in sys.path:
    sys.path.append(module_path)
    
import ResNet as net

In [122]:
# adjustables
k = list(range(11))            # model index: should be in {0, 2, ..., 10}
dt = 0.01                    # time unit: 0.0005 for Lorenz and 0.01 for others
system = 'Hyperbolic'         # system name: 'Hyperbolic', 'Cubic', 'VanDerPol', 'Hopf'
noise = 0.0                    #noise levels: 0.0, 0.01, 0.02, 0.05 ,0.1, 0.2

lr = 1e-3                     # learning rate
max_epoch = 10000            # the maximum training epoch
batch_size = 320              # training batch size
arch = [2, 128, 128, 128, 2]  # architecture of the neural network (check paper for details)

In [123]:
# paths
data_dir = os.path.join('../../data/', system)
model_dir = os.path.join('../../models/', system)

# global const
n_forward = 5

In [124]:
# load data
train_data = np.load(os.path.join(data_dir, 'train_noise{}.npy'.format(noise)))
val_data = np.load(os.path.join(data_dir, 'val_noise{}.npy'.format(noise)))
test_data = np.load(os.path.join(data_dir, 'test_noise{}.npy'.format(noise)))
n_train = train_data.shape[0]
n_val = val_data.shape[0]
n_test = test_data.shape[0]

# create dataset object
datasets = list()
step_sizes = list()
step_size = [2**k for k in k]
print('Dt\'s: ')
for i in range(11):
    step_size = 2**i
    print(step_size * dt)
    step_sizes.append(step_size)
    datasets.append(net.DataSet(train_data, val_data, test_data, dt, step_size=step_size, n_forward=n_forward))

# dataset = net.DataSet(train_data, val_data, test_data, dt, step_size, n_forward)

Dt's: 
0.01
0.02
0.04
0.08
0.16
0.32
0.64
1.28
2.56
5.12
10.24


In [125]:
datasets[0].device

'cuda'

In [126]:
models=list()
for (step_size, dataset) in zip(step_sizes, datasets):
    print('training model_D{} ...'.format(step_size))
    model_name = 'model_D{}_noise{}.pt'.format(step_size, noise)
    #model object
    device = 'cuda' if torch.cuda.is_available() else 'cpu'
    #model = torch.load(os.path.join(model_dir, model_name), map_location=device)
    #model.device = device
    # set up the network
    model = net.ResNet(arch=arch, dt=dt, step_size=step_size)
    # training
    model.train_net(dataset, max_epoch=max_epoch, batch_size=batch_size, lr=lr,  model_path=os.path.join(model_dir, model_name))
    models.append(model)

print('models trained successfully!')
# create/load model object
# try:
#     device = 'cuda' if torch.cuda.is_available() else 'cpu'
#     model = torch.load(os.path.join(model_dir, model_name), map_location=device)
#     model.device = device
# except:
#     print('create model {} ...'.format(model_name))
#     model = net.ResNet(arch=arch, dt=dt, step_size=step_size)
#
# # training
# model.train_net(dataset, max_epoch=max_epoch, batch_size=batch_size, lr=lr,
#                 model_path=os.path.join(model_dir, model_name))

training model_D1 ...
epoch 1000, training loss 0.004603048786520958, validation loss 0.005144072230905294
(--> new model saved @ epoch 1000)
epoch 2000, training loss 0.004512108396738768, validation loss 0.0051805367693305016
epoch 3000, training loss 0.004212073050439358, validation loss 0.005234193056821823
epoch 4000, training loss 0.00443781353533268, validation loss 0.005252656061202288
epoch 5000, training loss 0.004236592445522547, validation loss 0.005396964494138956
epoch 6000, training loss 0.004090677015483379, validation loss 0.0054458430968225
epoch 7000, training loss 0.004101904109120369, validation loss 0.005491490475833416
epoch 8000, training loss 0.0038857031613588333, validation loss 0.005526662338525057
epoch 9000, training loss 0.003916052635759115, validation loss 0.005612837616354227
epoch 10000, training loss 0.003770360955968499, validation loss 0.005738550331443548
training model_D2 ...
epoch 1000, training loss 0.004377912729978561, validation loss 0.00497