# Evaluation

These are some initial results from training a simple multilayerd
lstm model on
the task of generating the acoustic feature output for the next
frame.

The model was trained on **the entire** dataset as a start to see if
doing this
with a vanilla multilayered lstm using MSELoss would be feasible.
The
dataset consists of **4812** datapoints. Each datum consists of 152 frames
with
4 values as input and 2 targets for each frame. These values correlate for
the
pitch and intensity values for the speaker, the listener and the the target
output for the listener in the subsequent frame, resectively.

The model used is simply a multilayered lstm with n layers, h hidden nodes, and
a fully connected output layer mapping the hidden state size to the desired
output dimensions.  Training consisted of training the model for 100 epochs on
the dataset with `h=64` and `n=[1,2,5,10,15,20]`. The lstm trained on the entire
152 element long sequence for each backpropagation and was optimized using Adam
(lr=0.001) on the mean squared error loss.

:bulb: **Warning** This was accidentally trained with forced teaching on the
entire sequence. The best model in the absent of this had the same behavior
irrelevant of input.

The best loss came from using 5 layers.

<img width="200" src='images/basic_lstm_5_layer_loss.png' alt="training" >

Here is a [link](http://130.237.67.222:6007/#scalars) to the complete training
output in tensorboard.

In [1]:
import matplotlib.pyplot as plt
from tqdm import tqdm

from maptask.dataset import DSet
from maptask.models import BasicLstm

import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
from torch.utils.data import DataLoader

def plot(out, target):
    pitch = out[0].numpy()
    intensity = out[1].numpy()
    target_pitch = target[0].numpy()
    target_intensity = target[1].numpy()

    plt.subplot(2,1,1)
    plt.title('Pitch')
    plt.plot(pitch, 'b', label='Predicted')
    plt.plot(target_pitch, 'r', label='Target')
    plt.legend()
    plt.xlabel('frames')
    plt.ylabel('frequency', color='k')
    plt.tick_params('y', colors='k')
    plt.subplot(2,1,2)
    plt.title('Intensity')
    plt.plot(intensity, 'b', label='Predicted')
    plt.plot(target_intensity, 'r', label='Target')
    plt.legend()
    plt.ylabel('intensity', color='k')
    plt.tick_params('frames', colors='k')
    plt.tight_layout()
    plt.pause(0.1)


# Data
dset = DSet()
dloader = DataLoader(dset, batch_size=64, shuffle=True)

# Load model
model_path = 'checkpoints/basiclstm_l5_best.pt'

print('Loading model: ', model_path)
model = torch.load(model_path)
model.lstm.flatten_parameters()
model.eval()

device = 'cuda' if torch.cuda.is_available() else 'cpu'
print('Iterate through data set comparing predictions with target: ')

ModuleNotFoundError: No module named 'maptask'

Plot target and predicted values

In [None]:
inp, target = dset.get_random()
inp = inp.unsqueeze(0).to(device)
out = model(inp).squeeze()
plot(out.cpu().detach(), target)

In [None]:
inp, target = dset.get_random()
inp = inp.unsqueeze(0).to(device)
out = model(inp).squeeze()
plot(out.cpu().detach(), target)

In [None]:
inp, target = dset.get_random()
inp = inp.unsqueeze(0).to(device)
out = model(inp).squeeze()
plot(out.cpu().detach(), target)

In [None]:
inp, target = dset.get_random()
inp = inp.unsqueeze(0).to(device)
out = model(inp).squeeze()
plot(out.cpu().detach(), target)