# Training Environment Development 

In this notebook I develop the training script for the signature similarity model. This is not the environment where I will train the model, however.

### Load Model

In [1]:
from model import BiEncoder

# create model instance with cosine similarity threshold of 0.5
signature_model = BiEncoder(threshold=0.5)

In [2]:
total_params = sum(p.numel() for p in signature_model.parameters())
print(f'Total number of parameters: {total_params}')

Total number of parameters: 460536


### Load Data

In [3]:
import os
import torch

data_dir = os.path.join(os.getcwd(), "data")

# load train.pt and val.pt

train_examples = torch.load(os.path.join(data_dir, "train_examples.pt"))
train_labels = torch.load(os.path.join(data_dir, "train_labels.pt"))

val_examples = torch.load(os.path.join(data_dir, "val_examples.pt"))
val_labels = torch.load(os.path.join(data_dir, "val_labels.pt"))

#set requires_grad to True for data so backpropagation works
train_examples.requires_grad = True
train_labels.requires_grad = True
val_examples.requires_grad = True
val_labels.requires_grad = True

### Create Tensor Dataset
The Tensor_To_Dataset is a custom subclass of torchs Dataset module that will be used to seperate the labels and features. As well as give the features the __len__ and __getitem__ methods

In [4]:
from torch.utils.data import TensorDataset

train_dataset = TensorDataset(train_examples, train_labels)

val_dataset = TensorDataset(val_examples, val_labels)

### Wrap in Dataloader

In [5]:
from torch.utils.data import DataLoader

# create train and val dataloaders
train_dataloader = DataLoader(train_dataset, batch_size=8, shuffle=True)
val_dataloader = DataLoader(val_dataset, batch_size=8, shuffle=True)

### Test Training Environment

In [6]:
from train import Trainer
import torch.nn as nn

#set criterion and optimizer
criterion = nn.BCELoss()
optimizer = torch.optim.Adam(signature_model.parameters(), lr=0.001)

#set epochs
epochs = 10

#set device
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

#set early stopping patience
patience = 5

#set log frequency for model eval 
log_frequency = 1

# create trainer instance
trainer = Trainer( model = signature_model,
                   criterion = criterion, 
                   optimizer = optimizer, 
                   train_loader = train_dataloader, 
                   val_loader = val_dataloader, 
                   epochs = epochs, 
                   device = device, 
                   early_stopping_patience = patience, 
                   log_freq = log_frequency)

This is how you call the trainer class.

In [None]:
trainer.train()