The purpose of this notebook is for testing the model by predicting the thusfar unseen dataset. But this could also be used for loading in any set of observations and bulk prediting them.

# _Loading in Model_

Choose your model from the ones created by _model_train_ or _warm_start_.

___NB:___ the _saved_models_ folder below is in the _.gitignore_ folder of this repo due to the size of the pt files. 

In [None]:
BEST_MODEL = "saved_models/best_model.pt"

Importing usual packages.

In [None]:
import torch
import pandas as pd
from model import BERTClass, load_checkpoint

Set _device_ object based on your machine's available processecing units. 

In [None]:
if torch.cuda.is_available(): # check for CUDA gpu
    device = torch.device("cuda")
elif torch.backends.mps.is_available(): # Check for Apple M1/M2 chip
    device = torch.device("mps")
else:
    device = torch.device("cpu") # Otherwise just use CPU

model = BERTClass()
model.to(device)

Now we can load in our best model.

In [None]:
LEARNING_RATE = 1e-05
optimizer = torch.optim.Adam(params=model.parameters(), lr=LEARNING_RATE)

model, optimizer, epoch, valid_loss_min_input = load_checkpoint(BEST_MODEL, model, optimizer)

# Loading Test Data

This next cell will read in the untouched dataset and use the data_helpers to prepare the feature one-hot encoded feature vector.

In [None]:
from data_helpers import feature_prep, df_loader

test_data = pd.read_csv("observations-finaltest.csv")
test_data = feature_prep(test_data)
test_data.head(5)

Select a Sample of the testing dataset

In [None]:
TEST_SIZE = 100
RANDOM_STATE = 500
test_sample = test_data.sample(TEST_SIZE, random_state=RANDOM_STATE).reset_index(drop=True)

Batch up into a test loader ready to do predictions.

In [None]:
BATCH_SIZE = 32
test_loader = df_loader(test_data, BATCH_SIZE)

## Predicting Test Data

To break things up I have provided a couple of handy functions. One will use the loaded model to evaluate the category and return the target one-hot encoded tensor. 