# Model Evaluation

This notebook demonstrates how you can run my pre-trained model on unseen test data.

In [None]:
# My virtual environment is tracked using `pipenv`.
# From the top directory of the project, run:
!pipenv install

In [None]:
import sys; sys.path.append('..')

from functools import partial
import tqdm, os, json

import pandas as pd
import torch
import torch.nn as nn
import gvpgnn.datasets as datasets
import gvpgnn.models as models
import gvpgnn.paths as paths
import gvpgnn.data_models as dm
import gvpgnn.embeddings as embeddings
import gvpgnn.train_utils as train_utils
import numpy as np
import torch_geometric
from sklearn.metrics import confusion_matrix
from scripts.parser import parser

## Step 1: Preprocess the Data

### Required files:
- A CSV with the same format as `cath_w_seqs_share.csv` that contains test data (the exact filename can be changed below)
- A folder like `pdb_share` containing PDB files (the exact folder can be changed below)

### 1a: Preprocess the Dataset

For convenience, I map the provided raw data to a format that's easier for my dataloader to use. You'll need to preprocess any unseen test data in the same way by following the next few steps.

In [None]:
# From the top level of the repo:
!cd scripts/
!python preprocess.py \
  --csv path_to_your_test_data.csv \
  --output-folder ../data/challenge_test_set \
  --pdb-folder ../data/pdb_share

### 1b: Pre-Compute Language Model Embeddings

Next, I precompute language model embeddings for all of the examples in the dataset. These are placed alongside the `JSON` data as `.pt` files. The whole dataset is copied to a new folder to avoid overwriting any of the original data.

In [None]:
# Runs a script that fetches the pre-trained weights for all language models:
!cd scipts/
!python download_esm.py

# Then run a script to precompute the embeddings:
!python precompute_embeddings.py --in-dataset ../data/challenge_test_set

## Step 2: Run the Model on the Test Set

With the pre-processed test dataset, you can run the `train.py` script in test mode.

You should see a progress bar appear as batches are processed.

Finally, you should get a printout of metrics like the following:
```bash
*** Test set results:
Loss: 1.4339 acc: 0.5044
Accuracy (top1): 50.4362%
----------------------------------------------------

	(1, 10)	(1, 20)	(2, 30)	(2, 40)	(2, 60)	(3, 10)	(3, 20)	(3, 30)	(3, 40)	(3, 90)	Count
(1, 10)	659	265	0	0	0	0	0	68	0	8	132
(1, 20)	287	704	0	0	0	0	0	9	0	0	115
(2, 30)	0	0	39	305	367	109	0	180	0	0	128
(2, 40)	0	0	0	436	257	57	0	193	0	57	140
(2, 60)	0	0	0	31	915	15	0	38	0	0	130
(3, 10)	0	0	0	0	0	108	0	892	0	0	139
(3, 20)	0	0	0	0	0	0	1000	0	0	0	104
(3, 30)	44	0	0	0	88	88	0	555	182	44	137
(3, 40)	23	30	0	0	0	8	23	197	667	53	132
(3, 90)	0	0	0	0	731	269	0	0	0	0	104
```

In [None]:
!cd scripts
!./test_best_model.sh