# Supervised Topt prediction with Prime

This tutorial demonstrates how to predict the Topt of a protein using a pretrained model from the Prime model.

We provide:

- The sequences, a FASTA file.

Goals
Obtain an predicted Topt for each sequence.


## Config for imports

In [2]:
import sys
sys.path.append('..')

## Import the necessary libraries and modules.

In [3]:
from prime.model import SupervisedRegression, Config
import torch
import pandas as pd
from Bio import SeqIO
from tqdm.notebook import tqdm

## Prepare data path

In [4]:
sequence_file = "example.fasta"

## Load model

In [5]:
model_path = "../checkpoints/prime_topt.pt"
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = SupervisedRegression(Config())
model.load_state_dict(torch.load(model_path))
model.eval()
model = model.to(device)

## Prediction

In [7]:
topt = []
with torch.no_grad():
    for record in tqdm(list(SeqIO.parse(sequence_file, "fasta"))):
        sequence = str(record.seq)
        sequence_ids = model.tokenize(sequence).to(device)
        attention_mask = torch.ones_like(sequence_ids).to(device)
        logits = model(input_ids=sequence_ids, attention_mask=attention_mask)[0]
        topt.append(logits.item())

  0%|          | 0/14 [00:00<?, ?it/s]

In [8]:
topt

[40.687538146972656,
 43.96522521972656,
 43.535552978515625,
 37.044471740722656,
 39.86117172241211,
 58.468631744384766,
 50.20270538330078,
 34.72401428222656,
 54.70647048950195,
 37.22149658203125,
 36.48097610473633,
 42.725868225097656,
 32.847328186035156,
 37.01423645019531]