# Supervised OGT prediction with Prime

This tutorial demonstrates how to predict the OGT of a protein using a pretrained model from the Prime model.

We provide:

- The sequences, a FASTA file.

Goals
Obtain an predicted OGT for each sequence.


## Config for imports

In [9]:
import sys
sys.path.append('..')

## Import the necessary libraries and modules.

In [10]:
from prime.model import SupervisedRegression, Config
import torch
import pandas as pd
from Bio import SeqIO
from tqdm.notebook import tqdm

## Prepare data path

In [11]:
sequence_file = "example.fasta"

## Load model

In [13]:
model_path = "../checkpoints/prime_ogt.pt"
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = SupervisedRegression(Config())
model.load_state_dict(torch.load(model_path))
model.eval()
model = model.to(device)

## Prediction

In [15]:
togt = []
with torch.no_grad():
    for record in tqdm(list(SeqIO.parse(sequence_file, "fasta"))):
        sequence = str(record.seq)
        sequence_ids = model.tokenize(sequence).to(device)
        attention_mask = torch.ones_like(sequence_ids).to(device)
        logits = model(input_ids=sequence_ids, attention_mask=attention_mask)[0]
        togt.append(logits.item())

  0%|          | 0/14 [00:00<?, ?it/s]

In [16]:
togt

[26.19342803955078,
 26.880943298339844,
 21.495868682861328,
 28.51004409790039,
 22.163387298583984,
 31.906055450439453,
 26.732147216796875,
 26.881406784057617,
 34.62818908691406,
 27.475366592407227,
 24.651473999023438,
 23.612808227539062,
 24.937366485595703,
 26.754934310913086]