# Experiment Overview

In this experiment, I investigated the effectiveness of various neural network architectures for protein family classification, a problem with significant practical applications in the fields of bioinformatics and drug discovery. The ability to accurately classify protein sequences into different families is critical for understanding their structure, function, and evolution. It is also essential for identifying potential drug targets and developing new treatments for diseases.

The problem of protein family classification is closely related to natural language processing (NLP) tasks, such as text classification and sentiment analysis. This is because protein sequences, like natural language, are essentially a sequence of symbols that can be represented as a sequence of tokens. Moreover, the ability to capture the context and meaning of these tokens is critical for accurate classification in both cases.

To achieve my objective, I conducted a series of experiments using a subset of the PFAM dataset, focusing on the 100 most popular protein families to make the analysis manageable. My aim was to compare different models and determine the most effective deep learning method for this challenging problem. Specifically, I focused on maximizing the F1 score, a critical performance metric in protein sequence classification.

To this end, I tested six different model architectures: Fully-Connected Feed-Forward Network, Recurrent Neural Network (RNN), RNN with embedding, Long Short-Term Memory (LSTM), Bi-Directional LSTM, and Transformer (encoder only). I used hyperparameter optimization to fine-tune the model parameters and extract the best possible performance.

Through these experiments, I aimed to demonstrate the advantages and disadvantages of each architecture and provide insights into the most effective deep learning methods for protein family classification. The results of my study may have practical implications for researchers and practitioners in the field of protein sequence analysis. By accurately classifying protein sequences, we can better understand their structure and function, and potentially identify new drug targets for developing new treatments.

In [5]:
from operations import tuning_and_training, test, training_with_parameters

In [2]:
class_gap = 1
num_classes = 100
max_seq_len = 128
vocab_len = 24
batch_size = 64

## Fully Connected Model

Employing a Fully-Connected Feed-Forward Network, we exploit its simplicity and efficiency, focusing on tuning the hidden dimension and number of layers to maximize the F1 score in protein family classification.

In [3]:
model_name = "fc"
label = model_name + "_iter1"
no_trials = 20

In [3]:
tuning_and_training(
    model_name = model_name, 
    no_trials= no_trials, 
    class_gap = class_gap,
    num_classes = num_classes,
    max_seq_len = max_seq_len,
    vocab_len = vocab_len,
    batch_size = batch_size
)

[32m[I 2023-04-10 19:57:07,190][0m A new study created in memory with name: no-name-c18a80a1-aea9-4e1d-929e-d9c04400c3f6[0m


FC Trial 1 | hd 379, nl 1
Epoch   1/  5, train loss: 0.06, train f1: 0.08, val loss: 0.06, val f1: 0.11, duration: 6.2s


[32m[I 2023-04-10 19:57:32,601][0m Trial 0 finished with value: 0.17193517807328434 and parameters: {'hidden_dim': 379, 'num_layers': 1}. Best is trial 0 with value: 0.17193517807328434.[0m


Epoch   5/  5, train loss: 0.06, train f1: 0.15, val loss: 0.06, val f1: 0.17, duration: 4.5s
FC Trial 2 | hd 173, nl 2
Epoch   1/  5, train loss: 0.06, train f1: 0.12, val loss: 0.05, val f1: 0.20, duration: 4.6s


[32m[I 2023-04-10 19:57:58,192][0m Trial 1 finished with value: 0.266648465701802 and parameters: {'hidden_dim': 173, 'num_layers': 2}. Best is trial 1 with value: 0.266648465701802.[0m


Epoch   5/  5, train loss: 0.05, train f1: 0.21, val loss: 0.05, val f1: 0.27, duration: 4.6s
FC Trial 3 | hd 345, nl 0
Epoch   1/  5, train loss: 0.06, train f1: 0.07, val loss: 0.06, val f1: 0.12, duration: 3.9s


[32m[I 2023-04-10 19:58:19,592][0m Trial 2 finished with value: 0.21747621017293706 and parameters: {'hidden_dim': 345, 'num_layers': 0}. Best is trial 1 with value: 0.266648465701802.[0m


Epoch   5/  5, train loss: 0.06, train f1: 0.18, val loss: 0.05, val f1: 0.22, duration: 4.4s
FC Trial 4 | hd 469, nl 0
Epoch   1/  5, train loss: 0.07, train f1: 0.04, val loss: 0.07, val f1: 0.05, duration: 3.6s


[32m[I 2023-04-10 19:58:40,061][0m Trial 3 finished with value: 0.080621592900695 and parameters: {'hidden_dim': 469, 'num_layers': 0}. Best is trial 1 with value: 0.266648465701802.[0m


Epoch   5/  5, train loss: 0.06, train f1: 0.07, val loss: 0.07, val f1: 0.08, duration: 3.5s
FC Trial 5 | hd 93, nl 0
Epoch   1/  5, train loss: 0.05, train f1: 0.27, val loss: 0.04, val f1: 0.50, duration: 3.5s


[32m[I 2023-04-10 19:59:00,514][0m Trial 4 finished with value: 0.5736501056764914 and parameters: {'hidden_dim': 93, 'num_layers': 0}. Best is trial 4 with value: 0.5736501056764914.[0m


Epoch   5/  5, train loss: 0.04, train f1: 0.44, val loss: 0.03, val f1: 0.57, duration: 3.5s
FC Trial 6 | hd 220, nl 2
Epoch   1/  5, train loss: 0.06, train f1: 0.15, val loss: 0.05, val f1: 0.25, duration: 5.2s


[32m[I 2023-04-10 19:59:29,267][0m Trial 5 finished with value: 0.28115887052554883 and parameters: {'hidden_dim': 220, 'num_layers': 2}. Best is trial 4 with value: 0.5736501056764914.[0m


Epoch   5/  5, train loss: 0.05, train f1: 0.23, val loss: 0.05, val f1: 0.28, duration: 4.7s
FC Trial 7 | hd 350, nl 1
Epoch   1/  5, train loss: 0.06, train f1: 0.08, val loss: 0.06, val f1: 0.11, duration: 4.5s


[32m[I 2023-04-10 19:59:53,351][0m Trial 6 finished with value: 0.1545625286783351 and parameters: {'hidden_dim': 350, 'num_layers': 1}. Best is trial 4 with value: 0.5736501056764914.[0m


Epoch   5/  5, train loss: 0.06, train f1: 0.13, val loss: 0.06, val f1: 0.15, duration: 4.3s
FC Trial 8 | hd 306, nl 1
Epoch   1/  5, train loss: 0.06, train f1: 0.06, val loss: 0.06, val f1: 0.10, duration: 4.5s


[32m[I 2023-04-10 20:00:17,659][0m Trial 7 finished with value: 0.12524805161554953 and parameters: {'hidden_dim': 306, 'num_layers': 1}. Best is trial 4 with value: 0.5736501056764914.[0m


Epoch   5/  5, train loss: 0.06, train f1: 0.11, val loss: 0.06, val f1: 0.13, duration: 4.2s
FC Trial 9 | hd 492, nl 1
Epoch   1/  5, train loss: 0.06, train f1: 0.06, val loss: 0.06, val f1: 0.08, duration: 4.2s


[32m[I 2023-04-10 20:00:42,075][0m Trial 8 finished with value: 0.1366059448201182 and parameters: {'hidden_dim': 492, 'num_layers': 1}. Best is trial 4 with value: 0.5736501056764914.[0m


Epoch   5/  5, train loss: 0.06, train f1: 0.11, val loss: 0.06, val f1: 0.14, duration: 4.9s
FC Trial 10 | hd 284, nl 1
Epoch   1/  5, train loss: 0.06, train f1: 0.11, val loss: 0.06, val f1: 0.18, duration: 4.8s


[32m[I 2023-04-10 20:01:07,372][0m Trial 9 finished with value: 0.22621305301297784 and parameters: {'hidden_dim': 284, 'num_layers': 1}. Best is trial 4 with value: 0.5736501056764914.[0m


Epoch   5/  5, train loss: 0.05, train f1: 0.19, val loss: 0.05, val f1: 0.23, duration: 4.2s
FC Trial 11 | hd 65, nl 0
Epoch   1/  5, train loss: 0.05, train f1: 0.26, val loss: 0.04, val f1: 0.54, duration: 3.5s


[32m[I 2023-04-10 20:01:28,364][0m Trial 10 finished with value: 0.6306640491658346 and parameters: {'hidden_dim': 65, 'num_layers': 0}. Best is trial 10 with value: 0.6306640491658346.[0m


Epoch   5/  5, train loss: 0.04, train f1: 0.47, val loss: 0.03, val f1: 0.63, duration: 4.0s
FC Trial 12 | hd 67, nl 0
Epoch   1/  5, train loss: 0.05, train f1: 0.25, val loss: 0.04, val f1: 0.49, duration: 4.0s


[32m[I 2023-04-10 20:01:51,296][0m Trial 11 finished with value: 0.5961351939691282 and parameters: {'hidden_dim': 67, 'num_layers': 0}. Best is trial 10 with value: 0.6306640491658346.[0m


Epoch   5/  5, train loss: 0.04, train f1: 0.44, val loss: 0.03, val f1: 0.60, duration: 3.9s
FC Trial 13 | hd 73, nl 0
Epoch   1/  5, train loss: 0.05, train f1: 0.24, val loss: 0.04, val f1: 0.47, duration: 3.8s


[32m[I 2023-04-10 20:02:13,372][0m Trial 12 finished with value: 0.5665516037727157 and parameters: {'hidden_dim': 73, 'num_layers': 0}. Best is trial 10 with value: 0.6306640491658346.[0m


Epoch   5/  5, train loss: 0.04, train f1: 0.43, val loss: 0.03, val f1: 0.57, duration: 3.8s
FC Trial 14 | hd 139, nl 0
Epoch   1/  5, train loss: 0.05, train f1: 0.31, val loss: 0.03, val f1: 0.51, duration: 4.3s


[32m[I 2023-04-10 20:02:36,905][0m Trial 13 finished with value: 0.5700343208839727 and parameters: {'hidden_dim': 139, 'num_layers': 0}. Best is trial 10 with value: 0.6306640491658346.[0m


Epoch   5/  5, train loss: 0.04, train f1: 0.46, val loss: 0.03, val f1: 0.57, duration: 5.2s
FC Trial 15 | hd 64, nl 0
Epoch   1/  5, train loss: 0.05, train f1: 0.25, val loss: 0.04, val f1: 0.49, duration: 4.6s


[32m[I 2023-04-10 20:02:59,199][0m Trial 14 finished with value: 0.5816243011647051 and parameters: {'hidden_dim': 64, 'num_layers': 0}. Best is trial 10 with value: 0.6306640491658346.[0m


Epoch   5/  5, train loss: 0.04, train f1: 0.44, val loss: 0.03, val f1: 0.58, duration: 3.6s
FC Trial 16 | hd 197, nl 0
Epoch   1/  5, train loss: 0.06, train f1: 0.17, val loss: 0.05, val f1: 0.31, duration: 3.6s


[32m[I 2023-04-10 20:03:19,815][0m Trial 15 finished with value: 0.40996327304128827 and parameters: {'hidden_dim': 197, 'num_layers': 0}. Best is trial 10 with value: 0.6306640491658346.[0m


Epoch   5/  5, train loss: 0.05, train f1: 0.32, val loss: 0.04, val f1: 0.41, duration: 3.6s
FC Trial 17 | hd 131, nl 0
Epoch   1/  5, train loss: 0.05, train f1: 0.25, val loss: 0.04, val f1: 0.41, duration: 3.8s


[32m[I 2023-04-10 20:03:40,661][0m Trial 16 finished with value: 0.475578751306846 and parameters: {'hidden_dim': 131, 'num_layers': 0}. Best is trial 10 with value: 0.6306640491658346.[0m


Epoch   5/  5, train loss: 0.04, train f1: 0.38, val loss: 0.04, val f1: 0.48, duration: 3.9s
FC Trial 18 | hd 233, nl 2
Epoch   1/  5, train loss: 0.06, train f1: 0.18, val loss: 0.05, val f1: 0.29, duration: 4.8s


[32m[I 2023-04-10 20:04:07,156][0m Trial 17 finished with value: 0.3596663182480019 and parameters: {'hidden_dim': 233, 'num_layers': 2}. Best is trial 10 with value: 0.6306640491658346.[0m


Epoch   5/  5, train loss: 0.05, train f1: 0.28, val loss: 0.04, val f1: 0.36, duration: 4.7s
FC Trial 19 | hd 130, nl 0
Epoch   1/  5, train loss: 0.05, train f1: 0.22, val loss: 0.04, val f1: 0.38, duration: 3.9s


[32m[I 2023-04-10 20:04:28,604][0m Trial 18 finished with value: 0.44319996948945184 and parameters: {'hidden_dim': 130, 'num_layers': 0}. Best is trial 10 with value: 0.6306640491658346.[0m


Epoch   5/  5, train loss: 0.04, train f1: 0.34, val loss: 0.04, val f1: 0.44, duration: 4.2s
FC Trial 20 | hd 115, nl 1
Epoch   1/  5, train loss: 0.05, train f1: 0.21, val loss: 0.04, val f1: 0.37, duration: 4.3s


[32m[I 2023-04-10 20:04:53,438][0m Trial 19 finished with value: 0.45963048183520394 and parameters: {'hidden_dim': 115, 'num_layers': 1}. Best is trial 10 with value: 0.6306640491658346.[0m


Epoch   5/  5, train loss: 0.05, train f1: 0.34, val loss: 0.04, val f1: 0.46, duration: 4.6s
Best parameters: {'hidden_dim': 65, 'num_layers': 0}
FC Model Best Params | hd 65, nl 0
Epoch   1/ 10, train loss: 0.05, train f1: 0.25, val loss: 0.04, val f1: 0.49, duration: 3.9s
Epoch   5/ 10, train loss: 0.04, train f1: 0.44, val loss: 0.03, val f1: 0.60, duration: 3.8s
Epoch  10/ 10, train loss: 0.04, train f1: 0.46, val loss: 0.03, val f1: 0.61, duration: 3.9s


In [4]:
_, test_f1 = test(label)

Test F1 score: 0.6144948415345701


## RNN

Building upon the Feed-Forward Network, we introduce a Recurrent Neural Network (RNN) to capture temporal dependencies in protein sequences, optimizing both hidden dimension and number of layers for superior performance.

In [7]:
model_name = "rnn"
label = model_name + "_iter1"
no_trials = 20

In [4]:
tuning_and_training(
    model_name = model_name, 
    no_trials= no_trials, 
    class_gap = class_gap,
    num_classes = num_classes,
    max_seq_len = max_seq_len,
    vocab_len = vocab_len,
    batch_size = batch_size
)

[32m[I 2023-04-10 20:38:16,147][0m A new study created in memory with name: no-name-89057f18-87d3-48f2-8f8e-8e0794cf8f50[0m


RNN Trial 0 | hd 132, nl 3
Epoch   1/  5, train loss: 0.03, train f1: 0.55, val loss: 0.02, val f1: 0.70, duration: 8.2s


[32m[I 2023-04-10 20:39:01,492][0m Trial 0 finished with value: 0.7657530711353085 and parameters: {'hidden_dim': 132, 'num_layers': 3}. Best is trial 0 with value: 0.7657530711353085.[0m


Epoch   5/  5, train loss: 0.01, train f1: 0.77, val loss: 0.02, val f1: 0.75, duration: 8.2s
RNN Trial 1 | hd 129, nl 2
Epoch   1/  5, train loss: 0.03, train f1: 0.56, val loss: 0.02, val f1: 0.66, duration: 7.1s


[32m[I 2023-04-10 20:39:38,199][0m Trial 1 finished with value: 0.8246473585215491 and parameters: {'hidden_dim': 129, 'num_layers': 2}. Best is trial 1 with value: 0.8246473585215491.[0m


Epoch   5/  5, train loss: 0.01, train f1: 0.82, val loss: 0.01, val f1: 0.82, duration: 6.9s
RNN Trial 2 | hd 72, nl 3
Epoch   1/  5, train loss: 0.04, train f1: 0.44, val loss: 0.02, val f1: 0.60, duration: 7.2s


[32m[I 2023-04-10 20:40:16,997][0m Trial 2 finished with value: 0.7630807568828937 and parameters: {'hidden_dim': 72, 'num_layers': 3}. Best is trial 1 with value: 0.8246473585215491.[0m


Epoch   5/  5, train loss: 0.02, train f1: 0.74, val loss: 0.01, val f1: 0.76, duration: 7.4s
RNN Trial 3 | hd 256, nl 4
Epoch   1/  5, train loss: 0.03, train f1: 0.61, val loss: 0.02, val f1: 0.75, duration: 21.3s


[32m[I 2023-04-10 20:42:10,975][0m Trial 3 finished with value: 0.8159934543438717 and parameters: {'hidden_dim': 256, 'num_layers': 4}. Best is trial 1 with value: 0.8246473585215491.[0m


Epoch   5/  5, train loss: 0.03, train f1: 0.60, val loss: 0.07, val f1: 0.00, duration: 22.7s
RNN Trial 4 | hd 127, nl 4
Epoch   1/  5, train loss: 0.03, train f1: 0.53, val loss: 0.02, val f1: 0.67, duration: 8.9s


[32m[I 2023-04-10 20:42:56,271][0m Trial 4 finished with value: 0.8272877721445038 and parameters: {'hidden_dim': 127, 'num_layers': 4}. Best is trial 4 with value: 0.8272877721445038.[0m


Epoch   5/  5, train loss: 0.01, train f1: 0.80, val loss: 0.01, val f1: 0.83, duration: 8.8s
RNN Trial 5 | hd 241, nl 3
Epoch   1/  5, train loss: 0.02, train f1: 0.62, val loss: 0.02, val f1: 0.74, duration: 17.0s


[32m[I 2023-04-10 20:44:25,354][0m Trial 5 finished with value: 0.8136566085722821 and parameters: {'hidden_dim': 241, 'num_layers': 3}. Best is trial 4 with value: 0.8272877721445038.[0m


Epoch   5/  5, train loss: 0.05, train f1: 0.25, val loss: 0.03, val f1: 0.54, duration: 17.4s
RNN Trial 6 | hd 158, nl 2
Epoch   1/  5, train loss: 0.03, train f1: 0.57, val loss: 0.02, val f1: 0.71, duration: 7.2s


[32m[I 2023-04-10 20:45:03,960][0m Trial 6 finished with value: 0.8215657115137107 and parameters: {'hidden_dim': 158, 'num_layers': 2}. Best is trial 4 with value: 0.8272877721445038.[0m


Epoch   5/  5, train loss: 0.01, train f1: 0.81, val loss: 0.01, val f1: 0.82, duration: 7.3s
RNN Trial 7 | hd 222, nl 3
Epoch   1/  5, train loss: 0.02, train f1: 0.61, val loss: 0.02, val f1: 0.68, duration: 12.2s


[32m[I 2023-04-10 20:46:08,237][0m Trial 7 finished with value: 0.8289530151288308 and parameters: {'hidden_dim': 222, 'num_layers': 3}. Best is trial 7 with value: 0.8289530151288308.[0m


Epoch   5/  5, train loss: 0.01, train f1: 0.82, val loss: 0.01, val f1: 0.82, duration: 12.6s
RNN Trial 8 | hd 118, nl 4
Epoch   1/  5, train loss: 0.03, train f1: 0.49, val loss: 0.03, val f1: 0.54, duration: 8.8s


[32m[I 2023-04-10 20:46:53,939][0m Trial 8 finished with value: 0.8025297804522031 and parameters: {'hidden_dim': 118, 'num_layers': 4}. Best is trial 7 with value: 0.8289530151288308.[0m


Epoch   5/  5, train loss: 0.01, train f1: 0.78, val loss: 0.01, val f1: 0.80, duration: 8.8s
RNN Trial 9 | hd 137, nl 2
Epoch   1/  5, train loss: 0.03, train f1: 0.56, val loss: 0.02, val f1: 0.71, duration: 6.9s


[32m[I 2023-04-10 20:47:31,101][0m Trial 9 finished with value: 0.8294072242208843 and parameters: {'hidden_dim': 137, 'num_layers': 2}. Best is trial 9 with value: 0.8294072242208843.[0m


Epoch   5/  5, train loss: 0.01, train f1: 0.82, val loss: 0.01, val f1: 0.83, duration: 6.9s
RNN Trial 10 | hd 195, nl 2
Epoch   1/  5, train loss: 0.03, train f1: 0.59, val loss: 0.02, val f1: 0.70, duration: 8.5s


[32m[I 2023-04-10 20:48:13,738][0m Trial 10 finished with value: 0.8230254925203694 and parameters: {'hidden_dim': 195, 'num_layers': 2}. Best is trial 9 with value: 0.8294072242208843.[0m


Epoch   5/  5, train loss: 0.01, train f1: 0.81, val loss: 0.01, val f1: 0.82, duration: 8.0s
RNN Trial 11 | hd 209, nl 2
Epoch   1/  5, train loss: 0.03, train f1: 0.61, val loss: 0.02, val f1: 0.73, duration: 7.8s


[32m[I 2023-04-10 20:48:55,639][0m Trial 11 finished with value: 0.7784551916433063 and parameters: {'hidden_dim': 209, 'num_layers': 2}. Best is trial 9 with value: 0.8294072242208843.[0m


Epoch   5/  5, train loss: 0.01, train f1: 0.81, val loss: 0.01, val f1: 0.78, duration: 7.8s
RNN Trial 12 | hd 190, nl 3
Epoch   1/  5, train loss: 0.03, train f1: 0.58, val loss: 0.02, val f1: 0.73, duration: 10.8s


[32m[I 2023-04-10 20:49:52,800][0m Trial 12 finished with value: 0.8239071926154452 and parameters: {'hidden_dim': 190, 'num_layers': 3}. Best is trial 9 with value: 0.8294072242208843.[0m


Epoch   5/  5, train loss: 0.01, train f1: 0.78, val loss: 0.01, val f1: 0.81, duration: 11.4s
RNN Trial 13 | hd 165, nl 2
Epoch   1/  5, train loss: 0.03, train f1: 0.56, val loss: 0.02, val f1: 0.69, duration: 7.6s


[32m[I 2023-04-10 20:50:31,930][0m Trial 13 finished with value: 0.8248584844551898 and parameters: {'hidden_dim': 165, 'num_layers': 2}. Best is trial 9 with value: 0.8294072242208843.[0m


Epoch   5/  5, train loss: 0.01, train f1: 0.81, val loss: 0.01, val f1: 0.81, duration: 7.2s
RNN Trial 14 | hd 80, nl 3
Epoch   1/  5, train loss: 0.04, train f1: 0.43, val loss: 0.03, val f1: 0.61, duration: 7.3s


[32m[I 2023-04-10 20:51:11,076][0m Trial 14 finished with value: 0.755001962481192 and parameters: {'hidden_dim': 80, 'num_layers': 3}. Best is trial 9 with value: 0.8294072242208843.[0m


Epoch   5/  5, train loss: 0.01, train f1: 0.75, val loss: 0.02, val f1: 0.73, duration: 7.6s
RNN Trial 15 | hd 225, nl 2
Epoch   1/  5, train loss: 0.03, train f1: 0.61, val loss: 0.02, val f1: 0.74, duration: 10.2s


[32m[I 2023-04-10 20:52:05,400][0m Trial 15 finished with value: 0.795551682956296 and parameters: {'hidden_dim': 225, 'num_layers': 2}. Best is trial 9 with value: 0.8294072242208843.[0m


Epoch   5/  5, train loss: 0.01, train f1: 0.79, val loss: 0.01, val f1: 0.76, duration: 10.5s
RNN Trial 16 | hd 162, nl 4
Epoch   1/  5, train loss: 0.03, train f1: 0.56, val loss: 0.02, val f1: 0.71, duration: 13.0s


[32m[I 2023-04-10 20:53:15,410][0m Trial 16 finished with value: 0.8142344215750955 and parameters: {'hidden_dim': 162, 'num_layers': 4}. Best is trial 9 with value: 0.8294072242208843.[0m


Epoch   5/  5, train loss: 0.01, train f1: 0.81, val loss: 0.01, val f1: 0.81, duration: 13.7s
RNN Trial 17 | hd 101, nl 3
Epoch   1/  5, train loss: 0.03, train f1: 0.50, val loss: 0.02, val f1: 0.68, duration: 7.4s


[32m[I 2023-04-10 20:53:56,765][0m Trial 17 finished with value: 0.8143658419665017 and parameters: {'hidden_dim': 101, 'num_layers': 3}. Best is trial 9 with value: 0.8294072242208843.[0m


Epoch   5/  5, train loss: 0.01, train f1: 0.78, val loss: 0.01, val f1: 0.81, duration: 7.7s
RNN Trial 18 | hd 173, nl 2
Epoch   1/  5, train loss: 0.03, train f1: 0.57, val loss: 0.02, val f1: 0.71, duration: 7.8s


[32m[I 2023-04-10 20:54:38,793][0m Trial 18 finished with value: 0.8231934867828025 and parameters: {'hidden_dim': 173, 'num_layers': 2}. Best is trial 9 with value: 0.8294072242208843.[0m


Epoch   5/  5, train loss: 0.01, train f1: 0.82, val loss: 0.01, val f1: 0.82, duration: 7.8s
RNN Trial 19 | hd 216, nl 3
Epoch   1/  5, train loss: 0.02, train f1: 0.62, val loss: 0.02, val f1: 0.76, duration: 13.3s


[32m[I 2023-04-10 20:55:47,973][0m Trial 19 finished with value: 0.7591574301485434 and parameters: {'hidden_dim': 216, 'num_layers': 3}. Best is trial 9 with value: 0.8294072242208843.[0m


Epoch   5/  5, train loss: 0.05, train f1: 0.20, val loss: 0.04, val f1: 0.38, duration: 13.2s
Best parameters: {'hidden_dim': 137, 'num_layers': 2}
RNN Model Best Params  | hd 137, nl 2
Epoch   1/ 10, train loss: 0.03, train f1: 0.55, val loss: 0.02, val f1: 0.69, duration: 7.7s
Epoch   5/ 10, train loss: 0.01, train f1: 0.82, val loss: 0.01, val f1: 0.83, duration: 7.5s
Epoch  10/ 10, train loss: 0.01, train f1: 0.84, val loss: 0.01, val f1: 0.86, duration: 7.8s


In [5]:
_, test_f1 = test(label)

Test F1 score: 0.8583505189918742


## RNN (With Embed)

Enhancing the RNN architecture with an embedding layer, we facilitate a more compact and expressive input representation, jointly optimizing hidden dimension, number of layers, and embedding size for improved classification accuracy.

In [17]:
model_name = "rnn_embed"
label = model_name + "_iter1"
no_trials = 20

In [18]:
tuning_and_training(
    model_name = model_name,
    no_trials= no_trials, 
    class_gap = class_gap,
    num_classes = num_classes,
    max_seq_len = max_seq_len,
    vocab_len = vocab_len,
    batch_size = batch_size
)

[32m[I 2023-04-11 13:35:11,327][0m A new study created in memory with name: no-name-4117103e-2b91-4454-b151-ac64d8e17ba2[0m
[33m[W 2023-04-11 13:35:11,330][0m Trial 0 failed with parameters: {'hidden_dim': 424, 'num_layers': 3, 'embed_size': 59} because of the following error: The value None could not be cast to float..[0m
[33m[W 2023-04-11 13:35:11,332][0m Trial 0 failed with value None.[0m
[33m[W 2023-04-11 13:35:11,333][0m Trial 1 failed with parameters: {'hidden_dim': 477, 'num_layers': 4, 'embed_size': 56} because of the following error: The value None could not be cast to float..[0m
[33m[W 2023-04-11 13:35:11,334][0m Trial 1 failed with value None.[0m


RNNEmbed Trial 2 | hd 343, nl 3, ne 55
Epoch   1/  5, train loss: 0.02, train f1: 0.68, val loss: 0.01, val f1: 0.80, duration: 97.4s


[32m[I 2023-04-11 13:43:33,470][0m Trial 2 finished with value: 0.8257862570707792 and parameters: {'hidden_dim': 343, 'num_layers': 3, 'embed_size': 55}. Best is trial 2 with value: 0.8257862570707792.[0m


Epoch   5/  5, train loss: 0.01, train f1: 0.80, val loss: 0.01, val f1: 0.80, duration: 100.2s
RNNEmbed Trial 3 | hd 192, nl 3, ne 48
Epoch   1/  5, train loss: 0.02, train f1: 0.63, val loss: 0.02, val f1: 0.74, duration: 15.0s


[32m[I 2023-04-11 13:44:49,442][0m Trial 3 finished with value: 0.8334338859224628 and parameters: {'hidden_dim': 192, 'num_layers': 3, 'embed_size': 48}. Best is trial 3 with value: 0.8334338859224628.[0m
[33m[W 2023-04-11 13:44:49,444][0m Trial 4 failed with parameters: {'hidden_dim': 412, 'num_layers': 4, 'embed_size': 64} because of the following error: The value None could not be cast to float..[0m
[33m[W 2023-04-11 13:44:49,445][0m Trial 4 failed with value None.[0m


Epoch   5/  5, train loss: 0.01, train f1: 0.80, val loss: 0.01, val f1: 0.82, duration: 14.9s
RNNEmbed Trial 5 | hd 206, nl 4, ne 63
Epoch   1/  5, train loss: 0.02, train f1: 0.62, val loss: 0.02, val f1: 0.75, duration: 21.1s


[32m[I 2023-04-11 13:46:36,712][0m Trial 5 finished with value: 0.8129091021318721 and parameters: {'hidden_dim': 206, 'num_layers': 4, 'embed_size': 63}. Best is trial 3 with value: 0.8334338859224628.[0m
[33m[W 2023-04-11 13:46:36,714][0m Trial 6 failed with parameters: {'hidden_dim': 421, 'num_layers': 4, 'embed_size': 50} because of the following error: The value None could not be cast to float..[0m
[33m[W 2023-04-11 13:46:36,716][0m Trial 6 failed with value None.[0m


Epoch   5/  5, train loss: 0.02, train f1: 0.73, val loss: 0.02, val f1: 0.61, duration: 21.3s
RNNEmbed Trial 7 | hd 176, nl 2, ne 64
Epoch   1/  5, train loss: 0.02, train f1: 0.61, val loss: 0.02, val f1: 0.75, duration: 9.8s


[32m[I 2023-04-11 13:47:29,131][0m Trial 7 finished with value: 0.8174893097410383 and parameters: {'hidden_dim': 176, 'num_layers': 2, 'embed_size': 64}. Best is trial 3 with value: 0.8334338859224628.[0m


Epoch   5/  5, train loss: 0.01, train f1: 0.82, val loss: 0.04, val f1: 0.47, duration: 10.4s
RNNEmbed Trial 8 | hd 298, nl 4, ne 58
Epoch   1/  5, train loss: 0.02, train f1: 0.65, val loss: 0.01, val f1: 0.78, duration: 106.1s


[32m[I 2023-04-11 13:56:32,162][0m Trial 8 finished with value: 0.8203824717583259 and parameters: {'hidden_dim': 298, 'num_layers': 4, 'embed_size': 58}. Best is trial 3 with value: 0.8334338859224628.[0m


Epoch   5/  5, train loss: 0.01, train f1: 0.79, val loss: 0.01, val f1: 0.77, duration: 108.3s
RNNEmbed Trial 9 | hd 381, nl 2, ne 53
Epoch   1/  5, train loss: 0.02, train f1: 0.70, val loss: 0.02, val f1: 0.72, duration: 80.0s


[32m[I 2023-04-11 14:03:14,943][0m Trial 9 finished with value: 0.8445741137537678 and parameters: {'hidden_dim': 381, 'num_layers': 2, 'embed_size': 53}. Best is trial 9 with value: 0.8445741137537678.[0m


Epoch   5/  5, train loss: 0.01, train f1: 0.82, val loss: 0.01, val f1: 0.81, duration: 79.9s
RNNEmbed Trial 10 | hd 197, nl 3, ne 62
Epoch   1/  5, train loss: 0.02, train f1: 0.63, val loss: 0.02, val f1: 0.75, duration: 16.2s


[32m[I 2023-04-11 14:04:37,043][0m Trial 10 finished with value: 0.8237202379556726 and parameters: {'hidden_dim': 197, 'num_layers': 3, 'embed_size': 62}. Best is trial 9 with value: 0.8445741137537678.[0m


Epoch   5/  5, train loss: 0.01, train f1: 0.79, val loss: 0.01, val f1: 0.82, duration: 15.9s
RNNEmbed Trial 11 | hd 85, nl 4, ne 27
Epoch   1/  5, train loss: 0.03, train f1: 0.49, val loss: 0.02, val f1: 0.67, duration: 13.4s


[32m[I 2023-04-11 14:05:42,433][0m Trial 11 finished with value: 0.801955580873739 and parameters: {'hidden_dim': 85, 'num_layers': 4, 'embed_size': 27}. Best is trial 9 with value: 0.8445741137537678.[0m
[33m[W 2023-04-11 14:05:42,435][0m Trial 12 failed with parameters: {'hidden_dim': 382, 'num_layers': 4, 'embed_size': 23} because of the following error: The value None could not be cast to float..[0m
[33m[W 2023-04-11 14:05:42,436][0m Trial 12 failed with value None.[0m


Epoch   5/  5, train loss: 0.01, train f1: 0.77, val loss: 0.01, val f1: 0.80, duration: 12.1s
RNNEmbed Trial 13 | hd 228, nl 3, ne 21
Epoch   1/  5, train loss: 0.02, train f1: 0.63, val loss: 0.01, val f1: 0.76, duration: 20.6s


[32m[I 2023-04-11 14:07:33,631][0m Trial 13 finished with value: 0.8237525709533449 and parameters: {'hidden_dim': 228, 'num_layers': 3, 'embed_size': 21}. Best is trial 9 with value: 0.8445741137537678.[0m


Epoch   5/  5, train loss: 0.01, train f1: 0.77, val loss: 0.01, val f1: 0.82, duration: 22.2s
RNNEmbed Trial 14 | hd 251, nl 4, ne 29
Epoch   1/  5, train loss: 0.02, train f1: 0.64, val loss: 0.01, val f1: 0.76, duration: 27.4s


[32m[I 2023-04-11 14:09:50,644][0m Trial 14 finished with value: 0.8458220784700217 and parameters: {'hidden_dim': 251, 'num_layers': 4, 'embed_size': 29}. Best is trial 14 with value: 0.8458220784700217.[0m
[33m[W 2023-04-11 14:09:50,660][0m Trial 15 failed with parameters: {'hidden_dim': 481, 'num_layers': 4, 'embed_size': 35} because of the following error: The value None could not be cast to float..[0m
[33m[W 2023-04-11 14:09:50,661][0m Trial 15 failed with value None.[0m
[33m[W 2023-04-11 14:09:50,677][0m Trial 16 failed with parameters: {'hidden_dim': 495, 'num_layers': 4, 'embed_size': 34} because of the following error: The value None could not be cast to float..[0m
[33m[W 2023-04-11 14:09:50,678][0m Trial 16 failed with value None.[0m
[33m[W 2023-04-11 14:09:50,698][0m Trial 17 failed with parameters: {'hidden_dim': 500, 'num_layers': 4, 'embed_size': 35} because of the following error: The value None could not be cast to float..[0m
[33m[W 2023-04-11 14:09:5

Epoch   5/  5, train loss: 0.01, train f1: 0.82, val loss: 0.01, val f1: 0.85, duration: 26.8s
Best parameters: {'hidden_dim': 251, 'num_layers': 4, 'embed_size': 29}
RNNEmbed Model Best Params  |  hd 251, nl 4, ne 29
Epoch   1/ 10, train loss: 0.02, train f1: 0.64, val loss: 0.01, val f1: 0.77, duration: 27.0s
Epoch   5/ 10, train loss: 0.01, train f1: 0.81, val loss: 0.01, val f1: 0.83, duration: 26.5s
Epoch  10/ 10, train loss: 0.01, train f1: 0.76, val loss: 0.01, val f1: 0.77, duration: 26.4s


In [19]:
_, test_f1 = test(label)

Test F1 score: 0.8318741277213964


## LSTM

To mitigate the vanishing gradient problem encountered in traditional RNNs, we employ a Long Short-Term Memory (LSTM) network, optimizing hidden dimension, number of layers, and embedding size for more robust learning of long-range dependencies.

In [8]:
model_name = "lstm"
label = model_name + "_iter1"
no_trials = 20

In [7]:
tuning_and_training(
    model_name = model_name,
    no_trials= no_trials, 
    class_gap = class_gap,
    num_classes = num_classes,
    max_seq_len = max_seq_len,
    vocab_len = vocab_len,
    batch_size = batch_size
)

[32m[I 2023-04-10 21:34:45,394][0m A new study created in memory with name: no-name-a2c9fc94-f09b-4bf0-9626-62aa4ff84cea[0m


LSTM Trial 0 | hd 79, nl 3, ne 52
Epoch   1/  5, train loss: 0.02, train f1: 0.63, val loss: 0.01, val f1: 0.85, duration: 12.2s


[32m[I 2023-04-10 21:35:46,520][0m Trial 0 finished with value: 0.9763102819167817 and parameters: {'hidden_dim': 79, 'num_layers': 3, 'embed_size': 52}. Best is trial 0 with value: 0.9763102819167817.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.98, val loss: 0.00, val f1: 0.98, duration: 12.0s
LSTM Trial 1 | hd 135, nl 3, ne 53
Epoch   1/  5, train loss: 0.02, train f1: 0.72, val loss: 0.01, val f1: 0.87, duration: 71.6s


[32m[I 2023-04-10 22:02:34,867][0m Trial 1 finished with value: 0.9689115951884782 and parameters: {'hidden_dim': 135, 'num_layers': 3, 'embed_size': 53}. Best is trial 0 with value: 0.9763102819167817.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.98, val loss: 0.00, val f1: 0.97, duration: 502.8s
LSTM Trial 2 | hd 201, nl 3, ne 23
Epoch   1/  5, train loss: 0.02, train f1: 0.74, val loss: 0.01, val f1: 0.89, duration: 123.4s


[32m[I 2023-04-10 22:12:56,582][0m Trial 2 finished with value: 0.9777497404821043 and parameters: {'hidden_dim': 201, 'num_layers': 3, 'embed_size': 23}. Best is trial 2 with value: 0.9777497404821043.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.98, val loss: 0.00, val f1: 0.98, duration: 123.7s
LSTM Trial 3 | hd 223, nl 4, ne 62
Epoch   1/  5, train loss: 0.02, train f1: 0.74, val loss: 0.01, val f1: 0.90, duration: 152.1s


[32m[I 2023-04-10 22:25:45,380][0m Trial 3 finished with value: 0.9714767460568406 and parameters: {'hidden_dim': 223, 'num_layers': 4, 'embed_size': 62}. Best is trial 2 with value: 0.9777497404821043.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.98, val loss: 0.00, val f1: 0.97, duration: 154.6s
LSTM Trial 4 | hd 212, nl 2, ne 64
Epoch   1/  5, train loss: 0.01, train f1: 0.81, val loss: 0.00, val f1: 0.93, duration: 77.7s


[32m[I 2023-04-10 22:29:31,550][0m Trial 4 finished with value: 0.9802261085890176 and parameters: {'hidden_dim': 212, 'num_layers': 2, 'embed_size': 64}. Best is trial 4 with value: 0.9802261085890176.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.99, val loss: 0.00, val f1: 0.98, duration: 28.6s
LSTM Trial 5 | hd 109, nl 4, ne 18
Epoch   1/  5, train loss: 0.03, train f1: 0.60, val loss: 0.01, val f1: 0.82, duration: 24.9s


[32m[I 2023-04-10 22:31:39,334][0m Trial 5 finished with value: 0.9530815128309272 and parameters: {'hidden_dim': 109, 'num_layers': 4, 'embed_size': 18}. Best is trial 4 with value: 0.9802261085890176.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.95, val loss: 0.00, val f1: 0.95, duration: 25.3s
LSTM Trial 6 | hd 248, nl 3, ne 36
Epoch   1/  5, train loss: 0.01, train f1: 0.78, val loss: 0.00, val f1: 0.92, duration: 48.3s


[32m[I 2023-04-10 22:35:45,902][0m Trial 6 finished with value: 0.973759973832268 and parameters: {'hidden_dim': 248, 'num_layers': 3, 'embed_size': 36}. Best is trial 4 with value: 0.9802261085890176.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.98, val loss: 0.00, val f1: 0.97, duration: 48.2s
LSTM Trial 7 | hd 127, nl 2, ne 35
Epoch   1/  5, train loss: 0.02, train f1: 0.73, val loss: 0.01, val f1: 0.89, duration: 14.3s


[32m[I 2023-04-10 22:36:58,944][0m Trial 7 finished with value: 0.977492171405636 and parameters: {'hidden_dim': 127, 'num_layers': 2, 'embed_size': 35}. Best is trial 4 with value: 0.9802261085890176.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.98, val loss: 0.00, val f1: 0.98, duration: 14.1s
LSTM Trial 8 | hd 106, nl 2, ne 42
Epoch   1/  5, train loss: 0.02, train f1: 0.71, val loss: 0.01, val f1: 0.88, duration: 13.5s


[32m[I 2023-04-10 22:38:08,277][0m Trial 8 finished with value: 0.9711897651025452 and parameters: {'hidden_dim': 106, 'num_layers': 2, 'embed_size': 42}. Best is trial 4 with value: 0.9802261085890176.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.98, val loss: 0.00, val f1: 0.97, duration: 13.6s
LSTM Trial 9 | hd 205, nl 3, ne 54
Epoch   1/  5, train loss: 0.01, train f1: 0.78, val loss: 0.00, val f1: 0.92, duration: 49.5s


[32m[I 2023-04-10 22:42:20,068][0m Trial 9 finished with value: 0.9726219483233273 and parameters: {'hidden_dim': 205, 'num_layers': 3, 'embed_size': 54}. Best is trial 4 with value: 0.9802261085890176.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.98, val loss: 0.00, val f1: 0.97, duration: 49.9s
LSTM Trial 10 | hd 168, nl 2, ne 64
Epoch   1/  5, train loss: 0.01, train f1: 0.78, val loss: 0.01, val f1: 0.92, duration: 67.4s


[32m[I 2023-04-10 22:47:59,894][0m Trial 10 finished with value: 0.9797704364501594 and parameters: {'hidden_dim': 168, 'num_layers': 2, 'embed_size': 64}. Best is trial 4 with value: 0.9802261085890176.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.98, val loss: 0.00, val f1: 0.98, duration: 67.5s
LSTM Trial 11 | hd 172, nl 2, ne 62
Epoch   1/  5, train loss: 0.01, train f1: 0.78, val loss: 0.00, val f1: 0.92, duration: 69.1s


[32m[I 2023-04-10 22:53:48,490][0m Trial 11 finished with value: 0.9774561064940153 and parameters: {'hidden_dim': 172, 'num_layers': 2, 'embed_size': 62}. Best is trial 4 with value: 0.9802261085890176.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.98, val loss: 0.00, val f1: 0.98, duration: 69.3s
LSTM Trial 12 | hd 163, nl 2, ne 64
Epoch   1/  5, train loss: 0.01, train f1: 0.78, val loss: 0.01, val f1: 0.91, duration: 68.6s


[32m[I 2023-04-10 22:59:34,179][0m Trial 12 finished with value: 0.9775010266444588 and parameters: {'hidden_dim': 163, 'num_layers': 2, 'embed_size': 64}. Best is trial 4 with value: 0.9802261085890176.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.98, val loss: 0.00, val f1: 0.98, duration: 68.8s
LSTM Trial 13 | hd 186, nl 2, ne 48
Epoch   1/  5, train loss: 0.01, train f1: 0.78, val loss: 0.01, val f1: 0.91, duration: 76.5s


[32m[I 2023-04-10 23:05:58,681][0m Trial 13 finished with value: 0.9761399946876184 and parameters: {'hidden_dim': 186, 'num_layers': 2, 'embed_size': 48}. Best is trial 4 with value: 0.9802261085890176.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.98, val loss: 0.00, val f1: 0.98, duration: 76.3s
LSTM Trial 14 | hd 256, nl 2, ne 59
Epoch   1/  5, train loss: 0.01, train f1: 0.81, val loss: 0.00, val f1: 0.93, duration: 32.0s


[32m[I 2023-04-10 23:08:38,290][0m Trial 14 finished with value: 0.9734037690522187 and parameters: {'hidden_dim': 256, 'num_layers': 2, 'embed_size': 59}. Best is trial 4 with value: 0.9802261085890176.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.98, val loss: 0.00, val f1: 0.97, duration: 31.3s
LSTM Trial 15 | hd 226, nl 2, ne 45
Epoch   1/  5, train loss: 0.01, train f1: 0.80, val loss: 0.01, val f1: 0.92, duration: 34.1s


[32m[I 2023-04-10 23:11:30,958][0m Trial 15 finished with value: 0.9763310614004304 and parameters: {'hidden_dim': 226, 'num_layers': 2, 'embed_size': 45}. Best is trial 4 with value: 0.9802261085890176.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.98, val loss: 0.00, val f1: 0.97, duration: 34.1s
LSTM Trial 16 | hd 146, nl 2, ne 30
Epoch   1/  5, train loss: 0.02, train f1: 0.72, val loss: 0.01, val f1: 0.90, duration: 52.4s


[32m[I 2023-04-10 23:15:56,670][0m Trial 16 finished with value: 0.9802672846781546 and parameters: {'hidden_dim': 146, 'num_layers': 2, 'embed_size': 30}. Best is trial 16 with value: 0.9802672846781546.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.98, val loss: 0.00, val f1: 0.98, duration: 52.8s
LSTM Trial 17 | hd 144, nl 2, ne 29
Epoch   1/  5, train loss: 0.02, train f1: 0.72, val loss: 0.01, val f1: 0.90, duration: 50.0s


[32m[I 2023-04-10 23:20:09,438][0m Trial 17 finished with value: 0.9773707125766059 and parameters: {'hidden_dim': 144, 'num_layers': 2, 'embed_size': 29}. Best is trial 16 with value: 0.9802672846781546.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.98, val loss: 0.00, val f1: 0.98, duration: 50.2s
LSTM Trial 18 | hd 69, nl 4, ne 26
Epoch   1/  5, train loss: 0.03, train f1: 0.50, val loss: 0.01, val f1: 0.76, duration: 15.0s


[32m[I 2023-04-10 23:21:25,935][0m Trial 18 finished with value: 0.9587981420245398 and parameters: {'hidden_dim': 69, 'num_layers': 4, 'embed_size': 26}. Best is trial 16 with value: 0.9802672846781546.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.96, val loss: 0.00, val f1: 0.96, duration: 14.8s
LSTM Trial 19 | hd 191, nl 3, ne 33
Epoch   1/  5, train loss: 0.02, train f1: 0.75, val loss: 0.01, val f1: 0.90, duration: 122.5s


[32m[I 2023-04-10 23:31:55,738][0m Trial 19 finished with value: 0.9786399176814623 and parameters: {'hidden_dim': 191, 'num_layers': 3, 'embed_size': 33}. Best is trial 16 with value: 0.9802672846781546.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.98, val loss: 0.00, val f1: 0.98, duration: 125.5s
Best parameters: {'hidden_dim': 146, 'num_layers': 2, 'embed_size': 30}
LSTM Model Best Params  | hd 146, nl 2, ne 30
Epoch   1/ 10, train loss: 0.02, train f1: 0.73, val loss: 0.01, val f1: 0.89, duration: 53.6s
Epoch   5/ 10, train loss: 0.00, train f1: 0.98, val loss: 0.00, val f1: 0.98, duration: 52.5s
Epoch  10/ 10, train loss: 0.00, train f1: 0.99, val loss: 0.00, val f1: 0.99, duration: 52.6s


In [8]:
_, test_f1 = test(label)

Test F1 score: 0.9830174947534691


## BiLSTM

Leveraging a Bi-Directional LSTM, we enable the model to capture information from both past and future contexts, fine-tuning the hidden dimension, number of layers, and embedding size for enhanced protein family prediction.

In [11]:
model_name = "bi-lstm"
label = model_name + "_iter1"
no_trials = 20

In [12]:
tuning_and_training(
    model_name = model_name,
    no_trials= no_trials, 
    class_gap = class_gap,
    num_classes = num_classes,
    max_seq_len = max_seq_len,
    vocab_len = vocab_len,
    batch_size = batch_size
)

[32m[I 2023-04-11 10:19:56,865][0m A new study created in memory with name: no-name-deb74425-8393-48b0-9f7d-fecb289e8c47[0m


BiLSTM Trial 0 | hd 231, nl 3, ne 32
Epoch   1/  5, train loss: 0.01, train f1: 0.79, val loss: 0.00, val f1: 0.92, duration: 84.3s


[32m[I 2023-04-11 10:27:49,734][0m Trial 0 finished with value: 0.9799523546218828 and parameters: {'hidden_dim': 231, 'num_layers': 3, 'embed_size': 32}. Best is trial 0 with value: 0.9799523546218828.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.99, val loss: 0.00, val f1: 0.98, duration: 101.8s
BiLSTM Trial 1 | hd 111, nl 4, ne 30
Epoch   1/  5, train loss: 0.02, train f1: 0.68, val loss: 0.01, val f1: 0.88, duration: 52.0s


[32m[I 2023-04-11 10:32:07,804][0m Trial 1 finished with value: 0.9717341806653917 and parameters: {'hidden_dim': 111, 'num_layers': 4, 'embed_size': 30}. Best is trial 0 with value: 0.9799523546218828.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.98, val loss: 0.00, val f1: 0.97, duration: 50.9s
BiLSTM Trial 2 | hd 183, nl 2, ne 37
Epoch   1/  5, train loss: 0.01, train f1: 0.79, val loss: 0.00, val f1: 0.92, duration: 138.3s


[32m[I 2023-04-11 10:44:02,239][0m Trial 2 finished with value: 0.9805935944563495 and parameters: {'hidden_dim': 183, 'num_layers': 2, 'embed_size': 37}. Best is trial 2 with value: 0.9805935944563495.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.99, val loss: 0.00, val f1: 0.98, duration: 145.3s
BiLSTM Trial 3 | hd 204, nl 4, ne 63
Epoch   1/  5, train loss: 0.01, train f1: 0.77, val loss: 0.00, val f1: 0.92, duration: 89.1s


[32m[I 2023-04-11 10:51:34,245][0m Trial 3 finished with value: 0.9814587265733016 and parameters: {'hidden_dim': 204, 'num_layers': 4, 'embed_size': 63}. Best is trial 3 with value: 0.9814587265733016.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.98, val loss: 0.00, val f1: 0.98, duration: 92.8s
BiLSTM Trial 4 | hd 128, nl 3, ne 26
Epoch   1/  5, train loss: 0.02, train f1: 0.72, val loss: 0.01, val f1: 0.90, duration: 40.4s


[32m[I 2023-04-11 10:55:03,064][0m Trial 4 finished with value: 0.9734063302487573 and parameters: {'hidden_dim': 128, 'num_layers': 3, 'embed_size': 26}. Best is trial 3 with value: 0.9814587265733016.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.98, val loss: 0.00, val f1: 0.97, duration: 41.0s
BiLSTM Trial 5 | hd 156, nl 2, ne 62
Epoch   1/  5, train loss: 0.01, train f1: 0.80, val loss: 0.00, val f1: 0.93, duration: 99.6s


[32m[I 2023-04-11 11:03:24,292][0m Trial 5 finished with value: 0.9815632253559958 and parameters: {'hidden_dim': 156, 'num_layers': 2, 'embed_size': 62}. Best is trial 5 with value: 0.9815632253559958.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.99, val loss: 0.00, val f1: 0.98, duration: 100.0s
BiLSTM Trial 6 | hd 103, nl 2, ne 16
Epoch   1/  5, train loss: 0.02, train f1: 0.66, val loss: 0.01, val f1: 0.88, duration: 25.3s


[32m[I 2023-04-11 11:05:31,125][0m Trial 6 finished with value: 0.9814369500671026 and parameters: {'hidden_dim': 103, 'num_layers': 2, 'embed_size': 16}. Best is trial 5 with value: 0.9815632253559958.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.98, val loss: 0.00, val f1: 0.98, duration: 24.7s
BiLSTM Trial 7 | hd 256, nl 3, ne 23
Epoch   1/  5, train loss: 0.01, train f1: 0.77, val loss: 0.00, val f1: 0.92, duration: 76.1s


[32m[I 2023-04-11 11:11:57,383][0m Trial 7 finished with value: 0.981965061349325 and parameters: {'hidden_dim': 256, 'num_layers': 3, 'embed_size': 23}. Best is trial 7 with value: 0.981965061349325.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.99, val loss: 0.00, val f1: 0.98, duration: 76.5s
BiLSTM Trial 8 | hd 149, nl 3, ne 47
Epoch   1/  5, train loss: 0.01, train f1: 0.77, val loss: 0.01, val f1: 0.91, duration: 145.4s


[32m[I 2023-04-11 11:24:22,160][0m Trial 8 finished with value: 0.9802554686227581 and parameters: {'hidden_dim': 149, 'num_layers': 3, 'embed_size': 47}. Best is trial 7 with value: 0.981965061349325.[0m
[33m[W 2023-04-11 11:24:22,163][0m Trial 9 failed with parameters: {'hidden_dim': 252, 'num_layers': 4, 'embed_size': 36} because of the following error: The value None could not be cast to float..[0m
[33m[W 2023-04-11 11:24:22,164][0m Trial 9 failed with value None.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.98, val loss: 0.00, val f1: 0.98, duration: 148.2s
BiLSTM Trial 10 | hd 153, nl 3, ne 43
Epoch   1/  5, train loss: 0.02, train f1: 0.76, val loss: 0.01, val f1: 0.91, duration: 152.3s


[32m[I 2023-04-11 11:37:08,337][0m Trial 10 finished with value: 0.9756852590633562 and parameters: {'hidden_dim': 153, 'num_layers': 3, 'embed_size': 43}. Best is trial 7 with value: 0.981965061349325.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.98, val loss: 0.00, val f1: 0.98, duration: 153.0s
BiLSTM Trial 11 | hd 249, nl 4, ne 16
Epoch   1/  5, train loss: 0.02, train f1: 0.73, val loss: 0.01, val f1: 0.91, duration: 143.3s


[32m[I 2023-04-11 11:49:04,619][0m Trial 11 finished with value: 0.9816676469046068 and parameters: {'hidden_dim': 249, 'num_layers': 4, 'embed_size': 16}. Best is trial 7 with value: 0.981965061349325.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.98, val loss: 0.00, val f1: 0.98, duration: 143.6s
BiLSTM Trial 12 | hd 247, nl 4, ne 16
Epoch   1/  5, train loss: 0.02, train f1: 0.72, val loss: 0.01, val f1: 0.90, duration: 141.8s


[32m[I 2023-04-11 12:01:11,121][0m Trial 12 finished with value: 0.9811122741094955 and parameters: {'hidden_dim': 247, 'num_layers': 4, 'embed_size': 16}. Best is trial 7 with value: 0.981965061349325.[0m
[33m[W 2023-04-11 12:01:11,144][0m Trial 13 failed with parameters: {'hidden_dim': 256, 'num_layers': 4, 'embed_size': 23} because of the following error: The value None could not be cast to float..[0m
[33m[W 2023-04-11 12:01:11,145][0m Trial 13 failed with value None.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.98, val loss: 0.00, val f1: 0.98, duration: 144.6s
BiLSTM Trial 14 | hd 68, nl 4, ne 23
Epoch   1/  5, train loss: 0.03, train f1: 0.56, val loss: 0.01, val f1: 0.81, duration: 29.9s


[32m[I 2023-04-11 12:03:43,808][0m Trial 14 finished with value: 0.9676840354034344 and parameters: {'hidden_dim': 68, 'num_layers': 4, 'embed_size': 23}. Best is trial 7 with value: 0.981965061349325.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.97, val loss: 0.00, val f1: 0.97, duration: 30.0s
BiLSTM Trial 15 | hd 256, nl 3, ne 23
Epoch   1/  5, train loss: 0.01, train f1: 0.79, val loss: 0.00, val f1: 0.92, duration: 76.4s


[32m[I 2023-04-11 12:10:10,308][0m Trial 15 finished with value: 0.9841632873791226 and parameters: {'hidden_dim': 256, 'num_layers': 3, 'embed_size': 23}. Best is trial 15 with value: 0.9841632873791226.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.99, val loss: 0.00, val f1: 0.98, duration: 77.1s
BiLSTM Trial 16 | hd 211, nl 3, ne 24
Epoch   1/  5, train loss: 0.01, train f1: 0.76, val loss: 0.00, val f1: 0.92, duration: 78.6s


[32m[I 2023-04-11 12:16:46,983][0m Trial 16 finished with value: 0.978268447592924 and parameters: {'hidden_dim': 211, 'num_layers': 3, 'embed_size': 24}. Best is trial 15 with value: 0.9841632873791226.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.98, val loss: 0.00, val f1: 0.98, duration: 79.2s
BiLSTM Trial 17 | hd 216, nl 3, ne 54
Epoch   1/  5, train loss: 0.01, train f1: 0.80, val loss: 0.00, val f1: 0.93, duration: 68.1s


[32m[I 2023-04-11 12:22:31,970][0m Trial 17 finished with value: 0.9726380601242599 and parameters: {'hidden_dim': 216, 'num_layers': 3, 'embed_size': 54}. Best is trial 15 with value: 0.9841632873791226.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.98, val loss: 0.00, val f1: 0.97, duration: 70.5s
BiLSTM Trial 18 | hd 177, nl 2, ne 35
Epoch   1/  5, train loss: 0.01, train f1: 0.79, val loss: 0.00, val f1: 0.93, duration: 143.0s


[32m[I 2023-04-11 12:34:29,008][0m Trial 18 finished with value: 0.9827735986963481 and parameters: {'hidden_dim': 177, 'num_layers': 2, 'embed_size': 35}. Best is trial 15 with value: 0.9841632873791226.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.99, val loss: 0.00, val f1: 0.98, duration: 143.0s
BiLSTM Trial 19 | hd 183, nl 2, ne 34
Epoch   1/  5, train loss: 0.01, train f1: 0.78, val loss: 0.01, val f1: 0.90, duration: 146.8s


[32m[I 2023-04-11 12:46:51,362][0m Trial 19 finished with value: 0.9788340060809854 and parameters: {'hidden_dim': 183, 'num_layers': 2, 'embed_size': 34}. Best is trial 15 with value: 0.9841632873791226.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.99, val loss: 0.00, val f1: 0.98, duration: 149.0s
Best parameters: {'hidden_dim': 256, 'num_layers': 3, 'embed_size': 23}
BiLSTM Model Best Params  | hd 256, nl 3, ne 23
Epoch   1/ 10, train loss: 0.01, train f1: 0.78, val loss: 0.00, val f1: 0.92, duration: 87.5s
Epoch   5/ 10, train loss: 0.00, train f1: 0.99, val loss: 0.00, val f1: 0.98, duration: 79.5s
Epoch  10/ 10, train loss: 0.00, train f1: 0.99, val loss: 0.00, val f1: 0.99, duration: 77.5s


In [13]:
_, test_f1 = test(label)

Test F1 score: 0.9834484386762391


## Transformer (Encoder only)

Incorporating the powerful Transformer architecture, we focus on the encoder component to capitalize on its self-attention mechanism, optimizing hidden dimension, number of layers, feed-forward dimension, number of heads per block, and embedding size for state-of-the-art protein family classification.

In [14]:
model_name = "transformer"
label = model_name + "_iter1"
no_trials = 30

In [15]:
tuning_and_training(
    model_name = model_name,
    no_trials= no_trials, 
    class_gap = class_gap,
    num_classes = num_classes,
    max_seq_len = max_seq_len,
    vocab_len = vocab_len,
    batch_size = batch_size,
    epochs = 20,
)

[32m[I 2023-04-11 13:00:08,484][0m A new study created in memory with name: no-name-4a78b3b1-4da2-4f13-91bd-ed37403498ca[0m
[33m[W 2023-04-11 13:00:08,488][0m Trial 0 failed with parameters: {'embed_size': 64, 'hidden_dim': 301, 'feed_forward_dim': 382, 'num_layers': 4, 'num_heads': 16} because of the following error: The value None could not be cast to float..[0m
[33m[W 2023-04-11 13:00:08,488][0m Trial 0 failed with value None.[0m


Transformer Trial 1 | hd 508, nl 2, ne 64, ff 169, nh 16
Epoch   1/  5, train loss: 0.05, train f1: 0.17, val loss: 0.04, val f1: 0.32, duration: 29.8s


[32m[I 2023-04-11 13:02:38,891][0m Trial 1 finished with value: 0.8878068843951432 and parameters: {'embed_size': 64, 'hidden_dim': 508, 'feed_forward_dim': 169, 'num_layers': 2, 'num_heads': 16}. Best is trial 1 with value: 0.8878068843951432.[0m
[33m[W 2023-04-11 13:02:38,893][0m Trial 2 failed with parameters: {'embed_size': 256, 'hidden_dim': 507, 'feed_forward_dim': 231, 'num_layers': 4, 'num_heads': 16} because of the following error: The value None could not be cast to float..[0m
[33m[W 2023-04-11 13:02:38,894][0m Trial 2 failed with value None.[0m


Epoch   5/  5, train loss: 0.01, train f1: 0.86, val loss: 0.01, val f1: 0.89, duration: 29.5s
Transformer Trial 3 | hd 207, nl 2, ne 16, ff 98, nh 4
Epoch   1/  5, train loss: 0.06, train f1: 0.03, val loss: 0.05, val f1: 0.07, duration: 17.1s


[32m[I 2023-04-11 13:04:01,644][0m Trial 3 finished with value: 0.575015486157533 and parameters: {'embed_size': 16, 'hidden_dim': 207, 'feed_forward_dim': 98, 'num_layers': 2, 'num_heads': 4}. Best is trial 1 with value: 0.8878068843951432.[0m


Epoch   5/  5, train loss: 0.03, train f1: 0.55, val loss: 0.02, val f1: 0.58, duration: 15.7s
Transformer Trial 4 | hd 136, nl 3, ne 16, ff 163, nh 16
Epoch   1/  5, train loss: 0.06, train f1: 0.04, val loss: 0.05, val f1: 0.07, duration: 24.9s


[32m[I 2023-04-11 13:06:07,746][0m Trial 4 finished with value: 0.6182965060177821 and parameters: {'embed_size': 16, 'hidden_dim': 136, 'feed_forward_dim': 163, 'num_layers': 3, 'num_heads': 16}. Best is trial 1 with value: 0.8878068843951432.[0m
[33m[W 2023-04-11 13:06:07,748][0m Trial 5 failed with parameters: {'embed_size': 64, 'hidden_dim': 346, 'feed_forward_dim': 156, 'num_layers': 3, 'num_heads': 16} because of the following error: The value None could not be cast to float..[0m
[33m[W 2023-04-11 13:06:07,749][0m Trial 5 failed with value None.[0m


Epoch   5/  5, train loss: 0.02, train f1: 0.59, val loss: 0.02, val f1: 0.62, duration: 24.8s
Transformer Trial 6 | hd 297, nl 3, ne 16, ff 249, nh 16
Epoch   1/  5, train loss: 0.06, train f1: 0.03, val loss: 0.05, val f1: 0.05, duration: 25.7s


[32m[I 2023-04-11 13:08:18,279][0m Trial 6 finished with value: 0.6472148769162076 and parameters: {'embed_size': 16, 'hidden_dim': 297, 'feed_forward_dim': 249, 'num_layers': 3, 'num_heads': 16}. Best is trial 1 with value: 0.8878068843951432.[0m


Epoch   5/  5, train loss: 0.02, train f1: 0.65, val loss: 0.02, val f1: 0.65, duration: 25.7s
Transformer Trial 7 | hd 248, nl 2, ne 16, ff 509, nh 8
Epoch   1/  5, train loss: 0.06, train f1: 0.04, val loss: 0.05, val f1: 0.08, duration: 21.2s


[32m[I 2023-04-11 13:10:05,777][0m Trial 7 finished with value: 0.6653502727941105 and parameters: {'embed_size': 16, 'hidden_dim': 248, 'feed_forward_dim': 509, 'num_layers': 2, 'num_heads': 8}. Best is trial 1 with value: 0.8878068843951432.[0m
[33m[W 2023-04-11 13:10:05,778][0m Trial 8 failed with parameters: {'embed_size': 32, 'hidden_dim': 382, 'feed_forward_dim': 288, 'num_layers': 4, 'num_heads': 4} because of the following error: The value None could not be cast to float..[0m
[33m[W 2023-04-11 13:10:05,779][0m Trial 8 failed with value None.[0m
[33m[W 2023-04-11 13:10:05,781][0m Trial 9 failed with parameters: {'embed_size': 256, 'hidden_dim': 503, 'feed_forward_dim': 79, 'num_layers': 3, 'num_heads': 16} because of the following error: The value None could not be cast to float..[0m
[33m[W 2023-04-11 13:10:05,782][0m Trial 9 failed with value None.[0m
[33m[W 2023-04-11 13:10:05,786][0m Trial 10 failed with parameters: {'embed_size': 512, 'hidden_dim': 512, 'fee

Epoch   5/  5, train loss: 0.02, train f1: 0.61, val loss: 0.02, val f1: 0.67, duration: 21.1s
Transformer Trial 12 | hd 212, nl 3, ne 16, ff 261, nh 4
Epoch   1/  5, train loss: 0.06, train f1: 0.03, val loss: 0.05, val f1: 0.06, duration: 22.8s


[32m[I 2023-04-11 13:12:04,176][0m Trial 12 finished with value: 0.6492613237557983 and parameters: {'embed_size': 16, 'hidden_dim': 212, 'feed_forward_dim': 261, 'num_layers': 3, 'num_heads': 4}. Best is trial 1 with value: 0.8878068843951432.[0m
[33m[W 2023-04-11 13:12:04,179][0m Trial 13 failed with parameters: {'embed_size': 32, 'hidden_dim': 283, 'feed_forward_dim': 354, 'num_layers': 4, 'num_heads': 4} because of the following error: The value None could not be cast to float..[0m
[33m[W 2023-04-11 13:12:04,180][0m Trial 13 failed with value None.[0m
[33m[W 2023-04-11 13:12:04,182][0m Trial 14 failed with parameters: {'embed_size': 256, 'hidden_dim': 436, 'feed_forward_dim': 462, 'num_layers': 4, 'num_heads': 16} because of the following error: The value None could not be cast to float..[0m
[33m[W 2023-04-11 13:12:04,183][0m Trial 14 failed with value None.[0m


Epoch   5/  5, train loss: 0.02, train f1: 0.59, val loss: 0.02, val f1: 0.65, duration: 24.9s
Transformer Trial 15 | hd 198, nl 4, ne 64, ff 93, nh 8
Epoch   1/  5, train loss: 0.05, train f1: 0.18, val loss: 0.04, val f1: 0.35, duration: 38.2s


[32m[I 2023-04-11 13:15:18,696][0m Trial 15 finished with value: 0.9346587262687649 and parameters: {'embed_size': 64, 'hidden_dim': 198, 'feed_forward_dim': 93, 'num_layers': 4, 'num_heads': 8}. Best is trial 15 with value: 0.9346587262687649.[0m


Epoch   5/  5, train loss: 0.01, train f1: 0.90, val loss: 0.00, val f1: 0.93, duration: 38.4s
Transformer Trial 16 | hd 266, nl 2, ne 128, ff 422, nh 8
Epoch   1/  5, train loss: 0.04, train f1: 0.31, val loss: 0.03, val f1: 0.53, duration: 29.9s


[32m[I 2023-04-11 13:17:55,141][0m Trial 16 finished with value: 0.9235295368174701 and parameters: {'embed_size': 128, 'hidden_dim': 266, 'feed_forward_dim': 422, 'num_layers': 2, 'num_heads': 8}. Best is trial 15 with value: 0.9346587262687649.[0m
[33m[W 2023-04-11 13:17:55,144][0m Trial 17 failed with parameters: {'embed_size': 128, 'hidden_dim': 347, 'feed_forward_dim': 107, 'num_layers': 3, 'num_heads': 8} because of the following error: The value None could not be cast to float..[0m
[33m[W 2023-04-11 13:17:55,145][0m Trial 17 failed with value None.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.92, val loss: 0.00, val f1: 0.92, duration: 31.4s
Transformer Trial 18 | hd 235, nl 3, ne 128, ff 373, nh 4
Epoch   1/  5, train loss: 0.04, train f1: 0.31, val loss: 0.03, val f1: 0.60, duration: 35.6s


[32m[I 2023-04-11 13:21:04,698][0m Trial 18 finished with value: 0.952891633434388 and parameters: {'embed_size': 128, 'hidden_dim': 235, 'feed_forward_dim': 373, 'num_layers': 3, 'num_heads': 4}. Best is trial 18 with value: 0.952891633434388.[0m


Epoch   5/  5, train loss: 0.00, train f1: 0.95, val loss: 0.00, val f1: 0.95, duration: 37.7s
Transformer Trial 19 | hd 214, nl 2, ne 64, ff 292, nh 4
Epoch   1/  5, train loss: 0.05, train f1: 0.19, val loss: 0.04, val f1: 0.35, duration: 19.1s


[32m[I 2023-04-11 13:22:42,895][0m Trial 19 finished with value: 0.8814624472103405 and parameters: {'embed_size': 64, 'hidden_dim': 214, 'feed_forward_dim': 292, 'num_layers': 2, 'num_heads': 4}. Best is trial 18 with value: 0.952891633434388.[0m
[33m[W 2023-04-11 13:22:42,911][0m Trial 20 failed with parameters: {'embed_size': 512, 'hidden_dim': 512, 'feed_forward_dim': 363, 'num_layers': 4, 'num_heads': 4} because of the following error: The value None could not be cast to float..[0m
[33m[W 2023-04-11 13:22:42,912][0m Trial 20 failed with value None.[0m
[33m[W 2023-04-11 13:22:42,928][0m Trial 21 failed with parameters: {'embed_size': 128, 'hidden_dim': 401, 'feed_forward_dim': 365, 'num_layers': 4, 'num_heads': 4} because of the following error: The value None could not be cast to float..[0m
[33m[W 2023-04-11 13:22:42,928][0m Trial 21 failed with value None.[0m
[33m[W 2023-04-11 13:22:42,947][0m Trial 22 failed with parameters: {'embed_size': 128, 'hidden_dim': 373

Epoch   5/  5, train loss: 0.01, train f1: 0.85, val loss: 0.01, val f1: 0.88, duration: 19.1s
Best parameters: {'embed_size': 128, 'hidden_dim': 235, 'feed_forward_dim': 373, 'num_layers': 3, 'num_heads': 4}
Transformer Model Best Params  | hd 235, nl 3, ne 128, ff 373, nh 4
Epoch   1/ 20, train loss: 0.04, train f1: 0.30, val loss: 0.03, val f1: 0.57, duration: 36.5s
Epoch   5/ 20, train loss: 0.00, train f1: 0.94, val loss: 0.00, val f1: 0.96, duration: 35.9s
Epoch  10/ 20, train loss: 0.00, train f1: 0.98, val loss: 0.00, val f1: 0.98, duration: 38.1s
Epoch  15/ 20, train loss: 0.00, train f1: 0.99, val loss: 0.00, val f1: 0.99, duration: 38.7s
Epoch  20/ 20, train loss: 0.00, train f1: 0.99, val loss: 0.00, val f1: 0.99, duration: 37.0s


In [16]:
_, test_f1 = test(label)

Test F1 score: 0.9945186048369115
