### Model Conversion
Previous model saved in `run` folder is created from `nn.Module` and thus does not comply with BERT-esque model.
This notebook contains script to convert such model into BERT-compatible format so you can use `ModelClass.from_pretrained(...)` and `ModelClass.save_pretrained(...)` methods.

### DNABERT_SL Base

In [1]:
from transformers import BertForMaskedLM
import models
import os
import json
import torch
dnabert_pretrained_path = os.path.join("pretrained", "3-new-12w-0")
base_config_path = os.path.join("models", "config", "seqlab", "base.json")
# model = models.seqlab.DNABERT_SL(
#     BertForMaskedLM.from_pretrained(
#         dnabert_pretrained_path
#     ).bert,
#     json.load(
#         open(
#             os.path.join("models", "config", "seqlab", "base.json"),
#             "r"
#         )
#     )
# )

model = models.dnabert.DNABERT_SL.from_pretrained(
    dnabert_pretrained_path, 
    json.load(open(
        os.path.join(base_config_path)
    )))

base_checkpoint_path = os.path.join("run", "latest-base-291mo307", "latest", "checkpoint.pth")
checkpoint = torch.load(base_checkpoint_path, map_location="cuda")
model.load_state_dict(checkpoint["model"])

base_pretrained = os.path.join("pretrained", "dnabert-sl-base")
model.save_pretrained(base_pretrained)

Some weights of the model checkpoint at pretrained\3-new-12w-0 were not used when initializing DNABERT_SL: ['cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'bert.pooler.dense.bias', 'cls.predictions.transform.LayerNorm.weight', 'bert.pooler.dense.weight', 'cls.predictions.decoder.weight', 'cls.predictions.decoder.bias']
- This IS expected if you are initializing DNABERT_SL from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DNABERT_SL from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of DNABERT_SL were not initialized from the model checkpoint at pretrained\3-new-12w-0 and are newly initial

### DNABERT-SL Lin1

In [None]:
# tldr; 

### DNABERT-SL Lin2

In [1]:
from transformers import BertForMaskedLM
import models
import os
import json
import torch
dnabert_pretrained_path = os.path.join("pretrained", "3-new-12w-0")
base_lin2_config_path = os.path.join("models", "config", "seqlab", "base.lin2.json")

model = models.dnabert.DNABERT_SL.from_pretrained(
    dnabert_pretrained_path, 
    json.load(open(
        os.path.join(base_lin2_config_path)
    )))

checkpoint_path = os.path.join("run", "latest-base.lin2-u8psigt5", "latest", "checkpoint.pth")
checkpoint = torch.load(checkpoint_path, map_location="cuda")
model.load_state_dict(checkpoint["model"])

base_pretrained = os.path.join("pretrained", "dnabert-sl-lin2")
model.save_pretrained(base_pretrained)

Some weights of the model checkpoint at pretrained\3-new-12w-0 were not used when initializing DNABERT_SL: ['cls.predictions.transform.LayerNorm.bias', 'bert.pooler.dense.weight', 'cls.predictions.decoder.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.bias', 'bert.pooler.dense.bias', 'cls.predictions.decoder.weight', 'cls.predictions.transform.LayerNorm.weight']
- This IS expected if you are initializing DNABERT_SL from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DNABERT_SL from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of DNABERT_SL were not initialized from the model checkpoint at pretrained\3-new-12w-0 and are newly initial

### DNABERT-SL Lin3

In [1]:
from transformers import BertForMaskedLM
import models
import os
import json
import torch
dnabert_pretrained_path = os.path.join("pretrained", "3-new-12w-0")
base_lin3_config_path = os.path.join("models", "config", "seqlab", "base.lin3.json")

model = models.dnabert.DNABERT_SL.from_pretrained(
    dnabert_pretrained_path, 
    json.load(open(
        os.path.join(base_lin3_config_path)
    )))

checkpoint_path = os.path.join("run", "latest-base.lin3-h91p4uhz", "latest", "checkpoint.pth")
checkpoint = torch.load(checkpoint_path, map_location="cuda")
model.load_state_dict(checkpoint["model"])

base_pretrained = os.path.join("pretrained", "dnabert-sl-lin3")
model.save_pretrained(base_pretrained)

Some weights of the model checkpoint at pretrained\3-new-12w-0 were not used when initializing DNABERT_SL: ['cls.predictions.decoder.bias', 'bert.pooler.dense.weight', 'bert.pooler.dense.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.decoder.weight', 'cls.predictions.transform.dense.bias']
- This IS expected if you are initializing DNABERT_SL from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DNABERT_SL from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of DNABERT_SL were not initialized from the model checkpoint at pretrained\3-new-12w-0 and are newly initial