## Metric Learning over the Matching Model
### Use Case Example

In [3]:
# IMPORTS
from matching_model import MatchingLoss, retrain
from matching_data import get_complete_feedback
from transformers import LongformerTokenizer, LongformerModel
from torch.optim import AdamW
from torch import device

In [4]:
# DEVICE settings
device = device('cuda' if torch.cuda.is_available() else 'cpu')

We start by initializing the base pretrained large language model for embedding generation along with its parameters.

In [5]:
model = LongformerModel.from_pretrained("allenai/longformer-base-4096")
model.to(device)

Some weights of the model checkpoint at allenai/longformer-base-4096 were not used when initializing LongformerModel: ['lm_head.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.decoder.weight', 'lm_head.bias', 'lm_head.layer_norm.bias', 'lm_head.dense.bias']
- This IS expected if you are initializing LongformerModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing LongformerModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


LongformerModel(
  (embeddings): LongformerEmbeddings(
    (word_embeddings): Embedding(50265, 768, padding_idx=1)
    (position_embeddings): Embedding(4098, 768, padding_idx=1)
    (token_type_embeddings): Embedding(1, 768)
    (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
    (dropout): Dropout(p=0.1, inplace=False)
  )
  (encoder): LongformerEncoder(
    (layer): ModuleList(
      (0): LongformerLayer(
        (attention): LongformerAttention(
          (self): LongformerSelfAttention(
            (query): Linear(in_features=768, out_features=768, bias=True)
            (key): Linear(in_features=768, out_features=768, bias=True)
            (value): Linear(in_features=768, out_features=768, bias=True)
            (query_global): Linear(in_features=768, out_features=768, bias=True)
            (key_global): Linear(in_features=768, out_features=768, bias=True)
            (value_global): Linear(in_features=768, out_features=768, bias=True)
          )
          (o

Next, we define the parameters for the model: 
 * Text tokenizer
 * Adam Optimizer
 * Our Matching Loss function

In [6]:
tokenizer = LongformerTokenizer.from_pretrained("allenai/longformer-base-4096")
matching_loss = MatchingLoss(margin=1, alpha=0.7, beta=0.3)
optimizer = AdamW(model.parameters())

Last we load some feedback. For each user's loading we can tune our model.

In [7]:
feedback = get_complete_feedback()
for user_feedback in feedback:
  retrain(user_feedback[0], user_feedback[1], user_feedback[2], tokenizer, optimizer, model, matching_loss, device)

100%|██████████| 3/3 [03:50<00:00, 76.78s/it]
100%|██████████| 3/3 [03:40<00:00, 73.58s/it]
100%|██████████| 3/3 [03:33<00:00, 71.12s/it]
