Skip to content

Latest commit



67 lines (44 loc) · 2.57 KB


File metadata and controls

67 lines (44 loc) · 2.57 KB

Examples of finetuneing

To finetune a model, you need to prepare pretrained language models (PLMs). Currently, LangML supports BERT/RoBERTa/ALBERT PLMs. You can download PLMs from google-research/bert , google-research/albert , Chinese RoBERTa etc.

1. Prepare datasets

You need to use specific tokenizers in terms of PLMs to initialize a tokenizer and convert texts to vocabulary indices. LangML wraps huggingface/tokenizers and google/sentencepiece to provide a uniform interface. Specifically, you can initialize a WordPiece tokenizer via langml.tokenizer.WPTokenizer, and initialize a sentencepiece tokenizer via langml.tokenizer.SPTokenizer.

from langml import keras, L
from langml.tokenizer import WPTokenizer

vocab_path = '/path/to/vocab.txt'
tokenizer = WPTokenizer(vocab_path)
# specify max token length

class DataLoader:
   def __init__(self, tokenizer):
      # define initializer here
      self.tokenizer = tokenizer

   def __iter__(self, data):
      # define your data generator here
      for text, label in data:
         tokenized = self.tokenizer.encode(text)
         token_ids = tokenized.ids
         segment_ids = tokenized.segment_ids
         # ...

2. Build models

You can use langml.plm.load_bert to load a BERT/RoBERTa model, and use langml.plm.load_albert to load an ALBERT model.

from langml import keras, L
from langml.plm import load_bert

config_path = '/path/to/bert_config.json'
ckpt_path = '/path/to/bert_model.ckpt'
vocab_path = '/path/to/vocab.txt'

bert_model, bert_instance = load_bert(config_path, ckpt_path)
# get CLS representation
cls_output = L.Lambda(lambda x: x[:, 0])(bert_model.output)
output = L.Dense(2, activation='softmax',
train_model = keras.Model(bert_model.input, cls_output)
train_model.compile(loss='categorical_crossentropy', optimizer=keras.optimizer.Adam(1e-5))

3. Train and Eval

After defining the data loader and model, you can train and evaluate your model as most Keras models do.