# Install packages

In [1]:
!pip install transformers==4.6.0



In [2]:
!pip install tokenizers==0.10.2



# Test the performance of the last checkpoint on Fill Mask task

The following code snippet allows to test the last checkpoint recorded during our collaborative training. You can change the `text` variable and see the token predicted by our model in the place of the `[MASK]` token.

In [3]:
from transformers import AlbertForMaskedLM, FillMaskPipeline, PreTrainedTokenizerFast

In [5]:
# Initialize tokenizer
tokenizer = PreTrainedTokenizerFast.from_pretrained("SaulLu/bengali-tokenizer-v2")

# Initialize model
model = AlbertForMaskedLM.from_pretrained("Upload/bengali-albert")

# Initialize pipeline
pipeline = FillMaskPipeline(tokenizer=tokenizer, model=model)

Some weights of the model checkpoint at Upload/bengali-albert were not used when initializing AlbertForMaskedLM: ['sop_classifier.classifier.weight', 'sop_classifier.classifier.bias']
- This IS expected if you are initializing AlbertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing AlbertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


In [6]:
raw_text = "ধন্যবাদ। আপনার সাথে কথা [MASK] ভালো লাগলো"  # Change me
pipeline(raw_text)

[{'sequence': 'ধন্যবাদ। আপনার সাথে কথা বলতে ভালো লাগলো',
  'score': 0.28910231590270996,
  'token': 403,
  'token_str': 'বলতে'},
 {'sequence': 'ধন্যবাদ। আপনার সাথে কথা হলে ভালো লাগলো',
  'score': 0.09174645692110062,
  'token': 177,
  'token_str': 'হলে'},
 {'sequence': 'ধন্যবাদ। আপনার সাথে কথা করতে ভালো লাগলো',
  'score': 0.07234039902687073,
  'token': 45,
  'token_str': 'করতে'},
 {'sequence': 'ধন্যবাদ। আপনার সাথে কথা করলে ভালো লাগলো',
  'score': 0.05066966637969017,
  'token': 387,
  'token_str': 'করলে'},
 {'sequence': 'ধন্যবাদ। আপনার সাথে কথা লাগলে ভালো লাগলো',
  'score': 0.02641388401389122,
  'token': 7397,
  'token_str': 'লাগলে'}]