# Masked LM

<div class="alert alert-info">

This tutorial is available as an IPython notebook at [Malaya/example/mlm](https://github.com/huseinzol05/Malaya/tree/master/example/mlm).
    
</div>

Masked Language Model Scoring, https://arxiv.org/abs/1910.14659

We are able to use BERT, ALBERT and RoBERTa from HuggingFace to do text scoring.

In [1]:
import malaya

 The versions of TensorFlow you are currently using is 2.6.0 and is not supported. 
Some things might work, some things might not.
If you were to encounter a bug, do not file an issue.
If you want to make sure you're using a tested and supported configuration, either change the TensorFlow version or the TensorFlow Addons's version. 
You can find the compatibility matrix in TensorFlow Addon's readme:
https://github.com/tensorflow/addons
TensorFlow Addons has compiled its custom ops against TensorFlow 2.4.0, and there are no compatibility guarantees between the two versions. 
This means that you might get segfaults when loading the custom op, or other kind of low-level errors.
 If you do, do not file an issue on Github. This is a known limitation.

It might help you to fallback to pure Python ops with TF_ADDONS_PY_OPS . To do that, see https://github.com/tensorflow/addons#gpucpu-custom-ops 

You can also change the TensorFlow version installed on your system. You would need a TensorFlow 

### Dependency

Make sure you already installed,

```bash
pip3 install transformers
```

### List available MLM models

In [2]:
malaya.language_model.available_mlm()

Unnamed: 0,Size (MB)
malay-huggingface/bert-base-bahasa-cased,310.0
malay-huggingface/bert-tiny-bahasa-cased,66.1
malay-huggingface/albert-base-bahasa-cased,45.9
malay-huggingface/albert-tiny-bahasa-cased,22.6
mesolitica/roberta-base-bahasa-cased,443.0
mesolitica/roberta-tiny-bahasa-cased,66.1


### Load MLM model

```python
def mlm(model: str = 'malay-huggingface/bert-tiny-bahasa-cased', force_check: bool = True, **kwargs):
    """
    Load Masked language model.

    Parameters
    ----------
    model: str, optional (default='malay-huggingface/bert-tiny-bahasa-cased')
        Check available models at `malaya.language_model.available_mlm()`.
    force_check: bool, optional (default=True)
        Force check model one of malaya model.
        Set to False if you have your own huggingface model.

    Returns
    -------
    result: malaya.torch_model.mask_lm.MLMScorer class
    """
```

If you have other models from huggingface and want to load it on `malaya.torch_model.mask_lm.MLMScorer`, set `force_check=False`.

In [3]:
model = malaya.language_model.mlm(model = 'malay-huggingface/bert-tiny-bahasa-cased')

In [4]:
model.score('saya suke awak')

-41.13948

In [5]:
model.score('saya suka awak')

-23.288536

In [6]:
model.score('najib razak')

-45.52244

In [7]:
model.score('najib comel')

-27.509785