# Fill-Mask


### **Masked language modeling** 

---

It is the task of masking some of the words in a sentence and predicting which words should replace those masks. 

These models are useful when we want to get a statistical understanding of the language in which the model is trained in.

### **Use Cases**

--- 

#### **Domain Adaptation 👩‍⚕️**



* Masked language models do not require labelled data.

* They are trained by masking a couple of words in sentences and the model is expected to guess the masked word. 

* For example, masked language modeling is used to train large models for domain-specific problems. 

* If you have to work on a domain-specific task, such as retrieving information from medical research papers, you can train a masked language model using those papers. 📄

* The resulting model has a statistical understanding of the language used in medical research papers, and can be further trained in a process called fine-tuning to solve different tasks, such as Text Classification or Question Answering to build a medical research papers information extraction system. 

* 👩‍⚕️ Pre-training on domain-specific data tends to yield better results.

### **Inference with Fill-Mask Pipeline**

---

* You can use the 🤗 Transformers library fill-mask pipeline to do inference with masked language models. 

* If a model name is not provided, the pipeline will be initialized with distilroberta-base. 

* You can provide masked text and it will return a list of possible mask values ​​ranked according to the score.

### **Metrics for Fill-Mask**

---

* **cross_entropy** : Cross Entropy is a metric that calculates the difference between two probability distributions. Each probability distribution is the distribution of predicted words.

* **perpexility** : Perplexity is the exponential of the cross-entropy loss. It evaluates the probabilities assigned to the next word by the model. Lower perplexity indicates better performance.

---
---
---

##### Importing Transformers

In [1]:
from transformers import pipeline

##### Initilizing Pipeline for Fill-Mask Task

In [2]:
classifier = pipeline("fill-mask")

Downloading:   0%|          | 0.00/480 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/331M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/899k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

##### Sample run

In [3]:
classifier("Paris is the <mask> of France.")

[{'sequence': 'Paris is the capital of France.',
  'score': 0.6790187954902649,
  'token': 812,
  'token_str': ' capital'},
 {'sequence': 'Paris is the birthplace of France.',
  'score': 0.051779814064502716,
  'token': 32357,
  'token_str': ' birthplace'},
 {'sequence': 'Paris is the heart of France.',
  'score': 0.03825274854898453,
  'token': 1144,
  'token_str': ' heart'},
 {'sequence': 'Paris is the envy of France.',
  'score': 0.024349061772227287,
  'token': 29778,
  'token_str': ' envy'},
 {'sequence': 'Paris is the Capital of France.',
  'score': 0.022851280868053436,
  'token': 1867,
  'token_str': ' Capital'}]

In [4]:
classifier("The goal of life is <mask>.")

[{'sequence': 'The goal of life is happiness.',
  'score': 0.06897156685590744,
  'token': 11098,
  'token_str': ' happiness'},
 {'sequence': 'The goal of life is immortality.',
  'score': 0.06554929167032242,
  'token': 45075,
  'token_str': ' immortality'},
 {'sequence': 'The goal of life is yours.',
  'score': 0.0323575995862484,
  'token': 14314,
  'token_str': ' yours'},
 {'sequence': 'The goal of life is liberation.',
  'score': 0.024313777685165405,
  'token': 22211,
  'token_str': ' liberation'},
 {'sequence': 'The goal of life is simplicity.',
  'score': 0.023767927661538124,
  'token': 25342,
  'token_str': ' simplicity'}]

In [5]:
classifier("True love never <mask>.")

[{'sequence': 'True love never dies.',
  'score': 0.34872037172317505,
  'token': 8524,
  'token_str': ' dies'},
 {'sequence': 'True love never fails.',
  'score': 0.17603053152561188,
  'token': 10578,
  'token_str': ' fails'},
 {'sequence': 'True love never ends.',
  'score': 0.11449392884969711,
  'token': 3587,
  'token_str': ' ends'},
 {'sequence': 'True love never stops.',
  'score': 0.05086619779467583,
  'token': 6897,
  'token_str': ' stops'},
 {'sequence': 'True love never sleeps.',
  'score': 0.04755270481109619,
  'token': 36831,
  'token_str': ' sleeps'}]

In [7]:
classifier("NED University is famous for <mask>.")

[{'sequence': 'NED University is famous for censorship.',
  'score': 0.03990253061056137,
  'token': 23915,
  'token_str': ' censorship'},
 {'sequence': 'NED University is famous for diversity.',
  'score': 0.02948944829404354,
  'token': 5845,
  'token_str': ' diversity'},
 {'sequence': 'NED University is famous for excellence.',
  'score': 0.01603603921830654,
  'token': 12411,
  'token_str': ' excellence'},
 {'sequence': 'NED University is famous for academics.',
  'score': 0.014674575999379158,
  'token': 16839,
  'token_str': ' academics'},
 {'sequence': 'NED University is famous for athletics.',
  'score': 0.014362143352627754,
  'token': 16015,
  'token_str': ' athletics'}]

In [9]:
classifier("Karachi has the world's largest <mask>.")

[{'sequence': "Karachi has the world's largest population.",
  'score': 0.3576110601425171,
  'token': 1956,
  'token_str': ' population'},
 {'sequence': "Karachi has the world's largest airport.",
  'score': 0.09124165773391724,
  'token': 3062,
  'token_str': ' airport'},
 {'sequence': "Karachi has the world's largest mosque.",
  'score': 0.07742006331682205,
  'token': 12958,
  'token_str': ' mosque'},
 {'sequence': "Karachi has the world's largest refinery.",
  'score': 0.027445845305919647,
  'token': 13628,
  'token_str': ' refinery'},
 {'sequence': "Karachi has the world's largest economy.",
  'score': 0.02396676503121853,
  'token': 866,
  'token_str': ' economy'}]

In [13]:
classifier("Canada is famous for <mask>.")

[{'sequence': 'Canada is famous for earthquakes.',
  'score': 0.023486588150262833,
  'token': 20396,
  'token_str': ' earthquakes'},
 {'sequence': 'Canada is famous for diversity.',
  'score': 0.02198144793510437,
  'token': 5845,
  'token_str': ' diversity'},
 {'sequence': 'Canada is famous for corruption.',
  'score': 0.01776721514761448,
  'token': 3198,
  'token_str': ' corruption'},
 {'sequence': 'Canada is famous for doping.',
  'score': 0.015277360565960407,
  'token': 16485,
  'token_str': ' doping'},
 {'sequence': 'Canada is famous for cycling.',
  'score': 0.015154481865465641,
  'token': 12731,
  'token_str': ' cycling'}]

In [14]:
classifier("Students are very <mask>.")

[{'sequence': 'Students are very welcome.',
  'score': 0.04593575745820999,
  'token': 2814,
  'token_str': ' welcome'},
 {'sequence': 'Students are very excited.',
  'score': 0.038177426904439926,
  'token': 2283,
  'token_str': ' excited'},
 {'sequence': 'Students are very polite.',
  'score': 0.03741106018424034,
  'token': 24908,
  'token_str': ' polite'},
 {'sequence': 'Students are very supportive.',
  'score': 0.030727164819836617,
  'token': 8440,
  'token_str': ' supportive'},
 {'sequence': 'Students are very grateful.',
  'score': 0.030097436159849167,
  'token': 6161,
  'token_str': ' grateful'}]

In [16]:
classifier("The <mask> barked at me.")

[{'sequence': 'The wolf barked at me.',
  'score': 0.098700150847435,
  'token': 23255,
  'token_str': ' wolf'},
 {'sequence': 'The dog barked at me.',
  'score': 0.08881120383739471,
  'token': 2335,
  'token_str': ' dog'},
 {'sequence': 'The cat barked at me.',
  'score': 0.06390288472175598,
  'token': 4758,
  'token_str': ' cat'},
 {'sequence': 'The fox barked at me.',
  'score': 0.04156745225191116,
  'token': 23602,
  'token_str': ' fox'},
 {'sequence': 'The owl barked at me.',
  'score': 0.027686724439263344,
  'token': 37323,
  'token_str': ' owl'}]

### **Source :**

* https://huggingface.co/tasks/fill-mask