Source: https://huggingface.co/learn/nlp-course/chapter1/8?fw=pt

# Bias and limitations

If your intent is to use a pretrained model or a fine-tuned version in production, please be aware that, while these models are powerful tools, they come with limitations. The biggest of these is that, to enable pretraining on large amounts of data, <span style="color:blue">researchers often scrape all the content they can find, <span style="color:green">taking the best </span><span style="color:red">as well as the worst</span> of what is available on the internet</span>.

To give a quick illustration, let’s go back the example of a `fill-mask` pipeline with the BERT model:

In [3]:
from transformers import pipeline

unmasker = pipeline("fill-mask", model="bert-base-uncased")

result = unmasker("This man works as a [MASK].")
print([r["token_str"] for r in result])

result = unmasker("This woman works as a [MASK].")
print([r["token_str"] for r in result])

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


['carpenter', 'lawyer', 'farmer', 'businessman', 'doctor']
['nurse', 'maid', 'teacher', 'waitress', 'prostitute']


When asked to fill in the missing word in these two sentences, <span style="color:red">the model gives only one gender-free answer (waiter/waitress). The others are work occupations usually associated with one specific gender — and yes, prostitute ended up in the top 5 possibilities the model associates with “woman” and “work”</span>. This happens even though BERT is one of the rare Transformer models not built by scraping data from all over the internet, but rather using apparently neutral data (it’s trained on the English Wikipedia and BookCorpus datasets).

When you use these tools, you therefore need to keep in the back of your mind that the <span style="color:red">original model you are using could very easily generate sexist, racist, or homophobic content. <b>Fine-tuning the model on your data won’t make this intrinsic bias disappear</b></span>.

# Summary

In this chapter, you saw how to approach different NLP tasks using the 1high-level pipeline()1 function from 🤗 Transformers. You also saw how to search for and use models in the Hub, as well as how to use the Inference API to test the models directly in your browser.

We discussed how Transformer models work at a high level, and talked about the importance of transfer learning and fine-tuning. <span style="color:blue">A key aspect is that you can use the full architecture or only the encoder or decoder, depending on what kind of task you aim to solve</span>. The following table summarizes this:

| Model |	Examples |	Tasks |
| :- | :- | :- |
| Encoder	| ALBERT, BERT, DistilBERT, ELECTRA, RoBERTa	| Sentence classification, named entity recognition, extractive question answering | 
| Decoder	| CTRL, GPT, GPT-2, Transformer XL	| Text generation |
| Encoder-decoder	| BART, T5, Marian, mBART	| Summarization, translation, generative question answering |

In [7]:
from transformers import pipeline

classifier = pipeline("zero-shot-classification")
result = classifier("This is a course about the Transformers library", 
                    candidate_labels=["educational", "not educational"])
print (result)

No model was supplied, defaulted to facebook/bart-large-mnli and revision c626438 (https://huggingface.co/facebook/bart-large-mnli).
Using a pipeline without specifying a model name and revision in production is not recommended.


{'sequence': 'This is a course about the Transformers library', 'labels': ['educational', 'not educational'], 'scores': [0.9890398979187012, 0.010960041545331478]}
