Link to the course: https://huggingface.co/course

# Content

# 1. Transformers Models

The `pipeline` object from transformers library connects a model with all necessary preprocessing and postprocessing steps.

**Examples:**

In [1]:
from transformers import pipeline

classifier = pipeline("sentiment-analysis")
print(classifier("I've been waiting for a HuggingFace course my whole life."))
print(classifier([
    "I've been waiting for a HuggingFace course my whole life.", 
    "I hate this so much!"
]))

No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english)
2021-10-16 22:45:08.263901: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-10-16 22:45:08.264575: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-10-16 22:45:08.267449: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
2021-10-16 22:45:08.284707: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future

[{'label': 'POSITIVE', 'score': 0.9598046541213989}]

A list with available pipelines may be found [here](https://huggingface.co/transformers/main_classes/pipelines.html).

## How do Transformers work?

Language models, self-supervised learning, transfer learning.

Transformers are **language models**. This type of model develops a statiscal understanding of the language it was tranined on.

## Encoder Models

Encoder models use only the encoder part of the transformer. Such models are best suited for tasks that require a knowledge of the whole sentence (eg. NER, sentence classification).

## Decoder Models

Decoder models use only de decoder part of the transformer. At each stage, they have access only to the words preceding the current word. These models are often called auto-regressive models. Such models are best suited for tasks that involves text generation.

## Encoder-Decoder Models (Sequence to Sequence)

These models use both parts of the transformer architecture. Sequence-to-sequence models are best suited for tasks revolving around generating new sentences depending on a given input (eg. summarization, translation).

## Limitations and Bias

In the above code, note how the models generates professions skewed based on gender. For instance "carpenter" is often associated with a men's work while "nurse" is associated with a wo

In [1]:
from transformers import pipeline

unmasker = pipeline("fill-mask", model="bert-base-uncased")
result = unmasker("This man works as a [MASK].")
print([r["token_str"] for r in result])

result = unmasker("This woman works as a [MASK].")
print([r["token_str"] for r in result])

Downloading:   0%|          | 0.00/570 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/511M [00:00<?, ?B/s]

2021-10-17 21:20:50.200597: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-10-17 21:20:50.201160: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-10-17 21:20:50.203377: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
All model checkpoint layers were used when initializing TFBertForMaskedLM.

All the layers of TFBertForMaskedLM were initialized from the model checkpoint at bert-base-uncased.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBertForMaskedLM for pr

Downloading:   0%|          | 0.00/28.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/226k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/455k [00:00<?, ?B/s]

['carpenter', 'lawyer', 'farmer', 'businessman', 'doctor']
['nurse', 'maid', 'teacher', 'waitress', 'prostitute']


# 2. Using Hugging Face Transformers

 # 3. Fine-tuning a pretrained model

# 4 Sharing models and tokenizers