### Classical Mission
- `"audio-classification"`: will return a [`AudioClassificationPipeline`].
- `"automatic-speech-recognition"`: will return a [`AutomaticSpeechRecognitionPipeline`].
- `"conversational"`: will return a [`ConversationalPipeline`].
- `"depth-estimation"`: will return a [`DepthEstimationPipeline`].
- `"document-question-answering"`: will return a [`DocumentQuestionAnsweringPipeline`].
- `"feature-extraction"`: will return a [`FeatureExtractionPipeline`].
- `"fill-mask"`: will return a [`FillMaskPipeline`]:.
- `"image-classification"`: will return a [`ImageClassificationPipeline`].
- `"image-segmentation"`: will return a [`ImageSegmentationPipeline`].
- `"image-to-text"`: will return a [`ImageToTextPipeline`].
- `"mask-generation"`: will return a [`MaskGenerationPipeline`].
- `"object-detection"`: will return a [`ObjectDetectionPipeline`].
- `"question-answering"`: will return a [`QuestionAnsweringPipeline`].
- `"summarization"`: will return a [`SummarizationPipeline`].
- `"table-question-answering"`: will return a [`TableQuestionAnsweringPipeline`].
- `"text2text-generation"`: will return a [`Text2TextGenerationPipeline`].
- `"text-classification"` (alias `"sentiment-analysis"` available): will return a
  [`TextClassificationPipeline`].
- `"text-generation"`: will return a [`TextGenerationPipeline`]:.
- `"token-classification"` (alias `"ner"` available): will return a [`TokenClassificationPipeline`].
- `"translation"`: will return a [`TranslationPipeline`].
- `"translation_xx_to_yy"`: will return a [`TranslationPipeline`].
- `"video-classification"`: will return a [`VideoClassificationPipeline`].
- `"visual-question-answering"`: will return a [`VisualQuestionAnsweringPipeline`].
- `"zero-shot-classification"`: will return a [`ZeroShotClassificationPipeline`].
- `"zero-shot-image-classification"`: will return[`ZeroShotImageClassificationPipeline`].
- `"zero-shot-audio-classification"`: will return[`ZeroShotAudioClassificationPipeline`].
- `"zero-shot-object-detection"`: will return a [`ZeroShotObjectDetectionPipeline`].

In [13]:
# Pipeline for classical NLP missions
# Example: sentiment analysis
from transformers import pipeline

classifier = pipeline("sentiment-analysis")
classifier("Some sentence")

No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


[{'label': 'NEGATIVE', 'score': 0.8669572472572327}]

### Tokenizer

Special Tokens: 
- [CLS]: classification token, corresponing output vector represent the classification result, such a meaningless token support more fairly combination of text information.
- [PAD]: padding token
- [SEP]: segment token
- [UNK]: unknown token
- [MASK]: mask token

Implementation: `transformers.PreTrainedTokenizer` or `transformers.PreTrainedTokenizer` (Base class for Tokenizer)

Output format: dict
- `input_ids`: indice of tokens (tokens_tensor)
- `token_type_ids`: sentence segmentation, 0-firstsentence; 1-second sentence (segments_tensor)
- `attention_mask`: 1 indicate attention requiredfor this token (mask_tensor)

In [6]:
from transformers import BertTokenizerFast
tokenizer: BertTokenizerFast = BertTokenizerFast.from_pretrained("bert-base-uncased")
type(tokenizer)

transformers.models.bert.tokenization_bert_fast.BertTokenizerFast

In [11]:
# Directly callable
raw_sentence: str = "Hello, BERT"
tokenizer(raw_sentence)

{'input_ids': [101, 7592, 1010, 14324, 102], 'token_type_ids': [0, 0, 0, 0, 0], 'attention_mask': [1, 1, 1, 1, 1]}

In [12]:
tokenizer.tokenize(raw_sentence)

['hello', ',', 'bert']

In [14]:
encode_sentence = tokenizer.encode(raw_sentence)
encode_sentence

[101, 7592, 1010, 14324, 102]

In [17]:
tokenizer.decode(encode_sentence, skip_special_tokens=False)

'[CLS] hello, bert [SEP]'

In [20]:
tokenizer.vocab_size

30522

### BERT Model

In [8]:
from transformers import BertModel
bert: BertModel = BertModel.from_pretrained("bert-base-uncased")
type(bert)

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertModel: ['cls.predictions.transform.dense.weight', 'cls.predictions.decoder.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.bias', 'cls.predictions.transform.dense.bias']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


transformers.models.bert.modeling_bert.BertModel

In [22]:
bert.config

BertConfig {
  "_name_or_path": "bert-base-uncased",
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "classifier_dropout": null,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "position_embedding_type": "absolute",
  "transformers_version": "4.29.2",
  "type_vocab_size": 2,
  "use_cache": true,
  "vocab_size": 30522
}

In [45]:
import torch

def to_tensor(src: dict[str, list[int]]) -> dict[str, torch.LongTensor]:
    for k, v in src.items():
        src[k] = torch.LongTensor(v)
    return src

corpus: list[str] = ["I like PyTorch", "Hello, BERT"]
src: dict[str, list[int]] = tokenizer(corpus, padding=True, truncation=True)
result_dict = bert(**to_tensor(src))
result_dict

BaseModelOutputWithPoolingAndCrossAttentions(last_hidden_state=tensor([[[ 5.7107e-02,  2.5669e-01, -9.5842e-02,  ..., -1.2954e-01,
           5.1437e-01,  5.7981e-01],
         [ 1.9249e-01,  3.8698e-01, -3.5842e-01,  ...,  3.5617e-01,
           9.1635e-01,  3.7469e-01],
         [ 1.8221e-01,  1.3408e-01,  9.1152e-01,  ..., -6.2215e-02,
           6.5364e-01,  4.2190e-01],
         ...,
         [ 4.7319e-01,  4.3217e-01,  9.9005e-01,  ..., -9.9500e-04,
           3.9321e-01,  8.0758e-01],
         [-3.5663e-01, -2.3165e-01, -5.6852e-01,  ...,  9.7276e-01,
           5.3078e-01, -3.6599e-02],
         [ 9.1939e-01,  2.7100e-01, -7.1249e-02,  ..., -7.5035e-02,
          -6.7383e-01, -1.5287e-01]],

        [[-3.1059e-02,  4.1652e-01, -2.8671e-01,  ...,  2.3529e-03,
           2.7787e-01,  4.6885e-01],
         [-2.7230e-02, -1.7579e-02,  5.1009e-02,  ...,  8.4250e-02,
           5.9173e-01, -4.7261e-02],
         [-1.0027e+00,  6.8936e-01,  4.3878e-01,  ..., -6.8877e-01,
           2.

In [46]:
# Output Dimension: 
# "last_hidden_state": batch_size * token_counts * 768 
# "last_hidden_state": batch_size  * 768 
for k, v in result_dict.items():
    print(f'{k}: {result_dict[k].shape}')

last_hidden_state: torch.Size([2, 8, 768])
pooler_output: torch.Size([2, 768])
