# 1. Transformer models

## Natural Language Processing (NLP) and Large Language Models (LLMs)

- NLP
  - Focused on understanding related to the human language
  - Understand the context
  - Indentify the grammaticam components
- LLMs
  - AI model trained on large text datasets, like a books, newspapers, etc
  - "Advanced NLP"
- LLM problems
  - Hallucinations: generating things that do not exist
  - Bias: generating things that are only in the documents/data that the model was trained on
  - Context windows: the data that the model has access to is limited. Old models: 4K tokens. New models: 128K tokens

## Transformers, what can they do?

- NLP, computer vision, audio processing
- Created in 2017
- Attention is all you need: https://arxiv.org/pdf/1706.03762
- Self-attention: focus on relevant parts of the input

### Examples

- Pipeline: sentiment analysis
- Pre-trained model, fine-tuned to sentiment analysis in English

In [1]:
from transformers import pipeline

classifier = pipeline("sentiment-analysis", device=0)
result = classifier("I've been waiting for a HuggingFace course my whole life.")
print(result)

  from .autonotebook import tqdm as notebook_tqdm
No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Device set to use cuda:0


[{'label': 'POSITIVE', 'score': 0.9598046541213989}]


In [2]:
import torch

print(torch.cuda.is_available())
print(torch.cuda.get_device_name(0))

True
NVIDIA GeForce RTX 2060


### Zero-shot classification

- Text haven't been labelled before

In [3]:
classifier = pipeline("zero-shot-classification")
classifier(
    "This is a course about the Transformers library",
    candidate_labels=["education", "politics", "business"],
)

No model was supplied, defaulted to facebook/bart-large-mnli and revision d7645e1 (https://huggingface.co/facebook/bart-large-mnli).
Using a pipeline without specifying a model name and revision in production is not recommended.
Device set to use cuda:0


{'sequence': 'This is a course about the Transformers library',
 'labels': ['education', 'business', 'politics'],
 'scores': [0.844596266746521, 0.11197623610496521, 0.04342750087380409]}