# Hugging Face 

Your go-to tool for using any pretrained models.

In [45]:
import transformers
transformers.__version__

'4.36.2'

In [46]:
import evaluate #metrics
evaluate.__version__

'0.4.1'

In [47]:
import datasets
datasets.__version__

'2.16.0'

In [48]:
import accelerate
accelerate.__version__

'0.25.0'

## 1. Pipeline 

The most basic thing in Huggingface; you insert the pretrained model, and just use it for inference.

In [50]:
#sentiment analysis
from transformers import pipeline

clf = pipeline("sentiment-analysis", model="distilbert-base-uncased-finetuned-sst-2-english")
clf("I do not love huggingface so much")

[{'label': 'NEGATIVE', 'score': 0.9867877960205078}]

In [51]:
clf = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")
clf("This is a NLP course on Huggingface", candidate_labels=["education", "tech", "sports"])

{'sequence': 'This is a NLP course on Huggingface',
 'labels': ['tech', 'education', 'sports'],
 'scores': [0.585610032081604, 0.39745786786079407, 0.016932116821408272]}

In [53]:
gen = pipeline("text-generation", model="distilgpt2")
gen("AI is transforming our everyday lives", max_length=100, num_return_sequences=2)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': 'AI is transforming our everyday lives into the most vibrant, safe and fulfilling life for our members.\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n'},
 {'generated_text': 'AI is transforming our everyday lives. Even the most modest, affluent, and socially-connected, people we know may feel a little bit of guilt about doing so.'}]

In [55]:
mlm = pipeline('fill-mask', model="distilroberta-base")
mlm("Chaky loves to teach deep <mask>.", top_k=3)

Some weights of the model checkpoint at distilroberta-base were not used when initializing RobertaForMaskedLM: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[{'score': 0.15232914686203003,
  'token': 2239,
  'token_str': ' learning',
  'sequence': 'Chaky loves to teach deep learning.'},
 {'score': 0.10399948805570602,
  'token': 9589,
  'token_str': ' breathing',
  'sequence': 'Chaky loves to teach deep breathing.'},
 {'score': 0.07009342312812805,
  'token': 30079,
  'token_str': ' truths',
  'sequence': 'Chaky loves to teach deep truths.'}]

In [56]:
qa = pipeline("question-answering", model="distilbert-base-cased-distilled-squad")
qa(question="Where to Chaky work?", context="My name is Chaky and I love to teach at AIT.")

{'score': 0.9159268140792847, 'start': 40, 'end': 43, 'answer': 'AIT'}

In [58]:
#gender bias
mlm = pipeline("fill-mask", model="distilroberta-base")
result = mlm("This woman works as a <mask>.")
result

Some weights of the model checkpoint at distilroberta-base were not used when initializing RobertaForMaskedLM: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[{'score': 0.10098593682050705,
  'token': 35698,
  'token_str': ' waitress',
  'sequence': 'This woman works as a waitress.'},
 {'score': 0.08963349461555481,
  'token': 28894,
  'token_str': ' translator',
  'sequence': 'This woman works as a translator.'},
 {'score': 0.07987944036722183,
  'token': 9008,
  'token_str': ' nurse',
  'sequence': 'This woman works as a nurse.'},
 {'score': 0.06407161056995392,
  'token': 33080,
  'token_str': ' bartender',
  'sequence': 'This woman works as a bartender.'},
 {'score': 0.04693792760372162,
  'token': 8298,
  'token_str': ' consultant',
  'sequence': 'This woman works as a consultant.'}]

## 2. Tokenization

The first component of the pipeline.

In [59]:
from transformers import AutoTokenizer

checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"
tokenizer  = AutoTokenizer.from_pretrained(checkpoint)

In [60]:
raw_inputs = ['Chaky has been waiting in queue for sushi.',
              "Huggingface can do lots of stuffs so make sure you try everything."]

In [61]:
inputs = tokenizer(raw_inputs, padding=True, truncation=True, return_tensors='pt')

In [62]:
inputs

{'input_ids': tensor([[  101, 15775,  4801,  2038,  2042,  3403,  1999, 24240,  2005, 10514,
          6182,  1012,   102,     0,     0,     0,     0],
        [  101, 17662, 12172,  2064,  2079,  7167,  1997,  4933,  2015,  2061,
          2191,  2469,  2017,  3046,  2673,  1012,   102]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0],
        [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])}

In [64]:
tokenizer.decode([  101, 17662, 12172,  2064,  2079,  7167,  1997,  4933,  2015,  2061,
          2191,  2469,  2017,  3046,  2673,  1012,   102])

'[CLS] huggingface can do lots of stuffs so make sure you try everything. [SEP]'

## 3. Model

The second component of Pipeline (after Tokenizer)

In [65]:
from transformers import AutoModel

checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"
model      = AutoModel.from_pretrained(checkpoint)

In [66]:
inputs

{'input_ids': tensor([[  101, 15775,  4801,  2038,  2042,  3403,  1999, 24240,  2005, 10514,
          6182,  1012,   102,     0,     0,     0,     0],
        [  101, 17662, 12172,  2064,  2079,  7167,  1997,  4933,  2015,  2061,
          2191,  2469,  2017,  3046,  2673,  1012,   102]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0],
        [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])}

In [67]:
outputs = model(**inputs)

In [70]:
outputs.last_hidden_state.shape  #(batch size, seq length, hidden state of this model)

torch.Size([2, 17, 768])

## 4. Postprocessing

Last step of the Pipeline (after the Model)

In [71]:
from transformers import AutoModelForSequenceClassification

checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"
model      = AutoModelForSequenceClassification.from_pretrained(checkpoint)
outputs    = model(**inputs)

In [75]:
outputs.logits

tensor([[ 2.6290, -2.1602],
        [-2.7711,  2.8129]], grad_fn=<AddmmBackward0>)

In [73]:
model.config.id2label

{0: 'NEGATIVE', 1: 'POSITIVE'}

In [76]:
import torch
predictions = torch.nn.functional.softmax(outputs.logits, dim = 1)

In [77]:
predictions

tensor([[0.9917, 0.0083],
        [0.0037, 0.9963]], grad_fn=<SoftmaxBackward0>)