# Play with HuggingFace 🤗 Transformers

## Installing requirements

In [1]:
%%capture
!pip install transformers

## 1- Using Pipelines

In [2]:
from transformers import pipeline

### 1-1. Pipeline creation with task

In [3]:
my_pipeline = pipeline(task='sentiment-analysis')

No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


Downloading (…)lve/main/config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

Downloading model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

In [4]:
my_pred = my_pipeline(["I'm very positive with HuggingFace courses",
                       "I'm not very positive with HuggingFace courses"])
print(my_pred)

[{'label': 'POSITIVE', 'score': 0.9998395442962646}, {'label': 'NEGATIVE', 'score': 0.9997095465660095}]


### 1-2. Pipeline creation with model

In [5]:
pipe = pipeline("text-classification",
                model="mrm8488/distilroberta-finetuned-financial-news-sentiment-analysis")

Downloading (…)lve/main/config.json:   0%|          | 0.00/933 [00:00<?, ?B/s]

Downloading model.safetensors:   0%|          | 0.00/328M [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/333 [00:00<?, ?B/s]

Downloading (…)olve/main/vocab.json:   0%|          | 0.00/798k [00:00<?, ?B/s]

Downloading (…)olve/main/merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

In [17]:
my_pred_2 = pipe(["I'm very positive with HuggingFace courses",
                  "I'm not very positive with HuggingFace courses"])
print(my_pred_2)

[{'label': 'positive', 'score': 0.9993441700935364}, {'label': 'negative', 'score': 0.7747668027877808}]


## 2- Using Auto classes

In [7]:
# Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification

### 2-1. Auto Tokenizer (pre-processing)

In [8]:
tokenizer = AutoTokenizer.from_pretrained("mrm8488/distilroberta-finetuned-financial-news-sentiment-analysis")

In [9]:
inputs = ["I'm very positive with HuggingFace courses",
          "I'm not very positive with HuggingFace courses"]

processed_inputs = tokenizer(inputs, return_tensors='pt',
                             padding=True, truncation=True)

print(processed_inputs)

{'input_ids': tensor([[    0,   100,   437,   182,  1313,    19, 30581,  3923, 34892,  7484,
             2,     1],
        [    0,   100,   437,    45,   182,  1313,    19, 30581,  3923, 34892,
          7484,     2]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0],
        [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])}


### 2-2. Auto Model

In [10]:
model = AutoModelForSequenceClassification.from_pretrained("mrm8488/distilroberta-finetuned-financial-news-sentiment-analysis")

In [11]:
outputs = model(**processed_inputs)
print(outputs)

SequenceClassifierOutput(loss=None, logits=tensor([[-2.4257, -2.7008,  5.4683],
        [ 1.6448, -1.6648,  0.2751]], grad_fn=<AddmmBackward0>), hidden_states=None, attentions=None)


### 2-3. Post-processing

In [12]:
print(model.config.id2label)

{0: 'negative', 1: 'neutral', 2: 'positive'}


In [13]:
import torch

predictions = torch.nn.functional.softmax(outputs.logits)
print(predictions)

tensor([[3.7271e-04, 2.8308e-04, 9.9934e-01],
        [7.7477e-01, 2.8303e-02, 1.9693e-01]], grad_fn=<SoftmaxBackward0>)


  predictions = torch.nn.functional.softmax(outputs.logits)


## 3- Pipeline for question-answering

In [14]:
qa_pipeline = pipeline("question-answering")

No model was supplied, defaulted to distilbert-base-cased-distilled-squad and revision 626af31 (https://huggingface.co/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.


Downloading (…)lve/main/config.json:   0%|          | 0.00/473 [00:00<?, ?B/s]

Downloading model.safetensors:   0%|          | 0.00/261M [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/29.0 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/213k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/436k [00:00<?, ?B/s]

In [15]:
context = """
🤗 Transformers is backed by the three most popular deep learning libraries — Jax, PyTorch, and TensorFlow — with a seamless integration
between them. It's straightforward to train your models with one before loading them for inference with the other.
"""

question = "Which deep learning libraries back 🤗 Transformers?"

answer = qa_pipeline(question=question, context=context)

In [16]:
print(answer)

{'score': 0.9802603125572205, 'start': 78, 'end': 106, 'answer': 'Jax, PyTorch, and TensorFlow'}
