<h2 align="center">NLP Tutorial: Hugging Face Pipelines</h2>

In [1]:
from transformers import pipeline

The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.


0it [00:00, ?it/s]

### Sentiment Classificaiton

In [2]:
cls = pipeline("sentiment-analysis")

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Device set to use cuda:0


In [3]:
cls("Pushpa 2 movie is full of violence and gave me a headache")

[{'label': 'NEGATIVE', 'score': 0.9987161159515381}]

In [4]:
cls("12th fail is such an inspiring movie")

[{'label': 'POSITIVE', 'score': 0.9983184337615967}]

**Specify Model Explicitly**

Enable developer mode on windows: https://learn.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development

In [6]:
pipe = pipeline(model="FacebookAI/roberta-large-mnli")
pipe("This restaurant is awesome")

Some weights of the model checkpoint at FacebookAI/roberta-large-mnli were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Device set to use cuda:0


[{'label': 'NEUTRAL', 'score': 0.7313137650489807}]

### Language Translation

In [7]:
translator = pipeline("translation", model="Helsinki-NLP/opus-mt-en-hi")

translation = translator("How are you?")
translation

config.json:   0%|          | 0.00/1.39k [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/306M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/293 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/44.0 [00:00<?, ?B/s]

source.spm:   0%|          | 0.00/812k [00:00<?, ?B/s]

target.spm:   0%|          | 0.00/1.07M [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/2.10M [00:00<?, ?B/s]

Device set to use cuda:0


[{'translation_text': 'आप कैसे हैं?'}]

### ZERO Shot Classification

In [9]:
classifier = pipeline("zero-shot-classification")
classifier(
    "I bought the product but it is faulty, I would like to return it and get my money back",
    candidate_labels=["refund", "new order", "existing order"],
)

No model was supplied, defaulted to facebook/bart-large-mnli and revision d7645e1 (https://huggingface.co/facebook/bart-large-mnli).
Using a pipeline without specifying a model name and revision in production is not recommended.
Device set to use cuda:0


{'sequence': 'I bought the product but it is faulty, I would like to return it and get my money back',
 'labels': ['refund', 'existing order', 'new order'],
 'scores': [0.8122069239616394, 0.17987750470638275, 0.007915527559816837]}

### Text Generation

In [13]:
generator = pipeline("text-generation")
generator("To become happy in life, we need to focus on healthy diet and ")

No model was supplied, defaulted to openai-community/gpt2 and revision 607a30d (https://huggingface.co/openai-community/gpt2).
Using a pipeline without specifying a model name and revision in production is not recommended.
Device set to use cuda:0
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': "To become happy in life, we need to focus on healthy diet and \xa0improve the productivity of our lives, not on a lifestyle that we simply don't care about.\nSome of you have suggested that you should be reading the book you are"}]

### NER

In [17]:
ner = pipeline("ner")
ner("I am Dhaval, I work for Codebasics and live in New Jersey, USA", grouped_entities=True)

No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision 4c53496 (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Some weights of the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Device set to use cuda:0


[{'entity_group': 'PER',
  'score': 0.9971852,
  'word': 'Dhaval',
  'start': 5,
  'end': 11},
 {'entity_group': 'ORG',
  'score': 0.9914946,
  'word': 'Codebasics',
  'start': 24,
  'end': 34},
 {'entity_group': 'LOC',
  'score': 0.99895823,
  'word': 'New Jersey',
  'start': 47,
  'end': 57},
 {'entity_group': 'LOC',
  'score': 0.99873096,
  'word': 'USA',
  'start': 59,
  'end': 62}]

## Exercise

Try tasks such as Fill-Mask, Question Answering, Sentence Similarity etc on your own text