In [30]:
from transformers import pipeline

# Sentitment Analysis 
## Perform positive / negative analysis

In [31]:
classifier = pipeline('sentiment-analysis')

classifier(
    ["This is going to be interesting", "I hate to have to do this", "I dont dislike this"]
)

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


[{'label': 'POSITIVE', 'score': 0.9996521472930908},
 {'label': 'NEGATIVE', 'score': 0.8994091153144836},
 {'label': 'POSITIVE', 'score': 0.8432842493057251}]

In [29]:
summarizer = pipeline("summarization", device=0)

summarizer("That does not seem to bad")

No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.
Your max_length is set to 142, but your input_length is only 8. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=4)


[{'summary_text': " That does not seem to seem to bad. That does seem to be a bad thing. That doesn't seem to matter. That's a good thing. It's a long way to go, but it's not a long road back to the top of the world's most famous buildings ."}]

In [22]:
classifier = pipeline("zero-shot-classification", device=0) # device=0 will use the first GPU
classifier(
    "This is a course about the Transformers library",
    candidate_labels=["education", "politics", "business"],
)

No model was supplied, defaulted to facebook/bart-large-mnli and revision c626438 (https://huggingface.co/facebook/bart-large-mnli).
Using a pipeline without specifying a model name and revision in production is not recommended.


{'sequence': 'This is a course about the Transformers library',
 'labels': ['education', 'business', 'politics'],
 'scores': [0.8445970416069031, 0.11197544634342194, 0.04342753440141678]}

In [27]:
generator = pipeline("text-generation", device=0, model="distilgpt2")

generator(
    "I think that we will",
    max_length=30,
    num_return_sequences=10, #controls the number of different sequences generated
    )

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': 'I think that we will never find a way to protect the public.\u200d'},
 {'generated_text': 'I think that we will learn through this experience, and that we can learn from the experiences. If we can get that insight, and that we can'},
 {'generated_text': 'I think that we will be talking about this when it’s time to meet.” We think that.\n\nOn October 25th'},
 {'generated_text': "I think that we will get back to the game a little bit. I'm not gonna have to worry about that, I'm just gonna be more"},
 {'generated_text': 'I think that we will be a much richer world."'},
 {'generated_text': 'I think that we will face a new reality.\n\n\nAdvertisements'},
 {'generated_text': 'I think that we will both stand by each other and have a great year. And that we will both strive to be the best we can together."'},
 {'generated_text': 'I think that we will see how it changes when it comes to the American people and the economy."'},
 {'generated_text': 'I think that we will need 

# Fill Mask
### Will fill in the blanks in a given text

In [28]:
unmasker = pipeline("fill-mask", device=0)

unmasker(
    "This course will teach you all about <mask> models.", #<mask> is reserved word for the model to fill in
    top_k=2 #controls the number of predictions generated
)

No model was supplied, defaulted to distilbert/distilroberta-base and revision ec58a5b (https://huggingface.co/distilbert/distilroberta-base).
Using a pipeline without specifying a model name and revision in production is not recommended.
Some weights of the model checkpoint at distilbert/distilroberta-base were not used when initializing RobertaForMaskedLM: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[{'score': 0.19198563694953918,
  'token': 30412,
  'token_str': ' mathematical',
  'sequence': 'This course will teach you all about mathematical models.'},
 {'score': 0.042092159390449524,
  'token': 38163,
  'token_str': ' computational',
  'sequence': 'This course will teach you all about computational models.'}]

# Named Entity Recognition
### NER is a task where the model has to find which parts of the input text correspond to entities such as persons, locations, or orginizations

In [32]:
ner = pipeline("ner", grouped_entities=True, device=0)
ner("My name is Sylvain and I work at Hugging Face in Brooklyn.")

No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision f2482bf (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Some weights of the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[{'entity_group': 'PER',
  'score': np.float32(0.9981694),
  'word': 'Sylvain',
  'start': 11,
  'end': 18},
 {'entity_group': 'ORG',
  'score': np.float32(0.9796019),
  'word': 'Hugging Face',
  'start': 33,
  'end': 45},
 {'entity_group': 'LOC',
  'score': np.float32(0.9932106),
  'word': 'Brooklyn',
  'start': 49,
  'end': 57}]