In [2]:
# !pip install datasets evaluate transformers[sentencepiece]

You can load up various models based on tasks using the Pipeline class

**SENTIMENT ANALYSIS**

In [None]:
from transformers import pipeline

classifier = pipeline("sentiment-analysis")
classifier("Sometimes, all we need to do is be present and be aware")

#Uses the distilbert/distilbert-base-uncased-finetuned-sst-2-english model if no model is mentioned

In [6]:
classifier(
    ["I hate how quickly I forget the way heaps work", "Hashmaps are the best thing ever"]
)

[{'label': 'NEGATIVE', 'score': 0.9986183643341064},
 {'label': 'POSITIVE', 'score': 0.9998331069946289}]

**Zero Shot Classification**

In [10]:
from transformers import pipeline

classifier = pipeline("zero-shot-classification")
classifier(
    "I hope that one day I could meet Martin Lurther King",
    candidate_labels=["anticipation", "politics", "business", "super mario"],
)

#Uses the facebook/bart-large-mnli model if no model is specified

No model was supplied, defaulted to facebook/bart-large-mnli and revision c626438 (https://huggingface.co/facebook/bart-large-mnli).
Using a pipeline without specifying a model name and revision in production is not recommended.


{'sequence': 'I hope that one day I could meet Martin Lurther King',
 'labels': ['anticipation', 'politics', 'business', 'super mario'],
 'scores': [0.9636815786361694,
  0.01820194162428379,
  0.0123277073726058,
  0.005788732785731554]}

**TEXT GENERATION**

In [11]:
from transformers import pipeline

generator = pipeline("text-generation")
generator("The day I walk down")

#Uses the openai-community/gpt2 model if no model is specified

No model was supplied, defaulted to openai-community/gpt2 and revision 6c0e608 (https://huggingface.co/openai-community/gpt2).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': "The day I walk down the hall I hear him talking about doing what he loves -- reading my letters. He's really trying to write good stuff. I'm telling him I'll write you something. I'll give him his favorite thing, but his"}]

In [15]:
from transformers import pipeline

generator = pipeline("text-generation", model="distilgpt2")
generator(
    "The day I walk down",
    max_length=100,
    num_return_sequences=3,
)

#If authenticated with Hugging Face API, various models such as Llama-3, Mistral-7B-Instruct-v0.3 can be accessed for free!

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': "The day I walk down from here, I’m talking to the other guy.’ I’m sitting on my desk, sitting on his side in my chair. His eyebrows are raised. He seems to grow more relaxed while I work toward it. He looks around, his hands clasped. He looks around, his legs trembling as I walk down. I get up and sit over him. I'm trying to explain why you’re talking of your mother and his father"},
 {'generated_text': 'The day I walk down to this church, the men ask how many men in this world are of the same birth who come to this temple, and it takes two women to say, \'Are you married?\' and for they tell us \'We can\'t have a temple where we go and we can\'t have our first one.\'"\n\n\n\nOn this Saturday afternoon, we are taking a bus tour on our way to my Temple, where we get our chance to live with those who are still'},
 {'generated_text': 'The day I walk down Fifth Avenue and found the intersection. I was just so excited to get out there the next day."\n\n\n\n\nWhile I wal

**Fill-Mask Models (Placeholder filling generation models)**

In [16]:
from transformers import pipeline

unmasker = pipeline("fill-mask")
unmasker("What I love about Spiderman is how <mask> he is", top_k=3)

#Uses the distilbert/distilroberta-base if no model is specified

No model was supplied, defaulted to distilbert/distilroberta-base and revision ec58a5b (https://huggingface.co/distilbert/distilroberta-base).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/480 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/331M [00:00<?, ?B/s]

Some weights of the model checkpoint at distilbert/distilroberta-base were not used when initializing RobertaForMaskedLM: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


tokenizer_config.json:   0%|          | 0.00/25.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

[{'score': 0.07190816849470139,
  'token': 6344,
  'token_str': ' awesome',
  'sequence': 'What I love about Spiderman is how awesome he is'},
 {'score': 0.04135068506002426,
  'token': 38525,
  'token_str': ' badass',
  'sequence': 'What I love about Spiderman is how badass he is'},
 {'score': 0.03186088800430298,
  'token': 3035,
  'token_str': ' cool',
  'sequence': 'What I love about Spiderman is how cool he is'}]

In [None]:
from transformers import pipeline

ner = pipeline("ner", grouped_entities=True)
ner("My name is Sylvain and I work at Hugging Face in Brooklyn.")

[{'entity_group': 'PER', 'score': 0.99816, 'word': 'Sylvain', 'start': 11, 'end': 18}, 
 {'entity_group': 'ORG', 'score': 0.97960, 'word': 'Hugging Face', 'start': 33, 'end': 45}, 
 {'entity_group': 'LOC', 'score': 0.99321, 'word': 'Brooklyn', 'start': 49, 'end': 57}
]

In [None]:
from transformers import pipeline

question_answerer = pipeline("question-answering")
question_answerer(
    question="Where do I work?",
    context="My name is Sylvain and I work at Hugging Face in Brooklyn",
)

{'score': 0.6385916471481323, 'start': 33, 'end': 45, 'answer': 'Hugging Face'}

In [None]:
from transformers import pipeline

summarizer = pipeline("summarization")
summarizer(
    """
    America has changed dramatically during recent years. Not only has the number of
    graduates in traditional engineering disciplines such as mechanical, civil,
    electrical, chemical, and aeronautical engineering declined, but in most of
    the premier American universities engineering curricula now concentrate on
    and encourage largely the study of engineering science. As a result, there
    are declining offerings in engineering subjects dealing with infrastructure,
    the environment, and related issues, and greater concentration on high
    technology subjects, largely supporting increasingly complex scientific
    developments. While the latter is important, it should not be at the expense
    of more traditional engineering.

    Rapidly developing economies such as China and India, as well as other
    industrial countries in Europe and Asia, continue to encourage and advance
    the teaching of engineering. Both China and India, respectively, graduate
    six and eight times as many traditional engineers as does the United States.
    Other industrial countries at minimum maintain their output, while America
    suffers an increasingly serious decline in the number of engineering graduates
    and a lack of well-educated engineers.
"""
)

[{'summary_text': ' America has changed dramatically during recent years . The '
                  'number of engineering graduates in the U.S. has declined in '
                  'traditional engineering disciplines such as mechanical, civil '
                  ', electrical, chemical, and aeronautical engineering . Rapidly '
                  'developing economies such as China and India, as well as other '
                  'industrial countries in Europe and Asia, continue to encourage '
                  'and advance engineering .'}]

In [None]:
from transformers import pipeline

translator = pipeline("translation", model="Helsinki-NLP/opus-mt-fr-en")
translator("Ce cours est produit par Hugging Face.")

[{'translation_text': 'This course is produced by Hugging Face.'}]