# Introduction

Hugging Face 🤗 [Transformers](https://github.com/huggingface/transformers) provide access to state-of-the-art pre-trained NLP models and pipelines to turn raw text into useful results. Many state of the art deep learning architectures have been published and made available from Hugging Face model [hub](https://huggingface.co/models).

In this tutorial, you will run the following NLP tasks using Hugging Face Pipelines. Objective of this tutorial is for you to get familiar with different NLP tasks and the navigating Hugging Face Models.
- Text Classification
  - Sentiment Analysis
  - Natural Language Inference
  - Question Natural Language Inference
  - Quora Question Pair
  - Grammatical Correctness
- Zero-shot classification
- Token Classification
  - Named Entity Recognition (NER)
  - Part of Speech Tagging (POS)
- Translation
- Summarization
- Question Answering
- Text Generation
  - In-context Learning


In [1]:
!pip install transformers[torch] datasets emoji==0.6.0 sentencepiece

Collecting transformers[torch]
  Downloading transformers-4.34.1-py3-none-any.whl (7.7 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.7/7.7 MB[0m [31m53.3 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting datasets
  Downloading datasets-2.14.5-py3-none-any.whl (519 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m519.6/519.6 kB[0m [31m50.9 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting emoji==0.6.0
  Downloading emoji-0.6.0.tar.gz (51 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m51.0/51.0 kB[0m [31m8.4 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting sentencepiece
  Downloading sentencepiece-0.1.99-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.3/1.3 MB[0m [31m79.8 MB/s[0m eta [36m0:00:00[0m
Collecting huggingface-hub<1.0,>=0.16.4 (from transformers[torch])
  Downloading hugging

Import pipeline

In [2]:
from transformers import pipeline
from rich import print

# Text classification

Text classification involves assigning a label or category to a given text. Common use cases include sentiment analysis, natural language inference, and the assessment of grammatical correctness.

## Sentiment Analysis

Sentiment analysis is a type of natural language processing technique that involves analyzing a piece of text to determine the sentiment or emotion expressed within it. It can be used to classify a text as positive, negative, or neutral, and has a wide range of applications in fields such as marketing, customer service, and political analysis.

In [3]:
text_classification_pipeline = pipeline("text-classification")

inputs = ["I love how amazingly simple ML has become!",
          "I hate doing mundane and thankless tasks. ☹️"]

results = text_classification_pipeline(inputs)
print(results)

No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


Downloading (…)lve/main/config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

Downloading model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

## Specific model

Default model for text classification is distilbert-base-uncased-finetuned-sst-2-english. If you want to use one of the over 19,000 models available on Hugging Face, include the name of the desired model in the pipeline.

In [4]:
pipe = pipeline(task="text-classification", model="finiteautomata/bertweet-base-sentiment-analysis")
pipe(["I love how amazingly simple ML has become!", "I hate doing mundane and thankless tasks. ☹️"])

Downloading (…)lve/main/config.json:   0%|          | 0.00/949 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/540M [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/338 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/843k [00:00<?, ?B/s]

Downloading (…)solve/main/bpe.codes:   0%|          | 0.00/1.08M [00:00<?, ?B/s]

Downloading (…)in/added_tokens.json:   0%|          | 0.00/22.0 [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/167 [00:00<?, ?B/s]

[{'label': 'POS', 'score': 0.9929322004318237},
 {'label': 'NEG', 'score': 0.9755997657775879}]

### Industry specific model

By selecting a model that has been specifically designed for a particular industry, you can achieve more accurate and relevant text classification. An example of such a model is FinBERT, a pre-trained NLP model that has been optimized for analyzing sentiment in financial text. FinBERT was created by training the BERT language model on a large financial corpus, and fine-tuning it to specifically classify financial sentiment. When using FinBERT, the model will provide softmax outputs for three different labels: positive, negative, or neutral.

In [5]:
pipe = pipeline(task="text-classification",model="ProsusAI/finbert")
pipe(["Stocks rallied and the British pound gained.","Stocks making the biggest moves midday: Nvidia, Palantir and more"])

Downloading (…)lve/main/config.json:   0%|          | 0.00/758 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/438M [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/252 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

[{'label': 'positive', 'score': 0.8983612656593323},
 {'label': 'neutral', 'score': 0.8062630891799927}]

## Natural Language Inference (NLI)
NLI, or Natural Language Inference, is a type of model that determines the relationship between two texts. The model takes a premise and a hypothesis as inputs and returns a class, which can be one of three types:

* Entailment: This means that the hypothesis is true based on the premise.
* Contradiction: This means that the hypothesis is false based on the premise.
* Neutral: This means that there is no relationship between the hypothesis and the premise.

The GLUE dataset is the benchmark dataset for evaluating NLI models. There are different variants of NLI models, such as Multi-Genre NLI, Question NLI, and Winograd NLI. If you want to use an NLI model, you can find them on the 🤗 Hugging Face model hub. Look for models with "mnli".

Below Example:
```
Premise: Soccer game with multiple males playing.
Hypothesis: Some men are playing a sport.
Label: Entailment
```

In [None]:
pipe = pipeline(task="text-classification",model="roberta-large-mnli")
pipe("A soccer game with multiple males playing., Some men are playing a sport")

## Question Natural Language Inference (QNLI)
The QNLI task involves determining whether a given question can be answered by the information in a provided document. If the answer can be found in the document, the label assigned is "entailment". Conversely, if the answer cannot be found in the document, the label assigned is "not entailment".

If you want to use an QNLI model, you can find them on the 🤗 Hugging Face model hub. Look for models with "qnli".

In [6]:
pipe = pipeline(task="text-classification",model="cross-encoder/qnli-electra-base")
pipe("Who was the London Weekend Television’s Managing Director?,The managing director of London Weekend Television (LWT), Greg Dyke, met with the representatives of the \"big five\" football clubs in England in 1990.")


Downloading (…)lve/main/config.json:   0%|          | 0.00/771 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/438M [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/268 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

[{'label': 'LABEL_0', 'score': 0.998712420463562}]

## Quora Question Pairs (QQP)
The Quora Question Pairs model is designed to evaluate whether two given questions are paraphrases of each other. This model takes the two questions and assigns a binary value as output. LABEL_0 indicates that the questions are paraphrases of each other and LABEL_1 indicates that the questions are not paraphrases. The benchmark dataset used for this task is the Quora Question Pairs dataset within the GLUE benchmark, which contains a collection of question pairs and their corresponding labels.

If you want to use an QQP model, you can find them on the 🤗 Hugging Face model hub. Look for models with qqp.

In [7]:
pipe = pipeline("text-classification", model = "textattack/bert-base-uncased-QQP")
pipe("Which city is the capital of France?, Where is the capital of France?")

Downloading (…)lve/main/config.json:   0%|          | 0.00/475 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/438M [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

[{'label': 'LABEL_0', 'score': 0.9988721013069153}]

## Grammatical Correctness
Linguistic Acceptability is a task that involves evaluating the grammatical correctness of a sentence. The model used for this task assigns one of two classes to the sentence, either "acceptable" or "unacceptable". LABEL_0 indicates acceptable and LABEL_1 indicates unacceptable. The benchmark dataset used for training and evaluating models for this task is the Corpus of Linguistic Acceptability (CoLA), which consists of a collection of texts along with their corresponding labels.

If you want to use a grammatical correctness model, you can find them on the 🤗 Hugging Face model hub. Look for models with cola.

In [8]:
pipe = pipeline("text-classification", model = "textattack/distilbert-base-uncased-CoLA")
pipe("I will walk to home when I went through the bus.")

Downloading (…)lve/main/config.json:   0%|          | 0.00/490 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/268M [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

[{'label': 'LABEL_1', 'score': 0.9576480388641357}]

# Zero-Shot Classification

Zero Shot Classification is a task where the model predicts a class that it hasn't seen during the training phase. This task leverages a pre-trained language model and is a type of transfer learning. Transfer learning involves using a model that was initially trained for one task in a different application. Zero Shot Classification is especially helpful when there is a scarcity of labeled data available for the specific task at hand.

In [9]:
from transformers import pipeline
classifier = pipeline(task="zero-shot-classification",model="facebook/bart-large-mnli")
text_to_classify= "I have a problem with my iphone that needs to be resolved asap!!"
candidate_labels=["urgent", "not urgent", "phone", "tablet", "computer"]
classifier(text_to_classify, candidate_labels, multi_label=True)


Downloading (…)lve/main/config.json:   0%|          | 0.00/1.15k [00:00<?, ?B/s]

Downloading model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

Downloading (…)olve/main/vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

Downloading (…)olve/main/merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

{'sequence': 'I have a problem with my iphone that needs to be resolved asap!!',
 'labels': ['urgent', 'phone', 'computer', 'not urgent', 'tablet'],
 'scores': [0.998576283454895,
  0.9949977993965149,
  0.1349661946296692,
  0.0006789116305299103,
  0.00041479477658867836]}

# Token Classification

Token classification is a task in natural language understanding, where labels are assigned to certain tokens in a text. Some popular subtasks of token classification include Named Entity Recognition (NER) and Part-of-Speech (PoS) tagging. NER models can be trained to identify specific entities in a text, such as individuals, places, and dates. PoS tagging, on the other hand, is used to identify the different parts of speech in a text, such as nouns, verbs, and punctuation marks.

## Named Entity Recognition
Named Entity Recognition (NER) is a task that involves identifying named entities in a text. These entities can include the names of people, locations, or organizations. The task is completed by labeling each token with a class for each named entity and a class named "0" for tokens that don't contain any entities. In this task, the input is text, and the output is the annotated text with named entities.

In [10]:
pipe = pipeline(task="token-classification")
pipe("I am John and I live in New York City.")


No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision f2482bf (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


Downloading (…)lve/main/config.json:   0%|          | 0.00/998 [00:00<?, ?B/s]

Downloading model.safetensors:   0%|          | 0.00/1.33G [00:00<?, ?B/s]

Some weights of the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


Downloading (…)okenizer_config.json:   0%|          | 0.00/60.0 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/213k [00:00<?, ?B/s]

[{'entity': 'I-PER',
  'score': 0.9974554,
  'index': 3,
  'word': 'John',
  'start': 5,
  'end': 9},
 {'entity': 'I-LOC',
  'score': 0.9992238,
  'index': 8,
  'word': 'New',
  'start': 24,
  'end': 27},
 {'entity': 'I-LOC',
  'score': 0.99931407,
  'index': 9,
  'word': 'York',
  'start': 28,
  'end': 32},
 {'entity': 'I-LOC',
  'score': 0.99942446,
  'index': 10,
  'word': 'City',
  'start': 33,
  'end': 37}]

## Part-of-Speech (PoS) Tagging
PoS tagging is a task that involves identifying the parts of speech, such as nouns, pronouns, adjectives, or verbs, in a given text. In this task, the model labels each word with a specific part of speech.

Look for models with pos to use a zero-shot classification model on the 🤗 Hugging Face model hub.

In [11]:
pipe = pipeline(task="token-classification", model="vblagoje/bert-english-uncased-finetuned-pos")
pipe("I am George and I live in Phoenix.")

Downloading (…)lve/main/config.json:   0%|          | 0.00/1.06k [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/438M [00:00<?, ?B/s]

Some weights of the model checkpoint at vblagoje/bert-english-uncased-finetuned-pos were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


Downloading (…)okenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

[{'entity': 'PRON',
  'score': 0.99950683,
  'index': 1,
  'word': 'i',
  'start': 0,
  'end': 1},
 {'entity': 'AUX',
  'score': 0.99707437,
  'index': 2,
  'word': 'am',
  'start': 2,
  'end': 4},
 {'entity': 'PROPN',
  'score': 0.9988508,
  'index': 3,
  'word': 'george',
  'start': 5,
  'end': 11},
 {'entity': 'CCONJ',
  'score': 0.99917895,
  'index': 4,
  'word': 'and',
  'start': 12,
  'end': 15},
 {'entity': 'PRON',
  'score': 0.99950755,
  'index': 5,
  'word': 'i',
  'start': 16,
  'end': 17},
 {'entity': 'VERB',
  'score': 0.99875176,
  'index': 6,
  'word': 'live',
  'start': 18,
  'end': 22},
 {'entity': 'ADP',
  'score': 0.99939656,
  'index': 7,
  'word': 'in',
  'start': 23,
  'end': 25},
 {'entity': 'PROPN',
  'score': 0.99888057,
  'index': 8,
  'word': 'phoenix',
  'start': 26,
  'end': 33},
 {'entity': 'PUNCT',
  'score': 0.9996618,
  'index': 9,
  'word': '.',
  'start': 33,
  'end': 34}]

# Translation
Translation is the task of converting text written in one language into another language. You have the option to select from over 2000 models available on the Hugging Face hub for translation.

In [12]:
pipe = pipeline(task="translation_en_to_fr")
pipe("How are you?")

No model was supplied, defaulted to t5-base and revision 686f1db (https://huggingface.co/t5-base).
Using a pipeline without specifying a model name and revision in production is not recommended.


Downloading (…)lve/main/config.json:   0%|          | 0.00/1.21k [00:00<?, ?B/s]

Downloading model.safetensors:   0%|          | 0.00/892M [00:00<?, ?B/s]

Downloading (…)neration_config.json:   0%|          | 0.00/147 [00:00<?, ?B/s]

Downloading (…)ve/main/spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/1.39M [00:00<?, ?B/s]

For now, this behavior is kept to avoid breaking backwards compatibility when padding/encoding with `truncation is True`.
- Be aware that you SHOULD NOT rely on t5-base automatically truncating your input to 512 when padding/encoding.
- If you want to encode/pad to sequences longer than 512 you can either instantiate this tokenizer with `model_max_length` or pass `max_length` when encoding/padding.


[{'translation_text': 'Comment êtes-vous?'}]

# Summarization

Summarization involves creating a condensed version of a document that includes the important information while reducing its length. Different models can be used for this task, with some models extracting the most relevant text from the original document, while other models generate completely new text that captures the essence of the original content.

In [13]:
document = """
The unanimous Declaration of the thirteen united States of America, When in the Course of human events, it becomes necessary for one people to dissolve the political bands which have connected them with another, and to assume among the powers of the earth, the separate and equal station to which the Laws of Nature and of Nature's God entitle them, a decent respect to the opinions of mankind requires that they should declare the causes which impel them to the separation.

We hold these truths to be self-evident, that all men are created equal, that they are endowed by their Creator with certain unalienable Rights, that among these are Life, Liberty and the pursuit of Happiness.--That to secure these rights, Governments are instituted among Men, deriving their just powers from the consent of the governed, --That whenever any Form of Government becomes destructive of these ends, it is the Right of the People to alter or to abolish it, and to institute new Government, laying its foundation on such principles and organizing its powers in such form, as to them shall seem most likely to effect their Safety and Happiness. Prudence, indeed, will dictate that Governments long established should not be changed for light and transient causes; and accordingly all experience hath shewn, that mankind are more disposed to suffer, while evils are sufferable, than to right themselves by abolishing the forms to which they are accustomed. But when a long train of abuses and usurpations, pursuing invariably the same Object evinces a design to reduce them under absolute Despotism, it is their right, it is their duty, to throw off such Government, and to provide new Guards for their future security.--Such has been the patient sufferance of these Colonies; and such is now the necessity which constrains them to alter their former Systems of Government. The history of the present King of Great Britain is a history of repeated injuries and usurpations, all having in direct object the establishment of an absolute Tyranny over these States. To prove this, let Facts be submitted to a candid world.
"""
print(len(document.split()))
pipe = pipeline(task="summarization")
pipe(document)

No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.


Downloading (…)lve/main/config.json:   0%|          | 0.00/1.80k [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/1.22G [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

Downloading (…)olve/main/vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

Downloading (…)olve/main/merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

[{'summary_text': ' The unanimous Declaration of the thirteen united States of America . The Declaration of Independence declared that all men are created equal, that they are endowed by their Creator with certain unalienable Rights, that among these are Life, Liberty and the pursuit of Happiness . The U.S. Constitution was established in 17th Amendment of 18th Amendment to First Amendment to the Constitution .'}]

# Question Answering
Question Answering models are designed to retrieve the answer to a question from a given text, which can be particularly useful for searching for information within a document. It's worth noting that some question answering models are capable of generating answers even without any contextual information.

In [14]:
qa_model = pipeline("question-answering")
question = "Where do I live?"
context = "My name is Merve and I live in İstanbul."
qa_model(question = question, context = context)

No model was supplied, defaulted to distilbert-base-cased-distilled-squad and revision 626af31 (https://huggingface.co/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.


Downloading (…)lve/main/config.json:   0%|          | 0.00/473 [00:00<?, ?B/s]

Downloading model.safetensors:   0%|          | 0.00/261M [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/29.0 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/213k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/436k [00:00<?, ?B/s]

{'score': 0.9538118243217468, 'start': 31, 'end': 39, 'answer': 'İstanbul'}

# Text Generation

This is a task of producing new text. These models can, for example, fill in incomplete text or paraphrase.

In [15]:
generator = pipeline('text-generation', model = 'gpt2')
output = generator("Hello, I'm a language model")
print(output)

Downloading (…)lve/main/config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

Downloading model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

Downloading (…)neration_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

Downloading (…)olve/main/vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

Downloading (…)olve/main/merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


## In-context Learning
In-context learning (ICL) is a specific method of prompt engineering where demonstrations of the task are provided to the model as part of the prompt (in natural language). With ICL, you can use off-the-shelf large language models (LLMs) to solve novel tasks without the need for fine-tuning. In the below example, you will learn how to perform Named Entity Recognition using few-shot examples.

In [16]:
from transformers import AutoTokenizer, AutoModelForCausalLM
prompt = """
Extract the main person and place from a sentence:

###
Paul is playing football in New York with Heather.
Person: Paul, Place: New York, Person: Heather
###
Jeff is in a hurry to go to Boston.
Person: Jeff, Place: Boston
###
Max is going to Phildelphia.
Person: Max, Place: Philadelphia
###
Sam is from Phoenix
"""
model_name = 'gpt2'
tokenizer = AutoTokenizer.from_pretrained(model_name)
device = 'cuda:0'
model = AutoModelForCausalLM.from_pretrained(model_name).to(device)

input_ids = tokenizer.encode(prompt, return_tensors="pt").to(device)
outputs = model.generate(input_ids=input_ids, do_sample=True, max_new_tokens=10, temperature=0.01, eos_token_id=tokenizer.encode("###"), pad_token_id = tokenizer.eos_token_id)
print(tokenizer.decode(outputs[0][len(input_ids[0]):-1]))

RuntimeError: ignored