

![alt text](https://d3.harvard.edu/platform-digit/wp-content/uploads/sites/2/2022/04/demo-huggingface_optimized-370x200.png)


# Hugging Face Tutorial 

*Hugging Face is a natural language processing (NLP) company that offers a range of tools, models, and infrastructure for building and deploying machine learning models in NLP. Their primary focus is on developing state-of-the-art deep learning models and their popular open-source software library called Transformers provides a unified API for working with a wide range of pre-trained transformer models.*

**What is NLP?**

NLP is a field of linguistics and machine learning focused on understanding everything related to human language. The aim of NLP tasks is not only to understand single words individually, but to be able to understand the context of those words.

The following is a list of common NLP tasks, with some examples of each:



*   Classifying whole sentences: Getting the sentiment of a review, detecting if an email is spam, determining if a sentence is grammatically correct or whether two sentences are logically related or not
*   Classifying each word in a sentence: Identifying the grammatical components of a sentence (noun, verb, adjective), or the named entities (person, location, organization)
*  Generating text content: Completing a prompt with auto-generated text, filling in the blanks in a text with masked words

* Extracting an answer from a text: Given a question and a context, extracting the answer to the question based on the information provided in the context
Generating a new sentence from an input text: Translating a text into another language, summarizing a text

**What is a Transformer?**

Hugging Face Transformer is a pre-trained neural network model that has been implemented in the Transformers library and can be fine-tuned for a wide range of NLP tasks. The library has become increasingly popular in the NLP community due to its ease of use, flexibility, and wide range of pre-trained models.

Lets now have a look at how easy it is the use a transformer. First we need to set up and install the transformers

In [1]:
!pip install transformers

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting transformers
  Downloading transformers-4.27.4-py3-none-any.whl (6.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.8/6.8 MB[0m [31m60.6 MB/s[0m eta [36m0:00:00[0m
Collecting tokenizers!=0.11.3,<0.14,>=0.11.1
  Downloading tokenizers-0.13.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.8/7.8 MB[0m [31m64.3 MB/s[0m eta [36m0:00:00[0m
Collecting huggingface-hub<1.0,>=0.11.0
  Downloading huggingface_hub-0.13.4-py3-none-any.whl (200 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m200.1/200.1 kB[0m [31m13.4 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: tokenizers, huggingface-hub, transformers
Successfully installed huggingface-hub-0.13.4 tokenizers-0.13.3 transformers-4.27.4


In [2]:
from transformers import pipeline

**Pipeline**

In the Hugging Face Transformers library, a pipeline is a convenient way to apply a pre-trained model to input text and generate an output. Pipelines encapsulate the entire process of pre-processing the input text, running it through the model, and post-processing the output into a readable format.

The Transformers library provides several pre-defined pipelines for common NLP tasks such as text classification, sentiment analysis, and named entity recognition. These pipelines are pre-configured to use specific pre-trained models and can be used with just a few lines of code.

**1st Example - Sentiment Analysis**

In the code below the pipeline feature allows us to call upon a pre trained sentiment analysis model and we can call it

In [3]:
classifier = pipeline("sentiment-analysis")

input_text = 'Bp have announced a profit warning'
classifier(input_text)

No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


Downloading (…)lve/main/config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/268M [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

[{'label': 'NEGATIVE', 'score': 0.992615818977356}]

**Try More Sentiment Analysis Examples**

Below play around with determining how robust sentiment analysis is. Think of the things that could be tricky for an algorithm 

In [4]:
input_text = 'That was amazingly rubbish'
input_text = 'He was the best, the very best at being terrible'
input_text = 'The company was concerned about the war in Ukraine. However they did see oppertuntites to invest in renewables as a reuslt of the current economic climate which could be a benefit in the future'
classifier(input_text)

[{'label': 'NEGATIVE', 'score': 0.9416939616203308}]

**What is going on in the Pipeline**

1. The text is preprocessed into a format the model can understand.
2. The preprocessed inputs are passed to the model.
3. The predictions of the model are post-processed, so you can make sense of them.

**Types of Pipeline**

* feature-extraction (get the vector representation of a text)
*  fill-mask
*  ner (named entity recognition)
*  question-answering
*  sentiment-analysis
*  summarization
* text-generation
* translation
*  zero-shot-classification

Lets look at these one by one 


**Zero Shot Classificaiton**

In Zero shot classificaition you give the transformer some text and some categories to assign this text. It attempts to use its general knowledge in order to place the text in the correct category

In [5]:
from transformers import pipeline

classifier = pipeline("zero-shot-classification")
classifier(
    "TBoriis johnson is a liar",
    candidate_labels=["education", "politics", "business"],
)

No model was supplied, defaulted to facebook/bart-large-mnli and revision c626438 (https://huggingface.co/facebook/bart-large-mnli).
Using a pipeline without specifying a model name and revision in production is not recommended.


Downloading (…)lve/main/config.json:   0%|          | 0.00/1.15k [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

Downloading (…)olve/main/vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

Downloading (…)olve/main/merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

{'sequence': 'TBoriis johnson is a liar',
 'labels': ['business', 'politics', 'education'],
 'scores': [0.509779691696167, 0.4088508188724518, 0.08136940747499466]}

In [6]:
classifier(
    "The person lost his arm in an accident",
    candidate_labels=["Safety", "Operational", "engineering"],
)

{'sequence': 'The person lost his arm in an accident',
 'labels': ['Safety', 'Operational', 'engineering'],
 'scores': [0.46833640336990356, 0.4126538038253784, 0.11900977045297623]}

Lets look at another example with a real world application




In [7]:
interest_rates = """
In 2025, the UK central bank predicted interest rates would fall back to 3.6%, declining further to 3.3% in 2026.13 Mar 2023"""

classifier(
    interest_rates,
    candidate_labels=["Falling interest rate", "No change interest rate", "rising interest rate"],
)

{'sequence': '\nIn 2025, the UK central bank predicted interest rates would fall back to 3.6%, declining further to 3.3% in 2026.13 Mar 2023',
 'labels': ['Falling interest rate',
  'rising interest rate',
  'No change interest rate'],
 'scores': [0.8426926136016846, 0.13479645550251007, 0.022510966286063194]}

And one more example

In [8]:
abstract = """
High-harmonic generation (HHG) provides a laboratory-scale source of coherent radiation ideally suited to lensless coherent diffractive imaging (CDI) in the EUV and x-ray spectral region. Here we demonstrate transmission extreme ultraviolet (EUV) ptychography, a scanning variant of CDI, using radiation at a wavelength around 29 nm from an HHG source. Image resolution is diffraction-limited at 54 nm and fields of view up to ∼100 μm
 are demonstrated. These results demonstrate the potential for wide-field, high-resolution, laboratory-scale EUV imaging using HHG-based sources with potential application in biological imaging or EUV lithography pellicle inspection."""

classifier(
    abstract,
    candidate_labels=["Physics", "Geography", "History", "Biology"],
)

{'sequence': '\nHigh-harmonic generation (HHG) provides a laboratory-scale source of coherent radiation ideally suited to lensless coherent diffractive imaging (CDI) in the EUV and x-ray spectral region. Here we demonstrate transmission extreme ultraviolet (EUV) ptychography, a scanning variant of CDI, using radiation at a wavelength around 29 nm from an HHG source. Image resolution is diffraction-limited at 54 nm and fields of view up to ∼100\u2009μm\n are demonstrated. These results demonstrate the potential for wide-field, high-resolution, laboratory-scale EUV imaging using HHG-based sources with potential application in biological imaging or EUV lithography pellicle inspection.',
 'labels': ['Physics', 'History', 'Biology', 'Geography'],
 'scores': [0.4279579818248749,
  0.29872632026672363,
  0.18375477194786072,
  0.08956091850996017]}

**Text Generation**

This can be used for the purposes of autocomplete

In [9]:
from transformers import pipeline

generator = pipeline("text-generation")
generator("Thank you for your")

No model was supplied, defaulted to gpt2 and revision 6c0e608 (https://huggingface.co/gpt2).
Using a pipeline without specifying a model name and revision in production is not recommended.


Downloading (…)lve/main/config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/548M [00:00<?, ?B/s]

Downloading (…)neration_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

Downloading (…)olve/main/vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

Downloading (…)olve/main/merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': 'Thank you for your continued support!\n\nFollow us on Twitter @theredhats'}]

In [10]:

generator("Thank you for your", max_length = 4,  num_return_sequences = 4)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Input length of input_ids is 4, but `max_length` is set to 4. This can lead to unexpected behavior. You should consider increasing `max_new_tokens`.


[{'generated_text': 'Thank you for your attention'},
 {'generated_text': 'Thank you for your support'},
 {'generated_text': 'Thank you for your continued'},
 {'generated_text': 'Thank you for your understanding'}]

In [11]:
generator("The holiday was", max_length = 20,  num_return_sequences = 4)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': "The holiday was celebrated in South Africa.\n\n'Hate and violence': People protest against the"},
 {'generated_text': 'The holiday was so full of life.\n\n"We took them to a place called a family'},
 {'generated_text': 'The holiday was also held in honour of the memory of the late Dr Michael Kelly, who gave birth'},
 {'generated_text': "The holiday was a day of good feeling for the residents of L'Enquête\n\n"}]

**Mask Filling**

This is where the algorithm looks the fill blanks in the text with suitable responses

In [12]:
unmasker = pipeline("fill-mask")
unmasker("BP is a comapany that deal in the <mask> sector.", top_k=2)

No model was supplied, defaulted to distilroberta-base and revision ec58a5b (https://huggingface.co/distilroberta-base).
Using a pipeline without specifying a model name and revision in production is not recommended.


Downloading (…)lve/main/config.json:   0%|          | 0.00/480 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/331M [00:00<?, ?B/s]

Downloading (…)olve/main/vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

Downloading (…)olve/main/merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

[{'score': 0.1934245526790619,
  'token': 1007,
  'token_str': ' energy',
  'sequence': 'BP is a comapany that deal in the energy sector.'},
 {'score': 0.11976183205842972,
  'token': 14304,
  'token_str': ' petroleum',
  'sequence': 'BP is a comapany that deal in the petroleum sector.'}]

The score represents the suitablity of the word to the sentence. The token represents the higher dimensional locaiton that is representive of the word. Remeber Networks alwats work with numbers so this is the best suited number to the sentence. We can decode it using 


In [13]:
unmasker.tokenizer.decode(14304)

' petroleum'

**Bias Alert**

These models are trained from lanaguage all over the web. Any biases that are in the corpos of text may show in the response. Lets look at one now


In [14]:
unmasker = pipeline("fill-mask")
results = unmasker("She worked very hard at her job as a <mask>.", top_k=5)
for result in results:
    print(result['token_str'])


No model was supplied, defaulted to distilroberta-base and revision ec58a5b (https://huggingface.co/distilroberta-base).
Using a pipeline without specifying a model name and revision in production is not recommended.


 waitress
 nurse
 teacher
 bartender
 maid


In [15]:
unmasker = pipeline("fill-mask")
results = unmasker("He worked very hard at his job as a <mask>.", top_k=5)
for result in results:
    print(result['token_str'])

No model was supplied, defaulted to distilroberta-base and revision ec58a5b (https://huggingface.co/distilroberta-base).
Using a pipeline without specifying a model name and revision in production is not recommended.


 mechanic
 waiter
 teacher
 firefighter
 lawyer


**Named Entity Recognition**

The task here is to identify which part of the text corrosponds to entities such as people locations and organisations. Lets try one

In [16]:
from transformers import pipeline

ner = pipeline("ner", grouped_entities=True)
ner("My Name is Peter and I work for Corndel")

No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision f2482bf (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


Downloading (…)lve/main/config.json:   0%|          | 0.00/998 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/1.33G [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/60.0 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/213k [00:00<?, ?B/s]



[{'entity_group': 'PER',
  'score': 0.9990478,
  'word': 'Peter',
  'start': 11,
  'end': 16},
 {'entity_group': 'ORG',
  'score': 0.9538417,
  'word': 'Corndel',
  'start': 32,
  'end': 39}]

Notice in the output it gives infomation about the entity along with a conidence score, the word it corrosponds to as well as the starting and ending character.

**Summarization**

This feature allows is to reduce the size of the text whilst trying to keep the key messages

In [17]:
from transformers import pipeline

summarizer = pipeline("summarization")
summarizer(
    """
5 Phases of NLP
Natural Language Processing is separated into five primary stages or phases, starting with simple word processing and progressing to identifying complicated phrase meanings.

1. Lexical or Morphological Analysis
Lexical or Morphological Analysis is the initial step in NLP. It entails recognizing and analyzing word structures. The collection of words and phrases in a language is referred to as the lexicon. Lexical analysis is the process of breaking down a text file into paragraphs, phrases, and words. The source code is scanned as a stream of characters and converted into intelligible lexemes in this phase. The entire book is divided into paragraphs, phrases, and words.
It refers to the study of text at the level of individual words. It searches for morphemes, which are the smallest units of a word. The lexical analysis identifies the relationship between these morphemes and transforms the word into its root form. The word’s probable parts of speech (POS) are also assigned by a lexical analyzer.
2. Syntax Analysis or Parsing
Syntactic or Syntax analysis is a technique for checking grammar, arranging words, and displaying relationships between them. It entails examining the syntax of the words in the phrase and arranging them in a way that demonstrates the relationship between them. Syntax analysis guarantees that the structure of a particular piece of text is proper. It tries to parse the sentence in order to ensure that the grammar is correct at the sentence level. A syntax analyzer assigns POS tags based on the sentence structure given the probable POS created in the preceding stage.
For example, New York goes to John.
This sentence New York goes to John is rejected by the Syntactic Analyzer as it makes no sense.
3. Semantic Analysis
Semantic analysis is the process of looking for meaning in a statement. It concentrates mostly on the literal meaning of words, phrases, and sentences is the main focus. It also deals with putting words together to form sentences. It extracts the text’s exact meaning or dictionary definition. The meaning of the text is examined. It is accomplished by mapping the task domain’s syntactic structures and objects.
Take the following sentence for example: “The guava ate an apple.” The line is syntactically valid, yet it is illogical because guavas cannot eat.
4. Discourse Integration
The term “discourse integration” refers to a feeling of context. The meaning of any sentence is determined by the meaning of the sentence immediately preceding it. In addition, it establishes the meaning of the sentence that follows. The sentences that come before it play a role in discourse integration. That is to say, that statement or word is dependent on the preceding sentence or words. It’s the same with the use of proper nouns and pronouns.

For example, Billy bought it.
The word “it” in the above sentence is dependent on the preceding discourse context. We can see that the “it” does not make sense in this statement. In fact, it refers to anything we don’t understand. That is nothing more than the fact that the word “it” is dependent on the preceding sentence, which is not provided. So, once we’ve learned about “it,” we’ll be able to simply locate the reference. Discourse is concerned with the impact of a prior sentence on the current sentence

5. Pragmatic Analysis
The fifth and final phase of NLP is pragmatic analysis. The overall communicative and social content, as well as its impact on interpretation, are the focus of pragmatic analysis. Pragmatic Analysis uses a set of rules that describe cooperative dialogues to help you find the intended result. It covers things like word repetition, who said what to whom, and so on. It comprehends how people communicate with one another, the context in which they converse, and a variety of other factors. It refers to the process of abstracting or extracting the meaning of a situation’s use of language. It translates the given text using the knowledge gathered in the preceding stages. “Switch on the TV” when used in a sentence, is an order or request to switch the TV on.
"""
)

No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.


Downloading (…)lve/main/config.json:   0%|          | 0.00/1.80k [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/1.22G [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

Downloading (…)olve/main/vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

Downloading (…)olve/main/merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

[{'summary_text': ' Natural Language Processing is separated into five primary stages or phases, starting with simple word processing and progressing to identifying complicated phrase meanings . Syntax Analysis or Parsing is a technique for checking grammar, arranging words, and displaying relationships between them . Semantic Analysis concentrates mostly on the literal meaning of words, phrases, and sentences .'}]

We can change the min or the max length of the summary using the max_length or min_length parameter


## So what are Transformers ?

Transformers are models that we use to perform NLP tasks. Broadley speakig we have three types of transformer architecture. These are 

* GPT-like (also called auto-regressive Transformer models ** )

* BERT-like (also called auto-encoding Transformer models)

* BART/T5-like (also called sequence-to-sequence Transformer models)

** Autoregressive Transformer models are a type of neural network architecture that combines the concepts of autoregressive modeling (generating predictions based on previously generated outputs) and the Transformer architecture (a type of neural network that is particularly effective in handling sequential data by leveraging attention mechanisms). They are widely used in natural language processing tasks such as language modeling, text generation, and machine translation, and have achieved state-of-the-art results in these fields.

** Auto encoding models take long strings 


## Tranfer Learning

Training transformers from scratch is a time and resource consuming task. Not only is this expensive, however it also is not great news when trying to limit the carbon foot print assosiated with models. 

This is one of the motivations behind Hugging Face to bring together lots of pre trained models and instead of teams developing their own models from scratch researchers can use exisiting models as a starting point to train their own models. This is known as transfer learning. 

In trasnfer learning pre training is performed on a huige amount of data and this training can take weeks or months. Once the model is trained then we can fine tune the model in later trainings.  

## More Transformer theory

## What is going on in the Pipeline Process

In the above examples we just called the pipeline on some text data and got an output. But what is going on within pipeline. Lets have a look

Broadly speaking the pipeline is broken down into 3 main parts
1. The raw text undergoes tokenization. This is the process where pre processing of words occurs (e.g stop word removal) and the the words are mapped to an integer. The process of tokenisation is linked to the model that will subsequentially be used to process the text so it is important the tokeniser matches the model.The output of the tokenisation in many DNN is a vector of integers as well as a vector to represent the attention mask. The attention mask effectively determines how important the particular integer ( or word / token) will be in prediction
2. The tokenized data get input into the model. The model takes the numerical inputs and first learns an embedding layer. The embedding laayer converts each token from the tokenized layer and represents this as a vector. These vectors then go through hidden layers in the netowrk before outputting a numerical output. 
3. The numerical output is the converted back into Natural Language.

Lets now look at this in cpde

## The Types of Transformers

Encoder models. These models are those that require an understanding of the input such as sentence classification or named entity recognition

Decoder models. These are good for tasks where natural language is the output such text generation.

Encoder models. This is where both an understanding of the input and being able to predocude an output is required such as text summerisation. 

## Applying transfomers

*Previously we looked at the pipeline feature and this was a great low code feature for us to use. However many things are occuring within the pipeline that we shall look at below

### Tokenization

- Lets look at the process for sentiment analysis

In [18]:
!pip install transformers

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [19]:
from transformers import AutoTokenizer

checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)

The checkpoint describes the model which we are taken the tokenization from. We then feed that to our AutoTokenizer class. Lets have a look further at what is output from the tokenizer

In [20]:
raw_inputs = [
    "I really love learning NLP",
    "I hate this so much!",
]
inputs = tokenizer(raw_inputs, padding=True, truncation=True, return_tensors="pt")
print(inputs)

{'input_ids': tensor([[  101,  1045,  2428,  2293,  4083, 17953,  2361,   102],
        [  101,  1045,  5223,  2023,  2061,  2172,   999,   102]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1, 1, 1, 1]])}


Notice the output from the tokeizer is a tensor of input ids which corropsond to roughly each word in our document and also an attention mask which is a rough approximation on how important each word may be to the document. 

Note the padding is true settings forces each example that is less then a defined length or default length to be padded with 0s. This makes sure that each example going into the network is the same length. Similarly if the document is longer than a defined length the truncate = true parameter will attempt to shorten it

### The Model

Like with our tokenisation we define the model we shall use and then set this up

In [21]:
from transformers import AutoModelForSequenceClassification

checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"
model = AutoModelForSequenceClassification.from_pretrained(checkpoint)

In [22]:
outputs = model(**inputs)
print(outputs.logits.shape) ## Note the ** unpacks the dicionary 

torch.Size([2, 2])


The output of this transformer is a tensor. In this example it represents the output of the tensor. We need to apply it through a sigmoid transformation to have this expressed between 0 and 1 for sentiment

import torch

predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
print(predictions)

In [23]:
import torch

predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
print(predictions)

tensor([[5.4148e-04, 9.9946e-01],
        [9.9946e-01, 5.4418e-04]], grad_fn=<SoftmaxBackward0>)


In [24]:
model.config.id2label

{0: 'NEGATIVE', 1: 'POSITIVE'}

Next Steps

- Fine Tuning your NLP Models- See Traditional Classification V Transformer 