# **Overview**

Hugging Face is an open source AI community where you can host your own AI models, train them and collaborate with others in the community. You can browse through the thousands of models that are available for a variety of use cases like NLP, audio and computer vision. Hugging Face also has a huge collection of NLP models for tasks like translation, sentiment analysis, summarization and text generation.

In this learning path, you will download a popular RoBERTa sentiment analysis NLP model from Hugging Face and deploy it using PyTorch on your Arm machine. Sentiment analysis is a type of NLP algorithm used to identify and classify the emotional tone of a piece of text. This model has been trained with over 124 million tweets.

**Who is this for?**

This is an introductory topic for software developers who want to learn how to run a Natural Language Processing (NLP) model from Hugging Face using PyTorch on Arm based servers.
What will you learn?

Upon completion of this learning path, you will be able to:

    Deploy a PyTorch NLP model from Hugging Face on an Arm AArch64 CPU
    Use the PyTorch profiler to analyze the execution time of the model

Prerequisites

Before starting, you will need the following:

    An Arm based instance from a cloud service provider or an on-premise Arm server.

# **Install dependencies**

The Hugging Face Transformers library provides APIs and tools that let you easily download and fine-tune pre-trained models. Hugging Face Transformers support multiple machine learning frameworks like PyTorch, TensorFlow and JAX. You will use Transformers with PyTorch to download the model from Hugging Face.

To install the Transformers library for PyTorch, run the following command:

`pip install 'transformers[torch]'`

The full classification example script used in this learning path uses SciPy, an open source Python library to process the inference output from the NLP model. To install SciPy, run the following command:

`pip install scipy`

# **Run the sentiment analysis NLP model**

You are now ready to download this model and run a full classification example from Hugging Face on your machine. Using a file editor of your choice, create a file named sentiment-analysis.py:

In [1]:
from transformers import AutoModelForSequenceClassification
from transformers import TFAutoModelForSequenceClassification
from transformers import AutoTokenizer, AutoConfig
import numpy as np
from scipy.special import softmax
import transformers
transformers.logging.set_verbosity_error()
# Preprocess text (username and link placeholders)
def preprocess(text):
    new_text = []
    for t in text.split(" "):
        t = '@user' if t.startswith('@') and len(t) > 1 else t
        t = 'http' if t.startswith('http') else t
        new_text.append(t)
    return " ".join(new_text)
MODEL = f"cardiffnlp/twitter-roberta-base-sentiment-latest"
tokenizer = AutoTokenizer.from_pretrained(MODEL)
config = AutoConfig.from_pretrained(MODEL)
# PT
model = AutoModelForSequenceClassification.from_pretrained(MODEL)
text = "Covid cases are increasing fast!"
text = preprocess(text)
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)
scores = output[0][0].detach().numpy()
scores = softmax(scores)
# Print labels and scores
ranking = np.argsort(scores)
ranking = ranking[::-1]
for i in range(scores.shape[0]):
    l = config.id2label[ranking[i]]
    s = scores[ranking[i]]
    print(f"{i+1}) {l} {np.round(float(s), 4)}")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/929 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/501M [00:00<?, ?B/s]

1) negative 0.7236
2) neutral 0.2287
3) positive 0.0477


This example does the following:

*   Downloads and creates an instance of the RoBERTa
*   Creates a tokenizer which prepares the inputs as tensors for the model
*   Pre-processes the input text to the model
*   Encodes the input text to the model
*   Passes the encoded input text to the model and performs the sentiment analysis
*   Obtains the output classification score

You have successfully performed sentiment analysis on the input text, all running on your Arm AArch64 CPU. You can change the input text in your example and re-run the classification example.

# **Sentiment Analysis Profile**

Now that you have run the model, let’s add the ability to profile the model execution. You can use the PyTorch Profiler to analyze the execution time on the CPU. Copy the contents shown below into a file named sentiment-analysis-profile.py:

In [2]:
from transformers import AutoModelForSequenceClassification
from transformers import TFAutoModelForSequenceClassification
from transformers import AutoTokenizer, AutoConfig
import numpy as np
from scipy.special import softmax
import transformers
transformers.logging.set_verbosity_error()
import torch
from torch.profiler import profile, record_function, ProfilerActivity
# Preprocess text (username and link placeholders)
def preprocess(text):
    new_text = []
    for t in text.split(" "):
        t = '@user' if t.startswith('@') and len(t) > 1 else t
        t = 'http' if t.startswith('http') else t
        new_text.append(t)
    return " ".join(new_text)
MODEL = f"cardiffnlp/twitter-roberta-base-sentiment-latest"
tokenizer = AutoTokenizer.from_pretrained(MODEL)
config = AutoConfig.from_pretrained(MODEL)
# PT
model = AutoModelForSequenceClassification.from_pretrained(MODEL)
text = "Covid cases are increasing fast!"
text = preprocess(text)
encoded_input = tokenizer(text, return_tensors='pt')
with torch.profiler.profile(activities=[torch.profiler.ProfilerActivity.CPU],
                            record_shapes=True) as prof:
    with record_function("model_inference"):
        output = model(**encoded_input)

# print basic stats
print(prof.key_averages().table(sort_by="self_cpu_time_total", row_limit=10))

scores = output[0][0].detach().numpy()
scores = softmax(scores)
# Print labels and scores
ranking = np.argsort(scores)
ranking = ranking[::-1]
for i in range(scores.shape[0]):
    l = config.id2label[ranking[i]]
    s = scores[ranking[i]]
    print(f"{i+1}) {l} {np.round(float(s), 4)}")

---------------------------  ------------  ------------  ------------  ------------  ------------  ------------  
                       Name    Self CPU %      Self CPU   CPU total %     CPU total  CPU time avg    # of Calls  
---------------------------  ------------  ------------  ------------  ------------  ------------  ------------  
                aten::addmm        63.03%      85.831ms        64.39%      87.681ms       1.185ms            74  
            model_inference        21.30%      29.005ms       100.00%     136.181ms     136.181ms             1  
           aten::layer_norm         1.89%       2.572ms         2.97%       4.040ms     161.600us            25  
                 aten::gelu         1.73%       2.353ms         1.73%       2.353ms     196.083us            12  
                   aten::to         1.52%       2.064ms         2.22%       3.018ms     150.900us            20  
                aten::copy_         1.38%       1.884ms         1.38%       1.884ms     

In addition to the classification output from the model, you can now see the execution time for the different operators.

You can experiment with the BFloat16 floating-point number format and Transparent huge pages settings with PyTorch and see how that impacts the performance of your model.

You have successfully run and profiled a sentiment analysis NLP model from Hugging Face on your Arm machine. You can explore running other models and use cases just as easily.