Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hugging Face NLP sentiment analysis LP #744

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
---
title: Run a Natural Language Processing (NLP) model from Hugging Face on Arm servers

minutes_to_complete: 20

who_is_this_for: This is an introductory topic for software developers who want to learn how to run a NLP model from Hugging Face using PyTorch on Arm based servers.

learning_objectives:
- Deploy a PyTorch NLP model from Hugging face on an Arm AArch64 CPU
- Use the PyTorch profiler to analyze the execution time of the model

prerequisites:
- An [Arm based instance](/learning-paths/servers-and-cloud-computing/csp/) from a cloud service provider or an on-premise Arm server.

author_primary: Pareena Verma

### Tags
skilllevels: Introductory
subjects: ML
armips:
- Neoverse
operatingsystems:
- Linux
tools_software_languages:
- Python
- PyTorch

### FIXED, DO NOT MODIFY
# ================================================================================
weight: 1 # _index.md always has weight of 1 to order correctly
layout: "learningpathall" # All files under learning paths have this same wrapper
learning_path_main_page: "yes" # This should be surfaced when looking for related content. Only set for _index.md of learning path content.
---
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
---
next_step_guidance: >
Thank you for completing this learning path on how to run a NLP sentiment analysis model on an Arm server. You might be interested in learning how to use the Keras Core with TensorFlow, PyTorch, and JAX backends.

recommended_path: "/learning-paths/servers-and-cloud-computing/keras-core/"

further_reading:
- resource:
title: Hugging Face Documentation
link: https://huggingface.co/docs
type: documentation
- resource:
title: PyTorch Inference Performance Tuning on AWS Graviton Processors
link: https://pytorch.org/tutorials/recipes/inference_tuning_on_aws_graviton.html
type: documentation
- resource:
title: ML inference on Graviton CPUs with PyTorch
link: https://github.com/aws/aws-graviton-getting-started/blob/main/machinelearning/pytorch.md
type: documentation
- resource:
title: PyTorch Documentation
link: https://pytorch.org/docs/stable/index.html
type: documentation


# ================================================================================
# FIXED, DO NOT MODIFY
# ================================================================================
weight: 21 # set to always be larger than the content in this path, and one more than 'review'
title: "Next Steps" # Always the same
layout: "learningpathall" # All files under learning paths have this same wrapper
---
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
---
review:
- questions:
question: >
Does PyTorch run on Arm servers?
answers:
- "Yes"
- "No"
correct_answer: 1
explanation: >
PyTorch is an open-source machine learning framework. It can be installed and used on Arm servers to build and deploy various neural networks.

- questions:
question: >
Can you run a Hugging Face model through PyTorch on an Arm AArch64 CPU?
answers:
- "Yes"
- "No"
correct_answer: 1
explanation: >
You can run and deploy models from Hugging Face on Arm CPUs using PyTorch.

# ================================================================================
# FIXED, DO NOT MODIFY
# ================================================================================
title: "Review" # Always the same title
weight: 20 # Set to always be larger than the content in this path
layout: "learningpathall" # All files under learning paths have this same wrapper
---
Original file line number Diff line number Diff line change
@@ -0,0 +1,187 @@
---
title: Run a
weight: 2

### FIXED, DO NOT MODIFY
layout: learningpathall
---

## Before you begin
The instructions in this learning path are for any Arm server running Ubuntu 22.04 LTS.

Before you begin, you will need to install [PyTorch](/install-guides/pytorch) on your Arm machine.
PyTorch is a widely used machine learning framework for Python. You will use PyTorch to deploy a Natural Language Processing (NLP) model on your Arm machine.

## Overview

[Hugging Face](https://huggingface.co/) is an open source AI community where you can host your own AI models, train them and collaborate with others in the community. You can browse through the thousands of models that are available for a variety of use cases like Natural language processing, audio and computer vision. Hugging face has a huge collection of NLP models for tasks like translation, sentiment analysis, summarization and text generation.

In this learning path, you will download a popular [RoBERTa sentiment analysis](https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment-latest) NLP model from Hugging Face and deploy it using PyTorch on your Arm machine. Sentiment analysis is a type of NLP algorithm used to identify and classify the emotional tone of a piece of text. This model has been trained with over 124 million tweets.

## Install dependencies

Hugging Face Transformers library provides APIs and tools that let you easily download and train pre-trained models. Huggging Face Transformers support multiple machine learning frameworks like PyTorch, TensorFlow and JAX. You will use transformers with PyTorch to download the model from Hugging Face.

To install the Transformers library for PyTorch, run the following command:

```bash
pip install 'transformers[torch]'
```

The RoBERTa sentiment analysis NLP model uses SciPy, an open source Python library used to solve scientific and mathematical problems. To install SciPy, run the following command:

```bash
pip install scipy
```

## Run the sentiment analysis NLP model

You are now ready to download this model and run a full classification example from Hugging Face on your machine. Using a file editor of your choice, create a file named `sentiment-analysis.py`:

```python
from transformers import AutoModelForSequenceClassification
from transformers import TFAutoModelForSequenceClassification
from transformers import AutoTokenizer, AutoConfig
import numpy as np
ifrom scipy.special import softmax
import transformers
transformers.logging.set_verbosity_error()
# Preprocess text (username and link placeholders)
def preprocess(text):
new_text = []
for t in text.split(" "):
t = '@user' if t.startswith('@') and len(t) > 1 else t
t = 'http' if t.startswith('http') else t
new_text.append(t)
return " ".join(new_text)
MODEL = f"cardiffnlp/twitter-roberta-base-sentiment-latest"
tokenizer = AutoTokenizer.from_pretrained(MODEL)
config = AutoConfig.from_pretrained(MODEL)
# PT
model = AutoModelForSequenceClassification.from_pretrained(MODEL)
text = "Covid cases are increasing fast!"
text = preprocess(text)
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)
scores = output[0][0].detach().numpy()
scores = softmax(scores)
# Print labels and scores
ranking = np.argsort(scores)
ranking = ranking[::-1]
for i in range(scores.shape[0]):
l = config.id2label[ranking[i]]
s = scores[ranking[i]]
print(f"{i+1}) {l} {np.round(float(s), 4)}")
```
This example does the following:

* Downloads and creates an instance of the RoBERTa sentiment analysis model.
* Creates a `tokenizer` which prepares the inputs as tensors for the model.
* Pre-processes the input text to the model.
* Encodes the input text to the model.
* Passes the encoded input text to the model and performs the sentiment analysis
* Obtains the output classification score

Run this script:

```bash
python sentiment-analysis.py
```

The output from this script should look like:

```output
1) negative 0.7236
2) neutral 0.2287
3) positive 0.0477
```

You have successfully performed sentiment analysis on the input text, all running on your Arm AArch64 CPU. You can change the input text in your example and re-run the classification example.

Now that you have run the model, let's add the ability to profile the model execution. You can use the [PyTorch Profiler](https://pytorch.org/tutorials/recipes/recipes/profiler_recipe.html) to analyze the execution time on the CPU. Copy the contents shown below into a file named `sentiment-analysis-profile.py`:

```python
from transformers import AutoModelForSequenceClassification
from transformers import TFAutoModelForSequenceClassification
from transformers import AutoTokenizer, AutoConfig
import numpy as np
from scipy.special import softmax
import transformers
transformers.logging.set_verbosity_error()
import torch
from torch.profiler import profile, record_function, ProfilerActivity
# Preprocess text (username and link placeholders)
def preprocess(text):
new_text = []
for t in text.split(" "):
t = '@user' if t.startswith('@') and len(t) > 1 else t
t = 'http' if t.startswith('http') else t
new_text.append(t)
return " ".join(new_text)
MODEL = f"cardiffnlp/twitter-roberta-base-sentiment-latest"
tokenizer = AutoTokenizer.from_pretrained(MODEL)
config = AutoConfig.from_pretrained(MODEL)
# PT
model = AutoModelForSequenceClassification.from_pretrained(MODEL)
text = "Covid cases are increasing fast!"
text = preprocess(text)
encoded_input = tokenizer(text, return_tensors='pt')
with torch.profiler.profile(activities=[torch.profiler.ProfilerActivity.CPU],
record_shapes=True) as prof:
with record_function("model_inference"):
output = model(**encoded_input)

# print basic stats
print(prof.key_averages().table(sort_by="self_cpu_time_total", row_limit=10))

scores = output[0][0].detach().numpy()
scores = softmax(scores)
# Print labels and scores
ranking = np.argsort(scores)
ranking = ranking[::-1]
for i in range(scores.shape[0]):
l = config.id2label[ranking[i]]
s = scores[ranking[i]]
print(f"{i+1}) {l} {np.round(float(s), 4)}")
```

Run this python script:

```bash
python sentiment-analysis-profile.py
```

The output should look similar to:

```output
STAGE:2024-02-27 17:26:22 18170:18170 ActivityProfilerController.cpp:314] Completed Stage: Warm Up
STAGE:2024-02-27 17:26:22 18170:18170 ActivityProfilerController.cpp:320] Completed Stage: Collection
STAGE:2024-02-27 17:26:22 18170:18170 ActivityProfilerController.cpp:324] Completed Stage: Post Processing
--------------------------- ------------ ------------ ------------ ------------ ------------ ------------
Name Self CPU % Self CPU CPU total % CPU total CPU time avg # of Calls
--------------------------- ------------ ------------ ------------ ------------ ------------ ------------
aten::addmm 56.56% 29.355ms 57.96% 30.085ms 406.554us 74
model_inference 15.24% 7.910ms 100.00% 51.903ms 51.903ms 1
aten::bmm 4.86% 2.521ms 7.37% 3.823ms 159.292us 24
aten::select 2.55% 1.323ms 2.58% 1.337ms 1.535us 871
aten::view 1.98% 1.030ms 1.98% 1.030ms 3.962us 260
aten::linear 1.97% 1.022ms 62.89% 32.640ms 441.081us 74
aten::native_layer_norm 1.87% 968.000us 2.07% 1.072ms 42.880us 25
aten::gelu 1.76% 912.000us 1.76% 912.000us 76.000us 12
aten::copy_ 1.36% 706.000us 1.36% 706.000us 6.660us 106
aten::expand 0.95% 492.000us 0.98% 509.000us 4.138us 123
--------------------------- ------------ ------------ ------------ ------------ ------------ ------------
Self CPU time total: 51.903ms

1) negative 0.7236
2) neutral 0.2287
3) positive 0.0477
```
In addition to the classification output from the model, you can now see the execution time for the different operators.

You can experiment with the [BFloat16 floating-point number format](/install-guides/pytorch.md#bfloat16-floating-point-number-format) and [Transparent huge pages](/install-guides/pytorch.md#transparent-huge-pages) settings with PyTorch and see how that impacts the performance of your model.

You have successfully run and profiled a sentiment analysis NLP model from Hugging Face on your Arm machine. You can explore running other models and use cases just as easily.