ArmDeveloperEcosystem · pareenaverma · Mar 4, 2024 · Feb 27, 2024 · Feb 27, 2024 · Feb 27, 2024
diff --git a/content/learning-paths/servers-and-cloud-computing/nlp-hugging-face/_index.md b/content/learning-paths/servers-and-cloud-computing/nlp-hugging-face/_index.md
@@ -0,0 +1,33 @@
+---
+title: Run a Natural Language Processing (NLP) model from Hugging Face on Arm servers
+
+minutes_to_complete: 20
+
+who_is_this_for: This is an introductory topic for software developers who want to learn how to run a NLP model from Hugging Face using PyTorch on Arm based servers. 
+
+learning_objectives:
+    - Deploy a PyTorch NLP model from Hugging face on an Arm AArch64 CPU
+    - Use the PyTorch profiler to analyze the execution time of the model
+
+prerequisites:
+    - An [Arm based instance](/learning-paths/servers-and-cloud-computing/csp/) from a cloud service provider or an on-premise Arm server.
+
+author_primary: Pareena Verma
+
+### Tags
+skilllevels: Introductory
+subjects: ML
+armips:
+    - Neoverse 
+operatingsystems:
+    - Linux 
+tools_software_languages:
+    - Python
+    - PyTorch
+
+### FIXED, DO NOT MODIFY
+# ================================================================================
+weight: 1                       # _index.md always has weight of 1 to order correctly
+layout: "learningpathall"       # All files under learning paths have this same wrapper
+learning_path_main_page: "yes"  # This should be surfaced when looking for related content. Only set for _index.md of learning path content.
+---
diff --git a/content/learning-paths/servers-and-cloud-computing/nlp-hugging-face/_next-steps.md b/content/learning-paths/servers-and-cloud-computing/nlp-hugging-face/_next-steps.md
@@ -0,0 +1,32 @@
+---
+next_step_guidance: >
+    Thank you for completing this learning path on how to run a NLP sentiment analysis model on an Arm server. You might be interested in learning how to use the Keras Core with TensorFlow, PyTorch, and JAX backends.
+
+recommended_path: "/learning-paths/servers-and-cloud-computing/keras-core/"
+
+further_reading:
+    - resource:
+        title: Hugging Face Documentation
+        link: https://huggingface.co/docs
+        type: documentation
+    - resource:
+        title: PyTorch Inference Performance Tuning on AWS Graviton Processors
+        link: https://pytorch.org/tutorials/recipes/inference_tuning_on_aws_graviton.html
+        type: documentation
+    - resource:
+        title: ML inference on Graviton CPUs with PyTorch
+        link: https://github.com/aws/aws-graviton-getting-started/blob/main/machinelearning/pytorch.md
+        type: documentation
+    - resource:
+        title: PyTorch Documentation
+        link: https://pytorch.org/docs/stable/index.html
+        type: documentation
+
+
+# ================================================================================
+#       FIXED, DO NOT MODIFY
+# ================================================================================
+weight: 21                  # set to always be larger than the content in this path, and one more than 'review'
+title: "Next Steps"         # Always the same
+layout: "learningpathall"   # All files under learning paths have this same wrapper
+---
diff --git a/content/learning-paths/servers-and-cloud-computing/nlp-hugging-face/_review.md b/content/learning-paths/servers-and-cloud-computing/nlp-hugging-face/_review.md
@@ -0,0 +1,29 @@
+---
+review:
+    - questions:
+        question: >
+            Does PyTorch run on Arm servers?
+        answers:
+            - "Yes"
+            - "No"
+        correct_answer: 1
+        explanation: >
+            PyTorch is an open-source machine learning framework. It can be installed and used on Arm servers to build and deploy various neural networks.
+
+    - questions:
+        question: >
+            Can you run a Hugging Face model through PyTorch on an Arm AArch64 CPU?
+        answers:
+            - "Yes"
+            - "No"
+        correct_answer: 1
+        explanation: >
+            You can run and deploy models from Hugging Face on Arm CPUs using PyTorch.
+
+# ================================================================================
+#       FIXED, DO NOT MODIFY
+# ================================================================================
+title: "Review"                 # Always the same title
+weight: 20                      # Set to always be larger than the content in this path
+layout: "learningpathall"       # All files under learning paths have this same wrapper
+---
diff --git a/...t/learning-paths/servers-and-cloud-computing/nlp-hugging-face/pytorch-nlp-hf.md b/...t/learning-paths/servers-and-cloud-computing/nlp-hugging-face/pytorch-nlp-hf.md
@@ -0,0 +1,187 @@
+---
+title: Run a 
+weight: 2
+
+### FIXED, DO NOT MODIFY
+layout: learningpathall
+---
+
+## Before you begin
+The instructions in this learning path are for any Arm server running Ubuntu 22.04 LTS.
+
+Before you begin, you will need to install [PyTorch](/install-guides/pytorch) on your Arm machine. 
+PyTorch is a widely used machine learning framework for Python. You will use PyTorch to deploy a Natural Language Processing (NLP) model on your Arm machine.
+
+## Overview
+
+[Hugging Face](https://huggingface.co/) is an open source AI community where you can host your own AI models, train them and collaborate with others in the community. You can browse through the thousands of models that are available for a variety of use cases like Natural language processing, audio and computer vision. Hugging face has a huge collection of NLP models for tasks like translation, sentiment analysis, summarization and text generation.
+
+In this learning path, you will download a popular [RoBERTa sentiment analysis](https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment-latest) NLP model from Hugging Face and deploy it using PyTorch on your Arm machine. Sentiment analysis is a type of NLP algorithm used to identify and classify the emotional tone of a piece of text. This model has been trained with over 124 million tweets. 
+
+## Install dependencies
+
+Hugging Face Transformers library provides APIs and tools that let you easily download and train pre-trained models. Huggging Face Transformers support multiple machine learning frameworks like PyTorch, TensorFlow and JAX. You will use transformers with PyTorch to download the model from Hugging Face.
+
+To install the Transformers library for PyTorch, run the following command:
+
+```bash
+pip install 'transformers[torch]'
+```
+
+The RoBERTa sentiment analysis NLP model uses SciPy, an open source Python library used to solve scientific and mathematical problems. To install SciPy, run the following command:
+
+```bash 
+pip install scipy
+```
+
+## Run the sentiment analysis NLP model 
+
+You are now ready to download this model and run a full classification example from Hugging Face on your machine. Using a file editor of your choice, create a file named `sentiment-analysis.py`:
+
+```python
+from transformers import AutoModelForSequenceClassification
+from transformers import TFAutoModelForSequenceClassification
+from transformers import AutoTokenizer, AutoConfig
+import numpy as np
+ifrom scipy.special import softmax
+import transformers
+transformers.logging.set_verbosity_error()
+# Preprocess text (username and link placeholders)
+def preprocess(text):
+    new_text = []
+    for t in text.split(" "):
+        t = '@user' if t.startswith('@') and len(t) > 1 else t
+        t = 'http' if t.startswith('http') else t
+        new_text.append(t)
+    return " ".join(new_text)
+MODEL = f"cardiffnlp/twitter-roberta-base-sentiment-latest"
+tokenizer = AutoTokenizer.from_pretrained(MODEL)
+config = AutoConfig.from_pretrained(MODEL)
+# PT
+model = AutoModelForSequenceClassification.from_pretrained(MODEL)
+text = "Covid cases are increasing fast!"
+text = preprocess(text)
+encoded_input = tokenizer(text, return_tensors='pt')
+output = model(**encoded_input)
+scores = output[0][0].detach().numpy()
+scores = softmax(scores)
+# Print labels and scores
+ranking = np.argsort(scores)
+ranking = ranking[::-1]
+for i in range(scores.shape[0]):
+    l = config.id2label[ranking[i]]
+    s = scores[ranking[i]]
+    print(f"{i+1}) {l} {np.round(float(s), 4)}")
+```
+This example does the following:
+
+* Downloads and creates an instance of the RoBERTa sentiment analysis model. 
+* Creates a `tokenizer` which prepares the inputs as tensors for the model. 
+* Pre-processes the input text to the model.
+* Encodes the input text to the model.
+* Passes the encoded input text to the model and performs the sentiment analysis
+* Obtains the output classification score
+
+Run this script:
+
+```bash
+python sentiment-analysis.py
+```
+
+The output from this script should look like:
+
+```output
+1) negative 0.7236
+2) neutral 0.2287
+3) positive 0.0477
+```
+
+You have successfully performed sentiment analysis on the input text, all running on your Arm AArch64 CPU. You can change the input text in your example and re-run the classification example.
+
+Now that you have run the model, let's add the ability to profile the model execution. You can use the [PyTorch Profiler](https://pytorch.org/tutorials/recipes/recipes/profiler_recipe.html) to analyze the execution time on the CPU. Copy the contents shown below into a file named `sentiment-analysis-profile.py`:
+
+```python
+from transformers import AutoModelForSequenceClassification
+from transformers import TFAutoModelForSequenceClassification
+from transformers import AutoTokenizer, AutoConfig
+import numpy as np
+from scipy.special import softmax
+import transformers
+transformers.logging.set_verbosity_error()
+import torch
+from torch.profiler import profile, record_function, ProfilerActivity
+# Preprocess text (username and link placeholders)
+def preprocess(text):
+    new_text = []
+    for t in text.split(" "):
+        t = '@user' if t.startswith('@') and len(t) > 1 else t
+        t = 'http' if t.startswith('http') else t
+        new_text.append(t)
+    return " ".join(new_text)
+MODEL = f"cardiffnlp/twitter-roberta-base-sentiment-latest"
+tokenizer = AutoTokenizer.from_pretrained(MODEL)
+config = AutoConfig.from_pretrained(MODEL)
+# PT
+model = AutoModelForSequenceClassification.from_pretrained(MODEL)
+text = "Covid cases are increasing fast!"
+text = preprocess(text)
+encoded_input = tokenizer(text, return_tensors='pt')
+with torch.profiler.profile(activities=[torch.profiler.ProfilerActivity.CPU],
+                            record_shapes=True) as prof:
+    with record_function("model_inference"):
+        output = model(**encoded_input)
+
+# print basic stats
+print(prof.key_averages().table(sort_by="self_cpu_time_total", row_limit=10))
+
+scores = output[0][0].detach().numpy()
+scores = softmax(scores)
+# Print labels and scores
+ranking = np.argsort(scores)
+ranking = ranking[::-1]
+for i in range(scores.shape[0]):
+    l = config.id2label[ranking[i]]
+    s = scores[ranking[i]]
+    print(f"{i+1}) {l} {np.round(float(s), 4)}")
+```
+
+Run this python script:
+
+```bash
+python sentiment-analysis-profile.py
+```
+
+The output should look similar to:
+
+```output
+STAGE:2024-02-27 17:26:22 18170:18170 ActivityProfilerController.cpp:314] Completed Stage: Warm Up
+STAGE:2024-02-27 17:26:22 18170:18170 ActivityProfilerController.cpp:320] Completed Stage: Collection
+STAGE:2024-02-27 17:26:22 18170:18170 ActivityProfilerController.cpp:324] Completed Stage: Post Processing
+---------------------------  ------------  ------------  ------------  ------------  ------------  ------------
+                       Name    Self CPU %      Self CPU   CPU total %     CPU total  CPU time avg    # of Calls
+---------------------------  ------------  ------------  ------------  ------------  ------------  ------------
+                aten::addmm        56.56%      29.355ms        57.96%      30.085ms     406.554us            74
+            model_inference        15.24%       7.910ms       100.00%      51.903ms      51.903ms             1
+                  aten::bmm         4.86%       2.521ms         7.37%       3.823ms     159.292us            24
+               aten::select         2.55%       1.323ms         2.58%       1.337ms       1.535us           871
+                 aten::view         1.98%       1.030ms         1.98%       1.030ms       3.962us           260
+               aten::linear         1.97%       1.022ms        62.89%      32.640ms     441.081us            74
+    aten::native_layer_norm         1.87%     968.000us         2.07%       1.072ms      42.880us            25
+                 aten::gelu         1.76%     912.000us         1.76%     912.000us      76.000us            12
+                aten::copy_         1.36%     706.000us         1.36%     706.000us       6.660us           106
+               aten::expand         0.95%     492.000us         0.98%     509.000us       4.138us           123
+---------------------------  ------------  ------------  ------------  ------------  ------------  ------------
+Self CPU time total: 51.903ms
+
+1) negative 0.7236
+2) neutral 0.2287
+3) positive 0.0477
+```
+In addition to the classification output from the model, you can now see the execution time for the different operators. 
+
+You can experiment with the [BFloat16 floating-point number format](/install-guides/pytorch.md#bfloat16-floating-point-number-format) and [Transparent huge pages](/install-guides/pytorch.md#transparent-huge-pages) settings with PyTorch and see how that impacts the performance of your model.
+
+You have successfully run and profiled a sentiment analysis NLP model from Hugging Face on your Arm machine. You can explore running other models and use cases just as easily.
+
+
+