<table align="left">
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/polyhedron-gdl/halloween-seminar-2023/blob/main/1_notebooks/chapter-10-01.ipynb">
        <img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
</table>

# Introduction to Hugging Face

## What is Huggin Face?

Hugging Face is an AI company that has become a major hub for open-source machine learning (ML). Their platform has 3 major elements which allow users to access and share machine learning resources.

- First is their rapidly growing repository of pre-trained open-source ML models for things such as natural language processing (NLP), computer vision, and more.

- Second is their library of datasets for training ML models for almost any task.

- Third, and finally, is Spaces which is a collection of open-source ML apps hosted by Hugging Face.

The power of these resources is that they are community generated, which leverages all the benefits of open-source (i.e. cost-free, wide diversity of tools, high-quality resources, and rapid pace of innovation). While these make building powerful ML projects more accessible than before, there is another key element of the Hugging Face ecosystem — the Transformers library.

## Sentiment Analysis with Hugging Face Libraries

In this notebook, you’ll learn how to leverage pre-trained machine learning models from Hugging Face to perform sentiment analysis on various text examples. We’ll walk you through the entire process, from installing the required packages to running and interpreting the model’s output. By the end of this tutorial, you’ll be equipped with the knowledge to use Hugging Face Transformers as a Library for analyzing the sentiment of text data.

**Step 1: Install Required Packages**

First, you’ll need to install the transformers library from Hugging Face. You can do this using pip:

In [34]:
# !pip install transformers



**Transformers** is a Python library that makes downloading and training state-of-the-art ML models easy. Although it was initially made for developing language models, its functionality has expanded to include models for computer vision, audio processing, and beyond.

Two big strengths of this library are:

- it easily integrates with Hugging Face’s (previously mentioned) Models, Datasets, and Spaces repositories, and ...

- the library supports other popular ML frameworks such as PyTorch and TensorFlow.

This results in a simple and flexible all-in-one platform for downloading, training, and deploying machine learning models and apps.

In this example we are going to use PyTorch as the predefined framework. You can install PyTorch by running the following command in your SingleStore Notebook:

In [None]:
# !pip install torch

> **Restart the Kernel**: After installing, you may need to restart the SingleStore Notebook kernel to ensure that the newly installed packages are recognized. You can usually do this by clicking on “Kernel” in the menu and then selecting “Restart Kernel”.

**Step 2: Import Libraries**

Import the necessary Python libraries.

In [None]:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

> **AutoModels** In many cases, the architecture you want to use can be guessed from the name or the path of the pretrained model you are supplying to the from_pretrained method. AutoClasses are here to do this job for you so that you automatically retrieve the relevant model given the name/path to the pretrained weights/config/vocabulary: Instantiating one of *AutoModel*, *AutoConfig* and *AutoTokenizer* will directly create a class of the relevant architecture

> **Example** `model = AutoModel.from_pretrained('bert-base-cased')` will create a instance of `BertModel`). In particular, `AutoModelForSequenceClassification` is a generic model class that will be instantiated as one of the sequence classification model classes of the library when created with the `AutoModelForSequenceClassification.from_pretrained(pretrained_model_name_or_path)` class method.

**Step 3: Load Pre-trained Model and Tokenizer**

An important remark on Hugging Face terminology

> When we work with Hugging Face Libraries we must remember that the term **architecture** refers to the skeleton of the model and **checkpoints** are the weights for a given architecture. For example, **BERT is an architecture**, while **bert-base-uncased is a checkpoint**. Model is a general term that can mean either architecture or checkpoint.

With so many different Transformer architectures, it can be challenging to create one for your checkpoint. As we have already noted above, as a part of Hugging Face Transformers core philosophy to make the library easy, simple and flexible to use, an `AutoClass` automatically infers and loads the correct architecture from a given checkpoint. The `from_pretrained()` method lets you quickly load a pretrained model for any architecture so you don’t have to devote time and resources to train a model from scratch. Producing this type of checkpoint-agnostic code means if your code works for one checkpoint, it will work with another checkpoint - as long as it was trained for a similar task - even if the architecture is different.

For this example, let’s use the **distilbert-base-uncased-finetuned-sst-2-english** model for sentiment analysis.

In [None]:
# -----> HERE you define the tokenizer
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")
# -----> HERE you define the model that will be used for the inference phase
model     = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")

 **Step 4: Preprocess Text**

Tokenize the text you want to analyze.

In [None]:
text = '''
Katia Winter star in this understated, psychological thriller about a British operative who ends up having
to confiscate a negative from a freelance photographer of a picture he took of her in order to remain as a
ghost from those who are seeking to kill her. In the process, both she and the reluctant photographer find
themselves on the run. Directed by Joshua Caldwell who also directed Be Somebody (2016), a comedy romance
drama that also happens to echo a similar understated tone and offers a refreshing diversion into a more
realistic interplay between characters that Negative presents. Negative together with Jennifer Lawrence in
Red Sparrow (2018) and Daryl Hannah in The Job (2003) allow female assassin or professional agents that
emphasize the psychological drama over the intensity of special effects, explosions, and action scenes of
over the top mass killings or hand to hand combat in a choreographed martial arts. At the same time, the
sustainability of audience's interest is made much more difficult, especially with the marketing of Negative's
trailer with emphasized the action portion of the movie much to the disappointment of a good segment of the audience.'''

In [None]:
text = '''
The Burial is a captivating film that tells the inspiring story of a small business owner's fight against a
corporate giant, highlighting themes of justice and greed. The film seamlessly blends humor, heartwarming
family moments, and intense courtroom drama, making it a compelling watch for all audiences. Jamie Foxx delivers
a standout performance, showcasing his range and chemistry with co-star Tommy Lee Jones, while the supporting
cast also shines in their scene-stealing roles.
'''

In [None]:
print(text)

In [None]:
tokens = tokenizer(text, padding=True, truncation=True, return_tensors="pt")
tokens

**Step 5: Model Inference**

Pass the tokenized text through the model. You can find a complete and clear explanation of **context manager** in python and their use [here](https://realpython.com/python-with-statement/).

In [None]:
with torch.no_grad():
    outputs       = model(**tokens)
    logits        = outputs.logits
    probabilities = torch.softmax(logits, dim=1)

Let's explain the previous code step-by-step

1. `with torch.no_grad():`
   - This line of code is starting a context manager using the `with` statement. The purpose of this context manager is to temporarily disable gradient computation in PyTorch. In deep learning, when you train a neural network, you typically compute gradients for the model's parameters during the forward and backward passes to perform optimization (e.g., gradient descent). However, in some cases, you might want to perform inference or evaluation without computing gradients because they are not needed for these tasks. This context manager ensures that gradients are not computed within the indented block of code that follows.

2. `outputs = model(**tokens):`
   - Here, the code is making a forward pass through a neural network model. The `model` is the PyTorch model previously defined, and it's being called with `**tokens` as its argument. The `**tokens` syntax indicates that the `tokens` variable is a dictionary containing keyword arguments for the model. This line of code computes the model's output based on the input tokens.


3. `logits = outputs.logits:`
   - `outputs` is an object returned by the model, the code is extracting the `logits` from the model's output. Logits are raw values generated by the model before applying a softmax function. These logits are often used in classification tasks.

4. `probabilities = torch.softmax(logits, dim=1):`
   - In this line, the code is taking the `logits` obtained from the model and applying the softmax function to them. The `torch.softmax` function is used to convert raw logits into probabilities. The `dim=1` argument specifies that the softmax operation should be performed along dimension 1 of the `logits` tensor. This typically corresponds to the class dimension in a classification problem, and it ensures that the resulting probabilities sum to 1 along that dimension.

In summary, the code is performing a forward pass through a PyTorch neural network model, extracting the raw logits from the model's output, and then converting these logits into probability values using the softmax function. The use of `torch.no_grad()` ensures that gradient computation is turned off during this process, which is typically done during inference or evaluation to save computation resources and memory.

**Step 6: Interpret Results**

Interpret the model’s output to get the sentiment.

In [None]:
print(probabilities)

In [None]:
label_ids = torch.argmax(probabilities, dim=1)
labels = ['Negative', 'Positive']
label = labels[label_ids]
print(f"The sentiment is: {label}")

 Let's break this code down step by step:

1. `label_ids = torch.argmax(probabilities, dim=1):`
   - In this line of code, the `torch.argmax` function is used to find the index of the maximum probability along dimension 1 of the `probabilities` tensor. This essentially determines the predicted class (sentiment label) for each input. The `dim=1` argument specifies that the operation should be performed along the second dimension of the `probabilities` tensor, which is often the dimension corresponding to classes in classification problems.

2. `labels = ['Negative', 'Positive']`
   - Here, a list `labels` is defined, which contains two sentiment labels: 'Negative' and 'Positive'. These labels likely correspond to the two possible classes the model is trying to classify.

3. `label = labels[label_ids]:`
   - This line uses the `label_ids` obtained in step 1 to index the `labels` list. It essentially maps the predicted class (determined by the maximum probability) to the corresponding sentiment label. For example, if `label_ids` is `[1, 0, 1]`, it means the model predicts 'Positive', 'Negative', 'Positive' sentiments, respectively.

4. `print(f"The sentiment is: {label}")`
   - Finally, this line prints out the predicted sentiment label. It uses an f-string to format the output string, where `{label}` is replaced with the actual sentiment label determined in step 3.

So, the overall purpose of this code snippet is to take the probabilities predicted by the model for different sentiment classes (typically 'Negative' and 'Positive') and select the sentiment label with the highest probability as the predicted sentiment for a given input. It then prints this predicted sentiment label.

### Using Hugging Face Pipeline

The easiest way to start using the library is via the pipeline() function, which abstracts NLP (and other) tasks into 1 line of code. For example, if we want to do sentiment analysis, we would need to select a model, tokenize the input text, pass it through the model, and decode the numerical output to determine the sentiment label (positive or negative).

While this may seem like a lot of steps, we can do all this in 1 line via the pipeline() function, as shown in the code snippet below.

In [35]:
from transformers import pipeline

In [None]:
classifier = pipeline(task="sentiment-analysis", \
                      model="distilbert-base-uncased-finetuned-sst-2-english")

classifier(text)

## Summarization

Another way we can use the pipeline() function is for text summarization. Although this is an entirely different task than sentiment analysis, the syntax is almost identical.

We first load in a summarization model. Then pass in some text along with a couple of input parameters.

In [None]:
summarizer = pipeline("summarization", model="facebook/bart-large-cnn")

text = """
Hugging Face is an AI company that has become a major hub for open-source machine learning.
Their platform has 3 major elements which allow users to access and share machine learning resources.
First, is their rapidly growing repository of pre-trained open-source machine learning models for things such as natural language processing (NLP), computer vision, and more.
Second, is their library of datasets for training machine learning models for almost any task.
Third, and finally, is Spaces which is a collection of open-source ML apps.

The power of these resources is that they are community generated, which leverages all the benefits of open source i.e. cost-free, wide diversity of tools, high quality resources, and rapid pace of innovation.
While these make building powerful ML projects more accessible than before, there is another key element of the Hugging Face ecosystem—their Transformers library.
"""
summarized_text = summarizer(text, min_length=5, max_length=100)[0]['summary_text']
print('\n\n' + summarized_text)

## Conversational

Finally, we can use models developed specifically to generate conversational text. Since conversations require past prompts and responses to be passed to subsequent model responses, the syntax is a little different here. However, we start by instantiating our model using the pipeline() function.

In [36]:
chatbot = pipeline(model="facebook/blenderbot-400M-distill")

Downloading (…)lve/main/config.json:   0%|          | 0.00/1.57k [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/730M [00:00<?, ?B/s]

Downloading (…)neration_config.json:   0%|          | 0.00/347 [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/1.15k [00:00<?, ?B/s]

Downloading (…)olve/main/vocab.json:   0%|          | 0.00/127k [00:00<?, ?B/s]

Downloading (…)olve/main/merges.txt:   0%|          | 0.00/62.9k [00:00<?, ?B/s]

Downloading (…)in/added_tokens.json:   0%|          | 0.00/16.0 [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/772 [00:00<?, ?B/s]

Next, we can use the Conversation() class to handle the back-and-forth. We initialize it with a user prompt, then pass it into the chatbot model from the previous code block.

In [38]:
from transformers import Conversation

conversation = Conversation("Hi I'm Giovanni, how are you?")
conversation = chatbot(conversation)
print(conversation)

Conversation id: 1e1f7ef0-5286-4fa2-928b-d1ea09bb0b98 
user >> Hi I'm Giovanni, how are you? 
bot >>  I'm doing well, thank you. How are you this evening? Do you have any hobbies? 



To keep the conversation going, we can use the add_user_input() method to add another prompt to the conversation. We then pass the conversation object back into the chatbot.

In [39]:
conversation.add_user_input("Where do you work?")
conversation = chatbot(conversation)
print(conversation)

Conversation id: 1e1f7ef0-5286-4fa2-928b-d1ea09bb0b98 
user >> Hi I'm Giovanni, how are you? 
bot >>  I'm doing well, thank you. How are you this evening? Do you have any hobbies? 
user >> Where do you work? 
bot >>  I work at a grocery store as a cashier. What do you do for a living? 



### Chatbot UI with Gradio

While we get the base chatbot functionality with the Transformer library, this is an inconvenient way to interact with a chatbot. To make the interaction a bit more intuitive, we can use Gradio to spin up a front end in a few lines of Python code.

This is done with the code shown below. At the top, we initialize two lists to store user messages and model responses, respectively. Then we define a function that will take the user prompt and generate a chatbot output. Next, we create the chat UI using the Gradio ChatInterface() class. Finally, we launch the app.

In [41]:
# !pip install gradio

Collecting gradio
  Downloading gradio-3.44.3-py3-none-any.whl (20.2 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m20.2/20.2 MB[0m [31m22.3 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting aiofiles<24.0,>=22.0 (from gradio)
  Downloading aiofiles-23.2.1-py3-none-any.whl (15 kB)
Collecting fastapi (from gradio)
  Downloading fastapi-0.103.1-py3-none-any.whl (66 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m66.2/66.2 kB[0m [31m8.5 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting ffmpy (from gradio)
  Downloading ffmpy-0.3.1.tar.gz (5.5 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting gradio-client==0.5.0 (from gradio)
  Downloading gradio_client-0.5.0-py3-none-any.whl (298 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m298.2/298.2 kB[0m [31m30.8 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting httpx (from gradio)
  Downloading httpx-0.25.0-py3-none-any.whl (75 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

In [42]:
import gradio as gr

In [43]:
message_list = []
response_list = []

def vanilla_chatbot(message, history):
    conversation = Conversation(text=message, past_user_inputs=message_list, generated_responses=response_list)
    conversation = chatbot(conversation)

    return conversation.generated_responses[-1]

demo_chatbot = gr.ChatInterface(vanilla_chatbot, title="Vanilla Chatbot", description="Enter text to start chatting.")

demo_chatbot.launch()

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
Note: opening Chrome Inspector may crash demo inside Colab notebooks.

To create a public link, set `share=True` in `launch()`.


<IPython.core.display.Javascript object>



## References and Credits

- Pavan Belegatti, "[Hugging Face Tutorial for Beginners!](https://levelup.gitconnected.com/hugging-face-tutorial-for-beginners-e3a1c770cf9b)"

- Shawhin Talebi, "[Cracking Open the Hugging Face Transformers Library](https://towardsdatascience.com/cracking-open-the-hugging-face-transformers-library-350aa0ef0161)"

- Shashank Mohan Jain, "Introduction to Transformers for NLP", Apress