Cracking Open the Hugging Face Transformers Library

Setup

In [2]:
from transformers import pipeline, Conversation
import gradio as gr




Sentiment Analysis

In [3]:
# toy example 1
pipeline(task="sentiment-analysis")("Love this!")

No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

[{'label': 'POSITIVE', 'score': 0.9998745918273926}]

In [4]:
# toy example 2
pipeline(task="sentiment-analysis", model="distilbert-base-uncased-finetuned-sst-2-english")("Love this!")

[{'label': 'POSITIVE', 'score': 0.9998745918273926}]

More Sentiment Analysis

In [5]:
# defining classifier
classifier = pipeline(task="sentiment-analysis", model="distilbert-base-uncased-finetuned-sst-2-english")

In [6]:
classifier("Hate this.")

[{'label': 'NEGATIVE', 'score': 0.9997110962867737}]

In [7]:
# we can also pass in a list to classifier
text_list = ["This is great", \
             "Thanks for nothing", \
             "You've got to work on your face", \
             "You're beautiful, never change!"]

classifier(text_list)

[{'label': 'POSITIVE', 'score': 0.9998785257339478},
 {'label': 'POSITIVE', 'score': 0.9680057168006897},
 {'label': 'NEGATIVE', 'score': 0.8776118159294128},
 {'label': 'POSITIVE', 'score': 0.9998120665550232}]

In [8]:
# if there are multiple target labels, we can return them all
classifier = pipeline(task="text-classification", model="SamLowe/roberta-base-go_emotions", top_k=None)

config.json:   0%|          | 0.00/1.92k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/499M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/380 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/798k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/2.11M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/280 [00:00<?, ?B/s]

In [9]:
classifier(text_list[0])

[[{'label': 'admiration', 'score': 0.9526104927062988},
  {'label': 'approval', 'score': 0.03047208860516548},
  {'label': 'neutral', 'score': 0.015236231498420238},
  {'label': 'excitement', 'score': 0.006063777953386307},
  {'label': 'gratitude', 'score': 0.005296194460242987},
  {'label': 'joy', 'score': 0.004475215449929237},
  {'label': 'curiosity', 'score': 0.004322333727031946},
  {'label': 'realization', 'score': 0.004089601803570986},
  {'label': 'optimism', 'score': 0.00407722033560276},
  {'label': 'disapproval', 'score': 0.004076561890542507},
  {'label': 'annoyance', 'score': 0.0035287411883473396},
  {'label': 'surprise', 'score': 0.002973071299493313},
  {'label': 'disappointment', 'score': 0.002734640846028924},
  {'label': 'love', 'score': 0.0026945830322802067},
  {'label': 'amusement', 'score': 0.0024867451284080744},
  {'label': 'confusion', 'score': 0.0023607409093528986},
  {'label': 'pride', 'score': 0.002101339166983962},
  {'label': 'sadness', 'score': 0.001773

Summarization

In [10]:
summarizer = pipeline("summarization", model="facebook/bart-large-cnn")

config.json:   0%|          | 0.00/1.58k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

In [11]:
text = """
Hugging Face is an AI company that has become a major hub for open-source machine learning. 
Their platform has 3 major elements which allow users to access and share machine learning resources. 
First, is their rapidly growing repository of pre-trained open-source machine learning models for things such as natural language processing (NLP), computer vision, and more. 
Second, is their library of datasets for training machine learning models for almost any task. 
Third, and finally, is Spaces which is a collection of open-source ML apps.

The power of these resources is that they are community generated, which leverages all the benefits of open source i.e. cost-free, wide diversity of tools, high quality resources, and rapid pace of innovation. 
While these make building powerful ML projects more accessible than before, there is another key element of the Hugging Face ecosystem—their Transformers library.
"""
summarized_text = summarizer(text, min_length=5, max_length=140)[0]['summary_text']
summarized_text

'Hugging Face is an AI company that has become a major hub for open-source machine learning. They have 3 major elements which allow users to access and share machine learning resources.'

In [12]:
classifier(summarized_text)

[[{'label': 'neutral', 'score': 0.9101783633232117},
  {'label': 'approval', 'score': 0.08781373500823975},
  {'label': 'realization', 'score': 0.02325628511607647},
  {'label': 'annoyance', 'score': 0.006623796187341213},
  {'label': 'admiration', 'score': 0.004981071222573519},
  {'label': 'disapproval', 'score': 0.004730124492198229},
  {'label': 'optimism', 'score': 0.0033590742386877537},
  {'label': 'disappointment', 'score': 0.002619005972519517},
  {'label': 'confusion', 'score': 0.001953981351107359},
  {'label': 'excitement', 'score': 0.001241705846041441},
  {'label': 'disgust', 'score': 0.001140780746936798},
  {'label': 'joy', 'score': 0.0010540124494582415},
  {'label': 'amusement', 'score': 0.0009572373819537461},
  {'label': 'love', 'score': 0.000887105125002563},
  {'label': 'desire', 'score': 0.0008553274674341083},
  {'label': 'curiosity', 'score': 0.0008261060575023293},
  {'label': 'anger', 'score': 0.0007336385897360742},
  {'label': 'caring', 'score': 0.000697113

Conversational

In [13]:
chatbot = pipeline(model="facebook/blenderbot-400M-distill")

config.json:   0%|          | 0.00/1.57k [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/730M [00:00<?, ?B/s]

  return self.fget.__get__(instance, owner)()


generation_config.json:   0%|          | 0.00/347 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/1.15k [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/127k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/62.9k [00:00<?, ?B/s]

added_tokens.json:   0%|          | 0.00/16.0 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/772 [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/310k [00:00<?, ?B/s]

In [14]:
conversation = Conversation("Hi I'm Shaw, how are you?")
conversation = chatbot(conversation)


No chat template is defined for this tokenizer - using the default template for the BlenderbotTokenizerFast class. If the default is not appropriate for your model, please set `tokenizer.chat_template` to an appropriate template. See https://huggingface.co/docs/transformers/main/chat_templating for more information.



In [15]:
conversation

Conversation id: f086d000-c2c3-424e-86b5-f2745d2bb2f8
user: Hi I'm Shaw, how are you?
assistant:  I'm doing well. How are you doing this evening? I just got home from work.

In [16]:
conversation.add_user_input("Where do you work?")
conversation = chatbot(conversation)

In [17]:
conversation

Conversation id: f086d000-c2c3-424e-86b5-f2745d2bb2f8
user: Hi I'm Shaw, how are you?
assistant:  I'm doing well. How are you doing this evening? I just got home from work.
user: Where do you work?
assistant:  I work at a grocery store. What about you? What do you do for a living?

Deploy Chatbot UI

Text Sentiment Chatbot

In [18]:
def top3_text_classes(message, history):
    return str(classifier(message)[0][:3]).replace('}, {', '\n').replace('[{', '').replace('}]', '')

demo_sentiment = gr.ChatInterface(top3_text_classes, title="Text Sentiment Chatbot", description="Enter your text, and the chatbot will classify the sentiment.")

demo_sentiment.launch()

Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.




Summarizer Chatbot

In [19]:
def summarizer_bot(message, history):
    return summarizer(message, min_length=5, max_length=140)[0]['summary_text']

demo_summarizer = gr.ChatInterface(summarizer_bot, title="Summarizer Chatbot", description="Enter your text, and the chatbot will return the summarized version.")

demo_summarizer.launch()

Running on local URL:  http://127.0.0.1:7861

To create a public link, set `share=True` in `launch()`.




Your max_length is set to 140, but your input_length is only 7. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=3)


Vanilla Chatbot

In [20]:
message_list = []
response_list = []

def vanilla_chatbot(message, history):
    conversation = Conversation(text=message, past_user_inputs=message_list, generated_responses=response_list)
    conversation = chatbot(conversation)

    return conversation.generated_responses[-1]

demo_chatbot = gr.ChatInterface(vanilla_chatbot, title="Vanilla Chatbot", description="Enter text to start chatting.")

demo_chatbot.launch()

Running on local URL:  http://127.0.0.1:7862

To create a public link, set `share=True` in `launch()`.


