# Hugging Face

by: Aveontae Frazier

### What is Hugging Face?

A major hub for open-source machine learning (ML)! 

Three major offerings: 
1) Models
2) Datasets 
3) Spaces

#### Transformers

A Python library that makes the ML training and implementation process simple.

Installation: https://huggingface.co/docs/transformers/installation

## Sentiment Analysis Example

In [3]:
# Import mods
from transformers import pipeline, Conversation

In [7]:
pipeline(task="sentiment-analysis")("Love this!") # really simply application, no model specified

No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


[{'label': 'POSITIVE', 'score': 0.9998745918273926}]

## Pipelines are able to handle: 

*Summarization*, *Translation*, *Question-Answering*, *Feature-Extraction (I.e. Text embedding)*, *Text generation*, *and more!*

Pipeline() Docs: https://huggingface.co/docs/transformers/main_classes/pipelines#transformers.pipeline.task

In [9]:
# Specify a Model for the Pipeline task
pipeline(task="sentiment-analysis",
        model="distilbert-base-uncased-finetuned-sst-2-english")("Love this!") # default distilbert model for sentiment analysis

[{'label': 'POSITIVE', 'score': 0.9998745918273926}]

#### What makes transformers power is `the simple fact that we can easily implement any one of the thousands of open source models available in the Hugging Face Models!`

**Below is a link to the growing repository of pre-trained open-source ML models for many task!**

*Hugging Face Open-Souce Models: https://huggingface.co/models*

In [16]:
pipeline(task="sentiment-analysis",
        model="michellejieli/emotion_text_classifier")("Love this!") # Another trending model on Hugging Face

[{'label': 'joy', 'score': 0.9533290266990662}]

In [18]:
# Defining a classifier
classifier=pipeline(task="sentiment-analysis", model="distilbert-base-uncased-finetuned-sst-2-english")

In [20]:
classifier("I hate this!")

[{'label': 'NEGATIVE', 'score': 0.9995765089988708}]

### Batch Predictions

In [23]:
batch_list=["this is great", \
     "that is it for now!", \
     "thanks for nothing", \
     "I spilled my coffee!", \
     "you are great the way you are, why change?"]

classifier(batch_list) # pass the list to the classifier

[{'label': 'POSITIVE', 'score': 0.9998785257339478},
 {'label': 'POSITIVE', 'score': 0.9973287582397461},
 {'label': 'POSITIVE', 'score': 0.9680058360099792},
 {'label': 'NEGATIVE', 'score': 0.9567683339118958},
 {'label': 'POSITIVE', 'score': 0.9998712539672852}]

### Multiple Targets

In [26]:
classifier=pipeline(task="sentiment-analysis", 
                    model="SamLowe/roberta-base-go_emotions",
                    top_k=None)

In [27]:
classifier(batch_list[0]) # applied to first element of list

[[{'label': 'admiration', 'score': 0.9523261785507202},
  {'label': 'approval', 'score': 0.03107648715376854},
  {'label': 'neutral', 'score': 0.015463285148143768},
  {'label': 'excitement', 'score': 0.005772939883172512},
  {'label': 'gratitude', 'score': 0.0054391901940107346},
  {'label': 'curiosity', 'score': 0.004312193486839533},
  {'label': 'joy', 'score': 0.0042666420340538025},
  {'label': 'disapproval', 'score': 0.004173447843641043},
  {'label': 'optimism', 'score': 0.0041002132929861546},
  {'label': 'realization', 'score': 0.00409337691962719},
  {'label': 'annoyance', 'score': 0.0035506037529557943},
  {'label': 'surprise', 'score': 0.0028309354092925787},
  {'label': 'disappointment', 'score': 0.0027473466470837593},
  {'label': 'love', 'score': 0.0026830036658793688},
  {'label': 'amusement', 'score': 0.002467394806444645},
  {'label': 'confusion', 'score': 0.0024103664327412844},
  {'label': 'pride', 'score': 0.0020465166307985783},
  {'label': 'sadness', 'score': 0.0

In [30]:
classifier(batch_list[3]) 

[[{'label': 'neutral', 'score': 0.9341973066329956},
  {'label': 'approval', 'score': 0.021860238164663315},
  {'label': 'annoyance', 'score': 0.012884113937616348},
  {'label': 'realization', 'score': 0.012337401509284973},
  {'label': 'disgust', 'score': 0.0045599909499287605},
  {'label': 'amusement', 'score': 0.0033169311936944723},
  {'label': 'sadness', 'score': 0.003295227652415633},
  {'label': 'anger', 'score': 0.0029102263506501913},
  {'label': 'disappointment', 'score': 0.0028773448430001736},
  {'label': 'optimism', 'score': 0.002037769416347146},
  {'label': 'confusion', 'score': 0.001899832976050675},
  {'label': 'disapproval', 'score': 0.0018558281008154154},
  {'label': 'embarrassment', 'score': 0.0017885810229927301},
  {'label': 'fear', 'score': 0.001707057817839086},
  {'label': 'curiosity', 'score': 0.0015119265299290419},
  {'label': 'excitement', 'score': 0.0014253061963245273},
  {'label': 'joy', 'score': 0.0013963377568870783},
  {'label': 'caring', 'score': 0.

### Summarization

In [33]:
# Define summarizer
summarizer=pipeline(task="summarization", model="facebook/bart-large-cnn")

In [34]:
# My reflection from a data ethics course
text = """
One way to institutionalize the Communality Value might be to mandate an annual certification of univerisal Ethical Guidelines. 
Given the American Statistical Association’s (ASA) focus on the individual, this could help address the ethical lapses 
within the statistical community as a whole. However, for this to succeed, the ASA would need to become more ubiquitous in 
statistics. The ASA must permeate every industry and profession where a statistical practitioner works to inform as 
many people as possible about the Ethical Guidelines. Only then could a mandate like the annual certification of the 
guidelines work to change the ethical landscape of statistics. Additionally, this solution wouldn’t impose a compliance 
burden on any one organization, making it more adoptable and even preferred from a hiring standpoint. The only burden 
would be a quick annual review of the guidelines and certification of conformity, which could take just an hour or two a
year. This would require the ASA to also review the guidelines annually to ensure they remain adequate.
"""

summarizied_text=summarizer(text, min_length=75, max_length=100)[0]["summary_text"]

summarizied_text # Pretty good!

'The American Statistical Association (ASA) must become more ubiquitous in statistics. The ASA must permeate every industry and profession where a statistical practitioner works to inform as many people as possible about the Ethical Guidelines. This would require the ASA to also review the guidelines annually to ensure they remain adequate. The only burden would be a quick annual review of the guidelines and certification of conformity.'

In [35]:
classifier(summarizied_text) # Using the previous sentiment analysis classifier

[[{'label': 'neutral', 'score': 0.8097553849220276},
  {'label': 'approval', 'score': 0.1958303451538086},
  {'label': 'optimism', 'score': 0.03372952714562416},
  {'label': 'realization', 'score': 0.027438772842288017},
  {'label': 'desire', 'score': 0.014154449105262756},
  {'label': 'annoyance', 'score': 0.009135137312114239},
  {'label': 'admiration', 'score': 0.009024866856634617},
  {'label': 'disapproval', 'score': 0.0054798307828605175},
  {'label': 'caring', 'score': 0.005192108917981386},
  {'label': 'disappointment', 'score': 0.00373662356287241},
  {'label': 'gratitude', 'score': 0.001778473611921072},
  {'label': 'excitement', 'score': 0.0015567636583000422},
  {'label': 'confusion', 'score': 0.001510564354248345},
  {'label': 'disgust', 'score': 0.001298467512242496},
  {'label': 'relief', 'score': 0.001092472462914884},
  {'label': 'curiosity', 'score': 0.0010280683636665344},
  {'label': 'pride', 'score': 0.0009520385065115988},
  {'label': 'anger', 'score': 0.000859615

### Conversational

In [40]:
# Chatbot object
chatbot=pipeline("conversational", model="facebook/blenderbot-400M-distill")

In [42]:
# Conversation object to handle back and forth between user and bot
conversation=Conversation("Hi, I'm Tae, how are you?")

conversation=chatbot(conversation)

In [44]:
conversation

Conversation id: c7fc90bd-38b8-4001-bc65-da7c1af5d041 
user >> Hi, I'm Tae, how are you? 
bot >>  I'm doing well. How are you doing today? I'm just hanging out with my cat. 

In [46]:
# Keep the conversation going
conversation.add_user_input("How many cats do you have?")
conversation=chatbot(conversation)

In [48]:
conversation

Conversation id: c7fc90bd-38b8-4001-bc65-da7c1af5d041 
user >> Hi, I'm Tae, how are you? 
bot >>  I'm doing well. How are you doing today? I'm just hanging out with my cat. 
user >> How many cats do you have? 
bot >>  I have two. They're my babies. What do you like to do in your spare time? 

### Chatbot UI with Gradio

In [53]:
import gradio

message_list=[]
response_list=[]

def simply_chatbot(message, history):
    conversation=Conversation(text=message, # User inputs
                              past_user_inputs=message_list, # Context so the chatbot understands the conversation before responding
                              generated_responses=response_list)
    conversation=chatbot(conversation)
    
    return conversation.generated_responses[-1] #return the last generated response

demo_chatbot=gradio.ChatInterface(simply_chatbot, # pass chatbot object
                                  title="Simply-Chatbot",
                                  description="Enter text to begin chatting!")

demo_chatbot.launch()

Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.




### Hosted on Hugging Face Spaces :)

https://huggingface.co/spaces/amfrazier01/simply-chatbot