# Hugging Face Transformers 🤗

The Hugging Face transformers package is an immensely popular Python library providing pretrained models that are extraordinarily useful for a variety of natural language processing (NLP) tasks. It previously supported only PyTorch, but, as of late 2019, TensorFlow 2 is supported as well. While the library can be used for many tasks from Natural Language Inference (NLI) to Question-Answering, text classification remains one of the most popular and practical use cases.

Transformers (formerly known as pytorch-transformers and pytorch-pretrained-bert) provides general-purpose architectures (BERT, GPT-2, RoBERTa, XLM, DistilBert, XLNet…) for Natural Language Understanding (NLU) and Natural Language Generation (NLG) with over 32+ pretrained models in 100+ languages and deep interoperability between Jax, PyTorch and TensorFlow.

https://huggingface.co/transformers/

In [0]:
from transformers import pipeline

# Sentiment Analysis

## distilbert-base-uncased-finetuned-sst-2-english

In [0]:
classifier = pipeline('sentiment-analysis', model = 'distilbert-base-uncased-finetuned-sst-2-english')

No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


Downloading config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/255M [00:00<?, ?B/s]

Downloading tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

Downloading vocab.txt:   0%|          | 0.00/226k [00:00<?, ?B/s]

In [0]:
classifier("The food was good overall but Pizza was horrible")

Out[3]: [{'label': 'NEGATIVE', 'score': 0.982955813407898}]

In [0]:
classifier("I am loving it.")

Out[4]: [{'label': 'POSITIVE', 'score': 0.9998760223388672}]

In [0]:
classifier("I hate waking up early on the weekends.")

Out[5]: [{'label': 'NEGATIVE', 'score': 0.9970771074295044}]

In [0]:
data = spark.sql('SELECT * from amazon_cells')
data = data.toPandas()
data.head()

Unnamed: 0,Text,Label
0,So there is no way for me to plug it in here i...,0
1,"Good case, Excellent value.",1
2,Great for the jawbone.,1
3,Tied to charger for conversations lasting more...,0
4,The mic is great.,1


In [0]:
text = list(data['Text'][:5])
text

Out[7]: ['So there is no way for me to plug it in here in the US unless I go by a converter.',
 'Good case, Excellent value.',
 'Great for the jawbone.',
 'Tied to charger for conversations lasting more than 45 minutes.MAJOR PROBLEMS!!',
 'The mic is great.']

In [0]:
for i in text:
    print(classifier(i))

[{'label': 'NEGATIVE', 'score': 0.9994086027145386}]
[{'label': 'POSITIVE', 'score': 0.9998685121536255}]
[{'label': 'POSITIVE', 'score': 0.9997796416282654}]
[{'label': 'NEGATIVE', 'score': 0.9994041919708252}]
[{'label': 'POSITIVE', 'score': 0.9998689889907837}]


## cardiffnlp/twitter-roberta-base-emotion
https://huggingface.co/cardiffnlp/twitter-roberta-base-emotion?text=I+am+very+excited+about+Big+Data+course.+I+was+looking+forward+to+it.

In [0]:
classifier2 = pipeline('sentiment-analysis', model='cardiffnlp/twitter-roberta-base-emotion')

Downloading config.json:   0%|          | 0.00/768 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/476M [00:00<?, ?B/s]

Downloading vocab.json:   0%|          | 0.00/878k [00:00<?, ?B/s]

Downloading merges.txt:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading special_tokens_map.json:   0%|          | 0.00/150 [00:00<?, ?B/s]

In [0]:
for i in text:
    print(classifier2(i))

[{'label': 'joy', 'score': 0.6960296034812927}]
[{'label': 'optimism', 'score': 0.4840291738510132}]
[{'label': 'optimism', 'score': 0.6911888718605042}]
[{'label': 'joy', 'score': 0.8931247591972351}]
[{'label': 'optimism', 'score': 0.9212433695793152}]


In [0]:
for i in text:
    print(classifier2(i))

[{'label': 'joy', 'score': 0.6960296034812927}]
[{'label': 'optimism', 'score': 0.4840291738510132}]
[{'label': 'optimism', 'score': 0.6911888718605042}]
[{'label': 'joy', 'score': 0.8931247591972351}]
[{'label': 'optimism', 'score': 0.9212433695793152}]
