# HuggingFace - Basic Usage

The notebook includes basic usage of the HuggingFace library, including simple use cases and applications of it.

# Notebook Setup

## Imports

In [1]:
# Import Standard Libraries
from transformers import pipeline
from transformers.utils.import_utils import candidates

# Usage

## Pipeline

### Sentiment Analysis

In [3]:
# Instance the pipeline
sentiment_classifier = pipeline("sentiment-analysis")

# Inference
sentiment_result = sentiment_classifier("I've been waiting for a HuggingFace course for long time! I'm so happy to start it!")

print(sentiment_result)

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Device set to use mps:0


[{'label': 'POSITIVE', 'score': 0.9998413324356079}]


### Text Generation

In [4]:
# Instance the pipeline
text_generator = pipeline("text-generation", model="distilgpt2")

# Inference
generation_result = text_generator(
    "In this course we will teach you how to",
    max_length=30,
    num_return_sequences=2) # Two possible sequences to choose from

print(generation_result)

config.json:   0%|          | 0.00/762 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/353M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Device set to use mps:0
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': 'In this course we will teach you how to create a new language and to build a new language.'}, {'generated_text': 'In this course we will teach you how to solve and adapt to social change by helping you learn how to use the tools that bring you ideas in one'}]


### Text Classification

In [5]:
# Instance the pipeline
text_classifier = pipeline("zero-shot-classification")

# Inference
text_classification_result = text_classifier(
    "This course is about Python list comprehension",
    candidate_labels=["education", "politics", "business"])

print(text_classification_result)

No model was supplied, defaulted to facebook/bart-large-mnli and revision d7645e1 (https://huggingface.co/facebook/bart-large-mnli).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/1.15k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Device set to use mps:0


{'sequence': 'This course is about Python list comprehension', 'labels': ['education', 'business', 'politics'], 'scores': [0.9043459892272949, 0.06733746081590652, 0.028316549956798553]}
