#Pipelines

The Hugging Face pipelines API is designed for inference, meaning it runs models that are already trained.

###Training vs. Inference

**Training**: The process of updating a model using data to improve its performance. If a model is already trained and is further improved, it's called fine-tuning.

**Inference**: Using a trained model to generate outputs from new inputs—essentially running the model without changing it.

In [1]:
!pip install -q transformers datasets diffusers

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/487.4 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m [32m481.3/487.4 kB[0m [31m45.5 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m487.4/487.4 kB[0m [31m13.1 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/116.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m116.3/116.3 kB[0m [31m7.5 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/183.9 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m183.9/183.9 kB[0m [31m10.9 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/143.5 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━

In [1]:
import torch
from google.colab import userdata
from huggingface_hub import login
from transformers import pipeline
from diffusers import DiffusionPipeline
from datasets import load_dataset
import soundfile as sf
from IPython.display import Audio

**Logging into Hugging Face Hub**

First, we need to create a free account at Hugging Face, go to Settings, and generate an API token with read & write permissions.

In Colab, click the key icon (left panel), add a new secret:

1- Name: HF_TOKEN \
2- Value: Your API token (hf_...) \
3- Ensure that "Notebook access" is turned ON.

Then, run the login cell below.

In [2]:
hf_token = userdata.get("HF_TOKEN")
login(hf_token, add_to_git_credential=True)

## 1- Sentiment Analysis

In [7]:
classifier = pipeline("sentiment-analysis", device="cuda")
result = classifier("I love playing the piano!")
print(result)

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Device set to use cuda


[{'label': 'POSITIVE', 'score': 0.9997634291648865}]


## 2- Named Entity Recognition

In [8]:
ner = pipeline("ner", grouped_entities=True, device="cuda")
result = ner("Elon Musk is the CEO of Tesla and SpaceX, and he was born in South Africa.")
print(result)

No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision 4c53496 (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/998 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.33G [00:00<?, ?B/s]

Some weights of the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


tokenizer_config.json:   0%|          | 0.00/60.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/213k [00:00<?, ?B/s]

Device set to use cuda


[{'entity_group': 'PER', 'score': np.float32(0.99925745), 'word': 'Elon Musk', 'start': 0, 'end': 9}, {'entity_group': 'ORG', 'score': np.float32(0.9966955), 'word': 'Tesla', 'start': 24, 'end': 29}, {'entity_group': 'ORG', 'score': np.float32(0.99891424), 'word': 'SpaceX', 'start': 34, 'end': 40}, {'entity_group': 'LOC', 'score': np.float32(0.9993361), 'word': 'South Africa', 'start': 61, 'end': 73}]




## 3- Question Answering with Context

In [9]:
question_answer = pipeline("question-answering", device="cuda")

# Define the context and question
context = "Marie Curie was a physicist and chemist who conducted pioneering research on radioactivity. She was the first woman to win a Nobel Prize."
question = "Who was the first woman to win a Nobel Prize?"

result = question_answer(question=question, context=context)
print(result)

No model was supplied, defaulted to distilbert/distilbert-base-cased-distilled-squad and revision 564e9b5 (https://huggingface.co/distilbert/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/473 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/261M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/49.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/213k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/436k [00:00<?, ?B/s]

Device set to use cuda


{'score': 0.9983916878700256, 'start': 0, 'end': 11, 'answer': 'Marie Curie'}


## 4- Text Summarization

In [10]:
summarizer = pipeline("summarization", device="cuda")

# Define the long text to summarize
text = """Hugging Face is an AI research company that specializes in natural language processing (NLP).
It has developed the Transformers library, which provides state-of-the-art models for various NLP tasks
such as text generation, translation, summarization, and more. The company has played a significant role
in making AI more accessible by open-sourcing powerful machine learning models and tools."""

summary = summarizer(text, max_length=50, min_length=25, do_sample=False)
print(summary[0]["summary_text"])

No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/1.80k [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/1.22G [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.22G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

Device set to use cuda


 Hugging Face is an AI research company that specializes in natural language processing (NLP) It has developed the Transformers library, which provides state-of-the-art models for various NLP tasks such as text generation, translation,


## 5- Translation

In [15]:
# English-to-Persian translation model
translator = pipeline("translation", model="persiannlp/mt5-small-parsinlu-translation_en_fa", device="cuda")

# Define the text to translate
text = "Hugging Face provides state-of-the-art machine learning models for NLP tasks."

result = translator(text, max_length=50)
print(result[0]["translation_text"])

Device set to use cuda


صورت خمیده مدل های ماشین یادگیری ماشینی برای وظایف NLP را فراهم می کند.


## 6- Classification

In [16]:
classifier = pipeline("zero-shot-classification", device="cuda")
result = classifier("I absolutely love this new feature! It's amazing.", candidate_labels=["technology", "sports", "politics"])
print(result)

No model was supplied, defaulted to facebook/bart-large-mnli and revision d7645e1 (https://huggingface.co/facebook/bart-large-mnli).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/1.15k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Device set to use cuda


{'sequence': "I absolutely love this new feature! It's amazing.", 'labels': ['technology', 'sports', 'politics'], 'scores': [0.9238524436950684, 0.056067146360874176, 0.020080426707863808]}


## 7- Text Generation

In [17]:
txt_generator = pipeline("text-generation", device="cuda")
result = txt_generator("If there's one thing I want you to remember about using HuggingFace pipelines, it's")
print(result[0]["generated_text"])

No model was supplied, defaulted to openai-community/gpt2 and revision 607a30d (https://huggingface.co/openai-community/gpt2).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Device set to use cuda
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


If there's one thing I want you to remember about using HuggingFace pipelines, it's that it's incredibly simple to understand. It's easy to understand if you look at some of the examples I've made. Hugging is an algorithm that


## 8- Image Generation

In [3]:
img_generator = DiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-2",
    torch_dtype = torch.float16,
    use_safetensors = True,
    variant = "fp16"
    ).to("cuda")

prompt = "A futuristic city skyline at sunset, cyberpunk style"
image = img_generator(prompt=prompt).images[0]

image.save("generated_image.png")
image.show()

Fetching 13 files:   0%|          | 0/13 [00:00<?, ?it/s]

model.fp16.safetensors:  27%|##6       | 183M/681M [00:00<?, ?B/s]

diffusion_pytorch_model.fp16.safetensors:  10%|#         | 181M/1.73G [00:00<?, ?B/s]

diffusion_pytorch_model.fp16.safetensors:  88%|########7 | 146M/167M [00:00<?, ?B/s]

Loading pipeline components...:   0%|          | 0/6 [00:00<?, ?it/s]

  0%|          | 0/50 [00:00<?, ?it/s]

## 9- Audio Generation

In [4]:
tts = pipeline("text-to-speech", "microsoft/speecht5_tts", device="cuda")

embeddings_dataset = load_dataset("Matthijs/cmu-arctic-xvectors", split="validation")
speaker_embedding = torch.tensor(embeddings_dataset[7306]["xvector"]).unsqueeze(0)

# Define the text input
text = "Hello! Welcome to the world of AI-generated speech."
speech = tts(text, forward_params={"speaker_embeddings": speaker_embedding})

sf.write("speech.wav", speech["audio"], samplerate=speech["sampling_rate"])
Audio("speech.wav")

config.json:   0%|          | 0.00/2.06k [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/585M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/232 [00:00<?, ?B/s]

spm_char.model:   0%|          | 0.00/238k [00:00<?, ?B/s]

added_tokens.json:   0%|          | 0.00/40.0 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/585M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/234 [00:00<?, ?B/s]

preprocessor_config.json:   0%|          | 0.00/433 [00:00<?, ?B/s]

Device set to use cuda


config.json:   0%|          | 0.00/636 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/50.7M [00:00<?, ?B/s]

README.md:   0%|          | 0.00/1.01k [00:00<?, ?B/s]

cmu-arctic-xvectors.py:   0%|          | 0.00/1.36k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/50.6M [00:00<?, ?B/s]

0000.parquet:   0%|          | 0.00/21.3M [00:00<?, ?B/s]

Generating validation split:   0%|          | 0/7931 [00:00<?, ? examples/s]