In [1]:
!pip install -1 transformers datasets diffusers


Usage:   
  pip3 install [options] <requirement specifier> [package-index-options] ...
  pip3 install [options] -r <requirements file> [package-index-options] ...
  pip3 install [options] [-e] <vcs project url> ...
  pip3 install [options] [-e] <local project path> ...
  pip3 install [options] <archive url/path> ...

no such option: -1


In [2]:
import torch
import soundfile as sf
from transformers import pipeline
from diffusers import DiffusionPipeline
from datasets import load_dataset
from IPython.display import Audio

In [22]:
# Setiment Analysis

classifier = pipeline("sentiment-analysis", device='cuda')
result = classifier("I'm super excited to be on the way to LLM mastery!")
print(result)

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Device set to use cuda


[{'label': 'POSITIVE', 'score': 0.9993460774421692}]


In [19]:
# Named Entity Recognition

ner = pipeline("ner", grouped_entities=True, device="cuda")
result = ner("Barack Obama was the 44ht president of the United States")
print(result[0])

No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision 4c53496 (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Some weights of the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Device set to use cuda


{'entity_group': 'PER', 'score': 0.9992727, 'word': 'Barack Obama', 'start': 0, 'end': 12}




In [11]:
# Text Summarization

summarizer = pipeline("summarization", device="cuda")
text = """
Life is a journey filled with endless possibilities. Every challenge is an opportunity to grow stronger. Happiness comes from appreciating the little things.
Dream big and work hard to achieve your goals.Kindness has the power to change the world. Believe in yourself, and anything is possible.
"""

summary = summarizer(text, max_length=50, min_length=25, do_sample = False)
print(summary[0]['summary_text'])

No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.
Device set to use cuda


 Happiness comes from appreciating the little things, says author . Dream big and work hard to achieve your goals. Believe in yourself, and anything is possible .


In [24]:
# Translation

translator = pipeline("translation", model="Helsinki-NLP/opus-mt-tc-big-en-pt", device=0)  # Use device=0 para GPU
result = translator("The Data Scientists were truly amazed by the power and simplicity of the HuggingFace pipeline API.")
translation = translator(text, max_length=512)
print(translation[0]['translation_text'])

Device set to use cuda:0


A vida é uma jornada cheia de infinitas possibilidades. Cada desafio é uma oportunidade para crescer mais forte. A felicidade vem de apreciar as pequenas coisas. Sonhe grande e trabalhe duro para alcançar seus objetivos. A bondade tem o poder de mudar o mundo. Acredite em si mesmo, e tudo é possível.


In [29]:
# Classification

classifier = pipeline("zero-shot-classification", device = "cuda")
result = classifier("Basketball is very fun", candidate_labels=["technology", "sports", "politics"])
print(result)

No model was supplied, defaulted to facebook/bart-large-mnli and revision d7645e1 (https://huggingface.co/facebook/bart-large-mnli).
Using a pipeline without specifying a model name and revision in production is not recommended.
Device set to use cuda


{'sequence': 'Basketball is very fun', 'labels': ['sports', 'technology', 'politics'], 'scores': [0.9977923035621643, 0.0014158233534544706, 0.0007919972995296121]}


In [32]:
# Text Generation

generator = pipeline("text-generation",device = "cuda")
result = generator("if there's one thing I want you to remeber about using HugginFace pipelines, it's")
print(result[0]['generated_text'])

No model was supplied, defaulted to openai-community/gpt2 and revision 607a30d (https://huggingface.co/openai-community/gpt2).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Device set to use cuda
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


if there's one thing I want you to remeber about using HugginFace pipelines, it's that it will get rid of the fact that your input is a vector. That means you can write programs that perform simple things in the absence of


In [41]:
# Audio Generation

# Criar a pipeline de síntese de fala
synthesiser = pipeline("text-to-speech", model="microsoft/speecht5_tts", device="cuda")

# Carregar os embeddings do locutor
embeddings_dataset = load_dataset("Matthijs/cmu-arctic-xvectors", split="validation")

# Pegar um vetor de embedding do primeiro locutor
speaker_embeddings = torch.tensor(embeddings_dataset[7306]["xvector"]).unsqueeze(0)

# Gerar fala
speech = synthesiser("Hi to an artificial intelligence engineer, on the way to mastery!", forward_params={"speaker_embeddings": speaker_embeddings})

sf.write("speech.wav", speech["audio"], samplerate=speech["sampling_rate"])
Audio("speech.wav")

Device set to use cuda
