<a href="https://colab.research.google.com/github/Soroushav/pipelines_llm/blob/main/pipelines.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install -q transformers datasets diffusers

In [1]:
import torch
from transformers import pipeline
from diffusers import DiffusionPipeline
from datasets import load_dataset
import soundfile as sf
from IPython.display import Audio

In [6]:
# sentiment analysis
classifier = pipeline("sentiment-analysis")
result = classifier("Im super excited to be part of this journey but i dont feel confident but ready to go and take this to an end also i dont feel ok")
print(result)

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Device set to use cpu


[{'label': 'POSITIVE', 'score': 0.9687568545341492}]


In [8]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [13]:
# text generation
generator = pipeline(model="openai-community/gpt2")
outputs = generator("My tart needs some", num_return_sequences=4, return_full_text=False)

Device set to use cpu
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


In [14]:
outputs

[{'generated_text': ' attention, as it may have been a little over a week late. The tart is made from a very hard white paper with fine white hairs. The white paper is a thick, slightly thick, very fine paper that is soft but smooth, with a slightly uneven or soft texture. It has many small, fine hairs, which it is then glued onto. I did not need a lot of glue, as it was a lot of white paper. The tart is the only tart I made from a piece of paper, and it was good at making a little tart, but was not very popular with my friends. The only tart I made from a piece of paper was the one with the sharp edges. The tart was then glue dried and glued onto the tart as a base.\n\nThe tart is the perfect thickness to work with, and has a medium-sized finish. It is well-balanced and soft, and has a nice soft and very dry shape. I would recommend this as a base for any tart you might make.\n\nDrying and Glazing\n\nThis is a pretty easy and simple drying and glazing. It is a dryer than the tradition

In [16]:
summarizer = pipeline("summarization", model="google-t5/t5-small", tokenizer="google-t5/t5-small")
output = summarizer("""
    "Hugging Face is an open-source company and platform that has become a central hub "
    "for natural language processing (NLP) and machine learning models. It provides an "
    "extensive library called Transformers, which allows developers to easily use "
    "state-of-the-art models for tasks like text classification, translation, question "
    "answering, summarization, and more. With just a few lines of code, users can load "
    "pre-trained models and fine-tune them on their own data, making advanced AI "
    "capabilities more accessible to everyone."
""", min_length=5, max_length=20)
print(output)

config.json: 0.00B [00:00, ?B/s]

model.safetensors:   0%|          | 0.00/242M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/147 [00:00<?, ?B/s]

tokenizer_config.json: 0.00B [00:00, ?B/s]

spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Device set to use cpu
Both `max_new_tokens` (=256) and `max_length`(=20) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


[{'summary_text': '"Hugging Face" is an open-source company and platform . it provides an "extensive library" called Transformers . users can load "pre-trained models and fine-tune them on their own data .'}]


In [21]:
text_classification = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")
output = text_classification("Hugging Face is an open-source company and platform that has become a central hub", candidate_labels=["technology", "sports", "news", "economy"])
print(output)

config.json: 0.00B [00:00, ?B/s]

model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Device set to use cpu


{'sequence': 'Hugging Face is an open-source company and platform that has become a central hub', 'labels': ['technology', 'news', 'economy', 'sports'], 'scores': [0.8234268426895142, 0.12631866335868835, 0.0416354276239872, 0.00861904863268137]}


In [22]:
question_answer = pipeline("question-answering", model="deepset/roberta-base-squad2")
output = question_answer(question="Who was Jim Henson?",context="Jim Henson was a nice puppet")
print(output)

config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/496M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/79.0 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/772 [00:00<?, ?B/s]

Device set to use cpu


{'score': 0.3306240737438202, 'start': 22, 'end': 28, 'answer': 'puppet'}


In [23]:
translator = pipeline("translation", model="facebook/nllb-200-distilled-600M")
output = translator("My name is Soroush, nice to meet you",src_lang="eng_Latn",tgt_lang="ita_Latn")
print(output)

config.json:   0%|          | 0.00/846 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/2.46G [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/2.46G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/189 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/564 [00:00<?, ?B/s]

sentencepiece.bpe.model:   0%|          | 0.00/4.85M [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/17.3M [00:00<?, ?B/s]

special_tokens_map.json: 0.00B [00:00, ?B/s]

Device set to use cpu


[{'translation_text': 'Mi chiamo Soroush. Piacere di conoscerti.'}]


In [26]:
image_gen = DiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-2",
    torch_dtype=torch.float16,
    use_safesensors=True,
    variant="fp16",
)
text = "A class of birds learning about data science, in a surreal style of salvador Dali"
image = image_gen(prompt=text).images[0]
image

model_index.json:   0%|          | 0.00/537 [00:00<?, ?B/s]

Fetching 13 files:   0%|          | 0/13 [00:00<?, ?it/s]

config.json:   0%|          | 0.00/633 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/460 [00:00<?, ?B/s]

scheduler_config.json:   0%|          | 0.00/345 [00:00<?, ?B/s]

preprocessor_config.json:   0%|          | 0.00/342 [00:00<?, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer_config.json:   0%|          | 0.00/824 [00:00<?, ?B/s]

text_encoder/model.fp16.safetensors:   0%|          | 0.00/681M [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

config.json:   0%|          | 0.00/909 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/611 [00:00<?, ?B/s]

unet/diffusion_pytorch_model.fp16.safete(…):   0%|          | 0.00/1.73G [00:00<?, ?B/s]

vae/diffusion_pytorch_model.fp16.safeten(…):   0%|          | 0.00/167M [00:00<?, ?B/s]

KeyboardInterrupt: 