[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1PTHXJIJjop-3NrcqBN9VSJ4kACqAAIV-?usp=sharing)


# Pretrained models in HuggingFace - Overview Notebook

This notebook is a self-contained way to start using transformers. 

**Learning goals:** The goal of this tutorial is to learn How To

1. Use pre-trained pipelines
2. Get embeddings
3. Build a multimodal models

**Steps to Do:** How to best use this notebook

1. Make a copy of this notebook, so you can keep your changes



In [None]:
%pip install --quiet transformers datasets sentence-transformers

## Pre-Trained Models with Pipelines -> ✨ Easy Mode ✨

The [pipeline()](https://huggingface.co/docs/transformers/main/en/main_classes/pipelines#transformers.pipeline) supports many 20+ common tasks out-of-the-box:

**Text**:
* Sentiment analysis: classify the polarity of a given text.
* Text generation (in English): generate text from a given input.
* Name entity recognition (NER): label each word with the entity it represents (person, date, location, etc.).
* Question answering: extract the answer from the context, given some context and a question.

**Image**:
* Image classification: classify an image.
* Image segmentation: classify every pixel in an image.
* Object detection: detect objects within an image.

**Audio**:
* Audio classification: assign a label to a given segment of audio.
* Automatic speech recognition (ASR): transcribe audio data into text.

**MultiModal**:
* Visual Question Answering: answers open-ended questions about images
* Image To Text: predicts a caption for a given image

### Sentiment Analysis

In [None]:
from transformers import pipeline
sent_classifier = pipeline("sentiment-analysis")

In [None]:
sent_classifier("I am sad about today")

### Text Generation

If you want to see what other tasks are available, check out all the [pipeline tasks](https://huggingface.co/docs/transformers/main/en/main_classes/pipelines#the-task-specific-pipelines) in the docs.

In [None]:
from transformers import pipeline
generator = pipeline("text-generation")

In [None]:
generator("Once upon a time,")

In [None]:
generator("In this course, we will teach you how to")

### Image

In [None]:
from IPython.display import Image
Image('https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg')

In [None]:
vision_classifier = pipeline(task="image-classification")
imagepic="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg"
result = vision_classifier(
    images=imagepic
)
print("\n".join([f"Class {d['label']} with score {round(d['score'], 4)}" for d in result]))

### MultiModal

In [None]:
from transformers import AutoModelForVisualQuestionAnswering
vqa_pipeline = pipeline("visual-question-answering")
vqa = vqa_pipeline(image=imagepic,
                   question = "What is the weather like")
                  #question = "What color are the bushes")
vqa

### Image embeddings using [Distilled data-efficient Image Transformer (DeiT)](https://huggingface.co/facebook/deit-base-distilled-patch16-224)

In [None]:
from PIL import Image
import requests
im = Image.open(requests.get(imagepic, stream=True).raw)

In [None]:
from transformers import AutoFeatureExtractor
feature_extractor = AutoFeatureExtractor.from_pretrained('facebook/deit-base-distilled-patch16-224')
embeddings = feature_extractor(images=im, return_tensors="pt")
embeddings


### Text Embeddings using Transformers

In [None]:
from transformers import pipeline
checkpoint = "facebook/bart-base"
feature_extractor = pipeline("feature-extraction",framework="pt",model=checkpoint)
text = "Transformers is an awesome library!"

In [None]:
embeddings = feature_extractor(text,return_tensors = "pt")[0].numpy().mean(axis=0) 
embeddings

### Text Embeddings using Sentence Transformers

There are many embedding models, the [all-mpnet-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) model is generally recommended as a good all around model. A more lightweight embedding model is the [all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2). For a comprehensive analysis of embedding models, take a look at the [Massive Text Embedding Benchmark leaderboard](https://huggingface.co/spaces/mteb/leaderboard). 

In [None]:
from sentence_transformers import SentenceTransformer
modelst = SentenceTransformer('paraphrase-MiniLM-L6-v2')
sentence = ['It is a rainy and snowy day in Chicago']
embedding = modelst.encode(sentence)
embedding.shape

In [None]:
embedding