# Introduction to Hugging Face and Using its Models

Welcome to this session on using Hugging Face models! This guide is designed to introduce you to the world of Hugging Face and empower you to leverage its powerful tools and pre-trained models for various machine learning tasks.

## What is Hugging Face?

Hugging Face is an open-source platform that has revolutionized the field of Natural Language Processing (NLP) and is rapidly expanding into other domains like computer vision and audio. Its core mission is to democratize access to cutting-edge machine learning models and tools, making it easier for everyone to build and deploy AI applications.

Think of Hugging Face as a central hub for the ML community, offering:

*   **A vast Model Hub:** A repository of thousands of pre-trained models for various tasks, contributed by researchers and developers worldwide. You can find models for text classification, translation, summarization, image recognition, audio transcription, and much more.
*   **Powerful Libraries:** Open-source libraries like `transformers`, `datasets`, and `tokenizers` that provide easy-to-use interfaces for working with models, datasets, and text processing.
*   **A Collaborative Community:** A vibrant community of ML practitioners who share models, datasets, and expertise.

Hugging Face significantly reduces the barrier to entry for using state-of-the-art ML models, allowing you to quickly experiment and build applications without having to train models from scratch.

## Managing Hugging Face Tokens

To access some models or features on the Hugging Face Hub, you might need to use an API token. This token helps authenticate your requests and can be used to interact with the Hub programmatically, including downloading gated models or uploading your own.

Here's how you can manage and verify your Hugging Face token:

1.  **Obtain a Token:** Go to your Hugging Face profile settings (https://huggingface.co/settings/tokens) and generate a new access token. You can choose different roles for the token (e.g., read, write).
2.  **Store your Token Securely:** It's crucial to store your token securely. In Google Colab, you can use the "Secrets" feature (ðŸ”‘ icon in the left panel) to store your token as an environment variable. Name your secret `HF_TOKEN`.
3.  **Log in Programmatically:** You can use the `huggingface_hub` library to log in to the Hugging Face Hub using your token.


In [None]:
#Let's add a code block to install the necessary library and verify your token.

!pip install huggingface_hub

from huggingface_hub import whoami
from google.colab import userdata




In [None]:
# Get your Hugging Face token from Colab Secrets
hf_token = userdata.get('HF_TOKEN')

# Verify the token by checking your identity
try:
    user_info = whoami(token=hf_token)
    print(f"Logged in as: {user_info['name']}")
except Exception as e:
    print(f"Could not log in: {e}")
    print("Please make sure you have added your Hugging Face token to Colab Secrets with the name 'HF_TOKEN'")


Logged in as: monalisa1983


## Showcasing Different Model Types

Hugging Face isn't just about text! Let's explore how to use models for other modalities like images and audio, and also how to work with datasets.


In [None]:
### Image Classification
#Image classification is the task of categorizing an image into one of several classes. We can use a pre-trained image classification model from the Hugging Face Hub.

from transformers import pipeline
from PIL import Image
import requests


In [None]:
# Load an image classification pipeline
classifier = pipeline("image-classification")

# Get an image from a URL (replace with your image URL)
url = "https://i.guim.co.uk/img/media/327aa3f0c3b8e40ab03b4ae80319064e401c6fbc/377_133_3542_2834/master/3542.jpg?width=1200&height=1200&quality=85&auto=format&fit=crop&s=34d32522f47e4a67286f9894fc81c863"
image = Image.open(requests.get(url, stream=True).raw)

# Classify the image
predictions = classifier(image)

print("Image Classification Results:")
for prediction in predictions:
    print(f"- {prediction['label']}: {prediction['score']:.2f}")


No model was supplied, defaulted to google/vit-base-patch16-224 and revision 3f49326 (https://huggingface.co/google/vit-base-patch16-224).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json: 0.00B [00:00, ?B/s]

model.safetensors:   0%|          | 0.00/346M [00:00<?, ?B/s]

preprocessor_config.json:   0%|          | 0.00/160 [00:00<?, ?B/s]

Fast image processor class <class 'transformers.models.vit.image_processing_vit_fast.ViTImageProcessorFast'> is available for this model. Using slow image processor class. To use the fast image processor class set `use_fast=True`.
Device set to use cpu


Image Classification Results:
- tiger cat: 0.78
- tabby, tabby cat: 0.17
- Egyptian cat: 0.04
- lynx, catamount: 0.00
- Persian cat: 0.00


In [None]:
# Load an image classification pipeline
classifier = pipeline("image-classification")

# Get an image from a URL (replace with your image URL)
image = "/content/drive/MyDrive/Picture1.jpg"
#url = "https://i.guim.co.uk/img/media/327aa3f0c3b8e40ab03b4ae80319064e401c6fbc/377_133_3542_2834/master/3542.jpg?width=1200&height=1200&quality=85&auto=format&fit=crop&s=34d32522f47e4a67286f9894fc81c863"
#image = Image.open(requests.get(url, stream=True).raw)

# Classify the image
predictions = classifier(image)

print("Image Classification Results:")
for prediction in predictions:
    print(f"- {prediction['label']}: {prediction['score']:.2f}")

No model was supplied, defaulted to google/vit-base-patch16-224 and revision 3f49326 (https://huggingface.co/google/vit-base-patch16-224).
Using a pipeline without specifying a model name and revision in production is not recommended.
Fast image processor class <class 'transformers.models.vit.image_processing_vit_fast.ViTImageProcessorFast'> is available for this model. Using slow image processor class. To use the fast image processor class set `use_fast=True`.
Device set to use cpu


Image Classification Results:
- suit, suit of clothes: 0.10
- oboe, hautboy, hautbois: 0.04
- Windsor tie: 0.04
- academic gown, academic robe, judge's robe: 0.03
- bassoon: 0.02


In [None]:
### Audio Classification

#Audio classification is the task of categorizing audio data into different classes, such as identifying the type of sound or the speaker's emotion.

from transformers import pipeline
import torch
import soundfile as sf

# Load an audio classification pipeline
# We use a smaller model for demonstration purposes
classifier = pipeline("audio-classification", model="superb/wav2vec2-base-superb-ks")

# This is a simple sine wave, you would load your actual audio data
dummy_audio = torch.randn(16000) # 1 second of dummy audio at 16kHz
sf.write("dummy_audio.wav", dummy_audio.numpy(), 16000)


# Classify the audio
audio_file = "dummy_audio.wav"
predictions = classifier(audio_file)

print("Audio Classification Results:")
for prediction in predictions:
    print(f"- {prediction['label']}: {prediction['score']:.2f}")

config.json: 0.00B [00:00, ?B/s]



pytorch_model.bin:   0%|          | 0.00/378M [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/378M [00:00<?, ?B/s]

preprocessor_config.json:   0%|          | 0.00/215 [00:00<?, ?B/s]

Device set to use cpu


Audio Classification Results:
- _silence_: 1.00
- stop: 0.00
- down: 0.00
- left: 0.00
- _unknown_: 0.00
- off: 0.00
- up: 0.00
- yes: 0.00
- go: 0.00
- right: 0.00
- no: 0.00
- on: 0.00


In [None]:
### Working with Datasets

#Hugging Face provides the `datasets` library, which makes it easy to access and work with a wide variety of datasets for various ML tasks.

from datasets import load_dataset

# Load a dataset (e.g., the SQuAD dataset for question answering)
dataset = load_dataset("squad")

# Print information about the dataset
print(dataset)

# Access an example from the training set
print("\nExample from the training set:")
print(dataset["train"][0])

README.md: 0.00B [00:00, ?B/s]

plain_text/train-00000-of-00001.parquet:   0%|          | 0.00/14.5M [00:00<?, ?B/s]

plain_text/validation-00000-of-00001.par(â€¦):   0%|          | 0.00/1.82M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/87599 [00:00<?, ? examples/s]

Generating validation split:   0%|          | 0/10570 [00:00<?, ? examples/s]

DatasetDict({
    train: Dataset({
        features: ['id', 'title', 'context', 'question', 'answers'],
        num_rows: 87599
    })
    validation: Dataset({
        features: ['id', 'title', 'context', 'question', 'answers'],
        num_rows: 10570
    })
})

Example from the training set:
{'id': '5733be284776f41900661182', 'title': 'University_of_Notre_Dame', 'context': 'Architecturally, the school has a Catholic character. Atop the Main Building\'s gold dome is a golden statue of the Virgin Mary. Immediately in front of the Main Building and facing it, is a copper statue of Christ with arms upraised with the legend "Venite Ad Me Omnes". Next to the Main Building is the Basilica of the Sacred Heart. Immediately behind the basilica is the Grotto, a Marian place of prayer and reflection. It is a replica of the grotto at Lourdes, France where the Virgin Mary reputedly appeared to Saint Bernadette Soubirous in 1858. At the end of the main drive (and in a direct line that connects thr

## Introduction to Gradio

Gradio is an open-source Python library that allows you to quickly create customizable UI components for your machine learning models. It's a great way to build interactive demos and share your models with others.

While we won't cover Gradio in detail in this Colab, it's a valuable tool for building user interfaces for the Hugging Face models we'll be using. You can learn more about Gradio in a separate Colab notebook dedicated to it. We will, however, use it in our final assignment to build a simple demo.

## Transcription with Hugging Face

Audio transcription is the task of converting spoken language into text. Hugging Face also offers models for this task.

In [None]:
from google.colab import drive;
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
#Here's how you can use a pre-trained model for audio transcription:

from transformers import pipeline
import soundfile as sf
import torch

# Load the automatic speech recognition pipeline
transcriber = pipeline("automatic-speech-recognition", model="facebook/wav2vec2-base-960h")

# This is just for demonstration purposes. In a real scenario, you would load your audio file.
# audio_data = "/content/drive/MyDrive/AI Research Assistant (2).mp4" # Dummy data for 1 second of audio at 16kHz
# sf.write("/content/drive/MyDrive/AI Research Assistant (2).mp4", audio_data, 16000)

# Transcribe the audio
audio_file = "/content/drive/MyDrive/AI Research Assistant (2).mp4"
transcription = transcriber(audio_file)

print("Transcription:")
print(transcription['text'])

Some weights of Wav2Vec2ForCTC were not initialized from the model checkpoint at facebook/wav2vec2-base-960h and are newly initialized: ['wav2vec2.masked_spec_embed']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Device set to use cpu


Transcription:
TO DAY YOU'RE ABOUT TO WITNESS THE TRANSFORMATIVE FORCE OF ARTIFICIAL INTELLIGENTS UNLOCKING A NEW ERA OF KNOWLEDGE AND DISCOVERY WITH A BREAK THROUGH THAT'S JUST BEEN REVEALED THIS A I POWERED RESEARCH ASSISTANT CONDUCTS MULTI HOB MULTI SOURCE INVESTIGATIONS THE SYSTEM USES SPECIALIZED AGENTS CONTEXTUAL RETRIEVER AGENT HOLDS DATA FROM RESEARCH PAPERS NEWS ARTICLES REPORTS AND APIIES CRITICAL ANALYSIS AGENT SUMMERIZES FINDINGS HIGHLIHTES CONTRADICTIONS AND VALIDATE SOURCES INSIDE GENERATION AGENT SUGGEST HYPOTHESES OR TRENS USING REASONING CHAINS REPORT BILDER AGENT COMPILES ALL INSIGHTS INTO A STRUCTURED REPORT LET'S GO AHEAD TO SEE A DEMONSTRATION


## Summarization with Hugging Face

Text summarization is the task of creating a shorter version of a text while preserving its main ideas. Hugging Face provides several models that can be used for this purpose.

Here's how you can use a pre-trained model from Hugging Face for summarization:

from transformers import pipeline

In [None]:
# Load the summarization pipeline
summarizer = pipeline("summarization")

# Text to summarize
text = """
Hugging Face is a company and open-source platform that provides tools and models for natural language processing (NLP). It has become a central hub for the ML community, offering a wide range of pre-trained models that can be easily used or fine-tuned for specific applications. Key aspects of Hugging Face include the Transformers library, Model Hub, Datasets library, and Tokenizers library. Hugging Face democratizes access to powerful ML models, making it easier for developers and researchers to build and deploy applications.
"""

# Summarize the text
summary = summarizer(text, max_length=50, min_length=25, do_sample=False)

print("Original Text:")
print(text)
print("\nSummary:")
print(summary[0]['summary_text'])