<a href="https://colab.research.google.com/github/sonawanekrishna/python_basics/blob/main/huggingface_practice_handson.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

2.1 Installation


2.1.1 Prerequisites

Ensure you have Python installed (Python 3.6 or later is recommended).
Install a compatible deep learning framework such as PyTorch, TensorFlow, or JAX.

2.1.2 Installing the Transformers Library

Use pip to install the Hugging Face Transformers library. The following command installs the library along with all necessary dependencies:

In [None]:
pip install transformers



2.1.3 Verifying Installation

Verify that the installation was successful by importing the library in a Python script or interactive shell:

In [None]:
import transformers
print(transformers.__version__)

4.57.1


2.2 Basic Usage


2.2.1 Overview of the Pipeline API

The pipeline API is designed to streamline the use of pre-trained models for common NLP tasks. It provides a simple interface to load models and perform tasks such as sentiment analysis, text generation, and translation.


2.2.2 Sentiment Analysis Example

Here’s a basic example of using the pipeline API for sentiment analysis:

In [None]:
from transformers import pipeline

# Load the sentiment-analysis pipeline
sentiment_analyzer = pipeline("sentiment-analysis")

# Analyze sentiment of a sample text
result = sentiment_analyzer("I love using Hugging Face!")
print(result)

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

Device set to use cpu


[{'label': 'POSITIVE', 'score': 0.9997085928916931}]


2.2.3 Text Generation Example

Example of using the pipeline API for text generation:

In [1]:
from transformers import pipeline

# Load the text-generation pipeline
text_generator = pipeline("text-generation", model="Qwen/Qwen2.5-3B-Instruct")

# Generate text based on a prompt
result = text_generator("what is the rle of transformers in llm models?")
print(result)

config.json:   0%|          | 0.00/661 [00:00<?, ?B/s]

model.safetensors.index.json: 0.00B [00:00, ?B/s]

Fetching 2 files:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/3.97G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/2.20G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/242 [00:00<?, ?B/s]

tokenizer_config.json: 0.00B [00:00, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Device set to use cpu


[{'generated_text': "what is the rle of transformers in llm models? Transformers are a type of deep learning model that have become increasingly popular in natural language processing (NLP) tasks, including large language models (LLMs). In LLMs, transformers play a crucial role in processing and understanding text. Here's an overview of their key roles:\n\n1. Encoding input text:\nTransformers use a technique called self-attention to encode the input text into a fixed-length vector representation. This allows the model to capture the importance of different words or subword units within the context.\n\n2. Multi-head attention mechanism:\nTransformers employ multiple attention heads to simultaneously attend to different parts of the input sequence. This helps in capturing various relationships between tokens and improves the model's ability to handle long-range dependencies.\n\n3. Positional encoding:\nUnlike recurrent neural networks (RNNs), transformers don't rely on sequential inform

2.2.4 Named Entity Recognition (NER) Example

Example of using the pipeline API for named entity recognition:

In [None]:
from transformers import pipeline

# Load the NER pipeline
ner = pipeline("ner")

# Perform NER on a sample text
result = ner("Hugging Face Inc. is a company based in New York.")
print(result)

2.2.5 Zero-Shot Classification Example

Example of using the pipeline API for zero-shot classification:

In [None]:
from transformers import pipeline

# Load the zero-shot-classification pipeline
classifier = pipeline("zero-shot-classification")

# Classify a text with candidate labels
result = classifier("Hugging Face is creating a tool that democratizes AI.", candidate_labels=["education", "technology", "finance"])
print(result)

2.2.6 Translation Example

Example of using the pipeline API for translation:

In [None]:
from transformers import pipeline

# Load the translation pipeline
translator = pipeline("translation_en_to_fr")

# Translate a sample text from English to French
result = translator("Hugging Face is amazing!")
print(result)

2.2.7 Question Answering Example

Example of using the pipeline API for question answering:

In [None]:
from transformers import pipeline

# Load the question-answering pipeline
qa = pipeline("question-answering")

# Answer a question based on context
context = "Hugging Face Inc. is a company based in New York."
result = qa(question="Where is Hugging Face based?", context=context)
print(result)

2.2.8 Customizing Pipelines

Loading Specific Models: You can specify which model to use by providing the model name or path.

In [None]:
from transformers import pipeline

# Load a specific pre-trained model
custom_sentiment_analyzer = pipeline("sentiment-analysis", model="distilbert-base-uncased-finetuned-sst-2-english")
result = custom_sentiment_analyzer("I enjoy learning about machine learning.")
print(result)

Adjusting Parameters: Modify parameters such as max_length for text generation.

In [None]:
from transformers import pipeline

# Generate text with a specific max length
text_generator = pipeline("text-generation", model="gpt2")
result = text_generator("In a faraway land,", max_length=50)
print(result)