[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vuhung16au/hf-transformer-trove/blob/main/examples/01_intro_hf_transformers.ipynb)
[![View on GitHub](https://img.shields.io/badge/View_on-GitHub-blue?logo=github)](https://github.com/vuhung16au/hf-transformer-trove/blob/main/examples/01_intro_hf_transformers.ipynb)

# 01 - Introduction to Hugging Face and the Transformers Library

## Learning Objectives
By the end of this notebook, you will:
- Understand what Hugging Face is and its ecosystem
- Know how to install and import the transformers library
- Load and use pre-trained models for various NLP tasks
- Understand the basic workflow with pipelines
- Explore the Model Hub and find relevant models

## What is Hugging Face?

Hugging Face is an AI company that has built a comprehensive ecosystem for machine learning, particularly focused on Natural Language Processing (NLP). The company provides:

1. **The Transformers Library**: A unified API for using transformer models
2. **Model Hub**: A repository of thousands of pre-trained models
3. **Datasets Library**: Easy access to NLP datasets
4. **Spaces**: Platform for hosting ML demos and applications
5. **Tokenizers**: Fast tokenization library

## Installation

In [None]:
# Install required packages (run only once)
# !pip install transformers torch datasets tokenizers

# Import essential libraries
import torch
import os
from transformers import pipeline, AutoModel, AutoTokenizer, AutoModelForSequenceClassification
import warnings
warnings.filterwarnings('ignore')

# Load environment variables from .env.local for local development
try:
    from dotenv import load_dotenv
    load_dotenv('.env.local', override=True)
except ImportError:
    # dotenv not available - that's fine for Colab
    pass

# Credential loading utility for API keys
def get_api_key(key_name: str) -> str:
    """Get API key from environment or Colab secrets."""
    try:
        # Try to import Colab userdata (only available in Colab)
        from google.colab import userdata
        return userdata.get(key_name)
    except ImportError:
        # Not in Colab - check local environment
        api_key = os.getenv(key_name)
        if not api_key:
            print(f"⚠️  {key_name} not found. Some features may be limited.")
            print(f"   For local use: Add {key_name} to .env.local")
            print(f"   For Colab: Add {key_name} to secrets manager")
            return None
        return api_key
    except Exception as e:
        # In Colab but key not found
        print(f"⚠️  {key_name} not found in Colab secrets.")
        return None

# Try to load Hugging Face token (useful for gated models)
hf_token = get_api_key('HF_TOKEN')
if hf_token:
    print("✓ Hugging Face token loaded successfully")
    os.environ['HF_TOKEN'] = hf_token

print(f"PyTorch version: {torch.__version__}")

# Device awareness: Automatic optimization for CUDA, MPS (Apple Silicon), and CPU
def get_device():
    """Get the best available device for training/inference."""
    if torch.cuda.is_available():
        return torch.device("cuda")
    elif torch.backends.mps.is_available():
        return torch.device("mps") 
    else:
        return torch.device("cpu")

device = get_device()
print(f"Using device: {device}")
if device.type == 'cuda':
    print(f"CUDA device: {torch.cuda.get_device_name(0)}")
elif device.type == 'mps':
    print("Using Apple Silicon (MPS) acceleration")

## Part 1: Using Pipelines - The Quickest Way to Get Started

Pipelines provide a high-level API for common NLP tasks. They handle:
- Model loading
- Tokenization
- Inference
- Post-processing

### Text Classification

In [None]:
# Create a sentiment analysis pipeline
# This uses a default model trained for sentiment analysis
classifier = pipeline("sentiment-analysis")

# Test with some examples
texts = [
    "I love using Hugging Face transformers!",
    "This movie was terrible and boring.",
    "The weather today is okay, nothing special."
]

# Get predictions
results = classifier(texts)

for text, result in zip(texts, results):
    print(f"Text: {text}")
    print(f"Sentiment: {result['label']}, Confidence: {result['score']:.4f}\n")

### Text Generation

In [None]:
# Create a text generation pipeline
generator = pipeline("text-generation", model="gpt2")

# Generate text
prompt = "The future of artificial intelligence is"
generated = generator(
    prompt,
    max_length=100,
    num_return_sequences=2,
    temperature=0.7,
    do_sample=True
)

print(f"Prompt: {prompt}\n")
for i, result in enumerate(generated):
    print(f"Generation {i+1}: {result['generated_text']}\n")

### Question Answering

In [None]:
# Create a question-answering pipeline
qa_pipeline = pipeline("question-answering")

# Define context and questions
context = """
Hugging Face is an American company based in New York City that develops tools for building 
applications using machine learning. It is most notable for its transformers library built 
for natural language processing applications and its platform that allows users to share 
machine learning models and datasets.
"""

questions = [
    "Where is Hugging Face based?",
    "What is Hugging Face most notable for?",
    "What can users share on the platform?"
]

for question in questions:
    answer = qa_pipeline(question=question, context=context)
    print(f"Question: {question}")
    print(f"Answer: {answer['answer']}")
    print(f"Confidence: {answer['score']:.4f}\n")

## Part 2: Working with Models and Tokenizers Directly

While pipelines are convenient, sometimes you need more control over the process. Let's see how to work with models and tokenizers directly.

### Loading a Pre-trained Model and Tokenizer

In [None]:
# Load a specific model for sequence classification
model_name = "distilbert-base-uncased-finetuned-sst-2-english"

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

print(f"Model: {model_name}")
print(f"Number of parameters: {model.num_parameters():,}")
print(f"Model configuration: {model.config}")

### Manual Inference Process

In [None]:
# Text to classify
text = "I really enjoyed this movie, it was fantastic!"

# Step 1: Tokenize the input
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
print("Tokenized inputs:")
print(f"Input IDs shape: {inputs['input_ids'].shape}")
print(f"Input IDs: {inputs['input_ids']}")
print(f"Attention mask: {inputs['attention_mask']}")

# Step 2: Forward pass through the model
with torch.no_grad():
    outputs = model(**inputs)

print(f"\nModel outputs shape: {outputs.logits.shape}")
print(f"Raw logits: {outputs.logits}")

# Step 3: Convert logits to probabilities
probabilities = torch.softmax(outputs.logits, dim=1)
print(f"Probabilities: {probabilities}")

# Step 4: Get predicted class
predicted_class = torch.argmax(probabilities, dim=1)
labels = ["NEGATIVE", "POSITIVE"]
confidence = probabilities[0][predicted_class].item()

print(f"\nPrediction: {labels[predicted_class]} (confidence: {confidence:.4f})")

## Part 3: Exploring the Model Hub

The Hugging Face Model Hub contains thousands of pre-trained models. Let's explore how to find and use different models.

In [None]:
# List of popular models for different tasks
popular_models = {
    "Text Classification": [
        "distilbert-base-uncased-finetuned-sst-2-english",
        "cardiffnlp/twitter-roberta-base-sentiment-latest",
        "j-hartmann/emotion-english-distilroberta-base"
    ],
    "Question Answering": [
        "distilbert-base-cased-distilled-squad",
        "deepset/roberta-base-squad2",
        "bert-large-uncased-whole-word-masking-finetuned-squad"
    ],
    "Text Generation": [
        "gpt2",
        "microsoft/DialoGPT-medium",
        "distilgpt2"
    ],
    "Summarization": [
        "facebook/bart-large-cnn",
        "t5-small",
        "sshleifer/distilbart-cnn-12-6"
    ]
}

for task, models in popular_models.items():
    print(f"\n{task}:")
    for model in models:
        print(f"  - {model}")

### Comparing Different Models

In [None]:
# Compare different sentiment analysis models
models_to_compare = [
    "distilbert-base-uncased-finetuned-sst-2-english",
    "cardiffnlp/twitter-roberta-base-sentiment-latest"
]

test_text = "The new iPhone available in Sydney stores is amazing, but the price is too high."

print(f"Test text: {test_text}\n")

for model_name in models_to_compare:
    try:
        classifier = pipeline("sentiment-analysis", model=model_name)
        result = classifier(test_text)
        print(f"Model: {model_name}")
        print(f"Result: {result[0]['label']} (confidence: {result[0]['score']:.4f})\n")
    except Exception as e:
        print(f"Error with {model_name}: {e}\n")

## Part 4: Understanding Model Architectures

Different transformer architectures are suited for different tasks.

In [None]:
# Load different model architectures
architectures = {
    "BERT": "bert-base-uncased",
    "DistilBERT": "distilbert-base-uncased",
    "RoBERTa": "roberta-base",
    "GPT-2": "gpt2"
}

for arch_name, model_name in architectures.items():
    try:
        model = AutoModel.from_pretrained(model_name)
        tokenizer = AutoTokenizer.from_pretrained(model_name)
        
        print(f"\n{arch_name} ({model_name}):")
        print(f"  Parameters: {model.num_parameters():,}")
        print(f"  Max position embeddings: {model.config.max_position_embeddings if hasattr(model.config, 'max_position_embeddings') else 'N/A'}")
        print(f"  Hidden size: {model.config.hidden_size}")
        print(f"  Number of attention heads: {model.config.num_attention_heads}")
        print(f"  Number of layers: {model.config.num_hidden_layers}")
        
        # Show vocabulary size
        print(f"  Vocabulary size: {len(tokenizer)}")
        
    except Exception as e:
        print(f"Error loading {arch_name}: {e}")

## Part 5: Understanding Model Outputs

Let's examine what models actually output and how to interpret these outputs.

In [None]:
# Load BERT model
model_name = "bert-base-uncased"
model = AutoModel.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Sample text
text = "Hello world, this is BERT!"
inputs = tokenizer(text, return_tensors="pt")

# Get model outputs with different options
with torch.no_grad():
    # Basic output
    basic_output = model(**inputs)
    
    # Output with attention weights
    attention_output = model(**inputs, output_attentions=True)
    
    # Output with hidden states
    hidden_states_output = model(**inputs, output_hidden_states=True)

print("Model Output Analysis:")
print(f"Last hidden state shape: {basic_output.last_hidden_state.shape}")
print(f"Pooler output shape: {basic_output.pooler_output.shape}")

if attention_output.attentions:
    print(f"Number of attention layers: {len(attention_output.attentions)}")
    print(f"Attention shape (layer 0): {attention_output.attentions[0].shape}")

if hidden_states_output.hidden_states:
    print(f"Number of hidden state layers: {len(hidden_states_output.hidden_states)}")
    print(f"Hidden state shape (layer 0): {hidden_states_output.hidden_states[0].shape}")

# Show token-level representations
tokens = tokenizer.tokenize(text)
print(f"\nTokens: {tokens}")
print(f"Token representations shape: {basic_output.last_hidden_state.shape}")

## Part 6: Practical Tips and Common Patterns

Here are some practical tips for working with Hugging Face models.

In [None]:
# Tip 1: Check model capabilities
def analyze_model_capabilities(model_name):
    """Analyze what a model can do"""
    try:
        # Try different task types
        tasks_to_try = [
            "text-classification",
            "question-answering", 
            "text-generation",
            "fill-mask",
            "summarization"
        ]
        
        compatible_tasks = []
        
        for task in tasks_to_try:
            try:
                pipeline(task, model=model_name)
                compatible_tasks.append(task)
            except:
                pass
        
        return compatible_tasks
    except Exception as e:
        return f"Error: {e}"

# Test with BERT
bert_capabilities = analyze_model_capabilities("bert-base-uncased")
print(f"BERT capabilities: {bert_capabilities}")

# Test with GPT-2
gpt2_capabilities = analyze_model_capabilities("gpt2")
print(f"GPT-2 capabilities: {gpt2_capabilities}")

In [None]:
# Tip 2: Memory-efficient loading
def load_model_efficiently(model_name, task=None):
    """Load model with memory optimization"""
    
    if task:
        # Use pipeline for specific tasks
        return pipeline(task, model=model_name, device=0 if device.type == 'cuda' else -1)
    else:
        # Load with optimizations
        model = AutoModel.from_pretrained(
            model_name,
            torch_dtype=torch.float16 if device.type == 'cuda' else torch.float32,
            low_cpu_mem_usage=True
        )
        tokenizer = AutoTokenizer.from_pretrained(model_name)
        return model, tokenizer

# Example usage
efficient_classifier = load_model_efficiently("distilbert-base-uncased", "text-classification")
result = efficient_classifier("This is a test.")
print(f"Efficient loading result: {result}")

## Summary

In this notebook, we covered:

1. **Hugging Face Overview**: Understanding the ecosystem and its components
2. **Pipelines**: Quick and easy way to use pre-trained models
3. **Manual Model Usage**: Working with tokenizers and models directly for more control
4. **Model Hub Exploration**: Finding and comparing different models
5. **Model Architectures**: Understanding different transformer architectures
6. **Model Outputs**: Interpreting what models return
7. **Practical Tips**: Efficient loading and usage patterns

## Next Steps

- **Notebook 02**: Deep dive into tokenizers
- **Notebook 03**: Working with the datasets library
- **Notebook 04**: Mini-project combining concepts from 01-03

## Key Takeaways

- Pipelines are great for quick experimentation and prototyping
- Direct model usage gives you more control and insight
- The Model Hub has models for almost every NLP task
- Different architectures (BERT, GPT-2, RoBERTa) have different strengths
- Always consider memory usage and efficiency in your applications

---

## About the Author

**Vu Hung Nguyen** - AI Engineer & Researcher

Connect with me:
- 🌐 **Website**: [vuhung16au.github.io](https://vuhung16au.github.io/)
- 💼 **LinkedIn**: [linkedin.com/in/nguyenvuhung](https://www.linkedin.com/in/nguyenvuhung/)
- 💻 **GitHub**: [github.com/vuhung16au](https://github.com/vuhung16au/)

*This notebook is part of the [HF Transformer Trove](https://github.com/vuhung16au/hf-transformer-trove) educational series.*