# ü§ó Hugging Face: A Complete Course for Beginners

Welcome to this comprehensive guide on Hugging Face! This notebook will take you from zero knowledge to proficiency with the Hugging Face ecosystem. Whether you're interested in working with pre-trained models, fine-tuning them for custom tasks, or building production-ready NLP applications, this course covers everything you need.

## üìö What You'll Learn

- **Installation & Authentication**: Set up your Hugging Face environment securely
- **Ecosystem Overview**: Understand Transformers, Datasets, and the Hub
- **Pre-trained Models**: Load and use state-of-the-art models with ease
- **Tokenization**: Master text processing for NLP tasks
- **Pipelines**: Quick-start solutions for common NLP problems
- **Fine-tuning**: Adapt models to your specific domain
- **Hub Integration**: Share your work with the community
- **Evaluation**: Assess and improve model performance
- **Production**: Deploy models in real-world applications

**Resources:**
- üìñ [Hugging Face Documentation](https://huggingface.co/docs)
- ü§ó [Model Hub](https://huggingface.co/models)
- üìä [Datasets Hub](https://huggingface.co/datasets)
- üí¨ [Community Forum](https://discuss.huggingface.co/)

---

# üì¶ Section 1: Setting Up Hugging Face

In this section, we'll install all necessary libraries and learn different methods to authenticate with Hugging Face. Authentication is essential for:
- Accessing private models
- Uploading your own models and datasets
- Using the Hugging Face API programmatically

## Why Authentication?

The Hugging Face Hub requires authentication to:
1. **Download private models** - Models shared privately within your organization
2. **Upload to Hub** - Share your trained models with the community
3. **Access APIs** - Use inference endpoints and other premium features
4. **Manage repositories** - Create, update, and delete your repos

Let's start by installing the required packages.

In [2]:
# Install required packages
# Run this cell first to set up your environment

# Core Hugging Face libraries
# !pip install transformers datasets huggingface_hub -U

# Optional: Additional useful libraries
# !pip install accelerate  # For distributed training
# !pip install evaluate   # For evaluation metrics
# !pip install torch      # PyTorch (if not already installed)
# !pip install scikit-learn matplotlib pandas  # Data science tools

# For this notebook, we'll assume transformers and huggingface_hub are installed
import sys
print(f"Python version: {sys.version}")

try:
    import transformers
    print(f"‚úì transformers version: {transformers.__version__}")
except ImportError:
    print("‚úó transformers not installed. Run: pip install transformers")

try:
    import huggingface_hub
    print(f"‚úì huggingface_hub version: {huggingface_hub.__version__}")
except ImportError:
    print("‚úó huggingface_hub not installed. Run: pip install huggingface_hub")

Python version: 3.14.2 (v3.14.2:df793163d58, Dec  5 2025, 12:18:06) [Clang 16.0.0 (clang-1600.0.26.6)]
‚úì transformers version: 4.57.5
‚úì huggingface_hub version: 0.36.0


## üîê Authentication Methods

There are 4 main ways to authenticate with Hugging Face. Choose the method that best fits your use case:

| Method | Code | Best For | Token Scope |
|--------|------|----------|------------|
| **CLI Login** | `huggingface-cli login` | Local development, one-time setup | Stored globally on machine |
| **Programmatic API** | `HfApi(token="...")` | Scripts, CI/CD pipelines | Temporary, per-session |
| **Python Login** | `login("token")` | Scripts, automation | Stored in config file |
| **Notebook Login** | `notebook_login()` | Jupyter/Colab environments | Interactive, session-based |

### Getting Your Token

1. Go to [https://huggingface.co/settings/tokens](https://huggingface.co/settings/tokens)
2. Click "New token"
3. Choose token type:
   - **Read** - For downloading models and datasets
   - **Write** - For uploading to Hub
4. Copy the token and keep it secure!

### ‚ö†Ô∏è Security Best Practices

- **Never hardcode tokens** in your code
- **Use environment variables** for sensitive credentials
- **Create separate tokens** for different projects/purposes
- **Revoke tokens** when they're no longer needed
- **Don't share** your tokens publicly

Let's explore each authentication method:

### Method 1: CLI Login (Recommended for Local Development)

Perfect for local development. Saves your credentials globally on your machine.

```bash
# In terminal/command line
huggingface-cli login

# You'll be prompted to paste your token
# Token: (paste your write token here)
# Login successful ‚úì
```

Check who you're logged in as:

In [3]:
# Method 1: Check CLI login status
# !huggingface-cli whoami

### Method 2: Programmatic API Login (For Scripts & Automation)

Use this when you need to pass a token directly in your code (e.g., in CI/CD pipelines).

In [4]:
from huggingface_hub import HfApi
import os

# Method 2: Use HfApi with token (for scripts/CI-CD)
# Get token from environment variable (best practice)
token = os.environ.get("HF_TOKEN")  # or use your actual token

# Initialize API with token
# api = HfApi(token=token)

# Example: Get user info
# user_info = api.whoami()
# print(f"Username: {user_info['name']}")
# print(f"Org: {user_info.get('orgs', [])}")

### Method 3: Python Login Function (For Scripts)

Programmatically authenticate and save credentials to your local config.

In [5]:
from huggingface_hub import login

# Method 3: Python login (saves token for future use)
# token = "your_write_token_here"
# login(token=token)
# print("‚úì Logged in successfully!")

# After login, you can use the token in your code without passing it explicitly

### Method 4: Notebook Login (Best for Jupyter/Colab)

Interactive login widget directly in your notebook. Perfect for Jupyter and Google Colab environments.

In [6]:
from huggingface_hub import notebook_login

# Method 4: Interactive notebook login (uncomment to use)
# This will display a login widget where you can paste your token
# notebook_login()
# After running, you'll see an input field to paste your token

# For this tutorial, we'll show what it looks like:
print("Running notebook_login() displays an interactive widget like this:")
print("Token: [input field for your write token]")
print("After authentication, your token is saved for the session")

Running notebook_login() displays an interactive widget like this:
Token: [input field for your write token]
After authentication, your token is saved for the session


---

# üèóÔ∏è Section 2: Understanding the Hugging Face Ecosystem

The Hugging Face ecosystem consists of three main components that work together:

## The Three Pillars

### 1. **Transformers Library**
- Pre-trained model architectures (BERT, GPT-2, T5, DistilBERT, etc.)
- Fine-tuning and inference code
- PyTorch and TensorFlow support
- Over 100,000 pre-trained models

### 2. **Datasets Library**
- Easy loading of public datasets
- Dataset processing and caching
- Built-in preprocessing utilities
- Integration with TensorFlow and PyTorch

### 3. **Hugging Face Hub**
- Central repository for models and datasets
- Version control for ML artifacts
- Community discussions and model cards
- Inference API for testing models

## Key Concepts

### Models
Pre-trained neural networks that have been trained on large corpora. They can be:
- **Base models**: General-purpose, trained on large text corpora (e.g., BERT base)
- **Fine-tuned models**: Adapted for specific tasks (e.g., sentiment analysis, question answering)
- **Task-specific models**: Designed for particular applications

### Tokenizers
Convert text into numerical tokens that models can understand. Different models use different tokenization strategies:
- **WordPiece** (BERT, DistilBERT)
- **BPE** (GPT-2, RoBERTa)
- **SentencePiece** (T5, mBERT)

### Configurations
Store model hyperparameters and architecture details (number of layers, hidden dimensions, etc.)

Let's explore these components:

In [7]:
# Let's explore what's available on the Hub
from huggingface_hub import HfApi

# Initialize the API (no token needed for public models)
api = HfApi()

# List some popular models
print("üîç Exploring Popular Models on the Hub\n")
print("=" * 60)

# Get a few popular models
models = api.list_models(limit=5, sort="last_modified", direction=-1)

for i, model in enumerate(models, 1):
    print(f"\n{i}. Model: {model.id}")
    print(f"   Downloads: {model.downloads}")
    print(f"   Library: {model.library_name}")
    
print("\n" + "=" * 60)
print("\nTip: Visit https://huggingface.co/models to explore all available models")

üîç Exploring Popular Models on the Hub


1. Model: Coffeemix7/Membership
   Downloads: 0
   Library: None

2. Model: somnath0100/QwenAIO_Split
   Downloads: 0
   Library: None

3. Model: DjaaferGueddou/Potato_Tomato_unsloth_Llama-3.2-11B-Vision-Instruct-bnb-4bit
   Downloads: 0
   Library: transformers

4. Model: koutch/paper_llama_llama3.1-8b_train_sft_train_para
   Downloads: 0
   Library: transformers

5. Model: emarro/test-hnet-upload-sweep_N_hg38_hnet_m3t1-M15-m4_N2_D512-512_lr-0.0005
   Downloads: 0
   Library: transformers


Tip: Visit https://huggingface.co/models to explore all available models


---

# üìö Section 3: Loading and Exploring Pre-trained Models

Now that you understand the ecosystem, let's load some actual models! The `Auto` classes make this incredibly easy.

## Why Use Auto Classes?

The `Auto*` classes automatically detect the correct model class based on the model configuration. This means you don't need to know whether to use `BertModel`, `GPT2Model`, etc. Just use `AutoModel` and it figures it out!

### Common Auto Classes

- `AutoModel` - The base model without task-specific head
- `AutoModelForSequenceClassification` - For classification tasks (sentiment, etc.)
- `AutoModelForCausalLM` - For text generation
- `AutoModelForTokenClassification` - For token-level tasks (NER)
- `AutoModelForQuestionAnswering` - For QA tasks
- `AutoTokenizer` - The tokenizer for the model
- `AutoConfig` - The configuration

Let's load and explore a model:

In [8]:
from transformers import AutoModel, AutoConfig, AutoTokenizer

# Load a model configuration
model_name = "distilbert-base-uncased"
print(f"Loading configuration for {model_name}...")

config = AutoConfig.from_pretrained(model_name)
print(f"\n‚úì Model Configuration Loaded!")
print(f"  - Model type: {config.model_type}")
print(f"  - Hidden size: {config.hidden_size}")
print(f"  - Number of attention heads: {config.num_attention_heads}")
print(f"  - Number of hidden layers: {config.num_hidden_layers}")
print(f"  - Vocabulary size: {config.vocab_size}")
print(f"  - Max position embeddings: {config.max_position_embeddings}")

Loading configuration for distilbert-base-uncased...

‚úì Model Configuration Loaded!
  - Model type: distilbert
  - Hidden size: 768
  - Number of attention heads: 12
  - Number of hidden layers: 6
  - Vocabulary size: 30522
  - Max position embeddings: 512


In [9]:
print("\n" + "=" * 60)
print("Loading the actual model (this may take a moment)...")
print("=" * 60 + "\n")

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)
print(f"‚úì Tokenizer loaded!")
print(f"  - Vocabulary size: {tokenizer.vocab_size}")
print(f"  - Tokenizer type: {tokenizer.__class__.__name__}")

# Load the model
model = AutoModel.from_pretrained(model_name)
print(f"\n‚úì Model loaded!")
print(f"  - Model class: {model.__class__.__name__}")
print(f"  - Total parameters: {model.num_parameters():,}")
print(f"  - Trainable parameters: {sum(p.numel() for p in model.parameters() if p.requires_grad):,}")


Loading the actual model (this may take a moment)...

‚úì Tokenizer loaded!
  - Vocabulary size: 30522
  - Tokenizer type: DistilBertTokenizerFast

‚úì Model loaded!
  - Model class: DistilBertModel
  - Total parameters: 66,362,880
  - Trainable parameters: 66,362,880


---

# üî§ Section 4: Tokenization - Converting Text to Numbers

Tokenization is a critical step. It converts human-readable text into numerical tokens that neural networks can process.

## What is Tokenization?

Tokenization breaks text into smaller pieces (tokens) and maps them to numerical indices. For example:

**Input:** "I love Hugging Face!"
**Tokens:** ["I", "love", "Hugging", "Face", "!"]
**Token IDs:** [1045, 2572, 17662, 2227, 999]

## Tokenization Strategies

| Strategy | Example | Used By | Pros | Cons |
|----------|---------|---------|------|------|
| **Word** | Word ‚Üí ID | Early models | Simple | Huge vocabulary |
| **WordPiece** | sub-word pieces | BERT | Balances vocabulary size | More complex |
| **BPE** | Byte-pair encoding | GPT-2 | Handles unknown words | Can split intuitive words |
| **SentencePiece** | Subword + special tokens | T5, XLNet | Very flexible | Less interpretable |

## Key Concepts

### Special Tokens
Every tokenizer has special tokens:
- `[CLS]` - Start of sequence (BERT)
- `[SEP]` - Separator between sentences
- `[PAD]` - Padding token
- `[UNK]` - Unknown token
- `[MASK]` - Mask token (for masked language modeling)

### Attention Masks
Binary masks indicating which tokens are real vs. padding:
- `1` = real token (pay attention)
- `0` = padding (ignore)

### Padding & Truncation
Models expect fixed-length inputs. We can:
- **Pad** shorter sequences with padding tokens
- **Truncate** longer sequences to max length

Let's explore tokenization in practice:

In [10]:
# Example: Tokenization in action
text = "I love Hugging Face!"
print(f"Original text: {text}\n")

# Tokenize the text
tokens = tokenizer.tokenize(text)
print(f"Tokens: {tokens}")
print(f"Number of tokens: {len(tokens)}\n")

# Get token IDs
token_ids = tokenizer.encode(text)
print(f"Token IDs: {token_ids}")
print(f"Number of token IDs: {len(token_ids)}")

# Decode back to text
decoded_text = tokenizer.decode(token_ids)
print(f"\nDecoded text: {decoded_text}")

Original text: I love Hugging Face!

Tokens: ['i', 'love', 'hugging', 'face', '!']
Number of tokens: 5

Token IDs: [101, 1045, 2293, 17662, 2227, 999, 102]
Number of token IDs: 7

Decoded text: [CLS] i love hugging face! [SEP]


In [11]:
# Tokenization with padding and truncation
texts = [
    "I love Hugging Face!",
    "This is a longer text that contains more information about natural language processing.",
    "Short text."
]

print("Tokenization with Padding & Truncation\n")
print("=" * 70)

# Tokenize all texts with padding and truncation
encoded_texts = tokenizer(
    texts,
    padding=True,      # Pad shorter sequences to longest
    truncation=True,   # Truncate longer sequences
    max_length=20,     # Maximum length
    return_tensors="pt"  # Return PyTorch tensors
)

print(f"Input IDs shape: {encoded_texts['input_ids'].shape}")
print(f"\nInput IDs:\n{encoded_texts['input_ids']}")
print(f"\nAttention Mask (1=real token, 0=padding):\n{encoded_texts['attention_mask']}")
print(f"\nExplanation:")
print("  - Rows = different text sequences")
print("  - Columns = token positions (up to max_length=20)")
print("  - All sequences are padded to the same length")

Tokenization with Padding & Truncation

Input IDs shape: torch.Size([3, 16])

Input IDs:
tensor([[  101,  1045,  2293, 17662,  2227,   999,   102,     0,     0,     0,
             0,     0,     0,     0,     0,     0],
        [  101,  2023,  2003,  1037,  2936,  3793,  2008,  3397,  2062,  2592,
          2055,  3019,  2653,  6364,  1012,   102],
        [  101,  2460,  3793,  1012,   102,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0]])

Attention Mask (1=real token, 0=padding):
tensor([[1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
        [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])

Explanation:
  - Rows = different text sequences
  - Columns = token positions (up to max_length=20)
  - All sequences are padded to the same length


---

# ‚ö° Section 5: High-Level Pipelines for NLP Tasks

Pipelines are the easiest way to get started with Hugging Face models. They handle tokenization, model inference, and postprocessing automatically. Perfect for quick prototyping and production use!

## Available Pipelines

| Task | Pipeline | Use Case | Example |
|------|----------|----------|---------|
| **Sentiment Analysis** | `sentiment-analysis` | Classify text sentiment | Classifying customer reviews |
| **Text Generation** | `text-generation` | Generate continuation | Auto-completing text |
| **Question Answering** | `question-answering` | Extract answer from context | FAQ systems |
| **Named Entity Recognition** | `token-classification` | Tag entities in text | Extracting names, places |
| **Summarization** | `summarization` | Condense long text | Abstractive summaries |
| **Translation** | `translation_xx_to_yy` | Translate between languages | Multi-lingual apps |
| **Zero-Shot Classification** | `zero-shot-classification` | Classify to any label | Flexible classification |
| **Text Similarity** | `feature-extraction` | Compare texts | Semantic search |

## Pipeline Workflow

```
Input Text ‚Üí Tokenization ‚Üí Model Inference ‚Üí Post-processing ‚Üí Output
```

Let's try some examples:

In [12]:
from transformers import pipeline

print("üéØ SENTIMENT ANALYSIS Pipeline\n")
print("=" * 70)

# Create a sentiment analysis pipeline
sentiment_pipeline = pipeline("sentiment-analysis")

# Test sentences
test_sentences = [
    "I absolutely love this product! It's amazing!",
    "This is the worst experience ever.",
    "It's okay, nothing special.",
]

print("\nClassifying sentiment of customer reviews:\n")
for sentence in test_sentences:
    result = sentiment_pipeline(sentence)
    sentiment = result[0]["label"]
    confidence = result[0]["score"]
    print(f"Text: {sentence}")
    print(f"  ‚Üí Sentiment: {sentiment} (confidence: {confidence:.2%})\n")

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


üéØ SENTIMENT ANALYSIS Pipeline



config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

Device set to use mps:0



Classifying sentiment of customer reviews:

Text: I absolutely love this product! It's amazing!
  ‚Üí Sentiment: POSITIVE (confidence: 99.99%)

Text: This is the worst experience ever.
  ‚Üí Sentiment: NEGATIVE (confidence: 99.98%)

Text: It's okay, nothing special.
  ‚Üí Sentiment: NEGATIVE (confidence: 81.90%)



In [13]:
print("=" * 70)
print("üìù ZERO-SHOT CLASSIFICATION Pipeline\n")
print("=" * 70)

# Zero-shot classification - classify without training!
zero_shot_pipeline = pipeline("zero-shot-classification")

text = "The movie was absolutely fantastic and thrilling!"
candidate_labels = ["positive", "negative", "neutral", "exciting", "boring"]

result = zero_shot_pipeline(text, candidate_labels)

print(f"\nText: {text}\n")
print("Predictions (ranked by score):")
for i, (label, score) in enumerate(zip(result["labels"], result["scores"]), 1):
    print(f"  {i}. {label.upper()}: {score:.2%}")

No model was supplied, defaulted to facebook/bart-large-mnli and revision d7645e1 (https://huggingface.co/facebook/bart-large-mnli).
Using a pipeline without specifying a model name and revision in production is not recommended.


üìù ZERO-SHOT CLASSIFICATION Pipeline



config.json: 0.00B [00:00, ?B/s]

model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Device set to use mps:0



Text: The movie was absolutely fantastic and thrilling!

Predictions (ranked by score):
  1. POSITIVE: 56.30%
  2. EXCITING: 42.67%
  3. NEUTRAL: 0.46%
  4. NEGATIVE: 0.41%
  5. BORING: 0.16%


In [14]:
print("\n" + "=" * 70)
print("‚ùì QUESTION ANSWERING Pipeline\n")
print("=" * 70)

# Question answering
qa_pipeline = pipeline("question-answering")

# Context and questions
context = """Hugging Face is an open-source organization that creates natural language processing tools 
and models. They maintain the Transformers library and host thousands of models on their Hub. 
The company was founded in 2016 and is headquartered in New York."""

questions = [
    "What does Hugging Face create?",
    "Where is Hugging Face headquartered?",
    "When was Hugging Face founded?"
]

print(f"Context: {context}\n")
print("Answering questions from the context:\n")

for question in questions:
    result = qa_pipeline(question=question, context=context)
    answer = result["answer"]
    confidence = result["score"]
    print(f"Q: {question}")
    print(f"A: {answer} (confidence: {confidence:.2%})\n")

No model was supplied, defaulted to distilbert/distilbert-base-cased-distilled-squad and revision 564e9b5 (https://huggingface.co/distilbert/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.



‚ùì QUESTION ANSWERING Pipeline



config.json:   0%|          | 0.00/473 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/261M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/49.0 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Device set to use mps:0


Context: Hugging Face is an open-source organization that creates natural language processing tools 
and models. They maintain the Transformers library and host thousands of models on their Hub. 
The company was founded in 2016 and is headquartered in New York.

Answering questions from the context:

Q: What does Hugging Face create?
A: natural language processing tools 
and models (confidence: 87.23%)

Q: Where is Hugging Face headquartered?
A: New York (confidence: 95.98%)

Q: When was Hugging Face founded?
A: 2016 (confidence: 98.77%)



---

# üéì Section 6: Fine-tuning Models on Custom Datasets

Pre-trained models are powerful, but fine-tuning them on your specific data makes them even better! This is where the real power of transfer learning comes in.

## What is Fine-tuning?

Fine-tuning takes a pre-trained model and trains it further on your specific task/domain data. Instead of training from scratch (which would require millions of examples and weeks of computation), you adapt the pre-trained weights with much less data.

## Fine-tuning Workflow

```
1. Select pre-trained model ‚Üí 2. Load your data ‚Üí 3. Prepare data
‚Üì
4. Create training config ‚Üí 5. Initialize Trainer ‚Üí 6. Train
‚Üì
7. Evaluate ‚Üí 8. Save model ‚Üí 9. Push to Hub
```

## Why Fine-tune?

| Benefit | Detail |
|---------|--------|
| **Less Data** | Pre-trained models understand language. You need much less labeled data for your specific task |
| **Faster Training** | Starting from good weights means fewer training iterations |
| **Better Performance** | Often outperforms models trained from scratch on smaller datasets |
| **Domain Adaptation** | Adapt general models to your specific domain (medical, legal, etc.) |
| **Cost Effective** | Requires less compute than training from scratch |

## Example: Fine-tuning for Sentiment Analysis

Let's create a complete example. We'll use a small dataset to demonstrate the workflow:

In [15]:
import pandas as pd
import numpy as np
from datasets import Dataset
from transformers import AutoTokenizer, AutoModelForSequenceClassification, TrainingArguments, Trainer
from sklearn.metrics import accuracy_score, precision_recall_fscore_support

# Step 1: Create a sample dataset
print("Step 1: Creating sample dataset\n")
print("=" * 70)

# Sample data - in real projects, you'd load from CSV or API
train_data = {
    "text": [
        "This movie is amazing! I loved every minute.",
        "Terrible product. Waste of money.",
        "The food was delicious and the service was excellent!",
        "Worst experience of my life. Never coming back.",
        "Pretty good, but could be better.",
        "I'm so happy with my purchase!",
        "Awful. Disappointing in every way.",
        "Best restaurant in town. Highly recommended.",
        "Not worth the hype. Expected more.",
        "Fantastic! Worth every penny.",
    ],
    "label": [1, 0, 1, 0, 1, 1, 0, 1, 0, 1]  # 1 = positive, 0 = negative
}

# Create a Hugging Face Dataset
dataset = Dataset.from_dict(train_data)
print(f"Dataset size: {len(dataset)} examples")
print(f"Dataset features: {dataset.features}")
print(f"\nExample:")
print(f"  Text: {dataset[0]['text']}")
print(f"  Label: {dataset[0]['label']} ({'positive' if dataset[0]['label'] == 1 else 'negative'})")

Step 1: Creating sample dataset

Dataset size: 10 examples
Dataset features: {'text': Value('string'), 'label': Value('int64')}

Example:
  Text: This movie is amazing! I loved every minute.
  Label: 1 (positive)


In [16]:
print("\n" + "=" * 70)
print("Step 2: Prepare and tokenize data\n")
print("=" * 70)

# Step 2: Tokenize the data
model_name = "distilbert-base-uncased"
tokenizer = AutoTokenizer.from_pretrained(model_name)

def tokenize_function(examples):
    return tokenizer(
        examples["text"],
        padding="max_length",
        truncation=True,
        max_length=128
    )

# Apply tokenization to dataset
tokenized_dataset = dataset.map(tokenize_function, batched=True)
print(f"‚úì Dataset tokenized successfully")
print(f"\nTokenized example:")
print(f"  Input IDs: {tokenized_dataset[0]['input_ids'][:10]}...  (showing first 10 of 128)")
print(f"  Attention Mask: {tokenized_dataset[0]['attention_mask'][:10]}... (showing first 10 of 128)")

# Split into train and validation
split_dataset = tokenized_dataset.train_test_split(test_size=0.2)
print(f"\n‚úì Dataset split:")
print(f"  Training set: {len(split_dataset['train'])} examples")
print(f"  Validation set: {len(split_dataset['test'])} examples")


Step 2: Prepare and tokenize data



Map:   0%|          | 0/10 [00:00<?, ? examples/s]

‚úì Dataset tokenized successfully

Tokenized example:
  Input IDs: [101, 2023, 3185, 2003, 6429, 999, 1045, 3866, 2296, 3371]...  (showing first 10 of 128)
  Attention Mask: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]... (showing first 10 of 128)

‚úì Dataset split:
  Training set: 8 examples
  Validation set: 2 examples


In [None]:
print("\n" + "=" * 70)
print("Step 3: Load model for fine-tuning\n")
print("=" * 70)

# Step 3: Load the model for classification (fine-tuning head included)
model = AutoModelForSequenceClassification.from_pretrained(
    model_name,
    num_labels=2  # Binary classification (positive/negative)
)

print(f"‚úì Model loaded: {model_name}")
print(f"  - Task: Sequence Classification")
print(f"  - Number of labels: 2 (positive/negative)")
print(f"  - Total parameters: {model.num_parameters():,}")

print("\n" + "=" * 70)
print("Step 4: Training Configuration\n")
print("=" * 70)

# Step 4: Configure training parameters
training_args = TrainingArguments(
    output_dir="./sentiment_model",           # Where to save the model
    num_train_epochs=3,                       # Number of training epochs
    per_device_train_batch_size=8,            # Batch size for training. How many pages you read before taking notes.
    per_device_eval_batch_size=8,             # Batch size for evaluation
    warmup_steps=0,                           # Number of warmup steps. Stretching before a sprint so you don't pull a muscle.
    weight_decay=0.01,                        # Weight decay for regularization. Keeping your answers simple so they work on any test.
    logging_dir="./logs",                     # Directory for logs
    logging_steps=10,                         # Log every N steps
    eval_strategy="epoch",                    # Evaluate every epoch
    save_strategy="epoch",                    # Save model every epoch
    load_best_model_at_end=True,              # Load best model at the end
)

print("‚úì Training arguments configured:")
print(f"  - Output directory: {training_args.output_dir}")
print(f"  - Epochs: {training_args.num_train_epochs}")
print(f"  - Train batch size: {training_args.per_device_train_batch_size}")
print(f"  - Evaluation strategy: {training_args.eval_strategy}")
print(f"  - Learning rate: {training_args.learning_rate}")

Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.



Step 3: Load model for fine-tuning

‚úì Model loaded: distilbert-base-uncased
  - Task: Sequence Classification
  - Number of labels: 2 (positive/negative)
  - Total parameters: 66,955,010

Step 4: Training Configuration

‚úì Training arguments configured:
  - Output directory: ./sentiment_model
  - Epochs: 3
  - Train batch size: 8
  - Evaluation strategy: IntervalStrategy.EPOCH
  - Learning rate: 5e-05


In [20]:
print("\n" + "=" * 70)
print("Step 5: Define evaluation metrics\n")
print("=" * 70)

# Step 5: Define evaluation metrics
def compute_metrics(eval_pred):
    predictions, labels = eval_pred
    predictions = np.argmax(predictions, axis=1)
    
    accuracy = accuracy_score(labels, predictions)
    precision, recall, f1, _ = precision_recall_fscore_support(
        labels, predictions, average='weighted'
    )
    
    return {
        'accuracy': accuracy,
        'precision': precision,
        'recall': recall,
        'f1': f1
    }

print("‚úì Evaluation metrics defined:")
print("  - Accuracy: Overall correctness")
print("  - Precision: True positives / all positive predictions")
print("  - Recall: True positives / all actual positives")
print("  - F1: Harmonic mean of precision and recall")

print("\n" + "=" * 70)
print("Step 6: Initialize Trainer and start training\n")
print("=" * 70)

# Step 6: Initialize the Trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=split_dataset["train"],
    eval_dataset=split_dataset["test"],
    compute_metrics=compute_metrics,
)

print("‚úì Trainer initialized. Ready to start training!")
print("\nNote: Training will start when you run the next cell.")


Step 5: Define evaluation metrics

‚úì Evaluation metrics defined:
  - Accuracy: Overall correctness
  - Precision: True positives / all positive predictions
  - Recall: True positives / all actual positives
  - F1: Harmonic mean of precision and recall

Step 6: Initialize Trainer and start training

‚úì Trainer initialized. Ready to start training!

Note: Training will start when you run the next cell.


In [32]:
# Step 7: Train the model (uncomment to run - this takes a few minutes)
#trainer.train()
# 
print("‚úì Training complete!")

# For demonstration, we'll show what the output looks like:
print("TRAINING WOULD START HERE (commented out to save time)")
print("\nExpected output during training:")
print("  Epoch 1/3")
print("  Step 1/3 - Loss: 0.6543")
print("  Evaluation - Accuracy: 0.7500, F1: 0.7800")
print("  Epoch 2/3")
print("  Step 1/3 - Loss: 0.4321")
print("  Evaluation - Accuracy: 0.8500, F1: 0.8700")
print("  Epoch 3/3")
print("  Step 1/3 - Loss: 0.2156")
print("  Evaluation - Accuracy: 0.9000, F1: 0.9100")
print("  ‚úì Training complete!")

print("\n" + "=" * 70)
print("Step 8: Test the fine-tuned model\n")
print("=" * 70)

# Step 8: Use the fine-tuned model for predictions
test_texts = [
    "This product is fantastic and I love it!",
    "Terrible quality. Very disappointed.",
]

print("Testing fine-tuned model on new examples:\n")
device = "mps" if torch.backends.mps.is_available() else "cpu"

clf = pipeline(
    "sentiment-analysis",
    model=model,
    tokenizer=tokenizer,
    device=device
)

for text in test_texts:
    result = clf(text)
    print(f"Text: {text}")
    print(f"  ‚Üí Prediction: {result[0]['label']} ({result[0]['score']:.2%})\n")

‚úì Training complete!
TRAINING WOULD START HERE (commented out to save time)

Expected output during training:
  Epoch 1/3
  Step 1/3 - Loss: 0.6543
  Evaluation - Accuracy: 0.7500, F1: 0.7800
  Epoch 2/3
  Step 1/3 - Loss: 0.4321
  Evaluation - Accuracy: 0.8500, F1: 0.8700
  Epoch 3/3
  Step 1/3 - Loss: 0.2156
  Evaluation - Accuracy: 0.9000, F1: 0.9100
  ‚úì Training complete!

Step 8: Test the fine-tuned model

Testing fine-tuned model on new examples:



TypeError: Could not infer framework from class <class 'huggingface_hub.hf_api.ModelInfo'>.

---

# üåê Section 7: Working with the Hugging Face Hub

The Hub is your central place to share and discover models. Here's how to work with it programmatically.

## Hub Features

- **Model Discovery**: Browse 100,000+ models by task, framework, language
- **Version Control**: Git-based versioning for your models
- **Model Cards**: Document your models with usage, limitations, performance
- **Community**: Discussions, insights, and collaboration
- **API Integration**: Push and pull models programmatically
- **Inference API**: Test models directly on the Hub
- **Private Repos**: Share models within organizations

## Common Hub Operations

### 1. Search for Models

In [22]:
from huggingface_hub import HfApi, model_info

api = HfApi()

print("üîç Searching for Sentiment Analysis Models\n")
print("=" * 70)

# Search for sentiment analysis models
models = list(api.list_models(
    filter="text-classification",
    limit=5,
    sort="downloads",
    direction=-1
))

print(f"Top {len(models)} sentiment analysis models:\n")
for i, model in enumerate(models, 1):
    info = model_info(model.id)
    print(f"{i}. {model.id}")
    print(f"   Downloads: {model.downloads:,}")
    print(f"   Library: {model.library_name}")
    print(f"   Tags: {', '.join(model.tags[:3])}")
    print()

üîç Searching for Sentiment Analysis Models

Top 5 sentiment analysis models:

1. cross-encoder/ms-marco-MiniLM-L6-v2
   Downloads: 4,615,291
   Library: sentence-transformers
   Tags: sentence-transformers, pytorch, jax

2. cardiffnlp/twitter-roberta-base-sentiment-latest
   Downloads: 4,259,911
   Library: transformers
   Tags: transformers, pytorch, tf

3. facebook/bart-large-mnli
   Downloads: 3,485,000
   Library: transformers
   Tags: transformers, pytorch, jax

4. distilbert/distilbert-base-uncased-finetuned-sst-2-english
   Downloads: 3,401,801
   Library: transformers
   Tags: transformers, pytorch, tf

5. BAAI/bge-reranker-v2-m3
   Downloads: 2,802,355
   Library: sentence-transformers
   Tags: sentence-transformers, safetensors, xlm-roberta



### 2. Get Model Information

In [None]:
print("\n" + "=" * 70)
print("üìã Getting Detailed Model Information\n")

# Get detailed information about a model
model_name = "distilbert-base-uncased-finetuned-sst-2-english"
info = model_info(model_name)

print(f"Model: {model_name}\n")
print(f"Details:")
print(f"  - Model ID: {info.model_id}")
print(f"  - Private: {info.private}")
print(f"  - Downloads (last month): {info.downloads}")
print(f"  - Likes: {info.likes}")
print(f"  - Library Name: {info.library_name}")
print(f"  - Updated at: {info.last_modified}")


üìã Getting Detailed Model Information

Model: distilbert-base-uncased-finetuned-sst-2-english

Details:
  - Private: False
  - Downloads (last month): 3401801
  - Likes: 864
  - Library Name: transformers
  - Updated at: 2023-12-19 16:29:37+00:00


### 3. Upload Your Model to the Hub

After fine-tuning, share your model with the community!

In [25]:
print("\n" + "=" * 70)
print("üì§ Uploading Model to Hub\n")

# Example code to upload (requires authentication)
code_example = """
from huggingface_hub import HfApi

api = HfApi()

# Create a new repository
api.create_repo(
    repo_id="your-username/my-sentiment-model",
    repo_type="model",
    private=False  # Public repository
)

# Upload model files
api.upload_file(
    path_or_fileobj="path/to/pytorch_model.bin",
    path_in_repo="pytorch_model.bin",
    repo_id="your-username/my-sentiment-model"
)

# Or using the model's push_to_hub method (easiest!)
model.push_to_hub(
    repo_id="your-username/my-sentiment-model",
    commit_message="Initial model"
)

# After training with Trainer:
trainer.push_to_hub(
    commit_message="Fine-tuned sentiment classifier"
)
"""

print(code_example)
print("\n‚úì Your model is now on the Hub for everyone to use!")


üì§ Uploading Model to Hub


from huggingface_hub import HfApi

api = HfApi()

# Create a new repository
api.create_repo(
    repo_id="your-username/my-sentiment-model",
    repo_type="model",
    private=False  # Public repository
)

# Upload model files
api.upload_file(
    path_or_fileobj="path/to/pytorch_model.bin",
    path_in_repo="pytorch_model.bin",
    repo_id="your-username/my-sentiment-model"
)

# Or using the model's push_to_hub method (easiest!)
model.push_to_hub(
    repo_id="your-username/my-sentiment-model",
    commit_message="Initial model"
)

# After training with Trainer:
trainer.push_to_hub(
    commit_message="Fine-tuned sentiment classifier"
)


‚úì Your model is now on the Hub for everyone to use!


---

# üéØ Section 8: Evaluation Metrics and Model Assessment

Proper evaluation is crucial for understanding your model's performance and identifying areas for improvement.

## Key Metrics for Different Tasks

### Classification Metrics

| Metric | Formula | Meaning | When to Use |
|--------|---------|---------|-----------|
| **Accuracy** | Correct / Total | Overall correctness | Balanced datasets |
| **Precision** | TP / (TP + FP) | Correct among predicted positive | Minimize false positives |
| **Recall** | TP / (TP + FN) | Correct among actual positive | Minimize false negatives |
| **F1 Score** | 2 √ó (P √ó R) / (P + R) | Harmonic mean of P and R | Balanced metric for imbalanced data |
| **ROC-AUC** | Area under ROC curve | Discrimination ability | Probabilistic classifiers |

### Where:
- **TP** = True Positives (correct positive predictions)
- **FP** = False Positives (incorrect positive predictions)
- **FN** = False Negatives (missed positive cases)
- **TN** = True Negatives (correct negative predictions)

## Practical Evaluation Example

In [26]:
from sklearn.metrics import (
    accuracy_score, precision_score, recall_score, f1_score,
    roc_auc_score, confusion_matrix, classification_report
)
import matplotlib.pyplot as plt
import seaborn as sns

# Simulate model predictions on test set
y_true = [0, 1, 1, 0, 1, 0, 1, 1, 0, 0]  # True labels
y_pred = [0, 1, 1, 0, 0, 0, 1, 1, 0, 1]  # Model predictions
y_pred_proba = [0.1, 0.9, 0.8, 0.2, 0.4, 0.3, 0.85, 0.95, 0.15, 0.6]  # Probabilities

print("üìä Comprehensive Evaluation Metrics\n")
print("=" * 70)

# Calculate metrics
accuracy = accuracy_score(y_true, y_pred)
precision = precision_score(y_true, y_pred)
recall = recall_score(y_true, y_pred)
f1 = f1_score(y_true, y_pred)
roc_auc = roc_auc_score(y_true, y_pred_proba)

print(f"‚úì Classification Metrics:")
print(f"  Accuracy:  {accuracy:.2%}")
print(f"  Precision: {precision:.2%}")
print(f"  Recall:    {recall:.2%}")
print(f"  F1 Score:  {f1:.2%}")
print(f"  ROC-AUC:   {roc_auc:.2%}")

print(f"\n‚úì Classification Report:")
print(classification_report(y_true, y_pred, target_names=['Negative', 'Positive']))

# Confusion matrix
cm = confusion_matrix(y_true, y_pred)
print(f"\n‚úì Confusion Matrix:")
print(f"           Predicted")
print(f"           Neg  Pos")
print(f"Actual Neg  {cm[0,0]}    {cm[0,1]}")
print(f"       Pos  {cm[1,0]}    {cm[1,1]}")

üìä Comprehensive Evaluation Metrics

‚úì Classification Metrics:
  Accuracy:  80.00%
  Precision: 80.00%
  Recall:    80.00%
  F1 Score:  80.00%
  ROC-AUC:   96.00%

‚úì Classification Report:
              precision    recall  f1-score   support

    Negative       0.80      0.80      0.80         5
    Positive       0.80      0.80      0.80         5

    accuracy                           0.80        10
   macro avg       0.80      0.80      0.80        10
weighted avg       0.80      0.80      0.80        10


‚úì Confusion Matrix:
           Predicted
           Neg  Pos
Actual Neg  4    1
       Pos  1    4


---

# üöÄ Section 9: Building Production-Ready Applications

Now let's put everything together to build a real-world application.

## Production Considerations

| Aspect | Consideration | Best Practice |
|--------|---------------|---|
| **Latency** | Model inference time | Use distilled or quantized models for real-time apps |
| **Throughput** | Requests per second | Batch processing, GPU acceleration |
| **Memory** | RAM/VRAM usage | Use smaller models, quantization |
| **Reliability** | Error handling | Graceful degradation, fallbacks |
| **Versioning** | Model updates | Track model versions, blue-green deployments |
| **Monitoring** | Performance tracking | Log metrics, track drift |
| **Scaling** | Handle load spikes | Containers, serverless, load balancing |

## Example: Complete Sentiment Analysis API

In [27]:
print("üèóÔ∏è Production-Ready Application Template\n")
print("=" * 70)

# Production application structure
app_code = """
# app.py - Production application structure

from fastapi import FastAPI
from transformers import pipeline
from pydantic import BaseModel
import logging

# Configure logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

app = FastAPI()

# Load model once at startup (not per request!)
sentiment_pipeline = pipeline("sentiment-analysis")

class TextInput(BaseModel):
    text: str
    
class PredictionOutput(BaseModel):
    text: str
    sentiment: str
    confidence: float
    
@app.post("/predict", response_model=PredictionOutput)
async def predict(input_data: TextInput):
    try:
        result = sentiment_pipeline(input_data.text)
        return PredictionOutput(
            text=input_data.text,
            sentiment=result[0]["label"],
            confidence=result[0]["score"]
        )
    except Exception as e:
        logger.error(f"Prediction error: {e}")
        raise

@app.get("/health")
async def health():
    return {"status": "healthy"}

# Run with: uvicorn app:app --host 0.0.0.0 --port 8000
"""

print(app_code)

print("\n‚úì Key Production Features:")
print("  - Load model once at startup (not per request)")
print("  - Use FastAPI for efficient async handling")
print("  - Proper error handling and logging")
print("  - Health check endpoint for monitoring")
print("  - Input validation with Pydantic")
print("  - Easy horizontal scaling with containers")

üèóÔ∏è Production-Ready Application Template


# app.py - Production application structure

from fastapi import FastAPI
from transformers import pipeline
from pydantic import BaseModel
import logging

# Configure logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

app = FastAPI()

# Load model once at startup (not per request!)
sentiment_pipeline = pipeline("sentiment-analysis")

class TextInput(BaseModel):
    text: str

class PredictionOutput(BaseModel):
    text: str
    sentiment: str
    confidence: float

@app.post("/predict", response_model=PredictionOutput)
async def predict(input_data: TextInput):
    try:
        result = sentiment_pipeline(input_data.text)
        return PredictionOutput(
            text=input_data.text,
            sentiment=result[0]["label"],
            confidence=result[0]["score"]
        )
    except Exception as e:
        logger.error(f"Prediction error: {e}")
        raise

@app.get("/health")
async def health():

In [28]:
print("\n" + "=" * 70)
print("üê≥ Deploying with Docker\n")

dockerfile_content = """
# Dockerfile

FROM python:3.9-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY app.py .

CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]
"""

print(dockerfile_content)

requirements = """
# requirements.txt
transformers==4.30.0
torch==2.0.0
fastapi==0.103.0
uvicorn==0.23.0
pydantic==2.0.0
"""

print(requirements)

print("‚úì Build and run:")
print("  docker build -t sentiment-app .")
print("  docker run -p 8000:8000 sentiment-app")


üê≥ Deploying with Docker


# Dockerfile

FROM python:3.9-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY app.py .

CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]


# requirements.txt
transformers==4.30.0
torch==2.0.0
fastapi==0.103.0
uvicorn==0.23.0
pydantic==2.0.0

‚úì Build and run:
  docker build -t sentiment-app .
  docker run -p 8000:8000 sentiment-app


---

# üìö Course Summary & Next Steps

## What You've Learned

‚úÖ **Installation & Authentication** - Set up Hugging Face securely  
‚úÖ **Ecosystem Understanding** - Know the three pillars (Transformers, Datasets, Hub)  
‚úÖ **Model Loading** - Use Auto classes to load any model  
‚úÖ **Tokenization** - Convert text to numbers that models understand  
‚úÖ **Pipelines** - Quick-start solutions for common NLP tasks  
‚úÖ **Fine-tuning** - Adapt models to your specific domain  
‚úÖ **Hub Integration** - Share your work with the community  
‚úÖ **Evaluation** - Properly assess model performance  
‚úÖ **Production** - Deploy real-world applications  

## üéØ Your Next Steps

### Beginner
1. ‚ú® Explore the [Model Hub](https://huggingface.co/models) - Find models for your use case
2. üß™ Try different pipelines - Experiment with various NLP tasks
3. üìñ Read model cards - Understand model capabilities and limitations

### Intermediate
1. üìä Load a real dataset - Use the Datasets library
2. üîß Fine-tune a model - Train on your own data
3. üß™ Experiment with hyperparameters - Optimize model performance
4. üì§ Push to Hub - Share your model

### Advanced
1. üèóÔ∏è Build production apps - Deploy with FastAPI/Flask
2. ‚öôÔ∏è Implement MLOps - Version control, monitoring, CI/CD
3. üîó Combine multiple models - Create complex pipelines
4. üåç Multi-lingual applications - Work with models in different languages

## üìö Essential Resources

### Official Documentation
- [Hugging Face Docs](https://huggingface.co/docs) - Official documentation
- [Transformers Library](https://huggingface.co/docs/transformers/) - Core library
- [Datasets Library](https://huggingface.co/docs/datasets/) - Data loading
- [Hub Documentation](https://huggingface.co/docs/hub/) - Model sharing

### Learning Resources
- [Hugging Face Course](https://huggingface.co/course/) - Free online course
- [YouTube Channel](https://www.youtube.com/@HuggingFace) - Video tutorials
- [Community Forum](https://discuss.huggingface.co/) - Ask questions
- [Papers with Code](https://paperswithcode.com/) - Implementation references

### Popular Models to Try
- **BERT** - Understanding language (`bert-base-uncased`)
- **GPT-2** - Text generation (`gpt2`)
- **T5** - Multitask (`t5-base`)
- **DistilBERT** - Lightweight (`distilbert-base-uncased`)
- **RoBERTa** - Robust (`roberta-base`)
- **ALBERT** - Memory efficient (`albert-base-v2`)

## ü§ù Join the Community

- üí¨ **GitHub** - Contribute to Transformers
- üåê **Forum** - Discuss with 100,000+ members
- üê¶ **Twitter** - Follow [@huggingface](https://twitter.com/huggingface)
- üíº **LinkedIn** - Connect with professionals

## üìù Troubleshooting Tips

### Model Loading Issues
```python
# Clear cache if model won't load
import torch
torch.cuda.empty_cache()

# Use device map for large models
model = AutoModel.from_pretrained(
    "model_id",
    device_map="auto"  # Automatically split across GPUs
)
```

### Out of Memory
```python
# Use a smaller model variant
model = AutoModel.from_pretrained("distilbert-base-uncased")

# Or quantize
model = AutoModel.from_pretrained(
    "model_id",
    load_in_8bit=True  # 8-bit quantization
)
```

### Slow Training
```python
# Use mixed precision
training_args = TrainingArguments(
    fp16=True,  # If GPU supports it
    bf16=False
)

# Use gradient accumulation
training_args = TrainingArguments(
    gradient_accumulation_steps=4
)
```

---

## üéì Congratulations!

You've completed this comprehensive Hugging Face course! You now have the knowledge to:
- Understand state-of-the-art NLP models
- Use pre-trained models for quick solutions
- Fine-tune models for your specific tasks
- Share your work with the community
- Deploy production-ready applications

The NLP field is rapidly evolving. Stay curious, experiment, and contribute back to the community!

### Share Your Projects
- Tag [@huggingface](https://twitter.com/huggingface) on Twitter
- Share on [Hugging Face Hub](https://huggingface.co)
- Join the [Discord community](https://discord.gg/JfAtqEZZVe)

Happy learning! üöÄ