In [None]:
!pip install datasets huggingface_hub

In [None]:
# Option 1: Login via CLI (run in terminal)
# !huggingface-cli login

# Option 2: Set token from environment variable or directly
import os
from huggingface_hub import login

# Get token from .env file or set directly
HF_TOKEN = os.getenv("HF_TOKEN", "")  # Add your token to .env file

if HF_TOKEN:
    login(token=HF_TOKEN)
    print("✓ Logged in to Hugging Face")
else:
    print("⚠️  No HF_TOKEN found. Get one from: https://huggingface.co/settings/tokens")
    print("   Then either:")
    print("   1. Add to .env file: HF_TOKEN=your_token_here")
    print("   2. Or run: !huggingface-cli login")

In [None]:
In this context, row is a single integer value from the df["label"] column, not an entire row of the DataFrame.

When you use .apply() on a pandas Series (a single column), it passes each individual element to the function one at a time. So row here represents:

One numeric label value (e.g., 0, 1, 2, 3, 4, or 5)
From the "label" column of the DataFrame
The parameter name row is somewhat misleading—it would be more accurate to call it label_value or label_int since it's just a single integer, not an entire row.

Example flow:

If df["label"] contains [0, 1, 2, 0, 3, ...]
The function gets called with label_int2str(0), then label_int2str(1), then label_int2str(2), etc.
Each call converts that single integer to its corresponding emotion string

head() is a pandas DataFrame method that returns the first few rows of a DataFrame.

Default behavior:

Returns the first 5 rows
With parameter:

head(n) - returns first n rows
Purpose:

Quick data inspection
Verify data structure
Check column names and types
Preview data after transformations

In [None]:
from datasets import load_dataset
from huggingface_hub import list_datasets
import pandas as pd



all_datasets = list_datasets()
print(f"There are {len(list(all_datasets))} datasets currently available on the Hub")
print(f"The first 10 are: {list(all_datasets)[:10]}")

emotions = load_dataset("emotion")
emotions.set_format(type="pandas")
df = emotions["train"][:]

def label_int2str(row):
    return emotions["train"].features["label"].int2str(row)

df["label_name"] = df["label"].apply(label_int2str)
df.head()