# 01a Environment Variables (Hugging Face)

This notebook validates Hugging Face access without using the CLI. It relies on the `HF_TOKEN` environment variable to authenticate and then performs a small, controlled download check.

Run order:
1. Set `HF_TOKEN` (and optional test model env vars).
2. Validate token + account access.
3. Run the model/tokenizer download test.


## 1) Set `HF_TOKEN` (no CLI)

You must create a Hugging Face access token and set it as an environment variable **before** running the auth check.

### Option A: Set in your shell

```bash
export HF_TOKEN="your_hf_token_here"
```

### Option B: Set inside the notebook (temporary)

Use the next cell to paste your token (input is hidden). This sets `HF_TOKEN` for the current kernel only.

Notes:
- Make sure you have accepted any gated model licenses you plan to download.
- Do **not** print or share your token.


In [2]:
import os  # Python standard library (environment variables)
from getpass import getpass  # Python standard library (hidden input)


def get_hf_token() -> str:
    """Return a usable Hugging Face token from env or a hidden prompt.

    Raises:
        ValueError: If the token is missing or empty after prompting.
    """
    token = os.environ.get("HF_TOKEN")
    if token:
        return token

    print("HF_TOKEN not found in environment.")
    token = getpass("Paste HF_TOKEN (input hidden): ").strip()
    if not token:
        raise ValueError("HF_TOKEN is required to authenticate with Hugging Face.")

    # Set it for this kernel session only (not persisted to your shell).
    os.environ["HF_TOKEN"] = token
    return token


hf_token = get_hf_token()
print("HF_TOKEN loaded (masked):", f"{hf_token[:4]}...{hf_token[-4:]}")

HF_TOKEN not found in environment.
HF_TOKEN loaded (masked): hf_Z...Ssdi


In [3]:
from huggingface_hub import whoami  # huggingface_hub package (HF auth utilities)

try:
    user_info = whoami(token=hf_token)
    print("Hugging Face authentication: ✓")
    print(f"Logged in as: {user_info.get('name', 'Unknown')}")
except Exception as exc:
    print("Hugging Face authentication: ✗")
    print(f"Error: {exc}")
    raise

  from .autonotebook import tqdm as notebook_tqdm


Hugging Face authentication: ✓
Logged in as: goblevsp


## 2) Configure the download test

To avoid hardcoding any model IDs, set a small, public model ID in your environment.

### Option A: Set in your shell

```bash
export HF_TEST_MODEL_ID="<your-small-public-model-id>"
```

If you use the shell approach, make sure you **launch Jupyter from the same shell** so the kernel inherits the environment variables.

### Option B: Set inside the notebook (temporary)

Use the next cell to set `HF_TEST_MODEL_ID` for this kernel session only.

Optional:
- Set `HF_DOWNLOAD_WEIGHTS=true` if you want to download full model weights.
- If omitted, the test only downloads the model config + tokenizer.


# Look at the top of your VS Code and copy paste google/gemma-3-1b-it into that command!!!

google/gemma-3-1b-it

In [4]:
import os  # Python standard library (environment variables)


def get_test_model_id() -> str:
    """Return a test model ID from env or prompt the user.

    Raises:
        ValueError: If the model ID is missing or empty after prompting.
    """
    model_id = os.environ.get("HF_TEST_MODEL_ID")
    if model_id:
        return model_id

    print("HF_TEST_MODEL_ID not found in environment.")
    model_id = input("Enter a small public model ID: ").strip()
    if not model_id:
        raise ValueError("HF_TEST_MODEL_ID is required for the download test.")

    # Set it for this kernel session only (not persisted to your shell).
    os.environ["HF_TEST_MODEL_ID"] = model_id
    return model_id


test_model_id = get_test_model_id()
print("HF_TEST_MODEL_ID set for this kernel:", test_model_id)

HF_TEST_MODEL_ID not found in environment.
HF_TEST_MODEL_ID set for this kernel: google/gemma-3-1b-it


In [5]:
from transformers import AutoConfig, AutoModel, AutoTokenizer  # transformers package (model utilities)

model_id = globals().get("test_model_id") or os.environ.get("HF_TEST_MODEL_ID")
if not model_id:
    raise ValueError(
        "HF_TEST_MODEL_ID is not set. Set it to a small public model ID and rerun."
    )

download_weights = os.environ.get("HF_DOWNLOAD_WEIGHTS", "true").lower() == "true"

print(f"Download test model_id: {model_id}")
print(f"Download full weights: {download_weights}")

try:
    config = AutoConfig.from_pretrained(model_id, token=hf_token)
    tokenizer = AutoTokenizer.from_pretrained(model_id, token=hf_token)
    print("Config + tokenizer download: ✓")

    if download_weights:
        _ = AutoModel.from_pretrained(model_id, token=hf_token)
        print("Model weights download: ✓")
    else:
        print("Model weights download: skipped (set HF_DOWNLOAD_WEIGHTS=true to enable)")
except Exception as exc:
    print("Download test failed.")
    print(f"Error: {exc}")
    raise

Download test model_id: google/gemma-3-1b-it
Download full weights: True
Config + tokenizer download: ✓


Some weights of Gemma3TextModel were not initialized from the model checkpoint at google/gemma-3-1b-it and are newly initialized: ['embed_tokens.weight', 'layers.0.input_layernorm.weight', 'layers.0.mlp.down_proj.weight', 'layers.0.mlp.gate_proj.weight', 'layers.0.mlp.up_proj.weight', 'layers.0.post_attention_layernorm.weight', 'layers.0.post_feedforward_layernorm.weight', 'layers.0.pre_feedforward_layernorm.weight', 'layers.0.self_attn.k_norm.weight', 'layers.0.self_attn.k_proj.weight', 'layers.0.self_attn.o_proj.weight', 'layers.0.self_attn.q_norm.weight', 'layers.0.self_attn.q_proj.weight', 'layers.0.self_attn.v_proj.weight', 'layers.1.input_layernorm.weight', 'layers.1.mlp.down_proj.weight', 'layers.1.mlp.gate_proj.weight', 'layers.1.mlp.up_proj.weight', 'layers.1.post_attention_layernorm.weight', 'layers.1.post_feedforward_layernorm.weight', 'layers.1.pre_feedforward_layernorm.weight', 'layers.1.self_attn.k_norm.weight', 'layers.1.self_attn.k_proj.weight', 'layers.1.self_attn.o_pr

Model weights download: ✓
