# Downloading Hugging Face Models Locally

This notebook will guide you through the process of downloading models from Hugging Face to your local device using the `huggingface-cli` command-line tool.

## Table of Contents
1. Installing Required Libraries
2. Creating a Hugging Face Account and Getting an Access Token
3. Logging in with huggingface-cli
4. Downloading Models
5. Using Downloaded Models

## 1. Installing Required Libraries

First, we need to install the Hugging Face Hub library which includes the CLI tool.

In [None]:
# Install the huggingface_hub library
!pip install huggingface_hub

Depending on what you want to do with the models, you may also need:
- `transformers` - for working with transformer models
- `torch` or `tensorflow` - for running the models
- `diffusers` - for working with diffusion models (like Stable Diffusion)
- `datasets` - for working with Hugging Face datasets

In [None]:
# Install additional common libraries (optional, based on your needs)
!pip install transformers torch datasets

## 2. Creating a Hugging Face Account and Getting an Access Token

### Steps to get your access token:

1. **Create an account** (if you don't have one):
   - Go to [huggingface.co](https://huggingface.co)
   - Click "Sign Up" and create your account

2. **Generate an access token**:
   - Log in to your Hugging Face account
   - Click on your profile picture (top right)
   - Go to **Settings** â†’ **Access Tokens**
   - Click **"New token"**
   - Give it a name (e.g., "local-downloads")
   - Select the token type:
     - **Read**: For downloading public and private models you have access to
     - **Write**: If you also want to upload models
   - Click **"Generate token"**
   - **Important**: Copy the token immediately - you won't be able to see it again!

### Token Types:
- **Read tokens**: Can download models and datasets
- **Write tokens**: Can upload and modify models and datasets
- **Fine-grained tokens**: Custom permissions for specific repositories

## 3. Logging in with huggingface-cli

There are two main ways to authenticate with Hugging Face:

### Method 1: Using the Command Line (Recommended)

Open your terminal and run:

```bash
huggingface-cli login
```

You'll be prompted to:
1. Enter your access token
2. Choose whether to add the token as a git credential (recommended: yes)

The token will be saved securely on your system.

In [None]:
# You can also run it from within the notebook:
!huggingface-cli login

### Method 2: Programmatic Login (Alternative)

You can also log in directly from Python code:

In [None]:
from huggingface_hub import login

# Option 1: Interactive - will prompt for token
login()

# Option 2: Pass token directly (not recommended for notebooks you'll share)
# login(token="your_token_here")

### Verify Your Login

In [None]:
from huggingface_hub import whoami

# Check if you're logged in
try:
    user_info = whoami()
    print(f"Successfully logged in as: {user_info['name']}")
    print(f"Email: {user_info.get('email', 'N/A')}")
except Exception as e:
    print(f"Not logged in or error occurred: {e}")

## 4. Downloading Models

### Finding Models on Hugging Face

1. Go to [huggingface.co/models](https://huggingface.co/models)
2. Browse or search for models
3. Filter by:
   - Task (text generation, image classification, etc.)
   - Library (transformers, diffusers, etc.)
   - Language
   - License
4. Click on a model to see its page
5. Copy the model ID (e.g., `bert-base-uncased`, `meta-llama/Llama-2-7b-hf`)

### Method 1: Download Using CLI

In [None]:
# Download an entire model repository
!huggingface-cli download bert-base-uncased

# Download a specific model with all files
!huggingface-cli download gpt2

### Method 2: Download Using Python API

In [None]:
from huggingface_hub import hf_hub_download, snapshot_download

# Download a single file from a model
file_path = hf_hub_download(
    repo_id="bert-base-uncased",
    filename="config.json"
)
print(f"Downloaded file to: {file_path}")

# Download entire model repository
model_path = snapshot_download(
    repo_id="gpt2"
)
print(f"Downloaded model to: {model_path}")

### Download Specific Files or Revisions

In [None]:
from huggingface_hub import snapshot_download

# Download only specific file patterns
model_path = snapshot_download(
    repo_id="gpt2",
    allow_patterns=["*.json", "*.txt"],  # Only download JSON and TXT files
)

# Download a specific revision (branch or tag)
model_path = snapshot_download(
    repo_id="gpt2",
    revision="main"  # or a specific commit hash
)

# Download to a specific local directory
model_path = snapshot_download(
    repo_id="gpt2",
    local_dir="./my_models/gpt2",
    local_dir_use_symlinks=False  # Copy files instead of using symlinks
)

### Where Are Models Downloaded?

By default, models are cached in:
- **Linux/Mac**: `~/.cache/huggingface/hub/`
- **Windows**: `C:\Users\<username>\.cache\huggingface\hub\`

You can change this by setting the `HF_HOME` environment variable.

In [None]:
import os
from pathlib import Path

# Check your cache directory
cache_dir = os.getenv('HF_HOME', Path.home() / '.cache' / 'huggingface')
print(f"Hugging Face cache directory: {cache_dir}")

# To change it (set before importing huggingface_hub):
# os.environ['HF_HOME'] = '/path/to/your/cache'

## 5. Using Downloaded Models

Once models are downloaded, you can use them with the transformers library:

In [None]:
from transformers import AutoTokenizer, AutoModel

# Load a model - it will use the cached version if already downloaded
model_name = "bert-base-uncased"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name)

print(f"Successfully loaded {model_name}")
print(f"Model has {model.num_parameters():,} parameters")

### Example: Text Generation with GPT-2

In [None]:
from transformers import pipeline

# Create a text generation pipeline
generator = pipeline('text-generation', model='gpt2')

# Generate text
result = generator(
    "Artificial intelligence is",
    max_length=50,
    num_return_sequences=1
)

print(result[0]['generated_text'])

### Using a Local Model Path

If you've downloaded a model to a specific directory, you can load it directly:

In [None]:
from transformers import AutoModel, AutoTokenizer

# Load from a local directory
local_model_path = "./my_models/gpt2"

# Make sure the path exists first
if os.path.exists(local_model_path):
    tokenizer = AutoTokenizer.from_pretrained(local_model_path)
    model = AutoModel.from_pretrained(local_model_path)
    print("Model loaded from local path")
else:
    print(f"Path {local_model_path} does not exist")

## Additional Tips

### Downloading Gated Models

Some models (like Llama 2) require you to:
1. Accept the license on the model's Hugging Face page
2. Use a token with read permissions

### Managing Cache

To see what's in your cache and manage it:

In [None]:
from huggingface_hub import scan_cache_dir

# Scan your cache
cache_info = scan_cache_dir()

print(f"Total cache size: {cache_info.size_on_disk / (1024**3):.2f} GB")
print(f"Number of repos: {len(cache_info.repos)}")

# List all cached repos
for repo in cache_info.repos:
    print(f"- {repo.repo_id}: {repo.size_on_disk / (1024**2):.2f} MB")

### Delete Unused Cache

In [None]:
from huggingface_hub import scan_cache_dir

# Get cache info
cache_info = scan_cache_dir()

# Delete a specific repo from cache
# Uncomment and modify to use:
# delete_strategy = cache_info.delete_revisions("model-name")
# delete_strategy.execute()

print("Cache management commands ready to use")

## Summary

In this notebook, you've learned:
1. How to install the necessary Hugging Face libraries
2. How to create an account and get an access token
3. How to log in using `huggingface-cli login`
4. Multiple methods to download models from Hugging Face
5. How to use downloaded models in your projects

## Resources

- [Hugging Face Hub Documentation](https://huggingface.co/docs/huggingface_hub/index)
- [Transformers Documentation](https://huggingface.co/docs/transformers/index)
- [Hugging Face Models](https://huggingface.co/models)
- [Hugging Face CLI Guide](https://huggingface.co/docs/huggingface_hub/guides/cli)