🤗 Hugging Face LLM Workspace

A comprehensive, production-ready development environment for working with Hugging Face large language models, transformers, and AI applications. Features safe model downloading with interrupt handling, interactive web interfaces, and comprehensive tooling.

🚀 Quick Start

🎯 Interactive Menu (Recommended)

python quickstart.py

Access all features through an easy-to-use menu system

📥 Model Explorer (Safe Downloads)

python model_explorer.py

Browse, download, and manage models with Ctrl+C safe interrupts

⚡ Quick Commands

# Basic text generation
python main.py --prompt "The future of AI is" --model gpt2

# Web interface
python examples/gradio_interface.py

# Setup everything
./setup.sh

📁 Project Structure

huggingface/
├── 📄 README.md                    # This comprehensive guide
├── 📋 requirements.txt             # Python dependencies
├── 🚀 quickstart.py               # Interactive menu system
├── 📥 model_explorer.py           # Safe model browser & downloader
├── 🧪 test_download_safety.py     # Download safety testing
├── 🐍 main.py                     # CLI text generation
├── ⚙️ setup.sh                    # Automated setup script
├── 🔒 .env.template               # Environment variables template
├── 📊 MODEL_GUIDE.md              # Model recommendations & info
├── 🛡️ DOWNLOAD_SAFETY.md          # Interrupt handling documentation
├── config/
│   └── 📊 config.json             # Model configurations
├── examples/
│   ├── 🎯 text_generation.py      # Text generation examples
│   ├── 🔄 model_loading.py        # Model loading & comparison
│   └── 🎨 gradio_interface.py     # Interactive web interfaces
├── utils/
│   ├── ⚙️ config.py               # Configuration management
│   └── 🔧 model_utils.py          # Utility functions & analysis
├── models/                        # Model cache directory
└── results/                       # Output and results directory

🛠️ Installation & Setup

⚡ One-Command Setup

./setup.sh

Automated setup with dependency checking and safety verification

Prerequisites

Python 3.8+
8GB+ RAM (16GB+ recommended for larger models)
Optional: NVIDIA GPU with CUDA support

Step-by-Step Installation

Install Python Dependencies
```
pip install -r requirements.txt
```

Configure Environment

# Copy the environment template
cp .env.template .env

# Edit .env file with your settings
nano .env

Essential Environment Variables

# Get your token from https://huggingface.co/settings/tokens
HUGGINGFACE_HUB_TOKEN=your_token_here

# Optional: For experiment tracking
WANDB_API_KEY=your_wandb_key_here

# Device configuration
DEVICE=auto  # or cpu, cuda, cuda:0, etc.

Test Installation

python -c "from transformers import pipeline; print('✅ Installation successful!')"

✨ Key Features

🛡️ Safe Model Downloads

Interrupt-safe downloading - Press Ctrl+C anytime without corruption
Automatic resume capability - Interrupted downloads continue seamlessly
Integrity verification - Ensure downloaded models are complete
Smart caching - No duplicate downloads, efficient storage

🎨 Interactive Interfaces

Web-based UI with Gradio for all AI tasks
Interactive menu system for easy navigation
Model comparison tools side-by-side testing
Real-time generation with parameter controls

🤖 Comprehensive AI Tasks

Text Generation - GPT-style language models
Text Classification - Sentiment, emotion, topic analysis
Question Answering - Context-based Q&A systems
Named Entity Recognition - Extract entities from text
Text Summarization - Automatic content summarization

🔧 Advanced Features

Quantization support for large models (4-bit, 8-bit)
Multi-GPU support with automatic device mapping
Experiment tracking with Weights & Biases integration
Configuration management with environment variables
Model benchmarking and performance analysis

🎯 Usage Examples

🚀 Interactive Menu System

python quickstart.py

One-stop access to all features with guided workflows

📥 Safe Model Downloads

python model_explorer.py

Browse 100,000+ models, download safely with Ctrl+C interrupt support

🎨 Web Interface

python examples/gradio_interface.py

Complete AI playground with text generation, classification, Q&A, and more

⚡ Quick Commands

Text Generation

# Basic generation
python main.py --prompt "The future of AI is" --model gpt2

# Advanced generation with parameters
python main.py --prompt "Once upon a time" --max-length 150 --temperature 0.8

Model Management

# Browse and download models safely
python model_explorer.py

# Check model integrity
python -c "
from model_explorer import ModelExplorer
explorer = ModelExplorer()
result = explorer.check_download_integrity('gpt2')
print('Complete:', result.get('complete', False))
"

# View available models
python -c "from utils.config import config; print(config.get_model_list('text_generation'))"

Advanced Features

# Model comparison
python examples/text_generation.py

# Comprehensive model analysis
python examples/model_loading.py

# Performance benchmarking
python utils/model_utils.py

Interactive Web Interface

python examples/gradio_interface.py --port 7860

Model Analysis
```
python utils/model_utils.py
```

🤖 Models & Downloads

📥 100,000+ Models Available

Access the entire Hugging Face model hub through our safe download system:

python model_explorer.py

🏆 Most Popular Models (by downloads)

openai-community/gpt2 - 10.6M downloads (550MB)
Qwen/Qwen2.5-7B-Instruct - 7.6M downloads (~14GB)
meta-llama/Llama-3.1-8B-Instruct - 5.2M downloads (~16GB)
distilbert-base-uncased-finetuned-sst-2-english - 5.2M downloads
facebook/bart-large-cnn - 2.7M downloads

🟢 Recommended Starter Models (< 2GB total)

# Download these first for testing
gpt2                    # 550MB - Basic text generation
distilgpt2             # 350MB - Faster text generation  
t5-small               # 242MB - Multi-task model
distilbert-base-uncased # 268MB - Classification tasks

🟡 Production Models (< 5GB total)

# Upgrade to these for better performance  
gpt2-medium            # 1.5GB - Better text quality
t5-base                # 892MB - Better summarization
facebook/bart-large-cnn # 1.6GB - News summarization

🔍 Model Categories

Text Generation: GPT-2, GPT-Neo, OPT, Qwen, LLaMA
Classification: BERT, RoBERTa, DistilBERT variants
Question Answering: BERT, DeBERTa, RoBERTa fine-tuned
Summarization: BART, T5, Pegasus
Translation: mBart, Helsinki-NLP models
NER: BERT, spaCy, domain-specific models

See MODEL_GUIDE.md for comprehensive model recommendations

🎨 Web Interface Features

The Gradio interface (examples/gradio_interface.py) provides:

Text Generation: Interactive text generation with parameter controls
Text Classification: Sentiment and emotion analysis
Question Answering: Ask questions about any text
Model Comparison: Compare outputs from multiple models
Real-time Results: Instant feedback and results

Launching the Interface

# Launch all interfaces
python examples/gradio_interface.py

# Launch specific interface
python examples/gradio_interface.py --interface generation
python examples/gradio_interface.py --interface classification
python examples/gradio_interface.py --interface qa

# Share publicly (creates public link)
python examples/gradio_interface.py --share

⚙️ Configuration

Configuration Files

.env - Environment variables and API keys
config/config.json - Model lists, default parameters, hardware requirements

Key Configuration Options

from utils.config import config

# Print current configuration
config.print_summary()

# Get model lists
text_models = config.get_model_list("text_generation")
classification_models = config.get_model_list("classification")

# Get default parameters
gen_params = config.get_generation_params()

🔧 Advanced Features

Quantization (for Large Models)

from transformers import BitsAndBytesConfig, AutoModelForCausalLM

# 4-bit quantization configuration
quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_quant_type="nf4"
)

model = AutoModelForCausalLM.from_pretrained(
    "gpt2-large",
    quantization_config=quantization_config
)

Model Benchmarking

from utils.model_utils import benchmark_model, print_model_summary

# Analyze a model
print_model_summary("gpt2")

# Benchmark model performance
results = benchmark_model("gpt2", num_runs=3)

Custom Model Loading

from examples.model_loading import ModelLoader

loader = ModelLoader()
model, tokenizer = loader.load_causal_lm("gpt2", quantize=True)

# Get model information
info = loader.get_model_info("gpt2")
print(f"Model size: {info['model_size_mb']:.1f} MB")

📊 Experiment Tracking

Weights & Biases Integration

import wandb
from utils.config import config

# Initialize W&B (configure WANDB_API_KEY in .env)
wandb.init(project="huggingface-experiments")

# Log model outputs
wandb.log({"generated_text": generated_text, "prompt": prompt})

Saving Results

from utils.model_utils import save_results, load_results

# Save experiment results
results = {"model": "gpt2", "prompt": prompt, "output": output}
save_results(results, "experiment_1.json")

# Load previous results
previous_results = load_results("experiment_1.json")

🚀 Performance Tips

For Large Models

Use Quantization: Enable 4-bit or 8-bit quantization for large models
GPU Memory: Monitor GPU memory usage with nvidia-smi
Batch Processing: Process multiple inputs together when possible

For Better Results

Prompt Engineering: Craft clear, specific prompts
Temperature Control: Lower temperature (0.1-0.3) for focused output, higher (0.7-1.0) for creativity
Model Selection: Choose appropriate model size for your task

Environment Variables for Performance

# Enable optimized memory usage
export PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:128

# Use optimized CUDA kernels
export CUDA_LAUNCH_BLOCKING=0

# Cache models locally
export HF_HOME=/workspaces/huggingface/models/.cache

🐛 Troubleshooting

Common Issues

CUDA Out of Memory

# Use quantization or smaller batch sizes
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto"
)

Model Download Issues

# Set Hugging Face token for private models
huggingface-cli login

Import Errors

# Reinstall dependencies
pip install -r requirements.txt --force-reinstall

Gradio Interface Not Accessible

# Check if port is available
python examples/gradio_interface.py --port 7861

Getting Help

📚 Learning Resources

Tutorials and Guides

Example Notebooks

Text generation with different models
Fine-tuning for custom tasks
Model comparison and evaluation
Prompt engineering techniques

🛡️ Safety & Error Handling

Our workspace includes production-ready safety features:

⚡ Interrupt Safety: Graceful handling of Ctrl+C during downloads
🔄 Resume Downloads: Automatic resume for interrupted downloads
✅ Integrity Checking: Verify model completeness after download
🧹 Cleanup Tools: Remove incomplete downloads safely
📊 Progress Tracking: Real-time download progress with ETA

See DOWNLOAD_SAFETY.md for technical details.

📈 Performance & Optimization

GPU Acceleration: Automatic CUDA detection and usage
Memory Management: Efficient model loading with garbage collection
Quantization Support: 8-bit and 4-bit model compression
Batch Processing: Optimize inference for multiple inputs
Caching: Smart model and tokenizer caching

🧪 Testing & Development

# Test download safety features
python test_download_safety.py

# Validate model integrity
python -c "from model_explorer import ModelExplorer; ModelExplorer().check_all_models()"

# Monitor GPU usage
python -c "import torch; print(f'GPU Available: {torch.cuda.is_available()}')"

📚 Additional Resources

Hugging Face Model Hub - Browse 100,000+ models
Transformers Documentation - Official API docs
MODEL_GUIDE.md - Our model recommendations
DOWNLOAD_SAFETY.md - Safety feature details

🤝 Contributing

Contributions are welcome! Please feel free to submit issues, fork the repository, and create pull requests.

Development Setup

# Install development dependencies
pip install -r requirements.txt

# Run tests
python -m pytest tests/

# Format code
black .

📄 License

This project is licensed under the MIT License. See the LICENSE file for details.

🙏 Acknowledgments

Hugging Face for their incredible transformers library
Gradio for the easy-to-use interface framework
The open-source AI community for making these tools accessible

🚀 Ready to explore AI? Start with python quickstart.py and dive into the world of large language models!

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
config		config
examples		examples
utils		utils
.env.template		.env.template
.gitignore		.gitignore
DOWNLOAD_SAFETY.md		DOWNLOAD_SAFETY.md
MODEL_GUIDE.md		MODEL_GUIDE.md
README.md		README.md
main.py		main.py
model_explorer.py		model_explorer.py
quickstart.py		quickstart.py
requirements.txt		requirements.txt
setup.sh		setup.sh
test_download_safety.py		test_download_safety.py

keithaumiller/huggingface

Folders and files

Latest commit

History

Repository files navigation

🤗 Hugging Face LLM Workspace

🚀 Quick Start

🎯 Interactive Menu (Recommended)

📥 Model Explorer (Safe Downloads)

⚡ Quick Commands

📁 Project Structure

🛠️ Installation & Setup

⚡ One-Command Setup

Prerequisites

Step-by-Step Installation

✨ Key Features

🛡️ Safe Model Downloads

🎨 Interactive Interfaces

🤖 Comprehensive AI Tasks

🔧 Advanced Features

🎯 Usage Examples

🚀 Interactive Menu System

📥 Safe Model Downloads

🎨 Web Interface

⚡ Quick Commands

Text Generation

Model Management

Advanced Features

🤖 Models & Downloads

📥 100,000+ Models Available

🏆 Most Popular Models (by downloads)

🟢 Recommended Starter Models (< 2GB total)

🟡 Production Models (< 5GB total)

🔍 Model Categories

🎨 Web Interface Features

Launching the Interface

⚙️ Configuration

Configuration Files

Key Configuration Options

🔧 Advanced Features

Quantization (for Large Models)

Model Benchmarking

Custom Model Loading

📊 Experiment Tracking

Weights & Biases Integration

Saving Results

🚀 Performance Tips

For Large Models

For Better Results

Environment Variables for Performance

🐛 Troubleshooting

Common Issues

Getting Help

📚 Learning Resources

Tutorials and Guides

Example Notebooks

🛡️ Safety & Error Handling

📈 Performance & Optimization

🧪 Testing & Development

📚 Additional Resources

🤝 Contributing

Development Setup

📄 License

🙏 Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages