A comprehensive, production-ready development environment for working with Hugging Face large language models, transformers, and AI applications. Features safe model downloading with interrupt handling, interactive web interfaces, and comprehensive tooling.
python quickstart.pyAccess all features through an easy-to-use menu system
python model_explorer.pyBrowse, download, and manage models with Ctrl+C safe interrupts
# Basic text generation
python main.py --prompt "The future of AI is" --model gpt2
# Web interface
python examples/gradio_interface.py
# Setup everything
./setup.shhuggingface/
βββ π README.md # This comprehensive guide
βββ π requirements.txt # Python dependencies
βββ π quickstart.py # Interactive menu system
βββ π₯ model_explorer.py # Safe model browser & downloader
βββ π§ͺ test_download_safety.py # Download safety testing
βββ π main.py # CLI text generation
βββ βοΈ setup.sh # Automated setup script
βββ π .env.template # Environment variables template
βββ π MODEL_GUIDE.md # Model recommendations & info
βββ π‘οΈ DOWNLOAD_SAFETY.md # Interrupt handling documentation
βββ config/
β βββ π config.json # Model configurations
βββ examples/
β βββ π― text_generation.py # Text generation examples
β βββ π model_loading.py # Model loading & comparison
β βββ π¨ gradio_interface.py # Interactive web interfaces
βββ utils/
β βββ βοΈ config.py # Configuration management
β βββ π§ model_utils.py # Utility functions & analysis
βββ models/ # Model cache directory
βββ results/ # Output and results directory
./setup.shAutomated setup with dependency checking and safety verification
- Python 3.8+
- 8GB+ RAM (16GB+ recommended for larger models)
- Optional: NVIDIA GPU with CUDA support
-
Install Python Dependencies
pip install -r requirements.txt
-
Configure Environment
# Copy the environment template cp .env.template .env # Edit .env file with your settings nano .env
-
Essential Environment Variables
# Get your token from https://huggingface.co/settings/tokens HUGGINGFACE_HUB_TOKEN=your_token_here # Optional: For experiment tracking WANDB_API_KEY=your_wandb_key_here # Device configuration DEVICE=auto # or cpu, cuda, cuda:0, etc.
-
Test Installation
python -c "from transformers import pipeline; print('β Installation successful!')"
- Interrupt-safe downloading - Press Ctrl+C anytime without corruption
- Automatic resume capability - Interrupted downloads continue seamlessly
- Integrity verification - Ensure downloaded models are complete
- Smart caching - No duplicate downloads, efficient storage
- Web-based UI with Gradio for all AI tasks
- Interactive menu system for easy navigation
- Model comparison tools side-by-side testing
- Real-time generation with parameter controls
- Text Generation - GPT-style language models
- Text Classification - Sentiment, emotion, topic analysis
- Question Answering - Context-based Q&A systems
- Named Entity Recognition - Extract entities from text
- Text Summarization - Automatic content summarization
- Quantization support for large models (4-bit, 8-bit)
- Multi-GPU support with automatic device mapping
- Experiment tracking with Weights & Biases integration
- Configuration management with environment variables
- Model benchmarking and performance analysis
python quickstart.pyOne-stop access to all features with guided workflows
python model_explorer.pyBrowse 100,000+ models, download safely with Ctrl+C interrupt support
python examples/gradio_interface.pyComplete AI playground with text generation, classification, Q&A, and more
# Basic generation
python main.py --prompt "The future of AI is" --model gpt2
# Advanced generation with parameters
python main.py --prompt "Once upon a time" --max-length 150 --temperature 0.8# Browse and download models safely
python model_explorer.py
# Check model integrity
python -c "
from model_explorer import ModelExplorer
explorer = ModelExplorer()
result = explorer.check_download_integrity('gpt2')
print('Complete:', result.get('complete', False))
"
# View available models
python -c "from utils.config import config; print(config.get_model_list('text_generation'))"# Model comparison
python examples/text_generation.py
# Comprehensive model analysis
python examples/model_loading.py
# Performance benchmarking
python utils/model_utils.py-
Interactive Web Interface
python examples/gradio_interface.py --port 7860
-
Model Analysis
python utils/model_utils.py
Access the entire Hugging Face model hub through our safe download system:
python model_explorer.py- openai-community/gpt2 - 10.6M downloads (550MB)
- Qwen/Qwen2.5-7B-Instruct - 7.6M downloads (~14GB)
- meta-llama/Llama-3.1-8B-Instruct - 5.2M downloads (~16GB)
- distilbert-base-uncased-finetuned-sst-2-english - 5.2M downloads
- facebook/bart-large-cnn - 2.7M downloads
# Download these first for testing
gpt2 # 550MB - Basic text generation
distilgpt2 # 350MB - Faster text generation
t5-small # 242MB - Multi-task model
distilbert-base-uncased # 268MB - Classification tasks# Upgrade to these for better performance
gpt2-medium # 1.5GB - Better text quality
t5-base # 892MB - Better summarization
facebook/bart-large-cnn # 1.6GB - News summarization- Text Generation: GPT-2, GPT-Neo, OPT, Qwen, LLaMA
- Classification: BERT, RoBERTa, DistilBERT variants
- Question Answering: BERT, DeBERTa, RoBERTa fine-tuned
- Summarization: BART, T5, Pegasus
- Translation: mBart, Helsinki-NLP models
- NER: BERT, spaCy, domain-specific models
See MODEL_GUIDE.md for comprehensive model recommendations
The Gradio interface (examples/gradio_interface.py) provides:
- Text Generation: Interactive text generation with parameter controls
- Text Classification: Sentiment and emotion analysis
- Question Answering: Ask questions about any text
- Model Comparison: Compare outputs from multiple models
- Real-time Results: Instant feedback and results
# Launch all interfaces
python examples/gradio_interface.py
# Launch specific interface
python examples/gradio_interface.py --interface generation
python examples/gradio_interface.py --interface classification
python examples/gradio_interface.py --interface qa
# Share publicly (creates public link)
python examples/gradio_interface.py --share.env- Environment variables and API keysconfig/config.json- Model lists, default parameters, hardware requirements
from utils.config import config
# Print current configuration
config.print_summary()
# Get model lists
text_models = config.get_model_list("text_generation")
classification_models = config.get_model_list("classification")
# Get default parameters
gen_params = config.get_generation_params()from transformers import BitsAndBytesConfig, AutoModelForCausalLM
# 4-bit quantization configuration
quantization_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_quant_type="nf4"
)
model = AutoModelForCausalLM.from_pretrained(
"gpt2-large",
quantization_config=quantization_config
)from utils.model_utils import benchmark_model, print_model_summary
# Analyze a model
print_model_summary("gpt2")
# Benchmark model performance
results = benchmark_model("gpt2", num_runs=3)from examples.model_loading import ModelLoader
loader = ModelLoader()
model, tokenizer = loader.load_causal_lm("gpt2", quantize=True)
# Get model information
info = loader.get_model_info("gpt2")
print(f"Model size: {info['model_size_mb']:.1f} MB")import wandb
from utils.config import config
# Initialize W&B (configure WANDB_API_KEY in .env)
wandb.init(project="huggingface-experiments")
# Log model outputs
wandb.log({"generated_text": generated_text, "prompt": prompt})from utils.model_utils import save_results, load_results
# Save experiment results
results = {"model": "gpt2", "prompt": prompt, "output": output}
save_results(results, "experiment_1.json")
# Load previous results
previous_results = load_results("experiment_1.json")- Use Quantization: Enable 4-bit or 8-bit quantization for large models
- GPU Memory: Monitor GPU memory usage with
nvidia-smi - Batch Processing: Process multiple inputs together when possible
- Prompt Engineering: Craft clear, specific prompts
- Temperature Control: Lower temperature (0.1-0.3) for focused output, higher (0.7-1.0) for creativity
- Model Selection: Choose appropriate model size for your task
# Enable optimized memory usage
export PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:128
# Use optimized CUDA kernels
export CUDA_LAUNCH_BLOCKING=0
# Cache models locally
export HF_HOME=/workspaces/huggingface/models/.cache-
CUDA Out of Memory
# Use quantization or smaller batch sizes model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype=torch.float16, device_map="auto" )
-
Model Download Issues
# Set Hugging Face token for private models huggingface-cli login -
Import Errors
# Reinstall dependencies pip install -r requirements.txt --force-reinstall -
Gradio Interface Not Accessible
# Check if port is available python examples/gradio_interface.py --port 7861
- Check the Hugging Face Documentation
- Visit Transformers GitHub
- Join the Hugging Face Discord
- Text generation with different models
- Fine-tuning for custom tasks
- Model comparison and evaluation
- Prompt engineering techniques
Our workspace includes production-ready safety features:
- β‘ Interrupt Safety: Graceful handling of Ctrl+C during downloads
- π Resume Downloads: Automatic resume for interrupted downloads
- β Integrity Checking: Verify model completeness after download
- π§Ή Cleanup Tools: Remove incomplete downloads safely
- π Progress Tracking: Real-time download progress with ETA
See DOWNLOAD_SAFETY.md for technical details.
- GPU Acceleration: Automatic CUDA detection and usage
- Memory Management: Efficient model loading with garbage collection
- Quantization Support: 8-bit and 4-bit model compression
- Batch Processing: Optimize inference for multiple inputs
- Caching: Smart model and tokenizer caching
# Test download safety features
python test_download_safety.py
# Validate model integrity
python -c "from model_explorer import ModelExplorer; ModelExplorer().check_all_models()"
# Monitor GPU usage
python -c "import torch; print(f'GPU Available: {torch.cuda.is_available()}')"- Hugging Face Model Hub - Browse 100,000+ models
- Transformers Documentation - Official API docs
- MODEL_GUIDE.md - Our model recommendations
- DOWNLOAD_SAFETY.md - Safety feature details
Contributions are welcome! Please feel free to submit issues, fork the repository, and create pull requests.
# Install development dependencies
pip install -r requirements.txt
# Run tests
python -m pytest tests/
# Format code
black .This project is licensed under the MIT License. See the LICENSE file for details.
- Hugging Face for their incredible transformers library
- Gradio for the easy-to-use interface framework
- The open-source AI community for making these tools accessible
π Ready to explore AI? Start with python quickstart.py and dive into the world of large language models!