Complete automated installation guide for Qwen3-Coder-480B-A35B-Instruct model on Ubuntu systems.
📌 Important: For the verified working installation process, see INSTALLATION_GUIDE_480B.md. This guide provides the exact steps that successfully installed the 480B model with GGUF format.
# One-line installation
curl -fsSL https://raw.githubusercontent.com/twobitapps/480b-setup/main/install.sh | bash
- GPU: NVIDIA H100 80GB HBM3 (recommended) or A100 80GB
- RAM: 64GB+ system RAM
- Storage: 500GB+ free space (450GB for model + dependencies)
- CPU: 16+ cores recommended
- Network: High-speed internet for initial model download
- OS: Ubuntu 20.04+ or 22.04 LTS (recommended)
- Python: 3.8-3.11 (3.10 recommended)
- CUDA: 12.1+ (will be installed automatically)
- Git: Latest version
- Git LFS: For large file handling
./install.sh
Follow the detailed instructions in MANUAL_INSTALL.md
docker-compose up -d
480b-setup/
├── README.md # This file
├── install.sh # Main installation script
├── MANUAL_INSTALL.md # Step-by-step manual guide
├── scripts/
│ ├── system_check.sh # System requirements verification
│ ├── dependencies.sh # Install system dependencies
│ ├── python_env.sh # Python environment setup
│ ├── cuda_setup.sh # CUDA installation
│ ├── model_download.sh # Model download with resume
│ ├── test_installation.sh # Installation verification
│ └── benchmark.sh # Performance testing
├── config/
│ ├── requirements.txt # Python dependencies
│ ├── environment.yml # Conda environment
│ └── model_config.json # Model configuration
├── examples/
│ ├── basic_inference.py # Simple inference example
│ ├── benchmark_test.py # Performance benchmark
│ └── comparison_demo.py # 480B vs 7B comparison
├── docker/
│ ├── Dockerfile # Docker container setup
│ └── docker-compose.yml # Docker Compose configuration
└── docs/
├── TROUBLESHOOTING.md # Common issues and solutions
├── PERFORMANCE.md # Performance tuning guide
└── API_REFERENCE.md # API usage documentation
After installation, verify everything works:
# Run system verification
./scripts/test_installation.sh
# Run basic inference test
python examples/basic_inference.py
# Run performance benchmark
./scripts/benchmark.sh
If you encounter issues:
- Check TROUBLESHOOTING.md
- Run
./scripts/system_check.sh
to verify requirements - Check logs in
~/qwen480b_env/logs/
- Open an issue with detailed error logs
On NVIDIA H100 80GB:
- Model Loading: ~2-3 minutes
- First Inference: ~10-15 seconds (cold start)
- Subsequent Inference: ~3-6 seconds
- Tokens per Second: 100-200 (depends on prompt complexity)
- Memory Usage: ~45-50GB VRAM
- Qwen Model Comparison Platform - Visual comparison demo
- Original Qwen Repository
- Hugging Face Model (Full)
- Hugging Face Model (GGUF)
This setup guide is provided under MIT License. The Qwen model follows its own licensing terms.
- Fork the repository
- Create a feature branch
- Test your changes thoroughly
- Submit a pull request
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Documentation: Wiki