A simple and easy-to-use GPU memory calculator for Large Language Models (LLMs). Helps you quickly estimate the GPU memory requirements and number of devices needed to run models of various sizes.
- Calculate required GPU memory based on model parameter count
- Support for multiple precision formats: FP32, FP16/BFLOAT16, FP8, INT8, INT4
- Built-in presets for popular LLM models (DeepSeek-R1, Qwen, Llama, etc.)
- Support for over 130 GPU models, including:
- NVIDIA Data Center GPUs (H100, H200, B100, B200, etc.)
- NVIDIA Consumer GPUs (RTX series)
- AMD Data Center GPUs (Instinct series)
- AMD Consumer GPUs (RX series)
- Apple Silicon (M1-M4 series)
- Huawei Ascend series
- Complete internationalization support (English, Simplified Chinese, Traditional Chinese, Russian, Japanese, Korean, Arabic)
- Responsive design for desktop and mobile devices
This calculator uses the following formula to estimate GPU memory required for LLM inference:
- Model Weight Memory = Number of Parameters × Bytes per Parameter
- Inference Memory = Model Weight Memory × 10% (for activations, KV cache, etc.)
- Total Memory Requirement = Model Weight Memory + Inference Memory
- Required GPUs = Total Memory Requirement ÷ Single GPU Memory Capacity (rounded up)
- Next.js - React framework
- TypeScript - Type-safe JavaScript superset
- Tailwind CSS - Utility-first CSS framework
- shadcn/ui - Reusable UI components
- next-intl - Internationalization support
# Install dependencies
pnpm install
# Start development server
pnpm dev
# Build for production
pnpm build
# Start production server
pnpm startContributions via Pull Requests or Issues are welcome to help improve this project. Potential areas for contribution include:
- Adding more GPU models
- Adding more LLM model presets
- Optimizing calculation methods
- Adding support for more languages
- Improving UI/UX
MIT