# Model Size Calculations: Memory Requirements by Data Type

This notebook shows how different data types affect the memory requirements for LLMs of various sizes.

**Key Question**: How much memory do you need to load a 7B parameter model in FP16 vs FP32?


In [None]:
# !pip install pandas

In [1]:
import pandas as pd


## Setup: Define Model Sizes and Data Types

We'll calculate memory for common LLM sizes:


In [2]:
# Standard model sizes (in billions of parameters)
model_sizes = [1, 7, 13, 70]

# Data types and their byte sizes
data_types = {
    "FP32": 4,
    "FP16": 2,
    "BF16": 2,
    "INT8": 1,
    "INT4": 0.5,
}


In [3]:
def calculate_model_memory(num_params_billions, bytes_per_param):
    """Calculate memory in GB for a model"""
    num_params = num_params_billions * 1e9
    bytes_total = num_params * bytes_per_param
    gb = bytes_total / (1024 ** 3)
    return gb


## Calculate Memory Requirements for All Combinations

In [4]:
# Create results dictionary
results = []

for num_params_b in model_sizes:
    row = {"Model Size": f"{num_params_b}B"}
    for dtype, bytes_size in data_types.items():
        memory_gb = calculate_model_memory(num_params_b, bytes_size)
        row[dtype] = f"{memory_gb:.2f} GB"
    results.append(row)

# Create DataFrame
df = pd.DataFrame(results)

# Display the table
print("=" * 80)
print("MODEL SIZE CALCULATIONS: Memory Requirements by Data Type")
print("=" * 80)
print()
print(df.to_string(index=False))
print()


MODEL SIZE CALCULATIONS: Memory Requirements by Data Type

Model Size      FP32      FP16      BF16     INT8     INT4
        1B   3.73 GB   1.86 GB   1.86 GB  0.93 GB  0.47 GB
        7B  26.08 GB  13.04 GB  13.04 GB  6.52 GB  3.26 GB
       13B  48.43 GB  24.21 GB  24.21 GB 12.11 GB  6.05 GB
       70B 260.77 GB 130.39 GB 130.39 GB 65.19 GB 32.60 GB



---

## Key Insights

From the table above, we can see:

1. **FP32 → FP16/BF16**: Cuts memory in half (4 bytes → 2 bytes)
   - A 7B model: 28 GB → 14 GB
   - A 70B model: 280 GB → 140 GB

2. **FP16 → INT8**: Cuts memory in half again (2 bytes → 1 byte)
   - A 7B model: 14 GB → 7 GB
   - Enables running larger models on consumer hardware

3. **INT4 (4-bit quantization)**: Most aggressive compression
   - A 7B model: Only 3.5 GB!
   - Makes it possible to run LLMs on laptops and mobile devices

**Important Note**: This calculation only accounts for model weights. In practice, you need additional memory for:
- Activations during forward pass
- Gradients during training
- Optimizer states (Adam uses ~2x model size)
- KV cache for generation

**Rule of thumb**: For inference, multiply model size by 1.2x. For training, multiply by 4-6x depending on optimizer.


## Memory Calculation Function

The formula is simple: **Memory (GB) = Number of Parameters × Bytes per Parameter / (1024³)**
