# Calculate VRAM requirements for open source models
---

This notebook shows how to run [`hf-mem`](https://github.com/alvarobartt/hf-mem) from a Google Colab runtime using `uvx`.

`hf-mem` estimates inference memory requirements for models hosted on Hugging Face Hub.

## Before you proceed
---

1. In Colab, go to **Runtime > Change runtime type** and select **T4 GPU** (or better).
2. Then run **Runtime > Run all**.


## Install Dependencies
---

We install `uv` and verify that `uvx` is available in this runtime.


In [1]:
!apt -qq update
!curl -LsSf https://astral.sh/uv/install.sh | sh
!/usr/local/bin/uv --version
!/usr/local/bin/uvx --version

82 packages can be upgraded. Run 'apt list --upgradable' to see them.
[1;33mW: [0mSkipping acquire of configured file 'main/source/Sources' as repository 'https://r2u.stat.illinois.edu/ubuntu jammy InRelease' does not seem to provide it (sources.list entry misspelt?)[0m
downloading uv 0.10.4 x86_64-unknown-linux-gnu
no checksums to verify
installing to /usr/local/bin
  uv
  uvx
everything's installed!
uv 0.10.4
uvx 0.10.4


## Run hf-mem on Popular Hugging Face LLMs
---

Each command runs separately (no loop).


In [2]:
!/usr/local/bin/uvx hf-mem --model-id MiniMaxAI/MiniMax-M2

[2K[2mInstalled [1m11 packages[0m [2min 12ms[0m[0m                               [0m         
[38;2;244;183;63m┌┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┐[0m
[38;2;244;183;63m├┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┤[0m
[38;2;244;183;63m│            INFERENCE MEMORY ESTIMATE FOR            │[0m
[38;2;244;183;63m│      https://hf.co/MiniMaxAI/MiniMax-M2 @ main      │[0m
[38;2;244;183;63m├────────────────┬────────────────────────────────────┤[0m
[38;2;244;183;63m│ TOTAL MEMORY   │ 214.32 GB (228.70B PARAMS)         │[0m
[38;2;244;183;63m│ REQUIREMENTS   │ ██████████████████████████████████ │[0m
[38;2;244;183;63m├────────────────┼────────────────────────────────────┤[0m
[38;2;244;183;63m│ F32            │ 0.23 / 214.32 GB                   │[0m
[38;2;244;183;63m│ 62.65M PARAMS  │ ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ │[0m
[38;2;244;183;63m├────────────────┼────────────────────────────────────┤[0m
[38;2;244;183;63m│ F8_E4M3        │ 211

In [3]:
!/usr/local/bin/uvx hf-mem --model-id zai-org/GLM-5 

[2K[37m⠙[0m [2m                                                                              [0m[38;2;244;183;63m┌┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┐[0m
[38;2;244;183;63m├┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┤[0m
[38;2;244;183;63m│            INFERENCE MEMORY ESTIMATE FOR             │[0m
[38;2;244;183;63m│          https://hf.co/zai-org/GLM-5 @ main          │[0m
[38;2;244;183;63m├────────────────┬─────────────────────────────────────┤[0m
[38;2;244;183;63m│ TOTAL MEMORY   │ 1404.18 GB (753.86B PARAMS)         │[0m
[38;2;244;183;63m│ REQUIREMENTS   │ ███████████████████████████████████ │[0m
[38;2;244;183;63m├────────────────┼─────────────────────────────────────┤[0m
[38;2;244;183;63m│ BF16           │ 1404.18 / 1404.18 GB                │[0m
[38;2;244;183;63m│ 753.86B PARAMS │ ███████████████████████████████████ │[0m
[38;2;244;183;63m├────────────────┼─────────────────────────────────────┤[0m
[38;2;244;183;63m│ F32       

In [16]:
!/usr/local/bin/uvx hf-mem --model-id Qwen/Qwen3-4B

[2K[37m⠙[0m [2m                                                                              [0m[38;2;244;183;63m┌┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┐[0m
[38;2;244;183;63m├┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┴┤[0m
[38;2;244;183;63m│          INFERENCE MEMORY ESTIMATE FOR          │[0m
[38;2;244;183;63m│       https://hf.co/Qwen/Qwen3-4B @ main        │[0m
[38;2;244;183;63m├────────────────┬────────────────────────────────┤[0m
[38;2;244;183;63m│ TOTAL MEMORY   │ 7.49 GB (4.02B PARAMS)         │[0m
[38;2;244;183;63m│ REQUIREMENTS   │ ██████████████████████████████ │[0m
[38;2;244;183;63m├────────────────┼────────────────────────────────┤[0m
[38;2;244;183;63m│ BF16           │ 7.49 / 7.49 GB                 │[0m
[38;2;244;183;63m│ 4.02B PARAMS   │ ██████████████████████████████ │[0m
[38;2;244;183;63m└────────────────┴────────────────────────────────┘[0m


In [18]:
!nvidia-smi

Sat Feb 21 11:39:27 2026       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.82.07              Driver Version: 580.82.07      CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  Tesla T4                       Off |   00000000:00:04.0 Off |                    0 |
| N/A   34C    P8             10W /   70W |       0MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+----------------------------------------------

## Notes
---

- If a model is gated/private, authenticate in Colab first (e.g., `huggingface-cli login`).
- `hf-mem` is experimental and may change across releases.
