# 00 — Check GPU and sampling backends

This notebook helps you verify that `profgpu` can read utilization metrics in your environment.

It will:

- show `nvidia-smi` output (if available)
- run a short `GpuMonitor` block
- optionally compare NVML vs `nvidia-smi` backends

> Tip: for the lowest overhead sampling, install NVML support:
>
> ```bash
> pip install profgpu[nvml]
> ```


In [1]:
import shutil
import subprocess

print("nvidia-smi on PATH:", shutil.which("nvidia-smi") is not None)
if shutil.which("nvidia-smi"):
    out = subprocess.check_output(["nvidia-smi"], text=True)
    print(out)
else:
    print(
        "nvidia-smi not found; NVML may still work if drivers expose it and you installed nvidia-ml-py3."
    )

nvidia-smi on PATH: True
Sat Feb 21 21:09:34 2026       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.54.03              Driver Version: 535.54.03    CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  NVIDIA A10G                    On  | 00000000:00:1E.0 Off |                    0 |
|  0%   22C    P8              15W / 300W |      4MiB / 23028MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                           

In [2]:
import time

from profgpu import GpuMonitor

# strict=False means: if no backend is available, we get a summary with notes instead of an exception.
with GpuMonitor(interval_s=0.5, strict=False) as mon:
    time.sleep(3)

print(mon.summary.format())

[GPU 0] NVIDIA A10G
  duration: 3.000s | samples: 6 @ 0.500s
  util.gpu: mean 0.0% | p50 0.0% | p95 0.0% | max 0.0%
  util.mem: mean 0.0%
  memory: max used 517 MB / total 23028 MB
  power: mean 15.6 W | max 15.6 W
  temp: max 22 °C
  busy time (est): 0.000s
  util trace: ▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁


In [3]:
import time

from profgpu import GpuBackendError, GpuMonitor

for backend in ["nvml", "smi"]:
    print()
    print("--- backend:", backend, "---")
    try:
        with GpuMonitor(backend=backend, interval_s=0.5, strict=True) as mon:
            time.sleep(2)
        print(mon.summary.format())
    except GpuBackendError as e:
        print("backend not available:", e)


--- backend: nvml ---
[GPU 0] NVIDIA A10G
  duration: 2.000s | samples: 4 @ 0.500s
  util.gpu: mean 0.0% | p50 0.0% | p95 0.0% | max 0.0%
  util.mem: mean 0.0%
  memory: max used 517 MB / total 23028 MB
  power: mean 15.6 W | max 15.7 W
  temp: max 22 °C
  busy time (est): 0.000s
  util trace: ▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

--- backend: smi ---
[GPU 0] NVIDIA A10G
  duration: 2.000s | samples: 4 @ 0.500s
  util.gpu: mean 0.0% | p50 0.0% | p95 0.0% | max 0.0%
  util.mem: mean 0.0%
  memory: max used 4 MB / total 23028 MB
  power: mean 15.5 W | max 15.6 W
  temp: max 22 °C
  busy time (est): 0.000s
  util trace: ▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
