# Milvus GPU on WSL2 — Verified Setup (Docker Compose)

This runbook captures a clean setup of Milvus Standalone **with GPU acceleration** on Windows 11/WSL2 using Docker Desktop. It reflects the current Milvus documentation for GPU deployments and adds the extra checks needed to confirm that work actually lands on the GPU.

## 1. Check the official GPU prerequisites

Milvus requires specific NVIDIA hardware and drivers before you ever pull a container:

- GPU compute capability must be 6.0, 7.0, 7.5, 8.0, 8.6, or 9.0. Use NVIDIA's tables to verify your card.
- Install a recent NVIDIA driver (545+) on Windows and inside WSL, and add the **NVIDIA Container Toolkit** so Docker can forward the GPU into containers.
- Validate the driver stack with `modinfo nvidia | grep "^version"` inside WSL — it should report the same version you installed.

See the "Requirements for Installing Milvus with GPU" page for the full prerequisite checklist and driver guidance. Keep that page handy if you need to troubleshoot a driver mismatch later.

>
**Tip:** On Windows, keep the standard GeForce/RTX driver current. Inside WSL2, use the Ubuntu packages that match that version (example below). If the versions diverge, GPU passthrough frequently fails.

## 2. Baseline the WSL2 environment

From an elevated PowerShell session:

```powershell
wsl --status
wsl -l -v
```

Then, in Ubuntu (WSL2):

```bash
# Kernel + systemd
uname -r
cat /proc/version
systemctl is-system-running || true

# Docker Desktop integration
docker version
docker info | head -n 20
```

If Docker is unreachable, open **Docker Desktop → Settings → Resources → WSL Integration** and enable your Ubuntu distro.

## 3. Install/refresh NVIDIA packages inside WSL2

The Milvus docs recommend installing the headless driver build plus the container toolkit on Ubuntu 22.04+.

```bash
sudo apt-get update
sudo apt-get install --no-install-recommends -y   nvidia-headless-545 nvidia-utils-545   nvidia-container-toolkit

# Verify that the driver and toolkit registered correctly
modinfo nvidia | grep "^version"
```

If you use a different distribution, follow NVIDIA's official installation guide for the container toolkit and pick the matching driver package. After installation, restart WSL (`wsl --shutdown`) so the new kernel modules load.

## 4. Confirm Docker sees the GPU

Back in Ubuntu, run the CUDA base image to ensure `nvidia-smi` works inside containers:

```bash
docker run --rm --gpus all nvidia/cuda:12.5.1-base-ubuntu24.04 nvidia-smi
```

You should see your GPU(s) listed with the correct driver version. If this fails, revisit the driver and container toolkit setup before proceeding.

## 5. Download the official GPU docker-compose file

Create a workspace and pull the compose file published with the Milvus release you are targeting:

```bash
mkdir -p ~/milvus && cd ~/milvus
wget https://github.com/milvus-io/milvus/releases/download/v2.6.5/milvus-standalone-docker-compose-gpu.yml -O docker-compose.yml
```

Edit `docker-compose.yml` so the `standalone` service reserves the GPU IDs you want Milvus to use:

```yaml
standalone:
  ...
  deploy:
    resources:
      reservations:
        devices:
          - driver: nvidia
            capabilities: ["gpu"]
            device_ids: ["0"]        # one GPU
```

Milvus supports multiple GPUs; list them as `device_ids: ["0", "1"]` if you want to expose two cards. This mirrors the guidance in the Milvus GPU installation manual.

>
**Optional:** If you need to further constrain GPU visibility at runtime, wrap the start command with `CUDA_VISIBLE_DEVICES=0` (or similar) — the official docs note this as the supported override.

## 6. Launch Milvus with GPU support

From the same directory:

```bash
docker compose up -d
docker compose ps
docker logs -f milvus-standalone
```

You should see `milvus-etcd`, `milvus-minio`, and `milvus-standalone` in the **Up** state. Keep the logs open until Milvus reports it has started services; seeing only `tini` means the wrong image was pulled.

## 7. Verify the GPU inside the Milvus container

Check that the Milvus container has access to the GPU device:

```bash
docker compose exec milvus-standalone nvidia-smi
```

The GPUs should appear under `GPU` with matching driver/runtime versions. If the `Processes` table is empty, that's expected while Milvus is idle.

## 8. Drive an actual GPU workload (Python quick test)

Milvus only touches the GPU while building or querying GPU-backed indexes (for example `GPU_CAGRA`, `GPU_IVF_FLAT`, `GPU_IVF_PQ`, or `GPU_BRUTE_FORCE`). Use the snippet below to create a collection, insert data, build a `GPU_CAGRA` index, and run a search. The index build/search phases will momentarily show up in `nvidia-smi`.

In [None]:
%%bash
python - <<'PY'
from pymilvus import (
    connections,
    FieldSchema, CollectionSchema, DataType,
    Collection
)
import numpy as np

# Connect to the standalone Milvus instance
connections.connect(host="127.0.0.1", port="19530")

# Drop any leftover collection from previous runs
COLLECTION_NAME = "gpu_quickstart"
if COLLECTION_NAME in [c.name for c in Collection.list_collections()]:
    Collection(COLLECTION_NAME).drop()

# Define schema
fields = [
    FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, auto_id=False),
    FieldSchema(name="emb", dtype=DataType.FLOAT_VECTOR, dim=384),
]
schema = CollectionSchema(fields, description="GPU validation demo")

# Create collection and insert random vectors
collection = Collection(name=COLLECTION_NAME, schema=schema)
np.random.seed(42)
N = 20000
ids = np.arange(N)
embeddings = np.random.random((N, 384)).astype(np.float32)
collection.insert([ids, embeddings])

# Build a GPU index (GPU_CAGRA) and load it
collection.create_index(
    field_name="emb",
    index_params={
        "index_type": "GPU_CAGRA",
        "metric_type": "L2",
        "params": {"graph_degree": 32}
    }
)
collection.load()

# Run a search to exercise the GPU
search_vec = embeddings[:5]
results = collection.search(
    search_vec,
    "emb",
    param={"metric_type": "L2", "search_width": 8},
    limit=5,
)
print(f"Search completed, hits for first query: {[hit.id for hit in results[0]]}")

# Clean up so the next run starts fresh
collection.drop()
PY

While the index builds or searches run, execute `watch -n 1 nvidia-smi` in another terminal — Milvus should briefly appear in the process list with GPU memory usage.

## 9. Optional tuning (memory pool)

If you need to adjust GPU memory reservations, copy `/milvus/configs/milvus.yaml` out of the container, edit the `gpu.initMemSize` and `gpu.maxMemSize` values, then copy the file back and restart the container. This mirrors the memory-pool guidance in the Milvus GPU install manual.

## 10. Shutdown / restart

When finished:

```bash
docker compose down
```

To reboot later, re-open Ubuntu, `cd ~/milvus`, and run `docker compose up -d` again. Verify with `docker compose ps` before reconnecting from Python.