[Feature] Blackwell GPU (RTX 50-series, sm_120) support for immich-machine-learning:release-cuda #28582

ekropotin · 2026-05-24T04:32:55Z

ekropotin
May 24, 2026

I have searched the existing feature requests, both open and closed, to make sure this is not a duplicate request.

Yes

The feature

Opening this as a Discussion rather than an Issue because the previous Issue (#28031) was auto-closed by the duplicate-detection bot with no actual duplicate identified, and my reply asking for the duplicate link went unanswered. A Discussion seems like the right venue to consolidate the problem, the workaround, and the open question on how upstream wants to track this.

Problem

The ghcr.io/immich-app/immich-machine-learning:release-cuda image silently falls back to CPU on Blackwell GPUs (RTX 50-series, compute capability sm_120):

CUDA failure 35: CUDA driver version is insufficient for CUDA runtime version
EP Error: Failed to parse provider option "device_id" ... cudaGetDeviceCount
Falling back to ['CPUExecutionProvider']

Reproduced on RTX 5060 Ti (driver 580.126.18, CUDA 13.0 capable). Also reported in #28032 review and by @yopichy on a separate RTX 50-series setup. The DGX Spark GB10 subthread in #10647 hits the same
sm_120/sm_121 family from the arm64 side.

Root cause

The prod-cuda stage is built on nvidia/cuda:12.2.2-runtime-ubuntu22.04. CUDA 12.2 predates Blackwell — sm_120 support landed in CUDA 12.8 (Feb 2025).

Solution

A naive bump to 12.8 doesn't fully work. The prod-cuda stage installs libcudnn9-cuda-12=9.10.2.21-1 (kept at 9.10 because 9.11 drops Pascal). Its apt dependency chain drags in cuda-cudart-12-2 /cuda-libraries-12-2 and runs update-alternatives so /usr/local/cuda is silently re-pointed at /usr/local/cuda-12.2/ — leaving the image effectively on CUDA 12.2 again. apt-get autoremove in the later prod stage re-triggers update-alternatives, so even an explicit ln -sfn or update-alternatives --set in prod-cuda gets overridden.

A working mitigation (currently running in production on my fork): reinstall cuda-cudart-12-8 after cuDNN, apt-mark manual it so autoremove doesn't strip it, set it as the primary alternative, and re-pin the symlink as a guard in the final prod stage. Full Dockerfile and root-cause notes:

Fork: https://github.com/ekropotin/immich
Diff
Image (interim workaround for affected users): ghcr.io/ekropotin/immich-machine-learning

Asks

Is there a tracking issue or renovate-config follow-up I can subscribe to? @bo0tzz mentioned in fix(ml): update CUDA base image to 12.8.1 for Blackwell GPU support #28032 that renovate was holding back the CUDA base image incorrectly; I'd like to follow that thread but didn't
find a public reference.
If upstream prefers to keep the base image at 12.2 for some reason I'm missing, would a release-cuda-blackwell variant tag (or build-arg-driven matrix) be acceptable? Happy to draft a PR that conforms to
the template / changelog-label rules this time.
The libcudnn9 / cudart apt trap above is non-obvious and bites anyone who tries the version bump locally — worth a note in the ML Dockerfile regardless of which direction this goes.

Platform

Server
Web
Mobile

2026-05-24T04:33:18Z

github-actions[bot]
Bot May 24, 2026

This discussion has automatically been closed as it is likely a duplicate. We get a lot of duplicate threads each day, which is why we ask you in the template to confirm that you searched for duplicates before opening one. If you're sure this is not a duplicate, please leave a comment and we will reopen the thread if necessary.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] Blackwell GPU (RTX 50-series, sm_120) support for immich-machine-learning:release-cuda #28582

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

[Feature] Blackwell GPU (RTX 50-series, sm_120) support for immich-machine-learning:release-cuda #28582

Uh oh!

Uh oh!

ekropotin May 24, 2026

I have searched the existing feature requests, both open and closed, to make sure this is not a duplicate request.

The feature

Problem

Root cause

Solution

Asks

Platform

Replies: 1 comment

Uh oh!

github-actions[bot] Bot May 24, 2026

ekropotin
May 24, 2026

github-actions[bot]
Bot May 24, 2026