-
Notifications
You must be signed in to change notification settings - Fork 224
Description
Is this a duplicate?
- I confirmed there appear to be no duplicate issues for this bug and that I agree to the Code of Conduct
Type of Bug
Performance
Component
Not sure
Describe the bug
We are working on generative AI models training
We have noticed that we are getting massive speed loss when we do big data transfer between RAM and GPU on Windows compared to Linux
The hit is such a big scale that Linux runs 2x faster than Windows even more
Same GPU RTX 5090
You can read more info here : kohya-ss/musubi-tuner#700
It turns out if we enable TCC mode on Windows, it gets equal speed as Linux
However again NVIDIA blocked this at driver level
I found a Chinese article with just changing few letters, via patching Patching nvlddmkm.sys, the TCC mode fully becomes working on consumer GPUs
Now my question is, why we can't get Linux speed on Windows?
Moreover it seems like Microsoft added this feature : MCDM
https://learn.microsoft.com/en-us/windows-hardware/drivers/display/mcdm-architecture
Why it is still not available for consumer GPUs?
How can we solve this slowness on Windows compared to Linux?
Thank you so much
How to Reproduce
Do big data transfer between GPU and RAM and compare speed on Windows and Linux
Expected behavior
Same speed as Linux on Windows
Operating System
Windows 11
nvidia-smi output
Microsoft Windows [Version 10.0.26200.7019]
(c) Microsoft Corporation. All rights reserved.
C:\Users\Furkan>nvidia-smi
Sun Nov 2 13:18:29 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 581.57 Driver Version: 581.57 CUDA Version: 13.0 |
+-----------------------------------------+------------------------+----------------------+
| GPU Name Driver-Model | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 5090 WDDM | 00000000:01:00.0 Off | N/A |
| 0% 40C P8 10W / 575W | 641MiB / 32607MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA GeForce RTX 3090 Ti WDDM | 00000000:0E:00.0 On | Off |
| 30% 52C P0 102W / 450W | 8847MiB / 24564MiB | 4% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+