Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LACT sometimes completely disables power limit (6900XT) #207

Closed
adolfintel opened this issue Nov 13, 2023 · 6 comments · Fixed by #233
Closed

LACT sometimes completely disables power limit (6900XT) #207

adolfintel opened this issue Nov 13, 2023 · 6 comments · Fixed by #233

Comments

@adolfintel
Copy link

If the power limit is set to a value that's too low for the power state that the card is currently in, it seems that the driver completely disables the power limit, causing the card to have a spike in power consumption (I've seen it go as high as 420w up from a 293w TDP) until the card leaves that state and then it is applied correctly.

This can be easily replicated by setting the power limit while running any GPU intensive application that will keep the card at its highest power state (demonstrated in the video using stable diffusion) and can potentially lead to damage if running something like furmark which is not memory-bound.

OS: Arch Linux, Kernel 6.6.1, GPU: 6900XT

Here's a video of the problem:
plbug.webm

@ilya-zlobintsev
Copy link
Owner

As you've noted, this is the behaviour of the driver, which LACT has no direct control of.
However, i think there can be a workaround: if the performance mode would be set to "lowest clocks" when a new power limit is applied, and then reverted back to whatever it was before (most likely "automatic"), then in theory this issue would be avoided.

Could you test if doing this manually avoids the problem on your system? Set the performance level to lowest together with the new power limit, apply, and then change it back to automatic. If it does help, then I'll implement it to be done automatically when setting the limit.

@adolfintel
Copy link
Author

adolfintel commented Nov 14, 2023

Switching to lowest clocks doesn't seem to apply until the card exits the highest power state either. Weird...

Maybe there's some way to "suspend" work on the GPU for a fraction of a second when clicking apply, that would cause it to switch power states.

Also, this might be an issue worth reporting to whoever develops the kernel part of the AMD driver, regardless of overclocking settings, the card should never disable its power limit.

@ilya-zlobintsev
Copy link
Owner

I don't think there's a good way to "suspend" the GPU.
Maybe there could be a wait interval until the clockspeed or power usage come down, though I'm not sure if it would be a good user experience. I'll see if i can reproduce this on my machine.

Kernel issues are generally reported here: https://gitlab.freedesktop.org/drm/amd

@adolfintel
Copy link
Author

@ilya-zlobintsev
Copy link
Owner

@adolfintel if this is still relevant: could you try building the power-limit-clockdown branch and see if it solves the issue? I've added a workaround that waits until the GPU enters the lowest power state before applying the limit.

@adolfintel
Copy link
Author

@ilya-zlobintsev that works! Good job :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants