-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
This is Broken #1
Comments
Thanks for the report, I suspect this is related to the overdrive changes in stable kernel 5.5, will try to test it as soon as possible. Can you try to disable amdgpu.ppfeaturemask (as (power)upp does not rely on that)? |
I also seey conflict between |
@azeam but if you disable |
And as far as
|
Same with |
With
|
Actually it doesn't even do anything at all without |
At work now, will get back with a longer reply later tonight, but try reading the current values with |
Wait yeah it does. Still doesn't work man:
It seems this program is just broken |
I've now updated to stable kernel 5.5 (from rc2) but I'm not able to reproduce this on my 5700 XT, (in fact nor any other issues even with OverDrive enabled as far as I can tell). I still don't get the:
values, like you have on your 5600, so something is different with the OverDrive implementation, either between our systems or the way the 5600/5700 XT cards are working. This is of less importance with the OverDrive turned off though, just a remark. But as @sibradzic noted above there still seem to be issues with the OverDrive settings in combination with the pp table, so keep OverDrive disabled. Powerupp checks for the pp table revision number and the only one implemented is "12", so our pp tables should be constructed the same, and from the information you have given the application also seems to (read and) change the expected parameters (even if the results aren't). A few notes: The Have you checked the performance? If the card is dropping to 300 MHz there should be a noticeable performance drop when you change the value from 1780 to 1781.
Powerupp does not read any actual clock speeds, it only reads the values that are set in the pp table (using upp). It is however on top of my to-do list to add some simple monitoring feature.
That is expected, if you successfully apply values and then load them they should appear the same. Powerupp only reads the values set in the pp table.
It is not actually a bash script as in a file on the system, but it sends a couple of bash commands under the same pkexec (kdesu) prompt (to avoid having to type the password multiple times): including the upp commands containing the values entered to write to the pp table and also a write to the hwmon power limit (as the pp table power limit is oddly implemented). If you do a "persistent save" it will however create a bash script (containing basically the same things as when applying) in |
This is actually expected when your GPU is idle. Are you sure you are actually putting your GPU under any load when you are checking these values? Try running this little monitoring script in a terminal, before running some game or 3D test, in a window (of full-screen on another monitor, case you have more than one):
and start it with Now start some GPU load and check those values changing (and change they should, regardless if you have any of the over/under clock/volt applied). MCLK values should fluctuate even if you do simple tings on your desktop, like moving some window around for example... |
It's not the load. All I have to do is run |
Have you tried forcing performance level without any overclocking applied (regardless if it's powerupp, or just upp or pp_od_clk_voltage)? The above totally do work on my 5700 in any 5.5rcX or 5.5 final release, without any additional patches, and regardless of how I modify Do you have the latest radeon firmware binaries deployed? |
Changing the performance level to high in Now we're getting somewhere: I pulled down the
You'll notice up there, that the frequency table starts with state 0 at 300Mhz, then goes to state 1 at 1780, and down from there. This is everything default, no
as does `radeon-profile. So, I took your advice and ran something to stress the gpu to make sure they weren't just inaccurately reported for some reason. So I ran unigine heaven, and unless you think 8 or 9 fps on an RX 5600 XT sounds right, then no, they're definitely being reported correctly and the clocks actually DO get set to 300Mhz. I reapplied the default stock card settings and ran unigine heaven again, at the reported 1780MHz, and sure enough, I was back up to 78-79 fps average. So this isn't a reporting error, running powerupp (and upp as well, which would make sense) and making it anything above the stock clock frequency forces the clock to run at 300MHz. So, naturally my next thought was to try to change the 300MHz state 0, right? Bad idea. Running
At which point the entire So obviously that 0 state can't be modified, or something. Now again, all of this is WITHOUT
Regarding that comment, I'm not sure what you're referring to. I wasn't trying to illustrate that the card was running at 800MHz in the quote you were replying to there, I was pointing out that if you set the core clock to anything 1780MHz and BELOW, then it worked fine, and Not really sure where to go from here, I mean maybe it's something to do with the new kernel patch that they added (which is why my |
UPDATE: This is a crazy coincidence, but I was commenting in the comments section on the Phoronix article about the 5600 XT and the new firmware, and I was asking about something to do with memory clocks completely unrelated to overclocking, and someone who is apparently an engineer replied "If you go outside the firmware's limits the clock defaults to 300 MHz. That matches the performance Michael was seeing." So it sounds like for some reason the way that upp tries to edit |
OK, so setting anything larger than 1780 is making the amdgpu driver power-management go nuts for you? Have you tried to see if there is anything significant in the kernel log (
Sorry, I have no clue what is radeonjet, but I guess that should match the output of
Indeed :) Lowest state clocks for both GPU & VRAM are not meant to be changed at all. Hence the very unpredictable driver behaviour or just hang.
It looks to me that your card firmware is blocking your max FreqTableGfx/1 clock. Try setting lower clock to confirm if the pp_table interface works at all in the first place (you may also try changing all instances of 1780 to 1800 for example, just for the lulz). Then make sure you have the latest VBIOS as well as the latest firmware, check https://www.phoronix.com/scan.php?page=news_item&px=Ubuntu-19.10-Radeon-RX-5700. As I see no report of an issue similar to yours on any 5700 cards, my gut feeling is telling mi this is totally about 5600XT firmware / VBIOS. Are you running factory-VBIOS (one with AMD-pre-anounced lower clocks) or the one after the card was released? btw, can you please share your pp_table, in its raw form? |
https://people.freedesktop.org/~agd5f/radeon_ucode/navi10/new/navi10_smc.bin oh, wait, you are there already... |
That's literally the link that I posted above, it's the same comments section. I already have that firmware, I got it from the devs days ago. Also, there IS no "after the card was released" vBIOS for the Sapphire Pulse, Sapphire actually flashed the new vBIOS on all of their 5600 XTs in North America before launch, which is why my stock frequency is 1780Mhz instead of 1650 or 1675 or whatever, which was the original one. But, I do have a copy of the original vBIOS but that would be useless because the new firmware is for the new vBIOS, and the old vBIOS has lower limits than the new one. Also yes, if you read my original comments, like I said if you set it to anything under 1780 in upp or powerupp it does in fact work. Going over 1780, though, does not. And from the engineer in the Phoronix forum's comments, it sounds like it's something to do with the way upp tries to change clock speeds which violates the firmware's settings, as opposed to |
Well, I guess you have your answer there, it's a firmware limitation. I agree that it's odd not being able to increase the clock to 1820 MHz with pp table though. Anyway this is not an issue with neither powerupp nor upp, they are doing what they are supposed to (i.e. reading and adjusting the pp table) afaict, but I find it interesting and would like to know more so I'll keep the issue open for a while if there's more information to be had.
Does this mean that if you keep the Gfx clock at 1780 you can increase the memory clock without anything breaking or does the Gfx clock drop to 300 if you increase the memory clock? What about Gfx voltage, can that be increased (if you keep the clock at 1780)? I noticed that you experienced similar issues earlier in Manjaro. Was this without any overclocking applied and did you solve that? |
What upp does is simply changing a value in pp_table, and it does its job correctly, as you had already demonstrated. It is amdgpu driver logic that processes the Power Play changes when tables are changed, and re-applies all the clock/voltage parameters from scratch (basically, modifying pp_table would cause driver power management to be completely re-initialized). I guess this re-init would fail with "unexpected" setting in Power Play. On the other hand, the sysfs API clock change does not re-init everything, it just trigger clock change in the driver logic, which is likely the reason of success with setting clock above 1780 Since you have both old & new Sapphire Pulse 5600XT vBIOS files, can you please share? I totally need them for comparing Power Play tables and double-checking if upp works as expected on both. |
Yes. Memory overclocking worked. I don't know about the voltages, because Navi doesn't have a voltage for each state, only a voltage curve, and I don't feel comfortable in my knowledge of the 5600 XT safe voltages to test out raising voltage limits. I've tried lowering them, and that works.
No, but I never tested the new firmware or anything like that. After that very initial testing, I just went back to Arch since the card was working fine there, and I haven't used Manjaro since, I've just been using Arch. I'll try it out later today though and use the new firmware and see if anything is fixed. I imagine it was the firmware issue though.
That's what I'm saying. It looks like editing pp_table is the issue, as apparently that causes the firmware to freak out. It does seem like this is something due to the firmware, like I said, BUT I wouldn't say it's an "issue" with the card OR with upp/powerupp, it sounds like upp just isn't compatibile with this card. But anyway, I'll upload the new and original versions of the performance vBIOS if you want: Details on which is which are in the README in the zip |
Ok, so possibly the only firmware limit is the maximum target frequency. It could be possible to do some workarounds only for 5600 XT in powerupp by setting the target frequency using OverDrive instead of the pp table, but, will consider it... Here is something you can try: First enable OverDrive ( In terminal (with proper path to upp, and yes it's supposed to be 1830, or anything above 1820 at least):
Another thing to note is that the target frequency and the actual working frequency of the GPU are not (always) the same, meaning that in order to actually get the card running at clocks higher than 1820 you would probably have to increase the voltage (for example I can only run at a maximum of 30 MHz below the target frequency and 80 below the OverDrive max with stock settings on my 5700 XT, so a bit surprising that you can run at 1820 without increasing the voltage on your 5600 XT). But I think it would break to 300 MHz when increasing the OverDrive limit if it affects some firmware limit. In case it works (
|
Doing the above doesn't cause an error or anything, but it has no actual effect other than changing the max clock in
So the As requested:
And I would like to say I really appreciate you guys helping try to get this work, even if powerupp and upp seem to be incompatible with the 5600 XT. Maybe we can get it working but even if not, it's very much appreciated. |
You can try increasing the |
And if not, in terminal I believe it is |
But, have you not been able to get the card running at 1820 MHz before (regardless of method)? I was under that impression but maybe I made that up myself. If not, I would guess that it needs more voltage to run higher than 1780 MHz. |
Just FYI, both 5600XT VBIOSes pp_tables are fully decode-able and modifiable by upp, no issue there. Maybe @gardotd426 may find this diff between Power Play settings between old and new VBIOS interesting:
Note the |
@azeam, yes I've been able to get the overclocking working with other methods including @sibradzic, setting Also, the vBIOS memory dump thing might be because the new vBIOS that I'm currently used I had a default memory overclock to 900MHz set by default in Funnily enough, it seems that But yeah, if it's possible to run two commands at once with |
(One line) Also make sure there are no profiles auto-loaded with radeon-profile or CoreCtrl when testing upp. I don't think that is the cause here but I've noticed some weird things, even without patches and with OverDrive disabled, so safer with them off for trouble-shooting purposes. For example, with OverDrive disabled on my system I can set the CoreCtrl clock slider to 300 MHz and this will cause the card to lock to "manual" dpm performance level and it will stay at state 0 (300 MHz)/manual even if I try to overwrite the perfomance level manually or change the pp table. As for how the OverDrive overclocking works I'm not very familiar with that, but did you try setting |
Seems like new firmware was released today https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=b791e15d3e0ac2705eaa7965ed9b6d4c85fef2a2 |
It does absolutely nothing to help Manjaro, so it's not a firmware issue.
Manjaro is still forced at 300MHz, and even trying to load defaults with
like powerupp or anything doesn't work. I'm about to file a bug report with
Manjaro because this has been an issue since I got the card, regardless of
which vBIOS or firmware I used. And for some reason, the new firmware isn't
available on vanilla Arch yet. But like I said I already downloaded the
firmware from the link from the devs and it helped with performance on the
new vBIOS but that's the only change it made, I've had the new firmware
since before I even filed this issue.
…On Tue, Feb 4, 2020 at 2:54 PM azeam ***@***.***> wrote:
Seems like new firmware was released today
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=b791e15d3e0ac2705eaa7965ed9b6d4c85fef2a2
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1?email_source=notifications&email_token=AM5Y333GXXVL6BK5SGLYX3DRBHBYTA5CNFSM4KOQVV4KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKY66WI#issuecomment-582086489>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AM5Y337XBLHLDHVPYEXFJBLRBHBYTANCNFSM4KOQVV4A>
.
|
Is it "the same kind of stuck at 300 MHz" you get in Manjaro as when setting the clock >1780 with upp, i.e. radeonjet etc. show all three states at 300 MHz? Do try the one line triple upp command above (in Arch), in case it helps it would be great. |
No, it's stuck at 300MHz in Manjaro right out of the box no matter what. Also no, I'm in Arch right now, but from what I remember from earlier it was like state 1 was 300MHz, state 2 was 850 or 800MHz, and state 3 was 300MHz. I'll check again here in a bit, I'm cloning my Manjaro install to back it up and then I'm gonna install Pop OS where Manjaro was, to see if it happens on Ubuntu-based distros as well, plus I wanna see if the same rendering issues I'm having in RE2 (and RE7, apparently. I just installed it today, and same thing happens) also happen in Pop. Also, no such luck:
|
The restrictive limits with the 5600 XT seem to apply under Windows as well. Did you try to increase the OverDrive limits above 1820
and then overclock above 1820 in CoreCtrl/radeon-profile? |
No luck:
Then after setting the overclock to 1830 in radeon-profile:
But:
And that's confirmed, radeon-profile shows 300 as well (and again we've figured out that those are indeed accurate numbers already). Also, that shouldn't even matter anyway because the issue was never that I couldn't overclock using |
And just to confirm, I lowered the range in
|
Thanks. Yes, I know this is a different matter, I was just curious if it is possible to increase the OverDrive limits under Linux, but it seems to be the same as in Windows (I believe what you did know is what MorePowerTool does), as suspected. I don't know for sure why it won't allow the pp table to be set above 1780 but the explanation by @sibradzic makes sense. |
My bad, I wasn't trying to insinuate that you like, didn't grasp it, it's just this has been such a long thread I didn't know if maybe it got lost in all the messages, and since as of late we've been trying all sorts of stuff it seemed like a possibility. I know you know what you're doing lol. I love Linux and open source, so I'm happy to try anything to help out, since the 5600 XT is so brand new and I'm probably one of the very few people that is using Linux, has a 5600 XT, AND is wanting to overclock, so I suppose in this instance I can actually be somewhat useful in my contributions, I just wish I knew more so I could try and help out more than I'm currently able to. |
No worries, it's interesting to find out more about this card. It's a pity that it doesn't allow the full potential of the pp table, hopefully it will change in the future. I don't think I will add any workarounds by setting the clock in a different way in powerupp, at least for now. It would basically mean just as much hassle (if not more, by complicating the code maintenance and dependencies even for other cards, depending on implementation) as using some other software for setting the OverDrive clock frequency (and powerupp for the other things, to the extent they are adjustable), as is possible now. If it would have been possible to increase the OverDrive limits it would have made more sense to do it, imho, but it seems like the OverDrive restrictions also apply to the pp table so it wouldn't add anything that is not possible to do with other software. I will add some of the information we've gathered in the readme at least. Closing this issue now but please let me know if there are any changes later on! (On a totally unrelated note, I noticed in your initial screenshot that the memory dpm selection radiobuttons are not displayed on your system as intended. The positioning of certain GTK elements is for some reason different between different systems, and on your system the size of the radiobuttons are smaller than what they appear for me but I haven't figured out how to set it consistently yet. I'm opening an issue for that and will try to work something out). |
I would hold off on looking into that, I'm using i3 and it's probably i3's fault. If I remember correctly, when I was in Plasma it didn't do that. I'll log into a Plasma session at some point today and make sure, at which point you can chalk it up to tiling WM weirdness. i3 has trouble with windows that are supposed to be floating like that, sometimes even if you set them to float. |
I'm on kernel 5.5 on Arch Linux, and on my 5600 XT, if I set the core clock to anything below the stock boost (1780MHz), it correctly applies.
![Screenshot_20200131_230224](https://user-images.githubusercontent.com/54234607/73586671-a56a5a00-447e-11ea-9644-2065e04dc336.png)
sudo cat /sys/class/drm/card0/devices/pp_od_clk_voltage
shows whichever value I set (as doesradeonjet
andradeon-profile
). However, if I set it to ANYTHING above 1780, even 1781MHz, it breaks. If I have my settings like this:sudo cat /sys/class/drm/card0/device/pp_od_clk_voltage
gives me:OD_SCLK:
0: 800Mhz
1: 300Mhz
OD_MCLK:
1: 900MHz
OD_VDDC_CURVE:
0: 800MHz @ 0mV
1: 550MHz @ 0mV
2: 300MHz @ 0mV
OD_RANGE:
SCLK: 800Mhz 1820Mhz
MCLK: 625Mhz 930Mhz
VDDC_CURVE_SCLK[0]: 800Mhz 1820Mhz
VDDC_CURVE_VOLT[0]: 800mV 1050mV
VDDC_CURVE_SCLK[1]: 800Mhz 1820Mhz
VDDC_CURVE_VOLT[1]: 800mV 1050mV
VDDC_CURVE_SCLK[2]: 800Mhz 1820Mhz
VDDC_CURVE_VOLT[2]: 800mV 1050mV
You'll notice that it makes state 0 800MHz, and the "boost" state, state 1 is 300MHz. I've tried this with a dozen values going all the way up to 1820 (the max of the card). Same thing every time. And it's not a reporting error.
If I raise the memory clock but keep the core at 1780 (or below), it actually applies correctly. And the thing is, when this happens, everything reports the frequency at 300MHz, except
powerupp
. If I try to apply a value of 1785MHz, click "Apply Current", type my password, and then hit "Load Current", everything inpowerupp
stays the same, so it's not properly reading/sys/class/drm/card0/pp_od_clk_voltage
. This sucks, I was super pumped to find such an easy-to-use GUI, and I tried to look at the code sincekdesu
said my password was needed to run/usr/bin/bash
, so I figured it was a bash script. But/usr/bin/powerupp
isn't a bash script and I can't read it (I'm assuming thatpowerupp
executes a second bash script but I can't find it).Yes, I have all the dependencies, and
radeon-profile
will correctly set frequency states.The text was updated successfully, but these errors were encountered: