New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[drm:r600_ring_test [radeon]] *ERROR* radeon: ring 0 test failed #270

Closed
alicektx opened this Issue May 23, 2017 · 5 comments

Comments

Projects
None yet
3 participants
@alicektx

alicektx commented May 23, 2017

Hello linrunner,
using TLP 0.9, Mint 18, and kernel 4.10.0-19-generic, on Lenovo Ideapad 300 with hybrid graphics (Intel & Radeon).

I get the following error message when the laptop boots on AC:

May 23 02:59:31 Lenovo-300-17ISK kernel: [ 26.336643] [drm:r600_ring_test [radeon]] ERROR radeon: ring 0 test failed (scratch(0x850C)=0xCAFEDEAD)
May 23 02:59:31 Lenovo-300-17ISK kernel: [ 26.336690] [drm:si_resume [radeon]] ERROR si startup failed on resume

That is the default TLP configuration, unchanged. What i further tried out:

  1. I changed / enabled RUNTIME_PM_BLACKLIST="03:00.0"
    [where 03:00.0 is according to lspci:
    Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Sun XT [Radeon HD 8670A/8670M/8690M / R5 M330] (rev 83)].
  2. Then, in addition to the above, i also further removed 'radeon' from "RUNTIME_PM_DRIVER_BLACKLIST=".
    In both cases, same behavior - only when uninstalling TLP, boot & logs appear to be ok again....

However, i noticed that i get no such error messages when I boot on battery instead of AC.

One more note: on older 4.4x kernels, no such errors as well - so i'm kinda puzzled now...is it a matter of 4.10, tlp or possibly a combination of both?
tlp-log.zip

I've attached a couple of kern / syslog files in case it's of any help...

All the best & thanks in advance for your reply.

PS: Sorry, forgot mentioning above: in the logs, after May 23 07:05:19 is with TLP temporarily uninstalled.

@linrunner

This comment has been minimized.

Owner

linrunner commented May 23, 2017

Hi,

the most important part is missing: the full output of

sudo tlp-stat

after AC boot and – for comparison – on BAT too. Thanks.

Via Gist – no attachments please.

@alicektx

This comment has been minimized.

alicektx commented May 24, 2017

Hi linrunner,
here is on battery, right after boot:
https://gist.github.com/alicektx/69ea23dd91e81aa5e1900d8e70601af5

And after booting on AC:
https://gist.github.com/alicektx/81ae88facd79f21ed7aeb6024dc39ad4

I've further tested with "radeon.nopm=0" parameter passed to grub, in which case, the warnings that appear with TLP on AC disappear (however, at the penalty of the battery lasting afterwards for at least 40% less time than before). So maybe it's also a 4.10x kernel version problem and not a TLP issue per se?

Thanks in advance for any further help / assistance.

@linrunner

This comment has been minimized.

Owner

linrunner commented May 24, 2017

The great majority of issues with TLP are in fact bugs in a kernel driver's pm features.

(1) RUNTIME_PM_BLACKLIST="03:00.0" is not configured currently. But this is not important, because driver blacklisting via RUNTIME_PM_DRIVER_BLACKLIST="radeon ..." actually works as the AC output shows:

sys/bus/pci/devices/0000:03:00.0/power/control = auto (0x038000, Display controller, radeon)

"auto" is the driver default and shows that TLP didn't touch the setting. All other devices are "on".

(2) TLP uses DPM too

+++ Radeon Graphics
/sys/class/drm/card1/device/power_dpm_state = performance
/sys/class/drm/card1/device/power_dpm_force_performance_level = off

and this seems to be the root of your error message. You may remove your boot option and try either to disable DPM on AC by commenting

#RADEON_DPM_STATE_ON_AC=performance

or try the BAT setting

RADEON_DPM_STATE_ON_AC=battery

@linrunner linrunner added kernel quirk and removed incomplete labels May 24, 2017

@alicektx

This comment has been minimized.

alicektx commented May 25, 2017

Hi linrunner,
deeply thanks for all of your help & support, i really appreciate it, and also sorry for being somewhat late in my reply - as i had to further google quite a bit to get an idea of why / when those specific errors pop up.

None of the TLP tweaks above seem to remove the errors unfortunately - however, i'm convinced 110% this behavior is merely 'triggered' by TLP, and the underlying problem certainly lies very deep elsewhere. By far the most interesting & very similar bug report / thread that i stumbled upon was in Arch's forums here (with kernel 4.8x, yet the very same radeon card...):
https://bbs.archlinux.org/viewtopic.php?id=218909
Thereby i'm closing this...

All the best & thanks again for all!

@alicektx alicektx closed this May 25, 2017

@HorselessHorseperson

This comment has been minimized.

HorselessHorseperson commented Dec 8, 2017

Hi. I've encountered the same problem with the same laptop on ubuntu 16.04.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment