Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linux 3.6 and bbswitch power on / off issues. #35

Closed
cybercyst opened this issue Oct 10, 2012 · 44 comments
Closed

Linux 3.6 and bbswitch power on / off issues. #35

cybercyst opened this issue Oct 10, 2012 · 44 comments

Comments

@cybercyst
Copy link

I recently upgraded to linux 3.6 out of the testing repo in ArchLinux and now bbswitch won't power on and off the card.
I've come from a working bumblebee installation on linux 3.5 and seems that maybe something's changed in the acpi setup.

Here's my dmesg: http://pastebin.archlinux.fr/450067

I have nvidia and nouveau blacklisted and all combinations of bbswitch module options yield the same result.

@cybercyst
Copy link
Author

Also, disabling powersaving in bumblebee.conf lets everything work great. So this is definitely an issue with bbswitch.

@Lekensteyn
Copy link
Member

I am using 3.6.1 and it works fine. Where does this come from:

[   86.072692] bbswitch: enabling discrete graphics

When using PMMethod=bbswitch (or auto) in bumblebeed.conf, it will only enable the card if you run optirun (or primusrun). Maybe you have a script somewhere that does echo ON > /proc/acpi/bbswitch ?

@cybercyst
Copy link
Author

No scripts, that's actually a bbswitch.conf file in /etc/modprobe.d/ telling the bbswitch to enable the card when loaded. I've tried enabling the card on loading, disabling the card on loading and leaving the whole dern thing alone and none of it works. If I disable bbswitch power management everything works fine.

@Lekensteyn
Copy link
Member

You mean options bbswitch load_state=0? That will have effect on loading the module and will happen before the following is printed:

bbswitch: Succesfully loaded. Discrete card ... is ...

@cybercyst
Copy link
Author

Well it would appear that something was wrong and that everything is OK now. I wish I could say more than that... the only thing I've changed is my bbswitch.conf file in /etc/modprobe.conf, and again, this exact same configuration wasn't working before.

cat /etc/modprobe.d/bbswitch.conf 
blacklist nvidia
blacklist nouveau

my bumblebee.conf:
http://codepad.org/O7tztBUx

and my dmesg now:
http://codepad.org/XsMnrAcI

While I'm happy it is working... I'm completely confused as to why! I've reviewed all my services and can't find anything that would have been trying to turn on the card before bbswitch was loaded.
shrug It works, it doesn't have anything to do with linux 3.6... I guess we can close this one!

@cybercyst cybercyst reopened this Oct 12, 2012
@cybercyst
Copy link
Author

OK, well, I'm retarded. I hadn't upgraded back to 3.6.1 from 3.5.4. So THAT's why it worked! So consider the above a baseline. I'll post the linux 3.6.1 stuffs soon.

@cybercyst
Copy link
Author

Alright, I upgraded to linux 3.6.1 and didn't change ANYTHING from the working 3.5.4 installation. bbswitch won't power the card. Here's my dmesg: http://codepad.org/bgQi273g

Interestingly to note, my laptop has a little indicator light to let me know the status of the Nvidia GPU.
On 3.5.4 it is on at POST up until X launches, at which point it would turn off.
On 3.6.1 it is on at POST up through X launching, never turning off. bbswitch reports it as already off.

@Lekensteyn
Copy link
Member

Please check journalctl -ba instead of dmesg so you can get daemon messages. Who is writing ON to bbswitch?
I see you are using 3.6.1 which is in the testing repo. Does that also mean that you have upgraded your Xorg? In that case you may need to create /etc/xorg.conf.d/disable-auto-add-gpu.conf containing the below and reboot to test.

Section "ServerLayout"
    Option         "AutoAddGPU" "false"
EndSection

(untested in this setting, you might need to add an Identifier and Screen keyword to it, see /etc/bumblebee/xorg.conf.nouveau for an example).

This is necessary becauses Xorg 1.13 supports multiple drivers (intel, nouveau, ...) for PRIME. You should try the above xorg.conf addition only when your /var/log/Xorg.0.log tries to bind the nvidia card, I do not see the nouveau/nvidia driver being loaded (blacklisted?) so that is a good thing.

@cybercyst
Copy link
Author

Alright, being working on this some more. I added the above to my xorg.conf and made it so that nvidia and nouveau are no longer blacklisted. The computer turns on and powers off the nvidia card. After this, it won't turn back on!
http://codepad.org/n7dG50dF
I did hear that linux 3.6 supports PCIe D3cold power states, something that is like a deeper sleep state for the card. Could this be causing what I'm seeing?
Once the card powers off now, I can't get nvidia to load manually, it says no device present.

@cybercyst
Copy link
Author

Also, any attempts to change bbswitch's power state through the command line gives me a Permission Denied error. Thanks again for all the helpful tips! I was sure the AutoAddGPU was going to do it! I was excited to see my card power off after booting up!

@cybercyst
Copy link
Author

And it seems this guy is having the same problem:
http://www.mail-archive.com/acpi-bugzilla@lists.sourceforge.net/msg35503.html

@Lekensteyn
Copy link
Member

@cybercyst You have to keep nouveau and nvidia blacklisted, otherwise bbswitch will refuse to power off the GPU. Instead of attaching dmesg, can you please attach the output of journalctl -ba > journal.txt?

C'mon... permission denied, what do you think that the issue is then? ls -l /proc/acpi/bbswitch .... guess what ;)

I have seen the linked bug (see the comments of that bug) was already fixed before the final 3.4 version.

@cybercyst
Copy link
Author

[cybercyst@optimus ~]$ cat /etc/modprobe.d/bbswitch.conf 
blacklist nvidia
blacklist nouveau
[cybercyst@optimus ~]$ cat /etc/X11/xorg.conf.d/60-disable-gpu.conf 
Section "ServerLayout"
    Identifier  "DisableAutoAddGPU"
    Option  "AutoAddGPU" "false"
EndSection

journalctl -ba > journal.txt
http://codepad.org/9mJmdbp3
#edit, the above truncated the upload for the complete journalctl -ba see:
http://pastebin.com/pvW9ryC0

@Lekensteyn Thanks again for your patience AND help man. ;) Just one reminder, if I take this exact same system and setup and downgrade to linux 3.5.4 everything works, and with linux 3.6.1 it doesn't.

@Lekensteyn
Copy link
Member

The relevant lines are:

Oct 13 13:05:06 optimus kernel: bbswitch: enabling discrete graphics
Oct 13 13:05:06 optimus kernel: pci 0000:01:00.0: power state changed by ACPI to D0
Oct 13 13:05:06 optimus kernel: pci 0000:01:00.0: Refused to change power state, currently in D3
Oct 13 13:05:06 optimus kernel: pci 0000:01:00.0: power state changed by ACPI to D0
Oct 13 13:05:06 optimus bumblebeed[165]: [  159.610255] [ERROR]Could not enable discrete graphics card
Oct 13 13:05:06 optimus kernel: pci 0000:01:00.0: Refused to change power state, currently in D3

This means that something asked Bumblebeed (=> bbswitch) to power on the nvidia card (optirun ....). Something is not really clear to me in your story, can you manually power on/off the card by writing ON or OFF to /proc/acpi/bbswitch as root? Then check the state by reading that file.

I think that something has become asynchronous (or takes a longer time) in 3.6, causing bbswitch to return before the card is actually off. But then bumblebeed immediately reads the file after the write returns and thus reports the old state (which is off).

If the above does not make things clear, please disable bumblebeed, reboot and try writing OFF/ON to bbswitch manually.

@cybercyst
Copy link
Author

That was actually me running optirun glxspheres after booting into X. As you've seen the nvidia card doesn't power on and so the whole thing doesn't work. I've tried manually writing ON and OFF to bbswitch and the same result. The card just won't respond and it puts the same errors in my syslog. I did try disabling bumblebeed.service and rebooting and just manually loading the bbswitch module and manually turning it on and off with /proc/acpi/bbswitch and this didn't work either.

@Lekensteyn
Copy link
Member

Please try bbswitch from the develop branch, I have changed D3hot to D3cold. Not sure if that actually fixes things, but it is worth testing.

If that fails, try bisecting the kernel.

@cybercyst
Copy link
Author

@Lekensteyn Hrm, well that also didn't work. /me rolls up his sleeves. Git bisection here we come!

@cybercyst
Copy link
Author

After bisecting the kernel, here's the kernel commit that breaks this for me:

71a83bd727cc31c5fe960c3758cb396267ff710e is the first bad commit
commit 71a83bd727cc31c5fe960c3758cb396267ff710e
Author: Zheng Yan <zheng.z.yan@intel.com>
Date:   Sat Jun 23 10:23:49 2012 +0800

    PCI/PM: add runtime PM support to PCIe port

    This patch adds runtime PM support to PCIe port.  This is needed by
    PCIe D3cold support, where PCIe device without ACPI node may be
    powered on/off by PCIe port.

    Because runtime suspend is broken for some chipsets, a black list is
    used to disable runtime PM support for these chipsets.

    Reviewed-by: Rafael J. Wysocki <rjw@sisk.pl>
    Signed-off-by: Zheng Yan <zheng.z.yan@intel.com>
    Signed-off-by: Huang Ying <ying.huang@intel.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

:040000 040000 944ceff6044404c594b78ebfb69db88b96e0003a f3415c164e5c2c0e44a5a8f4252a98b3da03f351 M  drivers

@uKev
Copy link

uKev commented Oct 17, 2012

Hi cybercyst and Lekensteyn,

thanks for diving into the issue. I just want to report that I can reproduce the issue after updating to 3.6.2 (became archlinux stable).

With disabled bumblebeed:

# echo ON > /proc/acpi/bbswitch 
(wait a few seconds)
# cat /proc/acpi/bbswitch 
0000:01:00.0 OFF

# dmesg|tail -n 3 (reduced)
bbswitch: enabling discrete graphics
power state changed by ACPI to D0
Refused to change power state, currently in D3

# dmesg|grep bbswitch (reduced)
bbswitch: version 0.4.2
bbswitch: Found integrated VGA device 0000:00:02.0: \_SB_.PCI0.GFX0
bbswitch: Found discrete VGA device 0000:01:00.0: \_SB_.PCI0.PEG0.GFX0
bbswitch: detected an Optimus _DSM function
bbswitch: Succesfully loaded. Discrete card 0000:01:00.0 is on
bbswitch: enabling discrete graphics
bbswitch: disabling discrete graphics
bbswitch: Result of Optimus _DSM call: 01000059

nvidia/nouveau is blacklisted.

@cybercyst
Copy link
Author

So just a question... does bbswitch need to change to work with the newer kernel, or would a bug report upstream to the kernel devs be appropriate?

@Lekensteyn
Copy link
Member

I think you need to file a bug at kernel bugzilla. I can also not see why the commit you bisected cause issues with this, but let upstream help figuring it out.

@cybercyst
Copy link
Author

Here's the bug report over there, which links back to here. INFINITE LOOP!
https://bugzilla.kernel.org/show_bug.cgi?id=48981

@cybercyst
Copy link
Author

FIXED:
I changed /etc/laptop-mode/conf.d/runtime-pm.conf and changed the line

# Control Runtime Power Management ?
CONTROL_RUNTIME_PM="auto"

to:

# Control Runtime Power Management ?
CONTROL_RUNTIME_PM="0"

Everything works as expected now.

@cybercyst
Copy link
Author

On further testing, the above does turn on and off the card, but causes the battery to still drain quickly since I guess devices aren't able to be shut on and off by the linux kernel?
Anyways, here's what a kernel developer suggested: "I still think this maybe a bumblebee / bbswitch issue. It need to resume
device before operating on the device. Can you suggest that?"

@stephencox
Copy link

I am on Fedora 18 and the 3.6 kernel is compiled with CONFIG_VGA_SWITCHEROO=y
Maybe switcheroo is loading nouvea?
Check with "grep -i switcheroo /boot/config-3.6.*"

@stephencox
Copy link

Switching off the card with "echo OFF > /sys/kernel/debug/vgaswitcheroo/switch" works for me

@Lekensteyn
Copy link
Member

switcheroo works because nouveau claims the device. I am working on a solution right now.

Lekensteyn added a commit that referenced this issue Oct 19, 2012
Fixes regression in Linux 3.6 when run-time power management is enabled. See
https://bugzilla.kernel.org/show_bug.cgi?id=48981 for a discussion.
@Lekensteyn
Copy link
Member

bbswitch should be fixed now. Please test bbswitch.

I think that Bumblebee is still broken though because /proc/pci/01/00.0 still reports the wrong values. I am waiting for upstream to report whether this is intended or not.

@cybercyst
Copy link
Author

The new bbswitch in the develop branch does indeed work, even with CONTROL_RUNTIME_PM="auto". I can launch stuff on my card and it turns of after. Thanks!

@Jodell88
Copy link

bbswitch from the develop branch does not seem to work for me.

@Lekensteyn
Copy link
Member

@Jodell88 You might have to reboot for the new version. Does manually writing ON/OFF work? Logs or it did not happen.

@stephencox
Copy link

ON/OFF works for me

@Jodell88
Copy link

I've rebooted numerous times.

Dmesg http://pastebin.com/SrDjm5hA

If I use bbswitch as my PMMethod I can't turn my card on. If I don't use bbswitch, I can't turn it off.

@Dreamer4135
Copy link

Works great for me (Arch 3.6.2) but I have this in dmesg every time I run applications with optirun:

[  369.235748] bbswitch: disabling discrete graphics
[  369.246521] pci 0000:01:00.0: Refused to change power state, currently in D0
[  369.247295] pci 0000:01:00.0: power state changed by ACPI to D3cold

It's not actually refusing to change state:

$ cat /proc/acpi/bbswitch 
0000:01:00.0 OFF
$ optirun  cat /proc/acpi/bbswitch
0000:01:00.0 ON
$ cat /proc/acpi/bbswitch 
0000:01:00.0 OFF

Great job anyway, thanks!

@dapolinario
Copy link

@Jodell88, how did you install bbswitch?

@Jodell88
Copy link

@dapolinario I modified the dkms-bbswitch-git PKGBUILD from the AUR to use the develop branch. You made the same changes I did based on your comment in the AUR.

@dapolinario
Copy link

What is your laptop? And your video card? My dmesg output is equal to @Dreamer4135 (D3cold), but your dmesg output is only D3.

@Jodell88
Copy link

Dell XPS L502X, Nvidia GT 525M, Intel Core i7-2630QM

@dapolinario
Copy link

Same as mine, except the processor. Remove all (bumblebee and bbswitch with configuration files) and reinstall to see if resolves.

@Jodell88
Copy link

@dapolinario I did as you suggested and I can report that bbswitch works! It also mentions D3cold and not D3 in dmesg. Thanks for everyone's help for getting this problem fixed.

@Lekensteyn
Copy link
Member

Fixed with bbswitch 0.5 - closing.

@Lekensteyn
Copy link
Member

Any issues in Bumblebee related to PCI config space that could occur in 3.6 should be fixes in future 3.6 versions:
http://www.spinics.net/lists/linux-pci/msg18282.html

@postadelmaga
Copy link

I have similar problem on kernel 4.8.11-1 ( I am using Antergos linux, based on Arch ).

I thought the issue was related to tlp and bsswitch conflicting and I have disabled the pci management from tlp for the nvidia card but nothing has really changed.

Sometime it works sometime I got the error:
Refused to change power state, currently in D3

@bluca
Copy link
Member

bluca commented Dec 5, 2016

@postadelmaga I would recommend having a look at the following threads:

Bumblebee-Project/Bumblebee#808
#140
#112

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants