Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Linux 3.6 and bbswitch power on / off issues. #35

Closed
cybercyst opened this Issue · 42 comments

7 participants

@cybercyst

I recently upgraded to linux 3.6 out of the testing repo in ArchLinux and now bbswitch won't power on and off the card.
I've come from a working bumblebee installation on linux 3.5 and seems that maybe something's changed in the acpi setup.

Here's my dmesg: http://pastebin.archlinux.fr/450067

I have nvidia and nouveau blacklisted and all combinations of bbswitch module options yield the same result.

@cybercyst

Also, disabling powersaving in bumblebee.conf lets everything work great. So this is definitely an issue with bbswitch.

@Lekensteyn
Owner

I am using 3.6.1 and it works fine. Where does this come from:

[   86.072692] bbswitch: enabling discrete graphics

When using PMMethod=bbswitch (or auto) in bumblebeed.conf, it will only enable the card if you run optirun (or primusrun). Maybe you have a script somewhere that does echo ON > /proc/acpi/bbswitch ?

@cybercyst

No scripts, that's actually a bbswitch.conf file in /etc/modprobe.d/ telling the bbswitch to enable the card when loaded. I've tried enabling the card on loading, disabling the card on loading and leaving the whole dern thing alone and none of it works. If I disable bbswitch power management everything works fine.

@Lekensteyn
Owner

You mean options bbswitch load_state=0? That will have effect on loading the module and will happen before the following is printed:

bbswitch: Succesfully loaded. Discrete card ... is ...
@cybercyst

Well it would appear that something was wrong and that everything is OK now. I wish I could say more than that... the only thing I've changed is my bbswitch.conf file in /etc/modprobe.conf, and again, this exact same configuration wasn't working before.

cat /etc/modprobe.d/bbswitch.conf 
blacklist nvidia
blacklist nouveau

my bumblebee.conf:
http://codepad.org/O7tztBUx

and my dmesg now:
http://codepad.org/XsMnrAcI

While I'm happy it is working... I'm completely confused as to why! I've reviewed all my services and can't find anything that would have been trying to turn on the card before bbswitch was loaded.
shrug It works, it doesn't have anything to do with linux 3.6... I guess we can close this one!

@cybercyst cybercyst closed this
@cybercyst cybercyst reopened this
@cybercyst

OK, well, I'm retarded. I hadn't upgraded back to 3.6.1 from 3.5.4. So THAT's why it worked! So consider the above a baseline. I'll post the linux 3.6.1 stuffs soon.

@cybercyst

Alright, I upgraded to linux 3.6.1 and didn't change ANYTHING from the working 3.5.4 installation. bbswitch won't power the card. Here's my dmesg: http://codepad.org/bgQi273g

Interestingly to note, my laptop has a little indicator light to let me know the status of the Nvidia GPU.

On 3.5.4 it is on at POST up until X launches, at which point it would turn off.
On 3.6.1 it is on at POST up through X launching, never turning off. bbswitch reports it as already off.

@Lekensteyn
Owner

Please check journalctl -ba instead of dmesg so you can get daemon messages. Who is writing ON to bbswitch?
I see you are using 3.6.1 which is in the testing repo. Does that also mean that you have upgraded your Xorg? In that case you may need to create /etc/xorg.conf.d/disable-auto-add-gpu.conf containing the below and reboot to test.

Section "ServerLayout"
    Option         "AutoAddGPU" "false"
EndSection

(untested in this setting, you might need to add an Identifier and Screen keyword to it, see /etc/bumblebee/xorg.conf.nouveau for an example).

This is necessary becauses Xorg 1.13 supports multiple drivers (intel, nouveau, ...) for PRIME. You should try the above xorg.conf addition only when your /var/log/Xorg.0.log tries to bind the nvidia card, I do not see the nouveau/nvidia driver being loaded (blacklisted?) so that is a good thing.

@cybercyst

Alright, being working on this some more. I added the above to my xorg.conf and made it so that nvidia and nouveau are no longer blacklisted. The computer turns on and powers off the nvidia card. After this, it won't turn back on!
http://codepad.org/n7dG50dF
I did hear that linux 3.6 supports PCIe D3cold power states, something that is like a deeper sleep state for the card. Could this be causing what I'm seeing?
Once the card powers off now, I can't get nvidia to load manually, it says no device present.

@cybercyst

Also, any attempts to change bbswitch's power state through the command line gives me a Permission Denied error. Thanks again for all the helpful tips! I was sure the AutoAddGPU was going to do it! I was excited to see my card power off after booting up!

@Lekensteyn
Owner

@cybercyst You have to keep nouveau and nvidia blacklisted, otherwise bbswitch will refuse to power off the GPU. Instead of attaching dmesg, can you please attach the output of journalctl -ba > journal.txt?

C'mon... permission denied, what do you think that the issue is then? ls -l /proc/acpi/bbswitch .... guess what ;)

I have seen the linked bug (see the comments of that bug) was already fixed before the final 3.4 version.

@cybercyst

[cybercyst@optimus ~]$ cat /etc/modprobe.d/bbswitch.conf 
blacklist nvidia
blacklist nouveau

[cybercyst@optimus ~]$ cat /etc/X11/xorg.conf.d/60-disable-gpu.conf 
Section "ServerLayout"
Identifier "DisableAutoAddGPU"
Option "AutoAddGPU" "false"
EndSection

journalctl -ba > journal.txt
http://codepad.org/9mJmdbp3
#edit, the above truncated the upload for the complete journalctl -ba see:
http://pastebin.com/pvW9ryC0

@Lekensteyn Thanks again for your patience AND help man. ;) Just one reminder, if I take this exact same system and setup and downgrade to linux 3.5.4 everything works, and with linux 3.6.1 it doesn't.

@Lekensteyn
Owner

The relevant lines are:

Oct 13 13:05:06 optimus kernel: bbswitch: enabling discrete graphics
Oct 13 13:05:06 optimus kernel: pci 0000:01:00.0: power state changed by ACPI to D0
Oct 13 13:05:06 optimus kernel: pci 0000:01:00.0: Refused to change power state, currently in D3
Oct 13 13:05:06 optimus kernel: pci 0000:01:00.0: power state changed by ACPI to D0
Oct 13 13:05:06 optimus bumblebeed[165]: [  159.610255] [ERROR]Could not enable discrete graphics card
Oct 13 13:05:06 optimus kernel: pci 0000:01:00.0: Refused to change power state, currently in D3

This means that something asked Bumblebeed (=> bbswitch) to power on the nvidia card (optirun ....). Something is not really clear to me in your story, can you manually power on/off the card by writing ON or OFF to /proc/acpi/bbswitch as root? Then check the state by reading that file.

I think that something has become asynchronous (or takes a longer time) in 3.6, causing bbswitch to return before the card is actually off. But then bumblebeed immediately reads the file after the write returns and thus reports the old state (which is off).

If the above does not make things clear, please disable bumblebeed, reboot and try writing OFF/ON to bbswitch manually.

@cybercyst

That was actually me running optirun glxspheres after booting into X. As you've seen the nvidia card doesn't power on and so the whole thing doesn't work. I've tried manually writing ON and OFF to bbswitch and the same result. The card just won't respond and it puts the same errors in my syslog. I did try disabling bumblebeed.service and rebooting and just manually loading the bbswitch module and manually turning it on and off with /proc/acpi/bbswitch and this didn't work either.

@Lekensteyn
Owner

Please try bbswitch from the develop branch, I have changed D3hot to D3cold. Not sure if that actually fixes things, but it is worth testing.

If that fails, try bisecting the kernel.

@cybercyst

@Lekensteyn Hrm, well that also didn't work. /me rolls up his sleeves. Git bisection here we come!

@Lekensteyn Lekensteyn referenced this issue in Bumblebee-Project/Bumblebee
Closed

GT650M: Failed to initialize NVIDIA GPU #172

@cybercyst

After bisecting the kernel, here's the kernel commit that breaks this for me:

71a83bd727cc31c5fe960c3758cb396267ff710e is the first bad commit
commit 71a83bd727cc31c5fe960c3758cb396267ff710e
Author: Zheng Yan <zheng.z.yan@intel.com>
Date:   Sat Jun 23 10:23:49 2012 +0800

    PCI/PM: add runtime PM support to PCIe port

    This patch adds runtime PM support to PCIe port.  This is needed by
    PCIe D3cold support, where PCIe device without ACPI node may be
    powered on/off by PCIe port.

    Because runtime suspend is broken for some chipsets, a black list is
    used to disable runtime PM support for these chipsets.

    Reviewed-by: Rafael J. Wysocki <rjw@sisk.pl>
    Signed-off-by: Zheng Yan <zheng.z.yan@intel.com>
    Signed-off-by: Huang Ying <ying.huang@intel.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

:040000 040000 944ceff6044404c594b78ebfb69db88b96e0003a f3415c164e5c2c0e44a5a8f4252a98b3da03f351 M  drivers
@uKev

Hi cybercyst and Lekensteyn,

thanks for diving into the issue. I just want to report that I can reproduce the issue after updating to 3.6.2 (became archlinux stable).

With disabled bumblebeed:

# echo ON > /proc/acpi/bbswitch 
(wait a few seconds)
# cat /proc/acpi/bbswitch 
0000:01:00.0 OFF

# dmesg|tail -n 3 (reduced)
bbswitch: enabling discrete graphics
power state changed by ACPI to D0
Refused to change power state, currently in D3

# dmesg|grep bbswitch (reduced)
bbswitch: version 0.4.2
bbswitch: Found integrated VGA device 0000:00:02.0: \_SB_.PCI0.GFX0
bbswitch: Found discrete VGA device 0000:01:00.0: \_SB_.PCI0.PEG0.GFX0
bbswitch: detected an Optimus _DSM function
bbswitch: Succesfully loaded. Discrete card 0000:01:00.0 is on
bbswitch: enabling discrete graphics
bbswitch: disabling discrete graphics
bbswitch: Result of Optimus _DSM call: 01000059

nvidia/nouveau is blacklisted.

@Lekensteyn Lekensteyn referenced this issue in Bumblebee-Project/Bumblebee
Closed

Could not enable discrete graphics card on Gentoo #265

@cybercyst

So just a question... does bbswitch need to change to work with the newer kernel, or would a bug report upstream to the kernel devs be appropriate?

@Lekensteyn
Owner

I think you need to file a bug at kernel bugzilla. I can also not see why the commit you bisected cause issues with this, but let upstream help figuring it out.

@cybercyst

Here's the bug report over there, which links back to here. INFINITE LOOP!
https://bugzilla.kernel.org/show_bug.cgi?id=48981

@cybercyst

FIXED:
I changed /etc/laptop-mode/conf.d/runtime-pm.conf and changed the line

# Control Runtime Power Management ?
CONTROL_RUNTIME_PM="auto"

to:

# Control Runtime Power Management ?
CONTROL_RUNTIME_PM="0"

Everything works as expected now.

@cybercyst

On further testing, the above does turn on and off the card, but causes the battery to still drain quickly since I guess devices aren't able to be shut on and off by the linux kernel?

Anyways, here's what a kernel developer suggested: "I still think this maybe a bumblebee / bbswitch issue. It need to resume
device before operating on the device. Can you suggest that?"

@stephencox

I am on Fedora 18 and the 3.6 kernel is compiled with CONFIG_VGA_SWITCHEROO=y
Maybe switcheroo is loading nouvea?
Check with "grep -i switcheroo /boot/config-3.6.*"

@stephencox

Switching off the card with "echo OFF > /sys/kernel/debug/vgaswitcheroo/switch" works for me

@Lekensteyn
Owner

switcheroo works because nouveau claims the device. I am working on a solution right now.

@Lekensteyn Lekensteyn referenced this issue from a commit
@Lekensteyn Lekensteyn Power on the PCIe port before accessing PCI config space (#35)
Fixes regression in Linux 3.6 when run-time power management is enabled. See
https://bugzilla.kernel.org/show_bug.cgi?id=48981 for a discussion.
7f9fb7a
@Lekensteyn
Owner

bbswitch should be fixed now. Please test bbswitch.

I think that Bumblebee is still broken though because /proc/pci/01/00.0 still reports the wrong values. I am waiting for upstream to report whether this is intended or not.

@cybercyst

The new bbswitch in the develop branch does indeed work, even with CONTROL_RUNTIME_PM="auto". I can launch stuff on my card and it turns of after. Thanks!

@Jodell88

bbswitch from the develop branch does not seem to work for me.

@Lekensteyn
Owner

@Jodell88 You might have to reboot for the new version. Does manually writing ON/OFF work? Logs or it did not happen.

@stephencox

ON/OFF works for me

@Jodell88

I've rebooted numerous times.

Dmesg http://pastebin.com/SrDjm5hA

If I use bbswitch as my PMMethod I can't turn my card on. If I don't use bbswitch, I can't turn it off.

@Dreamer4135

Works great for me (Arch 3.6.2) but I have this in dmesg every time I run applications with optirun:

[  369.235748] bbswitch: disabling discrete graphics
[  369.246521] pci 0000:01:00.0: Refused to change power state, currently in D0
[  369.247295] pci 0000:01:00.0: power state changed by ACPI to D3cold

It's not actually refusing to change state:

$ cat /proc/acpi/bbswitch 
0000:01:00.0 OFF
$ optirun  cat /proc/acpi/bbswitch
0000:01:00.0 ON
$ cat /proc/acpi/bbswitch 
0000:01:00.0 OFF

Great job anyway, thanks!

@dapolinario

@Jodell88, how did you install bbswitch?

@Jodell88

@dapolinario I modified the dkms-bbswitch-git PKGBUILD from the AUR to use the develop branch. You made the same changes I did based on your comment in the AUR.

@dapolinario

What is your laptop? And your video card? My dmesg output is equal to @Dreamer4135 (D3cold), but your dmesg output is only D3.

@Jodell88

Dell XPS L502X, Nvidia GT 525M, Intel Core i7-2630QM

@dapolinario

Same as mine, except the processor. Remove all (bumblebee and bbswitch with configuration files) and reinstall to see if resolves.

@Jodell88

@dapolinario I did as you suggested and I can report that bbswitch works! It also mentions D3cold and not D3 in dmesg. Thanks for everyone's help for getting this problem fixed.

@Lekensteyn
Owner

Fixed with bbswitch 0.5 - closing.

@Lekensteyn Lekensteyn closed this
@Lekensteyn
Owner

Any issues in Bumblebee related to PCI config space that could occur in 3.6 should be fixes in future 3.6 versions:
http://www.spinics.net/lists/linux-pci/msg18282.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.