Skip to content
This repository has been archived by the owner on Oct 17, 2021. It is now read-only.

Generate aliases in bumblebee.conf on Ubuntu #34

Merged
merged 1 commit into from
Aug 13, 2015

Conversation

bluca
Copy link
Member

@bluca bluca commented Aug 9, 2015

On Ubuntu, the Nvidia driver packages at the moment do not define
a remove rule for modprobe, which causes bumblebee to fail when
the patch to use modprobe -r is applied.
Add aliases and remove rules to bumblebee.conf during postinst phase
if building on Ubuntu as a workaround.

@bluca
Copy link
Member Author

bluca commented Aug 9, 2015

Note that this adds an "nvidia" alias for all supported drivers at the same time, but that's ok with modprobe. The drawback is false negatives error messages in the logs.

Tested this on 15.04, with drivers 340 (from Vivid repos) and 352 (from xorg-edgers). Tested both with glxgears and a Cuda application, in both cases all modules are removed correctly and the card is turned off at exit.

One of the problems is that, as far as I can see on my installation, the binary drivers do not install the nvidia.conf in modprobe.d unless they are the main provider of libgl (it's handled by update-alternatives), which is obviously not the case when using Bumblebee on Optimus hardware. And even if they did, they would only provide an alias, not a "remove" rule.

So this should really be fixed in the drivers package, ideally. But since in modprobe multiple rules and aliases with the same name can exist at the same time, it's probably safer for the sake of older versions to have this done in Bumblebee PPA as well.

Finally, Cuda users will have to define their own alias for nvidia-uvm I'm afraid (same problem as above, the driver's one is not active), as adding an alias as I did for each driver has the side effect of making optirun generate a lot of errors, one for each unavailable nvidia-uvm-XXX version. False negatives, but since they are printed to STDOUT I think it would be far too confusing for users. Again, should be fixed by the drivers package.

@Lekensteyn, @Vincent-C, @amonakov - opinions on this?

@bluca
Copy link
Member Author

bluca commented Aug 9, 2015

One more comment, the /etc/modprobe.d/bumblebee.conf really ought to be generated from a template, rather than hardcoded. I'll try and work on this on the Debian side of the world once the higher priority stuff is dealt with.

This is how the mangled modprobe files look like:

# This file is installed by bumblebee, do NOT edit!
# to be used by kmod / module-init-tools, and installed in /etc/modprobe.d/
# or equivalent

# do not automatically load nouveau as it may prevent nvidia from loading
blacklist nouveau
# do not automatically load nvidia as it's unloaded anyway when bumblebeed
# starts and may fail bumblebeed to disable the card in a race condition.
blacklist nvidia
blacklist nvidia-current
blacklist nvidia-current-updates
# 304
alias nvidia nvidia-304
remove nvidia-304 rmmod nvidia-uvm nvidia
blacklist nvidia-304
blacklist nvidia-304-updates
blacklist nvidia-experimental-304
# 310
alias nvidia nvidia-310
remove nvidia-310 rmmod nvidia-uvm nvidia
blacklist nvidia-310
blacklist nvidia-310-updates
blacklist nvidia-experimental-310
# 313
alias nvidia nvidia-313
remove nvidia-313 rmmod nvidia-uvm nvidia
blacklist nvidia-313
blacklist nvidia-313-updates
blacklist nvidia-experimental-313
# 319
alias nvidia nvidia-319
remove nvidia-319 rmmod nvidia-uvm nvidia
blacklist nvidia-319
blacklist nvidia-319-updates
blacklist nvidia-experimental-319
# 325
alias nvidia nvidia-325
remove nvidia-325 rmmod nvidia-uvm nvidia
blacklist nvidia-325
blacklist nvidia-325-updates
blacklist nvidia-experimental-325
# 331
alias nvidia nvidia-331
remove nvidia-331 rmmod nvidia-uvm nvidia
blacklist nvidia-331
blacklist nvidia-331-updates
blacklist nvidia-experimental-331
# 334
alias nvidia nvidia-334
remove nvidia-334 rmmod nvidia-uvm nvidia
blacklist nvidia-334
blacklist nvidia-334-updates
blacklist nvidia-experimental-334
# 337
alias nvidia nvidia-337
remove nvidia-337 rmmod nvidia-uvm nvidia
blacklist nvidia-337
blacklist nvidia-337-updates
blacklist nvidia-experimental-337
# 340
alias nvidia nvidia-340
remove nvidia-340 rmmod nvidia-uvm nvidia
blacklist nvidia-340
blacklist nvidia-340-updates
blacklist nvidia-experimental-340
# 343
alias nvidia nvidia-343
remove nvidia-343 rmmod nvidia-uvm nvidia
blacklist nvidia-343
blacklist nvidia-343-updates
blacklist nvidia-experimental-343
# 346
alias nvidia nvidia-346
remove nvidia-346 rmmod nvidia-uvm nvidia
blacklist nvidia-346
blacklist nvidia-346-updates
blacklist nvidia-experimental-346
# 349
alias nvidia nvidia-349
remove nvidia-349 rmmod nvidia-uvm nvidia
blacklist nvidia-349
blacklist nvidia-349-updates
blacklist nvidia-experimental-349
# 352
alias nvidia nvidia-352
remove nvidia-352 rmmod nvidia-uvm nvidia
blacklist nvidia-352
blacklist nvidia-352-updates
blacklist nvidia-experimental-352

@Lekensteyn
Copy link
Member

Hmm, just discovered that modprobe can actually use aliases like you do. In this list post I only tested the lsmod name (nvidia), not the file name (nvidia-352).

Note that Bumblebee uses modprobe -r on the lsmod name (nvidia), see https://github.com/Bumblebee-Project/Bumblebee/blob/develop/src/module.c#L87. Thus, this configuration should work now and in the future without modification:

remove nvidia rmmod nvidia-uvm nvidia

Did another test and it appears that modprobe -vn nvidia tries to insmod each nvidia-XXX.ko file. (-v = verbose, -n = dry run). This is probably not what you want, what about dropping the alias and use the fact that modprobe -r operates directly on the module name (nvidia)?

@bluca
Copy link
Member Author

bluca commented Aug 10, 2015

I had tried with only the lsmod name, but it didn't work the first time, hence why I added the aliases. Let me try again, I'll report back later or tomorrow.

@bluca
Copy link
Member Author

bluca commented Aug 11, 2015

@Lekensteyn - tried with only:

remove nvidia rmmod nvidia-uvm nvidia

But the module is not removed and the card is still active:

$ lsmod | grep nvidia
$ cat /proc/acpi/bbswitch 
0000:01:00.0 OFF
$ optirun glxgears
primus: warning: dropping a frame to avoid deadlock
XIO:  fatal IO error 11 (Resource temporarily unavailable) on X server ":0.0"
      after 31 requests (31 known processed) with 0 events remaining.
primus: warning: dropping a frame to avoid deadlock
primus: warning: timeout waiting for display worker
$ cat /proc/acpi/bbswitch 
0000:01:00.0 ON
$ lsmod | grep nvidia
nvidia               8593408  0 
drm                   344064  5 i915,drm_kms_helper,nvidia

From the journal:

Aug 11 23:35:12 localhost acpid[1050]: client 6563[0:1001] has disconnected
Aug 11 23:35:15 localhost bumblebeed[823]: [ 4826.958481] [ERROR]Unloading nvidia driver timed out.

Do you have any alias or other rule defined, besides the remove one?

@Vincent-C
Copy link
Member

@bluca I had no idea you could specify multiple aliases like that, but if that works and has no other drawbacks other than log spam, I'm ok with it. I'd like to see these alias directives installed by the nvidia driver packages in Ubuntu as well, but I agree that until that happens, it's best that we do so ourselves in bumblebee.

@Lekensteyn, ditto what Luca said above. remove nvidia rmmod nvidia-uvm nvidia doesn't seem to work without an alias?

Unless anyone has any objections, I'll pull this PR instead of reverting 2d3983e.

@Vincent-C
Copy link
Member

Err, ignore what I said above, remove nvidia rmmod nvidia-uvm nvidia works and causes the module to be unloaded, but only after I remove an additional alias (alias nvidia nvidia-current) that I happened to have defined in a separate modprobe.d conf file. I think that means we can simplify this PR to something along the lines of echo "remove nvidia rmmod nvidia-uvm nvidia" >> /usr/share/bumblebee/default-conf/bumblebee.conf during postinst on Ubuntu systems?

@bluca
Copy link
Member Author

bluca commented Aug 12, 2015

@Vincent-C - the alias you removed is the one in blacklist.conf, I assume? I completely missed that it was there, and indeed once removed the simple remove line is enough.

Any clue where that alias comes from?

Anyway, I can confirm that after removing it, the simple remove line is enough, and that includes Cuda!

$ sudo optirun --no-xorg /media/a1/cuda-examples/0_Simple/clock/clock
CUDA Clock sample
GPU Device 0: "GeForce GT 550M" with compute capability 2.1

Total clocks = 41878
$ lsmod | grep nvidia
$ cat /proc/acpi/bbswitch 
0000:01:00.0 OFF

(note that making Cuda works requires defining an additional alias: alias nvidia-uvm nvidia_352-uvm)

In a few hours I'll update the PR as you requested.

Later tonight or tomorrow I'll also open a bug against the drivers, asking for these aliases and rules to be included there, and also asking them to always install the modprobe rules, even when the driver is not the main GL provider.

@bluca
Copy link
Member Author

bluca commented Aug 12, 2015

@Vincent-C - pushed, could you please give it a run as well to make sure it's ok? If it is, I'll double-commit to Alioth as well. Thanks!

On Ubuntu, the Nvidia driver packages at the moment do not define
a remove rule for modprobe, which causes bumblebee to fail when
the patch to use modprobe -r is applied.
Add remove rules to bumblebee.conf during postinst phase if
building on Ubuntu as a workaround.
@Vincent-C
Copy link
Member

@bluca thumbs up from me; please push to Alioth. I'll handle the PPA uploads, as well as in Debian after you've pushed your changes to Alioth. Thanks!

Vincent-C added a commit that referenced this pull request Aug 13, 2015
Generate aliases in bumblebee.conf on Ubuntu
@Vincent-C Vincent-C merged commit f6a0a17 into Bumblebee-Project:master Aug 13, 2015
@bluca
Copy link
Member Author

bluca commented Aug 13, 2015

@Vincent-C - Thanks! Pushed to Alioth.

Is it worth uploading to Debian, given there's no difference? I'm fine with leaving it aside until there's something else to push.

@Vincent-C
Copy link
Member

@bluca, yes, because otherwise Ubuntu 15.10/wily users (who don't use the PPA) will hit this regression.

@bluca
Copy link
Member Author

bluca commented Aug 13, 2015

Ah, of course, didn't think about that, sorry :-)

@Lekensteyn
Copy link
Member

Shouldn't this be done in the rules file rather than the postinst file? By doing it there you would break debsums if I am not mistaken.

@bluca bluca deleted the add-modprobe-aliases branch August 13, 2015 21:15
@bluca
Copy link
Member Author

bluca commented Aug 13, 2015

Unfortunately this is not the only bit of sed hackery in the postinst, there's already one more I can see. We should really clean it up and generate those files at build time, via the rules as you pointed out. As a band-aid to solve the un-loading problem it should do for now, though :-)

@Vincent-C
Copy link
Member

@Lekensteyn you're right, I didn't think about debsums. Fixed in 46d7e22, thanks!

@bluca dh_md5sums doesn't generate hashsums for conffiles in /etc (by default, at least), so the existing sed hackery in bumblebee/bumblebee-nvidia's postinst won't break debsums. Your latest change will, however, so I think it's worth fixing.

@bluca
Copy link
Member Author

bluca commented Aug 15, 2015

Ah, absolutely true, sorry for missing that.

@Lekensteyn - sorry, you were right, I answered your comment in a hurry and forgot the sed hackery I added touched the file in /usr rather than the copy in /etc

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants