Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for nvidia_modeset #709

Closed
wants to merge 3 commits into from
Closed

Conversation

arafey
Copy link

@arafey arafey commented Dec 8, 2015

This should fix issue #699 without causing any problems for non-affected users. It should correctly detect if nvidia and nvidia_modeset are loaded, unload nvidia_modeset using modprobe -r nvidia_modeset (this also unloads nvidia), and log errors (as before). If it doesn't detect nvidia_modeset but detects "driver", then it will unload the detected driver just like before, except it does so using modprobe -r "driver" for consistency.

@ghost
Copy link

ghost commented Dec 17, 2015

Hi, on my almost fresh Ubuntu 15.10 unloading only "nvidia_modeset" doesn't unload "nvidia" module. I replaced "modprobe -r" with "rmmod" in module.c and it's working now. Any ideas why?

@Lekensteyn
Copy link
Member

Doesn't the modprobe trick from #699 (comment) work?

@invidian Probably because there is a modprobe alias rule somewhere, can you check /etc/modprobe.d/ or {/usr,}/lib/modprobe.d/?

@ghost
Copy link

ghost commented Dec 17, 2015

Well, I have /etc/modprobe.d/nvidia-graphics-drivers.conf which is a symlink to /etc/alternatives/x86_64-linux-gnu_nvidia_modconf and it contains

This file was installed by nvidia-358

Do not edit this file manually

blacklist nouveau
blacklist lbm-nouveau
blacklist nvidia-current
blacklist nvidia-173
blacklist nvidia-96
blacklist nvidia-current-updates
blacklist nvidia-173-updates
blacklist nvidia-96-updates
blacklist nvidia-358-updates
alias nvidia nvidia_358
alias nvidia-modeset nvidia_358-modeset
alias nvidia-uvm nvidia_358-uvm
alias nouveau off
alias lbm-nouveau off

"modprobe",
"-r",
"nvidia_uvm",
"nvidia_modeset",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just came across this line of code. You need to add driver as well, otherwise you will unload only modeset module, but not nvidia. Alternatively, you can change the logic removing else if statements and unload modules one by one.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my case, the driver line was unnecessary, since modprobe -r nvidia_modeset also unloaded nvidia. This may be specific to my system (Lenovo ThinkPad W540 running Arch Linux).

I agree with @invidian below that the whole thing needs a rewrite, not only because it's an inefficient fix, but a superior function would be modular, i.e. if nVidia starts throwing in nvidia_x with a new driver, we could simply add nvidia_x alongside nvidia, nvidia_modeset, and nvidia_uvm. Auto-detection would be ideal by parsing the output of lsmod | grep nvidia and automagically generating a list of modules to feed to module_unload based on that particular system.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arafey I have Arch Linux also, but I use nvidia-dkms package, so it might be somehow introduce a different behaviour, though, I haven't seen a case where one module would unload another unless you write so in modprobe.conf.

@ghost
Copy link

ghost commented Dec 24, 2015

  else if (module_is_loaded("nvidia_modeset") == 1) {
    int retries = 30;
    bb_log(LOG_INFO, "Unloading nvidia_modeset driver\n");
    char *mod_argv[] = {
      "modprobe",
      "-r",
      "nvidia_modeset",
      driver,
      NULL
    };

I made it like that and now it's working well. But in my opinion this function needs rewrite. Function unload_module should only unload one module. And we should create function unload_modules which checks module dependencies and unloading it. That way we also avoid many line duplications. I will try to make it in next days.

@frol
Copy link

frol commented Dec 26, 2015

@arafey Have you thought about reading lsmod and modprobe code to avoid running binaries in the first place, but make a library call instead? Binaries are less predictable, you know.

@ghost
Copy link

ghost commented Dec 26, 2015

In <linux/module.h> there are init_module and delete_module functions. I will try to use them.

@mudler
Copy link

mudler commented Dec 26, 2015

just my 2cents: like @arafey stated, on Sabayon (Gentoo based), the driver line was unnecessary, since modprobe -r nvidia_modeset also unloaded nvidia as well. i'll ping testers and see what we have in different systems

@frol
Copy link

frol commented Dec 26, 2015

@mudler Even if it is like you describe, it is generally a bad idea to rely on implicit behaviour. The code unloads nvidia-uvm and nvidia-modeset, but checks if nvidia is still loaded. Do you see the flaw?

@mudler
Copy link

mudler commented Dec 26, 2015

@frol absolutely, don't get me wrong here, what you reported is right and i don't see flaws in your logic.
Indeed, i consider the patches addressing to this issue as a workaround, not actually a fix.

that needs to be solved and handled properly, but meanwhile it is good to provide at least a quick workaround that could work on some hosts instead having a totally bugged package that could also cause damage to people. ( As for example, on my PC i have Optimus, but no lights at all indicating that my GPU is in use. This is a disaster when you are not paying attention to it and the discrete is permanently ON)

The only thing that worries me is that on the Official bumblebee repository (on the main branch) i do not see a commit since 2 years. I never looked at it's code since now, and to be honest, this portion of code seemed just creepy to me.

@frol
Copy link

frol commented Dec 26, 2015

@mudler The code is creepy indeed, but nobody seems to have enough time to maintain this quite popular and important project. Hopefully, @invidian will have a proper fix for this.

@mudler
Copy link

mudler commented Dec 26, 2015

@frol yeah, it is quite sad to see that. In theory who is more familiar with the codebase should at least give an "eye" on those kinda of issues, but then there are a lot of reasons why this doesn't happen (Life comes first, job, health, ecc.) and i totally respect that. Unfortunately my free time is so small that i can't work on a proper fix, but i'm glad to see that there are volunteers working on that 👍

@Mayurifag
Copy link

Can someone tell me how to implement all of this in my bumblebee in archlinux? Nvidia dont unload due to nvidia_modeset in my case, so i want to fix it, thanks!

@ghost
Copy link

ghost commented Dec 26, 2015

@Mayurifag Here you have quick workaround which I am using on my ubuntu.

 #   Setting ProbeAllGpus to false prevents the new proprietary driver
 #   instance spawned to try to control the integrated graphics card,
diff --git a/src/module.c b/src/module.c
index b5a4d79..5c44123 100644
--- a/src/module.c
+++ b/src/module.c
@@ -120,6 +120,7 @@ int module_unload(char *driver) {
       "modprobe",
       "-r",
       "nvidia_modeset",
+      driver,
       NULL
     };
     bb_run_fork_wait(mod_argv, 10);

@frol I am amateur and young developer, but I will try to make it better in my free time. IMO it's the best way to learn some new things.

@Mayurifag
Copy link

@invidian well, now i have package bumblebee-git installed and it seems to already have your workaround, but videocard doesnt switch off after i close optirun.

@ghost
Copy link

ghost commented Dec 27, 2015

Take a look at bumblebeed.service logs, check which nvidia modules are still loaded, so probably lsmod | grep nvidia and may be try to compile bumblebee from source with my patch. As far as I see, PKGBUILD from bumblebee-git doesn't have any patch for this issue.

@Mayurifag
Copy link

@invidian yeah, i saw my issue, i lied you, sorry. It was a hard day for me even except this issue. Now im going to compile from source, thank you.

@Mayurifag
Copy link

It seems, I finally solved my problem.
Let me write a guide. I'm new at such things like change source before building and Markdown's syntax, so don't argue at me, please, ha-ha.

My distro is Archlinux.
Firstly, I wrote $ yaourt -S bumblebee-git.
I confirmed everything and waited until git downloaded repository to my /tmp.
Next moment I did Ctrl+Z. You have to do that before build, but after download sources.
I opened /tmp/yaourt-tmp.../aur-bumblebee-git/src/bumblebee/src/module.c.
I have no idea what I had to add, so I added nvidia_uvm and nvidia_modeset from this topic.
So, the most important part of file now looks like that:

...
int module_unload(char *driver) {
  if (module_is_loaded(driver) == 1) {
    int retries = 30;
    bb_log(LOG_INFO, "Unloading %s driver\n", driver);
    char *mod_argv[] = {
      "modprobe",
      "-r",
      "nvidia_uvm",
      "nvidia_modeset",
      driver,
      NULL
    };
...

I saved the file, got to the terminal, typed $ jobs, saw that I had only one job for now, typed $ fg %1 to resume compilation. I also checked that my user is still in group bumblebee, etc.
Now everything works like a charm. Thanks to all of you, guys, really love you.

P.S. I also had problem with dkms and nvidia, that was loaded during boot. Setting LOAD=no in /etc/default/dkms solved this.

@ghost
Copy link

ghost commented Dec 27, 2015

You can also install bumblebee normally (via package manager), download the source, apply patch, make and sudo make install to patch bumblebee in system.

BTW now I am reading what is the proper way to load and remove kernel modules from C code. So I am digging inside kernel source, stack overflow etc.

@Mayurifag
Copy link

@invidian one my friend told me one day that I mustn't use make and make install when I have yaourt in my system.

@ghost
Copy link

ghost commented Dec 27, 2015

I created another pull request, because I think, that solution is much simpler, than we thought. #717

@frol
Copy link

frol commented Dec 27, 2015

@Mayurifag Your friend was completely right! Never do make install unless you want to make your system a rubbish dump.

@invidian I have already seen this kind of "fix" somewhere here. I don't remember why, but it didn't work well. BTW, implementing a proper fix will require not only proper unloading of the modules, but also loading back all those helping modules.

@ghost
Copy link

ghost commented Dec 27, 2015

@frol You are right, I should think twice before I write/do something. How about something like this? I know it is not perfect, but it should work without modprobe aliases workaround.

diff --git a/src/bbsecondary.c b/src/bbsecondary.c
index 71a6b73..5b01b91 100644
--- a/src/bbsecondary.c
+++ b/src/bbsecondary.c
@@ -104,7 +104,7 @@ static bool switch_and_load(void)
   if (pci_get_driver(driver, pci_bus_id_discrete, sizeof driver)) {
     /* if the loaded driver does not equal the driver from config, unload it */
     if (strcasecmp(bb_config.driver, driver)) {
-      if (!module_unload(driver)) {
+      if (!modules_unload(driver)) {
         /* driver failed to unload, aborting */
         return false;
       }
@@ -232,7 +232,7 @@ static void switch_and_unload(void)
       }
       /* unload the driver loaded by the graphica card */
       if (pci_get_driver(driver, pci_bus_id_discrete, sizeof driver)) {
-        module_unload(driver);
+        modules_unload(driver);
       }

       //only turn card off if no drivers are loaded
diff --git a/src/module.c b/src/module.c
index f7b99fa..a9a380e 100644
--- a/src/module.c
+++ b/src/module.c
@@ -112,6 +112,13 @@ int module_unload(char *driver) {
   return 1;
 }

+int modules_unload(char *driver){
+  if (strcmp(driver, "nvidia") == 0){
+    module_unload("nvidia_modeset");
+    module_unload("nvidia_uvm");
+  }
+  return module_unload(driver);
+}
 /**
  * Checks whether a kernel module is available for loading
  *
diff --git a/src/module.h b/src/module.h
index b1c5e75..905e16e 100644
--- a/src/module.h
+++ b/src/module.h
@@ -23,4 +23,5 @@
 int module_is_loaded(char *driver);
 int module_load(char *module_name, char *driver);
 int module_unload(char *driver);
+int modules_unload(char *driver);
 int module_is_available(char *module_name);

@frol
Copy link

frol commented Dec 27, 2015

@invidian It looks better then this PR, but it is still just a hack. However, the original module_unload is not ideal at all, so your fix at least doesn't make it completely broken. Nevertheless, if I were a maintainer of a project I won't merge such a fix...

@ArchangeGabriel
Copy link
Member

We’re not going to merge this, as @Lekensteyn stated we’re not going to change our code for every module nvidia will invent. Instead, we will just ensure there is proper aliasing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants