Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/nvidia : added nvidia-persistenced #77054

Merged
merged 1 commit into from May 10, 2020
Merged

Conversation

@Rakesh4G
Copy link
Contributor

Rakesh4G commented Jan 6, 2020

Motivation for this change

Fixes: #75099

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.
Notify maintainers

cc @CMCDragonkai @mdedetrich

@mdedetrich
Copy link

mdedetrich commented Jan 6, 2020

I believe this should be optional with default being turned off, reason being its when you have NVidia GPU's that are used for machine learning so you want the Nvidia GPU's to always be online even though your display is being run through a completely different GPU (i.e. Intel integrated GPU) or not at all (server with a terminal interface over network).

@CMCDragonkai Can you confirm this?

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Jan 7, 2020

Yes by default it should be off.

@Rakesh4G
Copy link
Contributor Author

Rakesh4G commented Jan 7, 2020

OK.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Jan 7, 2020

This requires a module option and a mkIf that checks the enable option before writing the config.

@Rakesh4G Rakesh4G marked this pull request as ready for review Jan 8, 2020
@Rakesh4G
Copy link
Contributor Author

Rakesh4G commented Jan 9, 2020

This requires a module option and a mkIf that checks the enable option before writing the config.

Done changes to make this optional and default to false.
@CMCDragonkai @mdedetrich

@Rakesh4G Rakesh4G changed the title added nvidia-persistenced nixos/modules/hardware/video/nvidia : added nvidia-persistenced Jan 9, 2020
@Rakesh4G Rakesh4G changed the title nixos/modules/hardware/video/nvidia : added nvidia-persistenced nixos/hardware/nvidia : added nvidia-persistenced Jan 9, 2020
@Rakesh4G Rakesh4G changed the title nixos/hardware/nvidia : added nvidia-persistenced nixos/nvidia : added nvidia-persistenced Jan 9, 2020
@mdedetrich
Copy link

mdedetrich commented Jan 9, 2020

Looks fine to me, @CMCDragonkai ?

@coreyoconnor
Copy link
Contributor

coreyoconnor commented Jan 22, 2020

Works for me: nvidia-smi consistently reports both GPUs available

Tue Jan 21 17:46:14 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.44       Driver Version: 440.44       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K10.G1.8GB    On   | 00000000:05:00.0 Off |                  Off |
| N/A   29C    P8    17W / 117W |      0MiB /  4037MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla K10.G1.8GB    On   | 00000000:06:00.0 Off |                  Off |
| N/A   37C    P8    17W / 117W |      0MiB /  4037MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Jan 22, 2020

@coreyoconnor
Copy link
Contributor

coreyoconnor commented Jan 23, 2020

Your machine is different. This is needed headless servers with embedded intel gpu and 4 or more gpus.

On 22 January 2020 12:47:07 GMT+11:00, coreyoconnor @.***> wrote: Works for me: nvidia-smi consistently reports both GPUs available ~~~ Tue Jan 21 17:46:14 2020 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 440.44 Driver Version: 440.44 CUDA Version: 10.2 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 Tesla K10.G1.8GB On | 00000000:05:00.0 Off | Off | | N/A 29C P8 17W / 117W | 0MiB / 4037MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 1 Tesla K10.G1.8GB On | 00000000:06:00.0 Off | Off | | N/A 37C P8 17W / 117W | 0MiB / 4037MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+ ~~~ -- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: #77054 (comment)
-- Sent from my Android device with K-9 Mail. Please excuse my brevity.

I think you are confused. This is useful on any system using a nvidia GPU in a headless mode. As that is my system and this definitely resolves issues the requirements of "embedded intel gpu and 4 or more gpus" is incorrect. That would also be inconsistent with the documentation of nvidia-persistenced.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Jan 23, 2020

Great, then this works for any headless system then.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Feb 25, 2020

This is ready to be merged. Anybody?

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Feb 25, 2020

@Rakesh4G Can you clean up the commits, and squash them into 1, and abide by the commit standard.

@Rakesh4G Rakesh4G force-pushed the formbay:nvidia-persistenced branch from 3acc886 to bdff4d6 Feb 25, 2020
@Rakesh4G
Copy link
Contributor Author

Rakesh4G commented Feb 25, 2020

Hi @CMCDragonkai , I have cleaned up the commits, and squash them into 1. Thanks.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Mar 9, 2020

You have to resolve the conflicts in the MR right now.

But it's unfortunate that this MR has been left without any one attempting to push it in.

@Rakesh4G Rakesh4G force-pushed the formbay:nvidia-persistenced branch from bdff4d6 to 4fda14a Mar 9, 2020
Rakesh Gupta
@Rakesh4G Rakesh4G force-pushed the formbay:nvidia-persistenced branch from 4fda14a to 89a8a31 Mar 9, 2020
@CMCDragonkai
Copy link
Member

CMCDragonkai commented May 8, 2020

This is ready to be merged.

Anyone?

Someone please merge this before this falls out of scope for maintenance!

Copy link
Member

CMCDragonkai left a comment

@mdedetrich
Copy link

mdedetrich commented May 8, 2020

I have no permission to merge, I think you will have to ping a maintainer.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented May 9, 2020

Who's the maintainer for this module?

@mdedetrich
Copy link

mdedetrich commented May 10, 2020

Judging by #66601 @romildo seems to merge NVidia related PR's

@romildo romildo merged commit be03474 into NixOS:master May 10, 2020
13 checks passed
13 checks passed
Evaluation Performance Report Evaluator Performance Report
Details
grahamcofborg-eval ^.^!
Details
grahamcofborg-eval-check-maintainers matching changed paths to changed attrs...
Details
grahamcofborg-eval-check-meta config.nix: checkMeta = true
Details
grahamcofborg-eval-darwin nix-instantiate --arg nixpkgs { outPath=./.; revCount=999999; shortRev="ofborg"; } ./pkgs/top-level/release.nix -A darwin-tested
Details
grahamcofborg-eval-nixos nix-instantiate --arg nixpkgs { outPath=./.; revCount=999999; shortRev="ofborg"; } ./nixos/release-combined.nix -A tested
Details
grahamcofborg-eval-nixos-manual nix-instantiate --arg nixpkgs { outPath=./.; revCount=999999; shortRev="ofborg"; } ./nixos/release.nix -A manual
Details
grahamcofborg-eval-nixos-options nix-instantiate --arg nixpkgs { outPath=./.; revCount=999999; shortRev="ofborg"; } ./nixos/release.nix -A options
Details
grahamcofborg-eval-nixpkgs-manual nix-instantiate --arg nixpkgs { outPath=./.; revCount=999999; shortRev="ofborg"; } ./pkgs/top-level/release.nix -A manual
Details
grahamcofborg-eval-nixpkgs-tarball nix-instantiate --arg nixpkgs { outPath=./.; revCount=999999; shortRev="ofborg"; } ./pkgs/top-level/release.nix -A tarball
Details
grahamcofborg-eval-nixpkgs-unstable-jobset nix-instantiate --arg nixpkgs { outPath=./.; revCount=999999; shortRev="ofborg"; } ./pkgs/top-level/release.nix -A unstable
Details
grahamcofborg-eval-package-list nix-env -qa --json --file .
Details
grahamcofborg-eval-package-list-no-aliases nix-env -qa --json --file . --arg config { allowAliases = false; }
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

5 participants
You can’t perform that action at this time.