Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nvidia: Preliminary nVidia/AMD PRIME and dynamic power management support #100519

Merged
merged 2 commits into from Jan 29, 2021
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
80 changes: 71 additions & 9 deletions nixos/modules/hardware/video/nvidia.nix
Expand Up @@ -63,6 +63,15 @@ in
'';
};

hardware.nvidia.powerManagement.finegrained = mkOption {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RTD3 power management is experimental so I would probably not include it. Nor is it a pain to set up as probably editing Xorg configuration so it might be better to exclude it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen several people ask about it, it's supposed to work with Intel iGPUs, and it's also supposed to work for APUs "shortly". I thought I'd get a head start on supporting it.

It's still marked as experimental in the documentation, but the code here shouldn't need any change as it stabilizes, except perhaps to remove the udev exclusions.

type = types.bool;
default = false;
description = ''
Experimental power management of PRIME offload. For more information, see
the NVIDIA docs, chapter 22. PCI-Express runtime power management.
'';
};

hardware.nvidia.modesetting.enable = mkOption {
type = types.bool;
default = false;
Expand Down Expand Up @@ -96,6 +105,16 @@ in
'';
};

hardware.nvidia.prime.amdgpuBusId = mkOption {
type = types.str;
default = "";
example = "PCI:4:0:0";
description = ''
Bus ID of the AMD APU. You can find it using lspci; for example if lspci
shows the AMD APU at "04:00.0", set this option to "PCI:4:0:0".
'';
};

hardware.nvidia.prime.sync.enable = mkOption {
type = types.bool;
default = false;
Expand Down Expand Up @@ -153,15 +172,24 @@ in
};
};

config = mkIf enabled {
config = let
igpuDriver = if pCfg.intelBusId != "" then "modesetting" else "amdgpu";
igpuBusId = if pCfg.intelBusId != "" then pCfg.intelBusId else pCfg.amdgpuBusId;
in mkIf enabled {
assertions = [
{
assertion = with config.services.xserver.displayManager; gdm.nvidiaWayland -> cfg.modesetting.enable;
message = "You cannot use wayland with GDM without modesetting enabled for NVIDIA drivers, set `hardware.nvidia.modesetting.enable = true`";
}

{
assertion = primeEnabled -> pCfg.nvidiaBusId != "" && pCfg.intelBusId != "";
assertion = primeEnabled -> pCfg.intelBusId == "" || pCfg.amdgpuBusId == "";
message = ''
You cannot configure both an Intel iGPU and an AMD APU. Pick the one corresponding to your processor.
'';
}
{
assertion = primeEnabled -> pCfg.nvidiaBusId != "" && (pCfg.intelBusId != "" || pCfg.amdgpuBusId != "");
message = ''
When NVIDIA PRIME is enabled, the GPU bus IDs must configured.
'';
Expand All @@ -174,6 +202,14 @@ in
assertion = !(syncCfg.enable && offloadCfg.enable);
message = "Only one NVIDIA PRIME solution may be used at a time.";
}
{
assertion = !(syncCfg.enable && cfg.powerManagement.finegrained);
message = "Sync precludes powering down the NVIDIA GPU.";
}
{
assertion = cfg.powerManagement.enable -> offloadCfg.enable;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be cfg.powerManagement.finegrained instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

message = "Fine-grained power management requires offload to be enabled.";
}
];

# If Optimus/PRIME is enabled, we:
Expand All @@ -183,18 +219,22 @@ in
# "nvidia" driver, in order to allow the X server to start without any outputs.
# - Add a separate Device section for the Intel GPU, using the "modesetting"
# driver and with the configured BusID.
# - OR add a separate Device section for the AMD APU, using the "amdgpu"
# driver and with the configures BusID.
# - Reference that Device section from the ServerLayout section as an inactive
# device.
# - Configure the display manager to run specific `xrandr` commands which will
# configure/enable displays connected to the Intel GPU.
# configure/enable displays connected to the Intel iGPU / AMD APU.

services.xserver.useGlamor = mkDefault offloadCfg.enable;

services.xserver.drivers = optional primeEnabled {
name = "modesetting";
services.xserver.drivers = let
in optional primeEnabled {
name = igpuDriver;
display = offloadCfg.enable;
modules = optional (igpuDriver == "amdgpu") [ pkgs.xorg.xf86videoamdgpu ];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed? If you have a AMD GPU this should be prerequisite. I'm not familiar with this but isn't there an open source and closed source driver. Is it compatible with both and/or conflicts are settled if both are included.

Copy link
Contributor Author

@Baughn Baughn Oct 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proprietary driver is amdgpu-pro. It doesn't work basically at all, for anyone -- perhaps a slight exaggeration, but I'd be astonished to see it in use.

The modesetting driver is bundled with xorg; amdgpu isn't. The usual user interface for fixing that is adding it to videoDrivers, but as I've explained, doing so will break PRIME. It has to be added to modules, or the AMD gpu won't work at all.

Adding driver modules this way does nothing by itself, so a user who explicitly wants to install amdgpu-pro should be able to do so. Though I don't think that would work in a PRIME configuration, and this explicitly selects the amdgpu driver elsewhere. Note that the module doesn't let Intel users choose the intel driver instead of the modesetting one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, alright I guess, but that doesn't seem to stop anyone from adding "amdgpu" into videoDrivers which would end up breaking it still based off what I've understood, which is probably the way most people would start to avoid to get the display working first.

If I've understood correctly, an assert to make sure amdgpu isn't in videoDrivers should be added if someone uses PRIME.

It's not that Intel users can't choose the intel xf86video driver, but the docs explicitly state that it's for modesetting, so it probably wouldn't work anyway if they chose to use the old driver. Though the Arch wiki states AMD GPU's are supported based off the docs, the link doesn't seem to show any proof of that (unless amdgpu implements modesetting but as it's own driver in which case it would be implied).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

amdgpu implements modesetting, yes.

deviceSection = ''
BusID "${pCfg.intelBusId}"
BusID "${igpuBusId}"
${optionalString syncCfg.enable ''Option "AccelMethod" "none"''}
'';
} ++ singleton {
Expand All @@ -205,6 +245,7 @@ in
''
BusID "${pCfg.nvidiaBusId}"
${optionalString syncCfg.allowExternalGpu "Option \"AllowExternalGpus\""}
${optionalString cfg.powerManagement.finegrained "Option \"NVreg_DynamicPowerManagement=0x02\""}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per the docs here, this is in the wrong spot. this option should be set on the nvidia kernel module during it's initialization. it's not an Xorg option for initializing the device:

[this feature] can be enabled or disabled via the NVreg_DynamicPowerManagement nvidia.ko kernel module parameter.

this setting also defaults to on for Ampere and newer cards as of this driver version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see #174057

'';
screenSection =
''
Expand All @@ -214,14 +255,14 @@ in
};

services.xserver.serverLayoutSection = optionalString syncCfg.enable ''
Inactive "Device-modesetting[0]"
Inactive "Device-${igpuDriver}[0]"
'' + optionalString offloadCfg.enable ''
Option "AllowNVIDIAGPUScreens"
'';

services.xserver.displayManager.setupCommands = optionalString syncCfg.enable ''
# Added by nvidia configuration module for Optimus/PRIME.
${pkgs.xorg.xrandr}/bin/xrandr --setprovideroutputsource modesetting NVIDIA-0
${pkgs.xorg.xrandr}/bin/xrandr --setprovideroutputsource ${igpuDriver} NVIDIA-0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The configuration looks to be basically the same for Intel and AMD, so not sure why you're adding logic?

Copy link
Contributor Author

@Baughn Baughn Oct 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AMD config doesn't use the modesetting driver, and calling it that would be misleading.

Granted, Sync doesn't work on my hardware at all. I have no way of testing this, so if you believe it should e unconditionally be "modesetting" I'll remove the change.

Copy link
Member

@eadwu eadwu Oct 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The arch wiki uses radeon instead of amdgpu in it's example, though I'm not sure what it is for AMD. This should just be a projection from different sinks from xrandr --listproviders, i.e., Provider 0 -> 1

Provider 0: id: 0x46 cap: 0xf, Source Output, Sink Output, Source Offload, Sink Offload crtcs: 3 outputs: 4 associated providers: 0 name:modesetting
Provider 1: id: 0x258 cap: 0x0 crtcs: 0 outputs: 0 associated providers: 0 name:NVIDIA-G0

I'm not sure about naming, if they both use modesetting internally then I would leave it as modesetting otherwise it's fine to leave it.

${pkgs.xorg.xrandr}/bin/xrandr --auto
'';

Expand Down Expand Up @@ -292,16 +333,37 @@ in
boot.kernelParams = optional (offloadCfg.enable || cfg.modesetting.enable) "nvidia-drm.modeset=1"
++ optional cfg.powerManagement.enable "nvidia.NVreg_PreserveVideoMemoryAllocations=1";

# Create /dev/nvidia-uvm when the nvidia-uvm module is loaded.
services.udev.extraRules =
''
# Create /dev/nvidia-uvm when the nvidia-uvm module is loaded.
KERNEL=="nvidia", RUN+="${pkgs.runtimeShell} -c 'mknod -m 666 /dev/nvidiactl c $$(grep nvidia-frontend /proc/devices | cut -d \ -f 1) 255'"
KERNEL=="nvidia_modeset", RUN+="${pkgs.runtimeShell} -c 'mknod -m 666 /dev/nvidia-modeset c $$(grep nvidia-frontend /proc/devices | cut -d \ -f 1) 254'"
KERNEL=="card*", SUBSYSTEM=="drm", DRIVERS=="nvidia", RUN+="${pkgs.runtimeShell} -c 'mknod -m 666 /dev/nvidia%n c $$(grep nvidia-frontend /proc/devices | cut -d \ -f 1) %n'"
KERNEL=="nvidia_uvm", RUN+="${pkgs.runtimeShell} -c 'mknod -m 666 /dev/nvidia-uvm c $$(grep nvidia-uvm /proc/devices | cut -d \ -f 1) 0'"
KERNEL=="nvidia_uvm", RUN+="${pkgs.runtimeShell} -c 'mknod -m 666 /dev/nvidia-uvm-tools c $$(grep nvidia-uvm /proc/devices | cut -d \ -f 1) 0'"
'' + optionalString cfg.powerManagement.finegrained ''
# Remove NVIDIA USB xHCI Host Controller devices, if present
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x0c0330", ATTR{remove}="1"

# Remove NVIDIA USB Type-C UCSI devices, if present
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x0c8000", ATTR{remove}="1"

# Remove NVIDIA Audio devices, if present
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x040300", ATTR{remove}="1"

# Enable runtime PM for NVIDIA VGA/3D controller devices on driver bind
ACTION=="bind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030000", TEST=="power/control", ATTR{power/control}="auto"
ACTION=="bind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030200", TEST=="power/control", ATTR{power/control}="auto"

# Disable runtime PM for NVIDIA VGA/3D controller devices on driver unbind
ACTION=="unbind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030000", TEST=="power/control", ATTR{power/control}="on"
ACTION=="unbind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030200", TEST=="power/control", ATTR{power/control}="on"
'';

boot.extraModprobeConfig = mkIf cfg.powerManagement.finegrained ''
options nvidia "NVreg_DynamicPowerManagement=0x02"
'';

boot.blacklistedKernelModules = [ "nouveau" "nvidiafb" ];

services.acpid.enable = true;
Expand Down