Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/nvidia: move TOPOLOGY_FILE_PATH and DATABASE_PATH into nvidia-fabricmanager service definition #320830

Merged

Conversation

philiptaron
Copy link
Contributor

Description of changes

Move the TOPOLOGY_FILE_PATH and DATABASE_PATH keys from hardware.nvidia.datacenter.settings default into the service file derivation. As can be seen on line 9, nvidia_x11 can be null and thus is unsuitable for use in this venue.

I discovered this while testing out #313497.

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 24.11 Release Notes (or backporting 23.11 and 24.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

… hardware.nvidia.datacenter.settings default into the service file
Copy link
Member

@FireyFly FireyFly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me, reducing the scope to one where nvidia_x11 is known to be non-null

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/prs-already-reviewed/2617/1787

Comment on lines -49 to -50
TOPOLOGY_FILE_PATH = "${nvidia_x11.fabricmanager}/share/nvidia-fabricmanager/nvidia/nvswitch";
DATABASE_PATH = "${nvidia_x11.fabricmanager}/share/nvidia-fabricmanager/nvidia/nvswitch";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side-note: I find these passthru packages in nvidia_x11 rather annoying; we recently found that e.g. libXNctrl could be moved out of nvidia_x11 entirely (#318092); do you know if is fabricmanager driver-locked?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://docs.nvidia.com/datacenter/tesla/pdf/fabric-manager-user-guide.pdf

That diagram suggests that it is, sadly, but I don't know for sure.

# unsuitable to be mentioned in the configuration defaults, but they _can_
# be overridden in `cfg.datacenter.settings` if needed.
fabricManagerConfDefaults = {
TOPOLOGY_FILE_PATH = "${nvidia_x11.fabricmanager}/share/nvidia-fabricmanager/nvidia/nvswitch";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sanity-check: here nvidia_x11 != null because it's guarded by cfg.enable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct.

@SomeoneSerge SomeoneSerge merged commit 35a472d into NixOS:master Jun 27, 2024
24 checks passed
@philiptaron philiptaron deleted the nixos-nvidia-change-datacenter-defaults branch June 27, 2024 16:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants