Skip to content

Conversation

@zvonkok
Copy link
Collaborator

@zvonkok zvonkok commented Jan 15, 2026

  • Replace manual PCI device enumeration with go-nvlib/nvpci library
  • Add support for discovering NVSwitches bound to vfio-pci driver
  • NVSwitches always use their actual device name (ignore PGPUAlias)
  • Remove redundant helper functions (readIDFromFileFunc, readLinkFunc, getDeviceName, locateVendor, readVFIODev)
  • Simplify Allocate() by using stored IommuFD instead of re-reading sysfs
  • Remove unused constants (basePath, pciIdsFilePath)
  • Fix context leak in grpc connect()
  • Update tests to use nvpci.InterfaceMock

Copilot AI review requested due to automatic review settings January 15, 2026 21:29
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR simplifies device discovery by replacing manual PCI enumeration with the go-nvlib/nvpci library and adds support for discovering NVSwitches. The refactoring removes redundant helper functions, fixes a context leak in gRPC connection, and updates tests to use mocks.

Changes:

  • Replaced manual PCI device enumeration with go-nvlib/nvpci library
  • Added NVSwitch device discovery support (with special handling to ignore PGPUAlias)
  • Fixed context leak in grpc connect() function by properly deferring cancel
  • Removed unused constants and redundant helper functions
  • Updated tests to use nvpci.InterfaceMock

Reviewed changes

Copilot reviewed 6 out of 44 changed files in this pull request and generated no comments.

Show a summary per file
File Description
pkg/device_plugin/device_plugin.go Core refactoring using go-nvlib for device discovery and NVSwitch support
pkg/device_plugin/generic_device_plugin.go Fixed context leak in connect() and simplified Allocate()
pkg/device_plugin/constants.go Removed unused constants (basePath, pciIdsFilePath)
pkg/device_plugin/device_plugin_test.go Refactored tests to use nvpci.InterfaceMock
pkg/device_plugin/generic_device_plugin_test.go Updated test data structures and improved test cases
go.mod Added go-nvlib dependency and updated Go version to 1.25
vendor/modules.txt Added golangci-lint and moq dependencies
vendor/k8s.io/klog/v2/* Removed klog/v2 dependency

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Collaborator

@rajatchopra rajatchopra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Will test to verify if two device plugin servers are created.

- Replace manual PCI device enumeration with go-nvlib/nvpci library
- Add support for discovering NVSwitches bound to vfio-pci driver
- NVSwitches always use their actual device name (ignore PGPUAlias)
- Remove redundant helper functions (readIDFromFileFunc, readLinkFunc,
  getDeviceName, locateVendor, readVFIODev)
- Simplify Allocate() by using stored IommuFD instead of re-reading sysfs
- Remove unused constants (basePath, pciIdsFilePath)
- Fix context leak in grpc connect()
- Update tests to use nvpci.InterfaceMock

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
@zvonkok zvonkok merged commit fd5baf3 into NVIDIA:main Jan 23, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants