Skip to content

chore: prune unused vendored sdk packages#161

Merged
jingxiang-z merged 4 commits intomainfrom
prune-vendored-sdk-unused
Apr 10, 2026
Merged

chore: prune unused vendored sdk packages#161
jingxiang-z merged 4 commits intomainfrom
prune-vendored-sdk-unused

Conversation

@jingxiang-z
Copy link
Copy Markdown
Collaborator

@jingxiang-z jingxiang-z commented Apr 10, 2026

What changed

  • pruned unused vendored SDK command trees under third_party/fleet-intelligence-sdk/cmd
  • removed vendored SDK client, docs, component, and package directories outside the agent's reachable dependency graph
  • kept SDK test-support packages still required by remaining vendored tests
  • updated vendored and top-level Go module metadata after the prune
  • updated NOTICE to remove references to deleted embedded third-party files

Why

Customers raised concerns about logic present in the vendored SDK. This change reduces the vendored surface area to the subset still used by the fleet intelligence agent, removing unused commands and packages while preserving current agent behavior.

Impact

  • smaller vendored SDK footprint
  • fewer unused command implementations shipped in-tree
  • no intended runtime behavior change for the agent

Validation

  • env HOME=/tmp/fleetint-home GOCACHE=/tmp/fleetint-gocache GOPATH=/tmp/fleetint-gopath go test -mod=mod ./cmd/fleetint ./internal/...

Summary by CodeRabbit

Release Notes

  • Refactor

    • Removed NVIDIA accelerator monitoring components (clock-speed, ECC, fabric-manager, GPM, GPU counts)
    • Removed gpud command-line interface and daemon utilities
    • Removed Fleet Intelligence SDK third-party integration
  • Dependencies

    • Updated indirect Go module dependencies

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 10, 2026

Important

Review skipped

Review was skipped as selected files did not have any reviewable changes.

💤 Files selected but had no reviewable changes (3)
  • third_party/fleet-intelligence-sdk/pkg/machine-info/machine_info.go
  • third_party/fleet-intelligence-sdk/pkg/machine-info/machine_info_test.go
  • third_party/fleet-intelligence-sdk/pkg/nvidia-query/dcgm/instance.go
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5620aacc-5c2b-43aa-b0e7-d1962570a61c

📥 Commits

Reviewing files that changed from the base of the PR and between 7449240 and a371269.

📒 Files selected for processing (3)
  • third_party/fleet-intelligence-sdk/pkg/machine-info/machine_info.go
  • third_party/fleet-intelligence-sdk/pkg/machine-info/machine_info_test.go
  • third_party/fleet-intelligence-sdk/pkg/nvidia-query/dcgm/instance.go

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This pull request removes the fleet-intelligence-sdk third-party codebase entirely, including all v1 client implementations, CLI command handlers, NVIDIA GPU monitoring components, and supporting utilities. Minor changes include dependency updates in go.mod and removal of parallel test execution flags.

Changes

Cohort / File(s) Summary
Dependency Updates
NOTICE, go.mod
Updated NOTICE to remove entries for six gpud command files and two Linux-specific source files; bumped indirect dependency github.com/rogpeppe/go-internal from v1.13.1 to v1.14.1; removed OpenTelemetry modules go.opentelemetry.io/otel and go.opentelemetry.io/otel/sdk/metric.
Test Configuration
internal/precheck/precheck_test.go
Removed t.Parallel() calls from two test functions (TestCollectInputCallsDCGMInit, TestCollectInputPreservesGPUProbeError).
Fleet Intelligence SDK - Client Package
third_party/fleet-intelligence-sdk/client/v1/*
Deleted entire v1 client implementation: doc.go, healthz.go, http_client.go, machine_info.go, options.go, package_status.go, v1_plugins.go, v1_set_healthy.go, v1_trigger.go and associated test files.
Fleet Intelligence SDK - Command Package
third_party/fleet-intelligence-sdk/cmd/common/*, third_party/fleet-intelligence-sdk/cmd/gpud/command/*
Deleted command infrastructure: common/doc.go, common/terminal.go, command/command.go, command/command_test.go, and data directory/parsing utilities (common/data_dir.go, common/parse.go and test files).
Fleet Intelligence SDK - GPUd Subcommands
third_party/fleet-intelligence-sdk/cmd/gpud/{up,down,run,scan,status,compact,list-plugins,custom-plugins,inject-fault,set-healthy,metadata,notify,machine-info,run-plugin-group}/command.go*, third_party/fleet-intelligence-sdk/cmd/gpud/main.go
Removed all GPUd CLI command implementations including up, down, run, scan, status, compact, list-plugins, custom-plugins, inject-fault, set-healthy, metadata, notify, machine-info, and run-plugin-group; also deleted main.go entrypoint.
Fleet Intelligence SDK - Release Commands
third_party/fleet-intelligence-sdk/cmd/gpud/release/command_*.go
Deleted GPUd release subcommand implementations: command_gen_key.go, command_sign_key.go, command_sign_package.go, command_verify_key_signature.go, command_verify_package_signature.go.
Fleet Intelligence SDK - Run Command Support
third_party/fleet-intelligence-sdk/cmd/gpud/run/session.go, third_party/fleet-intelligence-sdk/cmd/gpud/run/session_test.go
Deleted session management logic for login state persistence and credential seeding from persistent storage.
Fleet Intelligence SDK - Status Command Support
third_party/fleet-intelligence-sdk/cmd/gpud/status/utils.go, third_party/fleet-intelligence-sdk/cmd/gpud/status/utils_test.go
Removed login status display helpers and associated tests.
Fleet Intelligence SDK - Swagger Service
third_party/fleet-intelligence-sdk/cmd/swagger/main.go
Deleted standalone swagger/OpenAPI documentation service entrypoint.
Fleet Intelligence SDK - Accelerator Components
third_party/fleet-intelligence-sdk/components/accelerator/nvidia/{clock-speed,ecc,fabric-manager,gpm,gpu-counts}/*
Deleted five complete NVIDIA GPU monitoring components: clock-speed (graphics/memory MHz), ECC error tracking, fabric-manager (NVSwitch/fabric state), GPM metrics collection, and GPU count validation; includes all implementations, tests, metrics, and support files.
Fleet Intelligence SDK - Component Utilities
third_party/fleet-intelligence-sdk/components/accelerator/{doc.go,nvidia/doc.go}
Deleted package documentation files.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 A Farewell to Code

The commands and clients take their leave,
Components packed, no need to grieve,
GPU monitors once stood so tall,
Now gone, removed—we've cleared it all!
Fleet SDK bids the codebase goodbye,
Clean slate remains beneath the sky! 🌟

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch prune-vendored-sdk-unused

Comment @coderabbitai help to get the list of available commands and usage tips.

Signed-off-by: Jingxiang Zhang <jingzhang@nvidia.com>
Signed-off-by: Jingxiang Zhang <jingzhang@nvidia.com>
@jingxiang-z jingxiang-z force-pushed the prune-vendored-sdk-unused branch from d2c93d7 to 7449240 Compare April 10, 2026 17:26
@jingxiang-z jingxiang-z marked this pull request as ready for review April 10, 2026 17:26
Signed-off-by: Jingxiang Zhang <jingzhang@nvidia.com>
Signed-off-by: Jingxiang Zhang <jingzhang@nvidia.com>
@jingxiang-z jingxiang-z requested review from rsampaio and vinodba April 10, 2026 17:49
@jingxiang-z jingxiang-z self-assigned this Apr 10, 2026
@jingxiang-z jingxiang-z merged commit 6209f76 into main Apr 10, 2026
9 checks passed
@jingxiang-z jingxiang-z deleted the prune-vendored-sdk-unused branch April 10, 2026 19:44
jingxiang-z added a commit that referenced this pull request Apr 14, 2026
Signed-off-by: Jingxiang Zhang <jingzhang@nvidia.com>
jingxiang-z added a commit that referenced this pull request Apr 14, 2026
Signed-off-by: Jingxiang Zhang <jingzhang@nvidia.com>
jingxiang-z added a commit that referenced this pull request Apr 14, 2026
Signed-off-by: Jingxiang Zhang <jingzhang@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants