fix: restore gpu info gpus OTLP attribute#201
Conversation
Signed-off-by: Jingxiang Zhang <jingzhang@nvidia.com>
📝 WalkthroughWalkthroughThis PR adds optional cached machine-info collection to the health exporter. The collector gains an options system and conditional NVML-backed machine-info provider. Runtime collection reads from the cache and asynchronously refreshes. OTLP export serializes GPU info to resource attributes. NVML instance is injected through exporter options and threaded via server initialization. ChangesCached Machine-Info Collection
OTLP GPU Info Export
NVML Instance Dependency Injection
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
internal/exporter/collector/collector_test.go (1)
628-646: 💤 Low valueConsider adding a test for the cache-not-ready branch.
The current test covers the happy path (
ok: true). For completeness, consider adding a test whereokstarts asfalseto verifyWaitForInitialRefreshis invoked, then transitions totrueafter waiting. This would validate the full cache-warming flow.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/exporter/collector/collector_test.go` around lines 628 - 646, Add a test that exercises the cache-not-ready branch by instantiating fakeMachineInfoProvider with ok=false, passing it into the collector setup used in the existing happy-path test, and verifying that collector calls RefreshAsync and WaitForInitialRefresh on the provider; use a goroutine or timer to flip f.ok to true after a short delay so the collector proceeds and ensure f.waited and f.refreshed become true and the final collection succeeds. Reference the fakeMachineInfoProvider type and its methods Get, RefreshAsync, and WaitForInitialRefresh when locating where to add the test and assertions.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@internal/exporter/collector/collector_test.go`:
- Around line 628-646: Add a test that exercises the cache-not-ready branch by
instantiating fakeMachineInfoProvider with ok=false, passing it into the
collector setup used in the existing happy-path test, and verifying that
collector calls RefreshAsync and WaitForInitialRefresh on the provider; use a
goroutine or timer to flip f.ok to true after a short delay so the collector
proceeds and ensure f.waited and f.refreshed become true and the final
collection succeeds. Reference the fakeMachineInfoProvider type and its methods
Get, RefreshAsync, and WaitForInitialRefresh when locating where to add the test
and assertions.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: d3a33a40-cdc6-4855-b505-6705735d76f8
📒 Files selected for processing (7)
internal/exporter/collector/collector.gointernal/exporter/collector/collector_test.gointernal/exporter/converter/otlp.gointernal/exporter/converter/otlp_test.gointernal/exporter/exporter.gointernal/exporter/options.gointernal/server/server.go
Description
Checklist
Summary by CodeRabbit