Skip to content

Add comprehensive AMD GPU advanced metrics#3

Merged
simonCatBot merged 1 commit intomasterfrom
feature/amd-gpu-advanced-metrics
Apr 11, 2026
Merged

Add comprehensive AMD GPU advanced metrics#3
simonCatBot merged 1 commit intomasterfrom
feature/amd-gpu-advanced-metrics

Conversation

@simonCatBot
Copy link
Copy Markdown
Owner

This PR adds comprehensive AMD GPU metrics support via amd-smi.

Features Added

Backend

  • GPU Engine Utilization (GFX, MEM, MM)
  • Thermal Sensors (Edge, Junction, Memory, Throttling)
  • Power Delivery (Instant, Average, Voltage)
  • Clock Metrics (SCLK, MCLK)
  • PCIe Link Status (Width, Speed, Bandwidth, Replay Errors)
  • XGMI Bandwidth (Multi-GPU interconnect)
  • Media Engines (Encoder, Decoder)
  • ECC Error Counters

UI Components

  • GpuEnginesPanel
  • GpuThermalPanel
  • GpuPowerPanel
  • GpuMediaPanel
  • GpuPciePanel
  • GpuEccPanel

Test Results

  • ✅ All 62 tests passing
  • ✅ Lint clean
  • ✅ Build successful

Implement advanced AMD GPU metrics collection via amd-smi:
- GPU Engine Utilization (GFX, MEM, MM/Video)
- Thermal Sensors (Edge, Junction, Memory, Throttling)
- Power Delivery (Instant, Average, Voltage)
- Clock Metrics (SCLK, MCLK)
- PCIe Link Status (Width, Speed, Bandwidth, Replay Errors)
- XGMI Bandwidth (Multi-GPU interconnect)
- Media Engines (Encoder, Decoder)
- ECC Error Counters (Correctable, Uncorrectable)

Add new UI components:
- GpuEnginesPanel: Visualize engine-level workload breakdown
- GpuThermalPanel: Multi-zone thermal monitoring
- GpuPowerPanel: Power delivery metrics
- GpuMediaPanel: Video encode/decode utilization
- GpuPciePanel: PCIe and XGMI link status
- GpuEccPanel: ECC error monitoring

Update API route to conditionally fetch advanced metrics
when amd-smi is available. All panels conditionally
render based on data availability.

Lint: Clean
Tests: 62 passing
Build: Successful
@simonCatBot simonCatBot merged commit 235c35b into master Apr 11, 2026
2 checks passed
@simonCatBot simonCatBot deleted the feature/amd-gpu-advanced-metrics branch April 11, 2026 20:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant