Skip to content

Conversation

@clubanderson
Copy link
Contributor

Summary

  • Fix GPU count displaying 0 on cluster dashboard cards when fetch fails
  • Add retry logic and localStorage cache recovery for resilience
  • Remove premature cache clearing that caused data loss

Problem

GPU counts showed 0 on all cluster cards while other metrics (nodes, CPUs, pods) displayed correctly. Root cause: localStorage cache was cleared BEFORE the fetch attempt, so when the fetch failed/aborted, there was no data to display.

Changes

  1. Remove premature cache clearing - Cache is now only updated after successful fetch
  2. Add retry logic - 2 retries with 2s/5s delays on fetch failure
  3. Add localStorage recovery - Restore from localStorage when memory cache is empty
  4. Remove demo mode cache clearing - Prevents losing GPU data during mode toggle

Test plan

  • Toggle demo mode OFF → GPU counts persist (not reset to 0)
  • Navigate away during fetch → Navigate back → GPUs display from cache
  • Verify platform-eval shows ~10 GPUs, vllm-d shows ~42 GPUs (58 expected but 2 nodes have broken NVIDIA drivers)

🤖 Generated with Claude Code

The GPU count was showing 0 on all cluster cards because:
1. localStorage cache was cleared BEFORE the fetch attempt
2. When the fetch failed/aborted, there was no data to display
3. Demo mode toggle was also clearing the GPU cache

Changes:
- Remove premature localStorage cache clearing before GPU fetch
- Add retry logic (2 retries with 2s/5s delays) on fetch failure
- Restore from localStorage cache when memory cache is empty
- Remove GPU cache clearing from demo mode toggle

The cache is now only updated after a successful fetch, ensuring
GPU data persists through transient fetch failures.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Andrew Anderson <andy@clubanderson.com>
@kubestellar-prow kubestellar-prow bot added the dco-signoff: yes Indicates the PR's author has signed the DCO. label Jan 27, 2026
@kubestellar-prow
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign clubanderson for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@netlify
Copy link

netlify bot commented Jan 27, 2026

Deploy Preview for kubestellarklaudeconsole ready!

Name Link
🔨 Latest commit 6e47fe0
🔍 Latest deploy log https://app.netlify.com/projects/kubestellarklaudeconsole/deploys/6978e61f98a2f00008ec4a78
😎 Deploy Preview https://deploy-preview-112.console-deploy-preview.kubestellar.io
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@kubestellar-prow kubestellar-prow bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jan 27, 2026
@github-actions
Copy link
Contributor

Welcome to KubeStellar! 🚀 Thank you for submitting this Pull Request.

Before your PR can be merged, please ensure:

DCO Sign-off - All commits must be signed off with git commit -s to certify the Developer Certificate of Origin

PR Title - Must start with an emoji: ✨ (feature), 🐛 (bug fix), 📖 (docs), 🌱 (infra/tests), ⚠️ (breaking change)

Getting Started with KubeStellar:

Contributor Resources:


🌟 Help KubeStellar Grow - We Need Adopters!

Our roadmap is driven entirely by adopter feedback. Whether you're using KubeStellar yourself or know someone who could benefit from multi-cluster Kubernetes:

📋 Take our Multi-Cluster Survey - Share your use cases and help shape our direction!


A maintainer will review your PR soon. Feel free to ask questions in the comments or on Slack!

@clubanderson clubanderson merged commit 042ac58 into main Jan 27, 2026
22 of 28 checks passed
@kubestellar-prow kubestellar-prow bot deleted the fix/gpu-count-display branch January 27, 2026 16:31
@github-actions
Copy link
Contributor

🎉 Thank you for your contribution! Your PR has been successfully merged.


🌟 Help KubeStellar Grow - We Need Adopters!

Our roadmap is driven entirely by adopter feedback - nothing else. Whether you're using KubeStellar yourself or know organizations that could benefit from multi-cluster Kubernetes, we need your help:

📋 Take our Multi-Cluster Survey - Share your use cases and help shape our direction!

🗣️ Spread the word - Tell colleagues, write blog posts, present at meetups

💬 Share feedback on Slack #kubestellar-dev

Every adopter story helps us prioritize what matters most. Thank you for being part of the KubeStellar community!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dco-signoff: yes Indicates the PR's author has signed the DCO. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants