Merged
Conversation
Introduce a per-org miner_state_snapshots hypertable written every 60s by fleetStateSnapshotRoutine, and route the Uptime chart through it. Chart and live legend now share one classifier (CountMinersByState), and the chart renders three segments: Hashing / Needs attention / Not hashing, with drill-throughs to the corresponding miner-list filters. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Collapses generated protobuf, sqlc, and mockgen output in GitHub diffs and excludes them from language stats. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Matches the canonical formatter output (protobuf-es + prettier + goimports). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7c1eca5 to
c6b9e02
Compare
🔐 Codex Security Review
Review SummaryOverall Risk: HIGH Findings[HIGH] Per-device minute snapshots will create unsustainable TimescaleDB growth
[MEDIUM] Switching uptime reads to the new snapshot table drops all pre-deploy history
[MEDIUM] Historical
|
Switch miner_state_snapshots from per-org aggregate rows to per-device state rows, and aggregate at read time. This lets uptime_status_counts honor the full device_selector (fleet, group, rack, arbitrary list) with the same CountMinersByState classifier the live legend uses. Other simplifications that fall out: - Replace the per-org Go loop in the writer with a single INSERT...SELECT that materializes state for the whole paired fleet in one round-trip. - Drop OrgIDLister, SQLOrganizationStore, and MinerStateCountsRow — no longer needed now that the snapshot query itself produces the rows. - GetMinerStateSnapshots aggregates with DISTINCT ON per (bucket, device) so bucket sums stay truthful regardless of snapshot alignment, and applies the device-identifier filter at the CTE level. Requires a local `just db-reset` since the unmerged migration 000033 is edited in place (schema change, not a new migration). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tighten prose across the snapshot code so comments explain non-obvious why only. Cross-link the classifier between InsertMinerStateSnapshot and CountMinersByState so future edits don't drift silently. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
UI labels only; underlying buckets (hashing / broken / not-hashing) unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…fier
- Hourly and daily metric paths normalize start/end to complete buckets
before querying aggregates. Thread the normalized range into
uptimeCountsForQuery so the uptime series stops at the same edge instead
of leaking a partial current hour/day.
- InsertMinerStateSnapshot's hashing branch now requires ACTIVE + not
auth-needed + no open errors (matching CountMinersByState exactly). A new
state code 4 ("unknown") catches any status that doesn't fit the four
named buckets so historical rows don't misclassify those devices as
healthy. The read query already sums only 0..3, so unknown rows are
excluded from every bucket — same as CountMinersByState.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Append a synthetic UptimeStatusCount at time.Now() to GetCombinedMetrics (and keep the streaming path parity) using a live CountMinersByState call. The chart's right-most "live" bar now reflects current fleet state instead of lagging up to one snapshot interval behind the FleetHealth legend. Factor the filter-build + counts-fetch into appendLiveUptimeBar so the unary and streaming paths share one code path (previously only streaming populated MinerStateCounts). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
flesher
approved these changes
Apr 22, 2026
Contributor
flesher
left a comment
There was a problem hiding this comment.
One small thing I noticed, otherwise looks good
The uptime chart's Not-Hashing bucket (and FleetHealth's Sleeping segment)
count miners in either MAINTENANCE or INACTIVE as sleeping — matching the
CountMinersByState classifier. The URL filter plumbing only mapped
sleeping -> INACTIVE, so miners in MAINTENANCE counted toward the dashboard
bars but vanished from the filtered list page when users drilled through.
Fix symmetrically in three translators so sleeping <-> {INACTIVE, MAINTENANCE}
everywhere:
- encodeFilterToURL: either status now maps to status=sleeping.
- parseFilterFromURL: sleeping expands to both device statuses.
- MinerList dropdown: Sleeping chip selects both device statuses.
Call sites updated to pass both statuses for self-documentation:
- UptimePanel "Not hashing" drill-through.
- FleetHealth Sleeping segment link.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
miner_state_snapshotsTimescaleDB hypertable storing one row per paired device per tick with a 4-state code (offline/sleeping/broken/hashing). Classifier CASE mirrorsCountMinersByState, so the chart and the live FleetHealth legend agree.encodeFilterToURLhelper.INSERT ... SELECTper 60 s tick — one round-trip, no per-org loop, no Go-side array packing.GetMinerStateSnapshots) doesDISTINCT ON (bucket, device_identifier)+ SUM-by-state and applies the device filter at the CTE level, souptime_status_countshonors the fulldevice_selector(fleet, group, rack, arbitrary list). Drops in cleanly for future group/rack overview pages without server changes.fixes #12
Test plan
go build,go vet,golangci-lint runclean onserver/internal/...andserver/cmd/...go test ./internal/domain/telemetry/... ./internal/handlers/telemetry/...greenwriteFleetStateSnapshot(happy path + log-on-error) and lifecycle tests (Start/Stop) cover the tick firescd client && npx tsc --noEmit,npm run lint,vitest run src/protoFleet/features/dashboardgreenminer_state_snapshotsgrows by ~fleet-size rows/min, open dashboard, confirm all 3 segments render, confirm drill-throughs land on the right filter URLs, confirm totals match FleetHealth legend🤖 Generated with Claude Code