feat(gastown): add manual container token refresh in town settings#1102
feat(gastown): add manual container token refresh in town settings#1102
Conversation
Code Review SummaryStatus: 2 Issues Found | Recommendation: Address before merge Overview
Fix these issues in Kilo Cloud Issue Details (click to expand)WARNING
Other Observations (not in diff)N/A Files Reviewed (5 files)
Reviewed by gpt-5.4-20260305 · 823,449 tokens |
The $timeFilter macro already bounds the result set sufficiently. Hardcoded LIMITs were silently truncating data in 9 panels.
| "nullifySparse": false, | ||
| "query": "SELECT blob5 AS error_message, blob1 AS event, SUM(_sample_interval) AS count FROM gastown_events WHERE $timeFilter AND blob5 != '' GROUP BY error_message, event ORDER BY count DESC LIMIT 30", | ||
| "rawSql": "SELECT blob5 AS error_message, blob1 AS event, SUM(_sample_interval) AS count FROM gastown_events WHERE $timeFilter AND blob5 != '' GROUP BY error_message, event ORDER BY count DESC LIMIT 30", | ||
| "query": "SELECT blob5 AS error_message, blob1 AS event, SUM(_sample_interval) AS count FROM gastown_events WHERE $timeFilter AND blob5 != '' GROUP BY error_message, event ORDER BY count DESC", |
There was a problem hiding this comment.
WARNING: Keep a row cap on this high-cardinality panel
Removing the LIMIT turns the "Top Error Messages" table into an unbounded GROUP BY error_message, event over the whole time window. On production data that can return thousands of rows, which is likely to slow the ClickHouse query and make the dashboard much heavier to load.
| "nullifySparse": false, | ||
| "query": "SELECT blob2 AS user_id, SUM(_sample_interval) AS total_events, SUM(IF(blob5 != '', _sample_interval, 0)) AS error_count, SUM(IF(blob5 != '', _sample_interval, 0)) / SUM(_sample_interval) AS error_rate, SUM(_sample_interval * double1) / SUM(_sample_interval) AS avg_latency_ms, COUNT(DISTINCT blob6) AS town_count FROM gastown_events WHERE $timeFilter AND blob2 != '' GROUP BY user_id ORDER BY total_events DESC LIMIT 25", | ||
| "rawSql": "SELECT blob2 AS user_id, SUM(_sample_interval) AS total_events, SUM(IF(blob5 != '', _sample_interval, 0)) AS error_count, SUM(IF(blob5 != '', _sample_interval, 0)) / SUM(_sample_interval) AS error_rate, SUM(_sample_interval * double1) / SUM(_sample_interval) AS avg_latency_ms, COUNT(DISTINCT blob6) AS town_count FROM gastown_events WHERE $timeFilter AND blob2 != '' GROUP BY user_id ORDER BY total_events DESC LIMIT 25", | ||
| "query": "SELECT blob2 AS user_id, SUM(_sample_interval) AS total_events, SUM(IF(blob5 != '', _sample_interval, 0)) AS error_count, SUM(IF(blob5 != '', _sample_interval, 0)) / SUM(_sample_interval) AS error_rate, SUM(_sample_interval * double1) / SUM(_sample_interval) AS avg_latency_ms, COUNT(DISTINCT blob6) AS town_count FROM gastown_events WHERE $timeFilter AND blob2 != '' GROUP BY user_id ORDER BY total_events DESC", |
There was a problem hiding this comment.
WARNING: Bound this per-user query before shipping
This panel now groups every distinct user_id in the selected window with no LIMIT. On busy towns that can explode to thousands of rows and materially increase both query time and Grafana render time; the previous top-N cap kept the panel useful without making it unbounded.
Summary
forceRefreshContainerToken()public RPC method on TownDO andrefreshContainerTokentRPC mutation with ownership verificationCloses #1101
Verification
pnpm typecheck— all 28 workspace projects pass cleanlyVisual Changes
New "Container" section at the bottom of the town settings page with a "Refresh Token" button (spinning icon while pending, success/error toasts).
Reviewer Notes
refreshContainerTokenin Town.do is private and throttled to 1h. The newforceRefreshContainerTokenbypasses the throttle but still updateslastContainerTokenRefreshAtso the next alarm-driven refresh doesn't fire unnecessarily.router.d.tstype bridge file was updated manually to include the new procedure in both the innergastownRouterand outerwrappedGastownRoutertypes.