Problem
The PeerMetrics dashboard (/apps/<id>) downloads all raw data into the browser and aggregates it with JavaScript. On production:
GET /v1/conferences?appId=...&created_at_gte=... returns 42,177 conferences in one response
- Response size: 21.7 MB of JSON
- Response time: ~13 seconds
- Gunicorn worker timeout is 30s — one more doubling of data = every dashboard load returns 504
Even after the PRs we merged (pagination #19, expand_fields removal, annotations #22), the graphs tab still pulls the full dataset because charts aggregate client-side.
Context — Scale
Sessions Health has ~3 apps total (dev / staging / prod). This changes the optimal strategy significantly:
- Cache hit rate will be near 100% once warm (all users hit the same small key set)
- Pre-warming every app is trivially cheap (~180 KB Redis, ~1 qps on DB)
- Cold-start latency can be fully eliminated
Goal
Three compounding layers that together make the dashboard load in ~50ms regardless of data volume, with zero 504s.
┌────────────────────────────────────────────────────────────────┐
│ LAYER 3: Vue Query (browser-side cache + dedup + SWR) │
│ → Repeat navigations hit in-memory cache instantly │
│ → Multiple charts requesting same data = one HTTP call │
└──────────────────────┬─────────────────────────────────────────┘
▼
┌────────────────────────────────────────────────────────────────┐
│ LAYER 2: Redis cache (server-side, 30-60s TTL) │
│ → With ~3 apps, a pre-warming job keeps cache always hot │
│ → 2 KB responses cache cheaply (vs 21 MB today) │
└──────────────────────┬─────────────────────────────────────────┘
▼
┌────────────────────────────────────────────────────────────────┐
│ LAYER 1: Server-side aggregation (PostgreSQL GROUP BY) │
│ → 21 MB → 2 KB responses │
│ → Query runs in <100ms with proper indexes │
│ → This is the foundational fix; nothing above works without it │
└────────────────────────────────────────────────────────────────┘
Expected results after all three layers ship (with pre-warming):
| Scenario |
Response time |
Response size |
| Any dashboard load, any user |
~50ms |
2 KB |
| Same user revisiting (Vue Query cache) |
<10ms (no network) |
0 bytes |
With only ~3 apps total, pre-warming keeps all Redis caches always hot. No user ever pays the cold-start Postgres cost.
Layer 1 — Server-Side Aggregation (foundational)
Charts that need aggregation (use the big conferences array — priority)
| Chart |
Current behavior |
Summary endpoint needed |
| Conferences (stacked bar) |
.reduce() groups by day, counts success/warning/error/ongoing |
GET /v1/conferences/summary?group_by=day |
| Most common issues |
Groups issues by code, counts per conference |
GET /v1/issues/summary?group_by=code |
| Conference duration |
Buckets conferences into duration ranges |
GET /v1/conferences/duration-summary |
| Call setup time |
Computes from connections, buckets in JS |
GET /v1/connections/setup-time-summary |
| Number of participants |
Counts participants per conference |
GET /v1/conferences/participant-count-summary |
Charts using sessions/connections (smaller, lower priority)
| Chart |
Data |
Current size |
Priority |
| Browsers |
sessions |
~3 MB |
Medium |
| OS |
sessions |
~3 MB |
Medium |
| Map |
sessions (geo_ip) |
~3 MB |
Medium |
| GUM errors |
issues |
<1 KB |
Low |
| Relayed connections |
connections |
<1 KB |
Low |
Click-to-Detail Pattern
When a user clicks a bar/segment, a modal shows matching conferences. Today this filters the in-memory array. With aggregation, we fetch just that slice via the paginated list endpoint:
async onChartClick(e) {
const dayStart = new Date(e.xValue);
const dayEnd = new Date(dayStart);
dayEnd.setDate(dayEnd.getDate() + 1);
const resp = await peermetrics.get('/v1/conferences', {
appId: peermetrics.app.id,
created_at_gte: dayStart.toISOString(),
created_at_lte: dayEnd.toISOString(),
limit: 50,
});
this.modalConferences = resp.results;
}
API filters needed for modals
| Chart |
Filter |
Status |
| Conferences by day |
created_at_gte, created_at_lte |
✓ supported |
| Most common issues |
issue_code=xxx |
add |
| Conference duration |
duration_gte, duration_lt |
add |
| Call setup time |
setup_time_gte, setup_time_lt |
complex (setup time is per-connection) |
Example aggregation query
SELECT DATE(created_at) AS date,
COUNT(*) FILTER (WHERE ongoing) AS ongoing,
COUNT(*) FILTER (WHERE has_errors AND NOT ongoing) AS error,
COUNT(*) FILTER (WHERE has_warnings AND NOT has_errors AND NOT ongoing) AS warning,
COUNT(*) FILTER (WHERE NOT has_errors AND NOT has_warnings AND NOT ongoing) AS success
FROM conference
WHERE app_id = ? AND created_at >= ?
GROUP BY DATE(created_at);
Layer 2 — Redis Caching (server-side)
Redis is already deployed (rate limiter uses it). Add a thin cache wrapper around summary endpoints.
Memory footprint
- 3 apps × ~10 summary endpoints × 3 date ranges = ~180 KB total
- Completely negligible
TTL strategy
| Endpoint |
TTL |
| Conference daily summary |
60s |
| Duration/participant distributions |
120s |
| Session browser/OS aggregates |
300s |
| Click-to-detail paginated lists |
no cache |
Pattern
def cached_summary(cache_key, ttl, compute_fn):
cached = redis.get(cache_key)
if cached:
return json.loads(cached)
result = compute_fn()
redis.setex(cache_key, ttl, json.dumps(result))
return result
Invalidation
TTL-based. No active invalidation. Add ?no_cache=true param for admin tools that need fresh data.
Cache stampede mitigation
Add randomized TTL jitter (60s ± 10s) so cache keys don't all expire simultaneously.
Layer 3 — Vue Query (browser-side)
Add vue-query@1.x (Vue 2 compatible) to the dashboard.
Benefits
- Instant cache hits on repeat navigation — dashboard → conference detail → back: renders in <10ms from memory
- Automatic deduplication — 3 charts requesting the same summary = 1 HTTP call
- Stale-while-revalidate — show cached data instantly, refetch in background
- Automatic retries on network errors
Pattern
const { data, isLoading } = useQuery({
queryKey: ['conferences-summary', appId, dateRange],
queryFn: () => peermetrics.get('/v1/conferences/summary', { appId, created_at_gte: dateRange }),
staleTime: 60_000,
cacheTime: 300_000,
});
Implementation Phases
Phase 0 — Prerequisites
Phase 1 — Conferences chart (MVP)
- API:
GET /v1/conferences/summary?appId=...&created_at_gte=...
- API: Add
created_at_lte support to paginated /v1/conferences
- Web:
conferencesChart.vue uses summary + paginated click handler
- Web: Remove big
/v1/conferences call from app.vue graphs tab
Success metric: Conferences chart renders in <200ms.
Phase 2 — Remaining conference/issue/connection charts
- API:
/conferences/duration-summary, /conferences/participant-count-summary
- API:
/issues/summary?group_by=code
- API:
/connections/setup-time-summary, /connections/summary?group_by=type
- API: Add
issue_code, duration_gte/lt filters on /conferences
- Web: Migrate
conferenceDurationChart, noParticipantsChart, mostCommongIssuesChart, callSetupTimeChart, connectionTypeChart
Phase 3 — Session charts
- API:
/sessions/summary?group_by=browser|os|country
- API: Geo data endpoint for
mapChart
- Web: Migrate
browsersChart, osChart, mapChart
- Web: Remove the full
/v1/sessions call (currently 3 MB)
Phase 4 — Redis caching
- Add
cached_summary() helper
- Wrap each summary endpoint with appropriate TTL
- Cache keys include app_id, date range, filters
- Monitor cache hit rate in CloudWatch
Phase 5 — Vue Query
- Install
vue-query@1.x (Vue 2 compat)
- Wrap dashboard with query client provider
- Refactor
app.vue data fetching to use useQuery
- Migrate chart components to consume queries instead of props
- Remove ad-hoc state (
this.data.conferences, this.paginatedConferences)
Phase 6 (recommended) — Cache pre-warming
Sessions Health has ~3 apps total. Pre-warming every app is cheap and worthwhile.
Implementation:
# management/commands/warm_dashboard_cache.py
def handle(self):
for app in App.objects.filter(is_active=True):
for endpoint in SUMMARY_ENDPOINTS:
key, ttl = cache_key_for(app, endpoint)
result = endpoint.compute(app)
redis.setex(key, ttl, json.dumps(result))
- Run every 30 seconds (TTL is 60s, so cache always fresh)
- Total DB load: ~30 queries per 30s cycle = 1 qps
- Effect: no user ever hits a cold cache, not even first load of the day
Edge Cases & Decisions
Dashboard filters (browser/OS/country/app version)
Filters run client-side today. With aggregation, they must be passed as query params to summary endpoints. Decision: include filter params from Phase 1 onward. Cache keys include filters so filtered/unfiltered queries are cached separately.
Redis key schema
conf_summary:{app_id}:{date_range_hash}:{filter_hash}
issues_summary:{app_id}:{date_range_hash}:{filter_hash}
Vue Query + Vue 2 compatibility
Use vue-query@1.x (Vue 2 compat). @tanstack/vue-query@5.x is Vue 3 only.
Timezone
Aggregate in UTC for Phase 1. Frontend shows in browser timezone. Add timezone_offset param later if users complain.
Ongoing + mutually exclusive status
Conferences chart shows success | warning | error | ongoing as mutually exclusive. SQL uses FILTER with explicit AND NOT clauses (see query example above).
Rate limiting
Summary endpoints hit GenericView (2000/min per IP). Vue Query's dedup helps — multiple charts asking for the same data fire one request, not several.
Backward compatibility
Keep the old /v1/conferences list working. Summary endpoints are additive. Migrate chart-by-chart.
Cache stampede
Randomized TTL jitter (60s ± 10s) so keys don't expire together.
Pre-warming scope
With only ~3 apps, pre-warm every app every 30 seconds. No complex "which apps to warm" logic needed.
What to Measure
| Metric |
Before |
Target |
| Dashboard load time (warm Redis) |
13s |
~50ms |
| Dashboard load time (Vue Query hit) |
N/A |
<10ms |
| Largest API response size |
21 MB |
<50 KB |
| 504 error rate |
growing |
0 |
| Postgres dashboard query time |
3-5s |
<100ms |
| Redis cache hit rate |
N/A |
~100% with pre-warming |
Dependencies
Problem
The PeerMetrics dashboard (
/apps/<id>) downloads all raw data into the browser and aggregates it with JavaScript. On production:GET /v1/conferences?appId=...&created_at_gte=...returns 42,177 conferences in one responseEven after the PRs we merged (pagination #19,
expand_fieldsremoval, annotations #22), the graphs tab still pulls the full dataset because charts aggregate client-side.Context — Scale
Sessions Health has ~3 apps total (dev / staging / prod). This changes the optimal strategy significantly:
Goal
Three compounding layers that together make the dashboard load in ~50ms regardless of data volume, with zero 504s.
Expected results after all three layers ship (with pre-warming):
With only ~3 apps total, pre-warming keeps all Redis caches always hot. No user ever pays the cold-start Postgres cost.
Layer 1 — Server-Side Aggregation (foundational)
Charts that need aggregation (use the big conferences array — priority)
.reduce()groups by day, counts success/warning/error/ongoingGET /v1/conferences/summary?group_by=dayGET /v1/issues/summary?group_by=codeGET /v1/conferences/duration-summaryGET /v1/connections/setup-time-summaryGET /v1/conferences/participant-count-summaryCharts using sessions/connections (smaller, lower priority)
Click-to-Detail Pattern
When a user clicks a bar/segment, a modal shows matching conferences. Today this filters the in-memory array. With aggregation, we fetch just that slice via the paginated list endpoint:
API filters needed for modals
created_at_gte,created_at_lteissue_code=xxxduration_gte,duration_ltsetup_time_gte,setup_time_ltExample aggregation query
Layer 2 — Redis Caching (server-side)
Redis is already deployed (rate limiter uses it). Add a thin cache wrapper around summary endpoints.
Memory footprint
TTL strategy
Pattern
Invalidation
TTL-based. No active invalidation. Add
?no_cache=trueparam for admin tools that need fresh data.Cache stampede mitigation
Add randomized TTL jitter (60s ± 10s) so cache keys don't all expire simultaneously.
Layer 3 — Vue Query (browser-side)
Add
vue-query@1.x(Vue 2 compatible) to the dashboard.Benefits
Pattern
Implementation Phases
Phase 0 — Prerequisites
conference(app_id, created_at),app_genericevent(app_id, created_at)(tied to Performance: Dashboard takes 56s to load — DB optimization needed #15)Phase 1 — Conferences chart (MVP)
GET /v1/conferences/summary?appId=...&created_at_gte=...created_at_ltesupport to paginated/v1/conferencesconferencesChart.vueuses summary + paginated click handler/v1/conferencescall fromapp.vuegraphs tabSuccess metric: Conferences chart renders in <200ms.
Phase 2 — Remaining conference/issue/connection charts
/conferences/duration-summary,/conferences/participant-count-summary/issues/summary?group_by=code/connections/setup-time-summary,/connections/summary?group_by=typeissue_code,duration_gte/ltfilters on/conferencesconferenceDurationChart,noParticipantsChart,mostCommongIssuesChart,callSetupTimeChart,connectionTypeChartPhase 3 — Session charts
/sessions/summary?group_by=browser|os|countrymapChartbrowsersChart,osChart,mapChart/v1/sessionscall (currently 3 MB)Phase 4 — Redis caching
cached_summary()helperPhase 5 — Vue Query
vue-query@1.x(Vue 2 compat)app.vuedata fetching to useuseQuerythis.data.conferences,this.paginatedConferences)Phase 6 (recommended) — Cache pre-warming
Sessions Health has ~3 apps total. Pre-warming every app is cheap and worthwhile.
Implementation:
Edge Cases & Decisions
Dashboard filters (browser/OS/country/app version)
Filters run client-side today. With aggregation, they must be passed as query params to summary endpoints. Decision: include filter params from Phase 1 onward. Cache keys include filters so filtered/unfiltered queries are cached separately.
Redis key schema
Vue Query + Vue 2 compatibility
Use
vue-query@1.x(Vue 2 compat).@tanstack/vue-query@5.xis Vue 3 only.Timezone
Aggregate in UTC for Phase 1. Frontend shows in browser timezone. Add
timezone_offsetparam later if users complain.Ongoing + mutually exclusive status
Conferences chart shows
success | warning | error | ongoingas mutually exclusive. SQL usesFILTERwith explicit AND NOT clauses (see query example above).Rate limiting
Summary endpoints hit
GenericView(2000/min per IP). Vue Query's dedup helps — multiple charts asking for the same data fire one request, not several.Backward compatibility
Keep the old
/v1/conferenceslist working. Summary endpoints are additive. Migrate chart-by-chart.Cache stampede
Randomized TTL jitter (60s ± 10s) so keys don't expire together.
Pre-warming scope
With only ~3 apps, pre-warm every app every 30 seconds. No complex "which apps to warm" logic needed.
What to Measure
Dependencies