fix(signals): batch scout findings lookups to speed up inbox load#66823
Conversation
Opening the Findings panel fanned out one HTTP request per emitted run across two endpoints (up to ~240 requests, capped at 120 runs), and each per-run emission-reports request ran its own ClickHouse query over document_embeddings. That made the panel slow to open. Replace the per-run fan-out with two batched POST endpoints that take a run_ids list: emissions_batch and emission_reports_batch. The batched reports lookup resolves every run's findings in a single fetch_report_ids_for_source_ids call (one ClickHouse round-trip). findingsLogic now issues two batched requests instead of the per-run map.
🕸️ Eager graphHow much code each root forces the browser to download and decode through static imports — the regression class total bundle size can't see.
✅ Largest files eagerly reachable from
|
| Size | File |
|---|---|
| 906.9 KiB | src/styles/global.scss |
| 609.0 KiB | public/hedgehog/burning-money-hog.png |
| 541.9 KiB | public/hedgehog/waving-hog.png |
| 448.2 KiB | public/hedgehog/stop-sign-hog.png |
| 362.0 KiB | public/hedgehog/phone-pair-hogs.png |
| 357.8 KiB | ../node_modules/.pnpm/@posthog+icons@0.37.4_react-dom@18.3.1_react@18.3.1__react@18.3.1/node_modules/@posthog/icons/dist/posthog-icons.es.js |
| 343.3 KiB | src/taxonomy/core-filter-definitions-by-group.json |
| 335.6 KiB | public/hedgehog/desk-hog.png |
| 323.2 KiB | public/hedgehog/3-bears-hogs.png |
| 301.3 KiB | src/lib/api.ts |
Largest files eagerly reachable from src/scenes/AuthenticatedShell.tsx
| Size | File |
|---|---|
| 906.9 KiB | src/styles/global.scss |
| 771.7 KiB | src/queries/validators.js |
| 609.0 KiB | public/hedgehog/burning-money-hog.png |
| 541.9 KiB | public/hedgehog/waving-hog.png |
| 448.2 KiB | public/hedgehog/stop-sign-hog.png |
| 362.0 KiB | public/hedgehog/phone-pair-hogs.png |
| 357.8 KiB | ../node_modules/.pnpm/@posthog+icons@0.37.4_react-dom@18.3.1_react@18.3.1__react@18.3.1/node_modules/@posthog/icons/dist/posthog-icons.es.js |
| 343.3 KiB | src/taxonomy/core-filter-definitions-by-group.json |
| 335.6 KiB | public/hedgehog/desk-hog.png |
| 323.2 KiB | public/hedgehog/3-bears-hogs.png |
Posted automatically by check-eager-graph · sizes are input-source bytes from the esbuild metafile · part of #32479
|
Reviews (1): Last reviewed commit: "chore: update OpenAPI generated types" | Re-trigger Greptile |
MCP UI Apps size report
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8ed336f0cb
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
Size Change: 0 B Total Size: 64.5 MB ℹ️ View Unchanged
|
The report-link batch (emissions/reports/batch) requires task:read on top of signal_scout:read, while the emissions batch needs only signal_scout:read. A token with the latter but not the former loads findings cleanly but 403s on the report-link endpoint, and the retry listener re-polls it while any recent finding is unlinked — so the throw would hit the global kea-loaders error handler on a loop. Report chips are optional enrichment, so swallow the failure and keep the prior links (restoring the old allSettled intent); the emissions loader keeps throwing since that is the page's actual content.
|
👋 Visual changes detected for this PR. Review and approve in PostHog Visual Review If these changes are unexpected, they may be caused by a flaky test or a broken snapshot on master. Don't approve — rerun the job or wait for a fix. |
2 updated Run: 4a0ca4b7-f3b4-4815-a2e2-547834b2bc47 Co-authored-by: andrewm4894 <2178292+andrewm4894@users.noreply.github.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0f51040901
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| hash: v1.k794b7964.dbe806f797399ea2451bdf08b2a0bc1a413ea4d65e1a244e859a6f17a330005d.i8IfnQ9ONhYyodW5HPR_2GSN7q1dcVA3cPUtXyNi_ro | ||
| scenes-app-customer-analytics-accounts--row-expanded-links-disabled--light: | ||
| hash: v1.k794b7964.b6d4fa4f4fe7f87fe5fea5c702fc4db8558f8b9f0904cb9659ae9aece845b3ac.QfMAWNJmI8DCp90AGyh4Ena5eXJqEtCc3OFG2onVXnI | ||
| hash: v1.k794b7964.f82e5d3d3f153d31ba1ef5b668ebb5ab8aab000fe4acb42252f89e9f56a838bc.GkNSIp0tMmvNDlRasel6bRQBgice_VcLKHISDWHGnnU |
There was a problem hiding this comment.
Revert unrelated visual snapshot baselines
This change only updates Signals scout APIs/UI, but these new hashes are for the unrelated Customer analytics accounts row-expanded-links-disabled Storybook story. Once this lands, visual CI will treat the new dark/light hashes as expected and can mask a regression in that story that this PR does not otherwise justify, so please remove these snapshot updates or include the actual Customer analytics change that requires them.
Useful? React with 👍 / 👎.
| emissions = SignalScoutEmission.objects.filter(scout_run_id__in=run_ids, team_id=team_id).order_by( | ||
| "-emitted_at", "-id" | ||
| )[:MAX_EMISSIONS_PER_BATCH] |
There was a problem hiding this comment.
Preserve coverage across all requested runs
With the new global [:MAX_EMISSIONS_PER_BATCH] applied after sorting all requested runs together, one noisy/retry-heavy run can consume the entire 5,000-row window and silently drop older findings from the other runs in the Findings panel. The previous per-run fetch capped each run independently, so a single pathological run could not starve the rest of the fleet; keep a per-run cap/window or otherwise signal truncation so triage lists are not incomplete without warning.
Useful? React with 👍 / 👎.
|
🎭 Playwright report · View test results →
These issues are not necessarily caused by your changes. |
Problem
Opening the Findings panel in the inbox scouts surface was slow.
findingsLogictook the recent emitted-run window (capped at 120 runs) and fanned out one HTTP request per run, across two endpoints — up to ~240 parallel requests on open. The browser throttles to ~6 connections per host, so they serialized into ~40 waves. Worse, each per-runemissions/reportsrequest ran its own ClickHouse query doing unmaterializedJSONExtractoverdocument_embeddings— up to 120 separate CH scans per panel open. That's what made the button feel like it hung.Changes
Replace the per-run fan-out with two batched
POSTendpoints onSignalScoutRunViewSetthat accept arun_idslist:emissions/batch— every run's emitted findings in one Postgres query, flattened newest-first (each row keeps itsrun_id).emissions/reports/batch— resolves every run's findings to their inbox report in a singlefetch_report_ids_for_source_idscall, i.e. one ClickHouse round-trip instead of one per run.The shared report-link resolution is factored into a
_resolve_emission_report_linkshelper that both the per-run and batched actions use. Foreign-team run ids contribute no rows (no per-run 404 — one stale id can't blank the page); the reports endpoint keeps itstask:readscope requirement.findingsLogicnow issues two batched requests instead of mapping over runs, which collapses the old partial-failure machinery into simple all-or-nothing loaders (kea-loaders keeps prior report chips on a failed poll). Net effect: ~240 requests / up to 120 CH scans → 2 requests / 1 CH scan.Generated OpenAPI types / zod / MCP client were regenerated; these endpoints are intentionally not exposed as MCP tools (UI-internal).
How did you test this code?
I'm an agent (Claude Code), human-directed. Automated tests only — no manual UI testing.
test_scout_harness_api.py:TestScoutHarnessEmissionsBatchAPI— cross-run flattening + per-rowrun_id, foreign-team scoping (200 with only own rows, not 404), emptyrun_idsrejected (400).TestScoutHarnessEmissionReportsBatchAPI— the core regression guard:fetch_report_ids_for_source_idsis called exactly once with the source ids from all runs (prevents a silent revert to per-run fan-out), plus cross-run link mapping and foreign-team scoping.test_scout_harness_api.pypasses (108/108), so the per-run refactor didn't regress.hogli build:openapi; ran ruff, oxlint, and the frontend typecheck clean on all changed files.🤖 Agent context
Autonomy: Human-driven (agent-assisted)
/improving-drf-endpoints(new DRF actions + request serializer),/phs make-pr(this PR).emissions/reportsfan-out (each request its ownJSONExtractClickHouse scan). Sincefetch_report_ids_for_source_idsalready accepts an arbitrarysource_idslist, batching was the natural, lowest-risk fix — chosen over a deeper ClickHouse change.metadataJSON fields ondocument_embeddingsso even the single remaining query stops paying JSON-extract scan cost — that's a ClickHouse migration on a shared table.