fix/stream alert dedup memory leak by duguwanglong · Pull Request #273 · AgentFlocks/flocks

duguwanglong · 2026-05-14T11:30:58Z

fix(workflow): compact inputParams before persisting execution record

inputParams was written verbatim into every workflow_execution SQLite row.
For HTTP /run batch calls that pass a large alert list as an input
parameter (e.g. {"raw_alerts": [...10k items...]}) this inflates each row
by the same order of magnitude as the uncompacted outputResults bug fixed
in the previous commit.

Changes:
- execution_store.py: create_execution_record() now passes input_params
  through compact_outputs_for_storage() before building the record, so
  known large-list keys (raw_alerts, enriched_alerts, etc.) are replaced
  with lightweight _<key>_count integers.  Scalar fields and unknown keys
  are preserved unchanged, so audit / replay use-cases are not affected.
- tests/workflow/test_execution_store_compact.py: two new tests pin the
  expected behaviour — "alerts" (not in DEFAULT_LARGE_LIST_KEYS) passes
  through unchanged, while "raw_alerts" (in the default set) is compacted.

…lert_dedup Root causes identified: 1. Large alert lists (enriched_alerts, unique_alerts, …) were written verbatim into every workflow_execution SQLite row — both in outputResults and in each executionLog snapshot emitted by _on_step_complete — duplicating megabytes of data that is already persisted to JSONL on disk. 2. The problem affected both the syslog trigger path (manager.py) and the HTTP background execution path (routes/workflow.py), but only the former had a local fix applied; the HTTP path was left unprotected. Changes: - execution_store.py: add compact_outputs_for_storage() and compact_history_for_storage() as the canonical compact API. Lists whose key belongs to DEFAULT_LARGE_LIST_KEYS and whose length exceeds DEFAULT_COMPACT_SIZE_THRESHOLD (100) are replaced with a lightweight _<key>_count integer; small lists and unknown keys pass through unchanged, so ordinary metadata arrays are never silently dropped. - syslog/manager.py: remove the local _compact_outputs/_compact_history implementation and import the shared helpers from execution_store. - routes/workflow.py: apply compact_outputs_for_storage to outputResults and compact_history_for_storage to executionLog in both the background execution path and the workflow-center invoke path. Also compact each step dict inside _on_step_complete before appending to step_history so that every intermediate _write_progress snapshot stays bounded. - tests/workflow/test_execution_store_compact.py: 14 new unit tests covering the size-threshold guard, non-mutation contract, defensive pass-through for non-dict inputs, and the end-to-end size reduction (>1 000x on a 10K-alert payload). Co-authored-by: Cursor <cursoragent@cursor.com>

inputParams was written verbatim into every workflow_execution SQLite row. For HTTP /run batch calls that pass a large alert list as an input parameter (e.g. {"raw_alerts": [...10k items...]}) this inflates each row by the same order of magnitude as the uncompacted outputResults bug fixed in the previous commit. Changes: - execution_store.py: create_execution_record() now passes input_params through compact_outputs_for_storage() before building the record, so known large-list keys (raw_alerts, enriched_alerts, etc.) are replaced with lightweight _<key>_count integers. Scalar fields and unknown keys are preserved unchanged, so audit / replay use-cases are not affected. - tests/workflow/test_execution_store_compact.py: two new tests pin the expected behaviour — "alerts" (not in DEFAULT_LARGE_LIST_KEYS) passes through unchanged, while "raw_alerts" (in the default set) is compacted. Co-authored-by: Cursor <cursoragent@cursor.com>

Five issues raised in code review, all fixed in this commit: 1. Docstring/behaviour mismatch — alerts key create_execution_record()'s docstring previously used {"alerts":[…10k]} as the example of what gets compacted. "alerts" is NOT in DEFAULT_LARGE_LIST_KEYS, so the row would silently stay large. The docstring now cites "raw_alerts" (which IS in the set) and adds an explicit note that keys outside the default set are stored verbatim. compact_outputs_for_storage() gains the same clarification so callers can easily see the full list of compacted keys. 2. workflow_center_invoke — executionLog path The invoke path proxies to an external service and never populates executionLog locally, so no data was at risk. Added an explanatory comment and an explicit compact_history_for_storage() call on the executionLog field as a forward-compatible guard against future code that might populate it before the final write. 3. tuple sequences not compacted compact_outputs_for_storage() checked isinstance(v, list), missing tuple values that some serialisation paths may produce. Changed to isinstance(v, (list, tuple)). New test test_compact_outputs_compacts_ tuple_sequences() pins this behaviour. 4. Unused import pytest Removed the bare "import pytest" from test_execution_store_compact.py; no pytest.raises / @pytest.mark usages exist in the file. 5. Observability note compact_outputs_for_storage() docstring now explicitly states that compacted keys are replaced with _<key>_count and that callers who need the full list contents must read from the JSONL files written by the workflow. Co-authored-by: Cursor <cursoragent@cursor.com>

…dedup-memory-leak fix/stream alert dedup memory leak

duguwanglong and others added 2 commits May 14, 2026 19:08

duguwanglong requested a review from xiami762 May 14, 2026 11:31

xiami762 approved these changes May 15, 2026

View reviewed changes

xiami762 merged commit 9ced911 into dev May 15, 2026

duguwanglong pushed a commit to DearEmma/flocks that referenced this pull request May 18, 2026

Merge pull request AgentFlocks#273 from AgentFlocks/fix/stream-alert-…

5599acf

…dedup-memory-leak fix/stream alert dedup memory leak

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix/stream alert dedup memory leak#273

fix/stream alert dedup memory leak#273
xiami762 merged 3 commits into
devfrom
fix/stream-alert-dedup-memory-leak

duguwanglong commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

duguwanglong commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants