perf(parser,db): replace lifetime cml_stats refresh with windowed stats#51
Merged
Conversation
The background stats timer previously called update_cml_stats() for every
known CML every 60 s. That function scans the full history of cml_data,
causing 20-30 s of ~100% PostgreSQL CPU usage every minute regardless of
how much new data arrived.
This commit replaces that hot path with a cheap windowed variant:
Database (migration 009 / init.sql)
- Add 8 new columns to cml_stats:
completeness_percent_6h, total_records_6h, valid_records_6h,
mean_rsl_6h, stddev_rsl_6h,
completeness_percent_1h, mean_rsl_1h, stddev_rsl_1h
- Add update_cml_stats_windowed(cml_id, user_id) which computes all
windowed aggregates in one pass using FILTER clauses. TimescaleDB
chunk exclusion limits the scan to the current uncompressed chunk
(~6 h of data) irrespective of total dataset size.
- GRANT EXECUTE on the new function to demo_openmrg and
demo_orange_cameroun.
Parser (db_writer.py / main.py)
- Add DBWriter.refresh_windowed_stats() mirroring refresh_stats() but
calling update_cml_stats_windowed.
- Replace both refresh_stats() call sites in the stats background thread
with refresh_windowed_stats().
- Wire _update_stats_for_cmls() into write_rawdata() so lifetime columns
(total_records, min_rsl, max_rsl, …) stay current when new data
arrives; the stats update and the data insert share one commit so
they are atomic.
Webserver (main.py)
- /api/cml-stats: replace the complex LEFT JOIN + live STDDEV recompute
with a simple SELECT from cml_stats reading the pre-computed windowed
columns. Response now carries completeness_percent_6h as
completeness_percent, plus completeness_percent_1h and
stddev_last_60min (pre-computed 1 h stddev).
- get_archive_statistics(): replace COUNT(*) FROM cml_data_secure (full
hypertable scan) with SUM(total_records) FROM cml_stats (O(users)).
Frontend (realtime.html)
- Update dropdown labels: "Data Completeness" -> "Data Completeness (6h)"
and "RSL Std Dev (60min)" -> "RSL Std Dev (1h)".
- Update popup text to match new time-window labels.
Scripts / onboarding (generate_config.py)
- Add GRANT EXECUTE on update_cml_stats_windowed to per-user SQL
template so new users get access to both stats functions.
Tests
- test_api_cml_stats.py: rewrite fixture for the new 9-column schema
and add assertions for completeness_percent_1h and stddev_last_60min.
- All 63 parser unit tests and 67 webserver unit tests pass.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #51 +/- ##
==========================================
+ Coverage 84.31% 85.42% +1.10%
==========================================
Files 28 28
Lines 2940 3012 +72
==========================================
+ Hits 2479 2573 +94
+ Misses 461 439 -22
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Add unit tests for the new code paths introduced in the previous commit:
parser/tests/test_db_writer.py
- test_refresh_windowed_stats_commits_on_success: verifies
update_cml_stats_windowed is called and the transaction is committed.
- test_refresh_windowed_stats_rollback_on_error: verifies the exception
is swallowed and the connection is rolled back on DB failure.
parser/tests/test_main.py
- _capture_stats_loop helper: runs main() with a CapturingThread so the
stats_loop closure can be extracted and called synchronously.
- test_stats_loop_calls_refresh_windowed_stats_on_startup
- test_stats_loop_initial_refresh_error_is_swallowed
- test_stats_loop_calls_refresh_windowed_stats_in_timer_loop
webserver/tests/test_api_routes.py
- test_get_archive_statistics_reads_total_records_from_cml_stats:
regression guard ensuring the archive stats endpoint uses
COALESCE(SUM(total_records), 0) FROM cml_stats and never
COUNT(*) FROM cml_data (full hypertable scan).
_capture_stats_loop called stats_loop() after the with-patch block exited. With all patches removed, DBWriter.connect() attempted a real DB connection, failed, and the retry loop spun forever because mock_event.is_set() was configured to return False. Replace _capture_stats_loop with _run_stats_loop, which invokes stats_loop() inside the patch context so DBWriter stays mocked throughout the call.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The background stats timer previously called update_cml_stats() for every known CML every 60 s. That function scans the full history of cml_data, causing 20-30 s of ~100% PostgreSQL CPU usage every minute regardless of how much new data arrived.
This commit replaces that hot path with a cheap windowed variant:
Database (migration 009 / init.sql)
Parser (db_writer.py / main.py)
Webserver (main.py)
Frontend (realtime.html)
Scripts / onboarding (generate_config.py)
Tests