perf: add canary patterns to skip statistics during bulk operations#150
perf: add canary patterns to skip statistics during bulk operations#150kneckinator wants to merge 3 commits into19.0from
Conversation
Replace per-record ORM creates and Command.create() tuples with raw SQL INSERT ... ON CONFLICT (unique_cols) DO NOTHING for bulk membership creation. Duplicates are silently skipped and the inserted count is returned via cursor.rowcount. Updates _import_registrants and _add_beneficiaries to use the new skip_duplicates path, with ORM cache invalidation after raw SQL inserts.
Set enrollment_date to current timestamp when state is 'enrolled' in the program membership SQL insert (computed field not triggered by raw SQL). Hoist fields.Date.today() outside the loop in cycle membership to avoid repeated calls per record.
Add context flags (skip_registrant_statistics, skip_program_statistics) that allow bulk operation callers to suppress expensive computed field recomputation. Add refresh_beneficiary_counts() on spp.program and refresh_statistics() on spp.cycle to recompute once at completion. Also replace bool(rec.program_membership_ids) with SQL EXISTS in _compute_has_members to avoid loading the full membership recordset.
There was a problem hiding this comment.
Code Review
This pull request implements bulk membership creation for programs and cycles using raw SQL with ON CONFLICT DO NOTHING to optimize performance and handle duplicate entries. It introduces 'canary' context flags to bypass expensive field recomputations during bulk processes, along with refresh methods to synchronize statistics upon completion. Review feedback highlights a missing refresh method for registrant statistics which could lead to stale data, and questions the redundancy of the cycle statistics refresh method for non-stored fields.
| if self.env.context.get("skip_registrant_statistics"): | ||
| return |
There was a problem hiding this comment.
The skip_registrant_statistics context flag allows bypassing the recomputation of store=True fields such as program_membership_count. However, unlike the spp.program and spp.cycle models, no corresponding refresh method is provided for res.partner to recompute these statistics after bulk operations. This will result in stale data remaining in the database if the flag is used or if raw SQL is employed.
Please consider adding a refresh_registrant_statistics() method to the res.partner inheritance and calling it in the appropriate completion handlers (e.g., in mark_import_as_done within eligibility_manager.py).
| def refresh_statistics(self): | ||
| """Refresh all cycle statistics after bulk operations. | ||
|
|
||
| Call this after raw SQL inserts that bypass ORM dependency tracking | ||
| (e.g. bulk_create_memberships with skip_duplicates=True). | ||
| """ | ||
| self._compute_members_count() | ||
| self._compute_entitlements_count() | ||
| self._compute_total_entitlements_count() |
There was a problem hiding this comment.
The refresh_statistics method appears to be redundant in its current implementation. The fields it attempts to refresh (members_count, entitlements_count, and total_entitlements_count) are all store=False and do not implement the "canary" skip logic found in other models.
Since these fields are computed on-demand and the underlying relation caches are correctly invalidated in the managers (e.g., via cycle.invalidate_recordset(['cycle_membership_ids'])), they will naturally reflect the correct values upon the next access without an explicit refresh call. If the intention was to optimize these fields for bulk operations, they should be made store=True and implement the skip logic, similar to has_members in spp.program.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## 19.0 #150 +/- ##
==========================================
+ Coverage 71.06% 71.12% +0.05%
==========================================
Files 925 925
Lines 54704 54797 +93
==========================================
+ Hits 38876 38973 +97
+ Misses 15828 15824 -4
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Summary
refresh_beneficiary_counts()onspp.programandrefresh_statistics()onspp.cycleto recompute all statistics once at completionbool(rec.program_membership_ids)with SQL EXISTS in_compute_has_membersto avoid loading the full membership recordsetmark_import_as_done()completion handlersNote: This PR is based on Phase 7 (#149) and will need rebasing after that merges.
Changes
registrant.py: Addskip_registrant_statisticscontext flag early return to all 4 compute methodsprograms.py: Addskip_program_statisticsflag to_compute_has_members, replace ORM with SQL EXISTS, addrefresh_beneficiary_counts()cycle.py: Addrefresh_statistics()methodeligibility_manager.py: Userefresh_beneficiary_counts()inmark_import_as_done()cycle_manager_base.py: Usecycle.refresh_statistics()inmark_import_as_done()test_canary_patterns.py(new): 9 tests covering context flags, SQL EXISTS, and refresh methodsTest plan
./scripts/test_single_module.sh spp_programs— 600 tests, 0 failurespre-commit run --files <changed_files>— all checks pass