Skip to content

fix(sync): bulk-load synthetics_tests for full-scan lookups in minimize-reads mode#547

Merged
michael-richey merged 2 commits intomainfrom
hamr/minimize-reads-5-synthetics-bypass
Apr 27, 2026
Merged

fix(sync): bulk-load synthetics_tests for full-scan lookups in minimize-reads mode#547
michael-richey merged 2 commits intomainfrom
hamr/minimize-reads-5-synthetics-bypass

Conversation

@michael-richey
Copy link
Copy Markdown
Collaborator

Summary

  • Adds State.ensure_resource_type_loaded(resource_type) — a generic lazy bulk-load method that loads an entire resource type from storage on first call, using insert-if-absent merge to protect mid-sync destination entries
  • Fixes monitors.py and service_level_objectives.py: removes eager top-level access to state.destination["synthetics_tests"]; moves access into the composite/fallback branches with a ensure_resource_type_loaded call before each scan
  • Fixes synthetics_test_suites.py and synthetics_global_variables.py: adds ensure_resource_type_loaded before the startswith scan that requires all synthetics_tests compound keys to be present
  • Adds "synthetics_tests": [] to monitors resource_connections (documents implicit dependency; empty attr list creates no topological edges)

Root cause: endswith/startswith scans in connect_id overrides require ALL synthetics_tests entries to be loaded. In --minimize-reads mode only the requested resource types are loaded up front. Per-resource lazy loading can't resolve compound keys (public_id#monitor_id) from a bare monitor ID.

Stacked on: hamr/minimize-reads-3-id-targeted

Test plan

  • GREEN/GREEN regression tests written first and verified passing on pre-fix code (8 tests across 4 classes)
  • RED/GREEN tests verified failing on pre-fix code, passing after (12 tests across 5 classes: TestEnsureResourceTypeLoaded, TestMonitorsCompositeMinimizeReads, TestSLOConnectIdMinimizeReads, TestSyntheticsTestSuitesMinimizeReads, TestSyntheticsGlobalVariablesMinimizeReads)
  • Full unit test suite: 403 passed
  • tox -e ruff,black: clean

🤖 Generated with Claude Code

Copy link
Copy Markdown
Collaborator Author

@michael-richey michael-richey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review

Overall: Approve. The root cause is correctly identified, the fix is targeted, and the test coverage is thorough. A few nits below.


state.pyensure_resource_type_loaded

Solid implementation. Specific things I noticed that are worth calling out as intentional:

  • Marking the type in _bulk_loaded_types before the get() call correctly prevents an infinite retry on persistent storage failure. Good.
  • The early return when not src_loaded and not dst_loaded does not re-raise or retry on the next call — by design (type is already in the set). Worth confirming this is the desired behavior when a resource type has no data at all (e.g. synthetics_tests genuinely not exported), since a second caller would assume it's been loaded.
  • The insert-if-absent merge and _ensure_attempted backfill are exactly right — no clobbering mid-sync state, and per-resource ensure_resource_loaded calls for already-bulk-loaded keys become no-ops.

Nit — docstring asyncio safety claim:

"Python GIL protects set/dict operations"

This is technically true for individual ops but slightly misleading. The actual safety guarantee here is that ensure_resource_type_loaded has no await points, so asyncio cooperative scheduling can't interleave another coroutine mid-execution. The GIL note is correct but may confuse readers into thinking this is threadsafe in the general multi-thread sense. Could read: "Fully synchronous with no await points — safe to call from concurrent asyncio workers without locking."


monitors.py

Moving synthetics_tests and slos lookups out of the function preamble and into the branches that need them is the right call. It eliminates the eager defaultdict access that would silently return {} in minimize-reads mode and cause the scan to find nothing.

The "synthetics_tests": [] entry in resource_connections is confirmed correct: the inner loop in _resource_connections / connect_resources iterates over attribute paths, so an empty list produces no find_attr calls, no topological edges, and no ensure_resource_loaded triggers. It's purely documentary, which is exactly what the comment says.


synthetics_global_variables.py

Nit: The call is ensure_resource_type_loaded(resource_to_connect) rather than ensure_resource_type_loaded("synthetics_tests"). Today resource_to_connect is always synthetics_tests for this model (per resource_connections in line 19), so it's equivalent. But it means if a new resource_to_connect is ever added to this model's config, it would get bulk-loaded unconditionally on every connect_id call. Compare with synthetics_test_suites.py which guards inside if resource_to_connect == "synthetics_tests":. Not a correctness issue now, just slightly over-broad.


Tests

  • Green/green regression tests are present and cover the right cases (composite remap, SLO alert, non-composite no-op, principals).
  • Red/green tests use real LocalFile + State rather than mocked storage — this is the right call for validating the lazy-load path end-to-end.
  • test_handles_empty_storage_gracefully asserting state._data.source.get("synthetics_tests") is None is a good guard against accidentally polluting the defaultdict with empty entries when storage returns nothing.
  • test_non_composite_does_not_trigger_bulk_load and test_slo_regular_monitor_does_not_trigger_bulk_load correctly assert that the happy/fast path doesn't pay the bulk-load cost — important perf property to lock in.

All CI checks green. The two nits above are minor and don't block merging.

riyazsh
riyazsh previously approved these changes Apr 27, 2026
@michael-richey michael-richey force-pushed the hamr/minimize-reads-3-id-targeted branch from e808545 to 4ddaf6e Compare April 27, 2026 17:36
Base automatically changed from hamr/minimize-reads-3-id-targeted to main April 27, 2026 18:31
@michael-richey michael-richey dismissed riyazsh’s stale review April 27, 2026 18:31

The base branch was changed.

riyazsh
riyazsh previously approved these changes Apr 27, 2026
michael-richey and others added 2 commits April 27, 2026 14:35
…n minimize-reads mode

In --minimize-reads mode, connect_id overrides in monitors, SLOs,
synthetics_test_suites, and synthetics_global_variables perform
endswith/startswith scans over all synthetics_tests entries. These scans
require the full type to be present, but minimize-reads only loads the
requested resource types up front.

Root cause: two variants —
1. monitors.py and service_level_objectives.py accessed
   state.destination["synthetics_tests"] eagerly at the top of connect_id,
   getting an empty defaultdict when synthetics_tests was not in scope.
2. synthetics_test_suites.py and synthetics_global_variables.py went
   through resource_connections but ensure_resource_loaded() cannot find
   compound keys (public_id#monitor_id) from a bare public_id.

Fix: add State.ensure_resource_type_loaded(resource_type) — a generic
lazy bulk-load method that loads an entire resource type from storage on
demand. Uses insert-if-absent merge to protect mid-sync destination
entries. Idempotent via _bulk_loaded_types set.

Call sites updated to invoke ensure_resource_type_loaded just before the
full-scan loop, keeping non-affected code paths untouched.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ans in minimize-reads mode

SyntheticsTests.connect_id() for the synthetics_tests -> synthetics_tests
path performs a startswith scan over compound keys ({public_id}#{monitor_id})
to remap subtestPublicId references. In minimize-reads mode only explicitly
requested test IDs are loaded, so referenced subtests are absent and the scan
finds nothing.

The framework-level ensure_resource_loaded() cannot compensate because it does
exact-key lookup and the state key is compound while the subtestPublicId is a
bare public_id.

Fix: add ensure_resource_type_loaded("synthetics_tests") before the scan,
mirroring the identical pattern already in synthetics_test_suites.py:101,
monitors.py:145, service_level_objectives.py:89, and
synthetics_global_variables.py:117.

RED/GREEN: TestSyntheticsTestsSelfRefMinimizeReads (3 tests) + GREEN/GREEN
regression: TestSyntheticsTestsConnectIdRegression (3 tests). Full suite: 409
passed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@michael-richey michael-richey force-pushed the hamr/minimize-reads-5-synthetics-bypass branch from 2100e48 to cd59981 Compare April 27, 2026 18:35
@michael-richey michael-richey merged commit 6f97c5e into main Apr 27, 2026
11 of 12 checks passed
@michael-richey michael-richey deleted the hamr/minimize-reads-5-synthetics-bypass branch April 27, 2026 19:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants