perf: add mtime/lru_cache to uncached registry loaders#2153
Merged
Trecek merged 8 commits intoMay 7, 2026
Conversation
Add caching to four registry-loading functions that re-read and re-parse YAML files on every call: - load_ml_sub_area_folding(): @lru_cache(maxsize=1) - list_migrations(): mtime-keyed dict cache - load_all_experiment_types(): mtime-keyed dict cache - load_all_methodology_traditions(): mtime-keyed dict cache Cache keys use directory mtimes so stale entries are invalidated when YAML files change. Tests added for all four caching behaviors. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Addresses three review findings: - Duplicate _dir_mtime in experiment_type_registry.py and methodology_tradition_registry.py extracted to _registry_utils.py - OSError sentinel changed from 0.0 to -1.0 to avoid epoch-mtime collision with missing directories Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
All four registry loaders now return copies of cached lists to prevent caller mutation (e.g. in-place sort) from corrupting the cache. load_ml_sub_area_folding returns an immutable tuple instead. Test assertions updated: identity (is) → equality (==) + non-identity. test_loader mtime invalidation uses os.utime instead of time.sleep. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…er dirs 0.0 is a valid mtime (epoch) that could collide with a real directory. Use _MISSING_MTIME (-1.0) from _registry_utils for consistency and safety. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Consistent with experiment_type_registry and methodology_tradition_registry. Adds OSError safety via the shared utility from _registry_utils. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ime tests Move `import os` and `import time` from inside test method bodies to module-level imports. Replace unreliable `time.sleep(0.05)` with explicit `os.utime()` for deterministic mtime advancement in cache invalidation tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use autouse fixture with yield to clear load_ml_sub_area_folding cache in both setup and teardown, preventing cache state leakage to other tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…pe file count Replace cross-package import of dir_mtime with inline try/except to satisfy REQ-IMP-001 cross-package submodule import rule. Bump recipe/ file count exemption from 29 to 30 for _registry_utils.py. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Trecek
added a commit
that referenced
this pull request
May 8, 2026
## Summary Four registry-loading functions re-read and re-parse YAML files from disk on every call with no caching. All four load static bundled assets that change only on package updates. This PR adds mtime-keyed dict caches to the three directory-scanning functions and `@lru_cache(maxsize=1)` to the single-file function, consistent with existing caching patterns in the codebase (`DefaultRecipeRepository._get_list()` in `repository.py`, `_block_budgets()` in `rules_blocks.py`). | Function | Module | Cache Strategy | |----------|--------|----------------| | `load_all_experiment_types()` | `recipe/experiment_type_registry.py` | mtime-keyed dict cache | | `load_all_methodology_traditions()` | `recipe/methodology_tradition_registry.py` | mtime-keyed dict cache | | `list_migrations()` | `migration/loader.py` | mtime-keyed dict cache | | `load_ml_sub_area_folding()` | `recipe/methodology_venue_appendix.py` | `@lru_cache(maxsize=1)` | Closes #2138 ## Implementation Plan Plan file: `/home/talon/projects/autoskillit-runs/impl-20260507-014202-057975/.autoskillit/temp/make-plan/perf_add_caching_to_uncached_registry_loaders_plan_2026-05-07_015049.md` 🤖 Generated with [Claude Code](https://claude.com/claude-code) via AutoSkillit <!-- autoskillit:pipeline-signature steps=prepare_pr,run_arch_lenses,compose_pr,annotate_pr_diff,review_pr --> ## Token Usage Summary | Step | Model | count | uncached | output | cache_read | peak_ctx | turns | cache_write | time | |------|-------|-------|----------|--------|------------|----------|-------|-------------|------| | plan | claude-sonnet-4-6 | 1 | 77 | 18.3k | 521.7k | 55.6k | 73 | 47.9k | 12m 2s | | verify | claude-sonnet-4-6 | 1 | 30 | 5.3k | 455.4k | 61.9k | 99 | 35.0k | 5m 16s | | implement* | MiniMax-M2.7-highspeed | 1 | 1.4M | 16.3k | 1.2M | 70.4k | 96 | 80.3k | 5m 18s | | prepare_pr* | MiniMax-M2.7-highspeed | 1 | 88.9k | 3.1k | 235.3k | 29.8k | 20 | 15.3k | 1m 16s | | compose_pr* | MiniMax-M2.7-highspeed | 1 | 116.3k | 2.5k | 294.9k | 29.8k | 25 | 15.0k | 1m 1s | | **Total** | | | 1.6M | 45.5k | 2.7M | 70.4k | | 193.5k | 24m 55s | \* *Step used a non-Anthropic provider; caching behavior may differ.* ## Token Efficiency | Step | LoC Changed | cache_read/LoC | cache_write/LoC | output/LoC | |------|-------------|----------------|-----------------|------------| | plan | 0 | — | — | — | | verify | 0 | — | — | — | | implement | 254 | 4712.8 | 316.1 | 64.0 | | prepare_pr | 0 | — | — | — | | compose_pr | 0 | — | — | — | | **Total** | **254** | 10646.7 | 761.8 | 179.0 | ## Model Usage Breakdown | Model | steps | uncached | output | cache_read | cache_write | time | |-------|-------|----------|--------|------------|-------------|------| | claude-sonnet-4-6 | 2 | 107 | 23.6k | 977.0k | 82.9k | 17m 19s | | MiniMax-M2.7-highspeed | 3 | 1.6M | 21.8k | 1.7M | 110.6k | 7m 35s | --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Four registry-loading functions re-read and re-parse YAML files from disk on every call with no caching. All four load static bundled assets that change only on package updates. This PR adds mtime-keyed dict caches to the three directory-scanning functions and
@lru_cache(maxsize=1)to the single-file function, consistent with existing caching patterns in the codebase (DefaultRecipeRepository._get_list()inrepository.py,_block_budgets()inrules_blocks.py).load_all_experiment_types()recipe/experiment_type_registry.pyload_all_methodology_traditions()recipe/methodology_tradition_registry.pylist_migrations()migration/loader.pyload_ml_sub_area_folding()recipe/methodology_venue_appendix.py@lru_cache(maxsize=1)Closes #2138
Implementation Plan
Plan file:
/home/talon/projects/autoskillit-runs/impl-20260507-014202-057975/.autoskillit/temp/make-plan/perf_add_caching_to_uncached_registry_loaders_plan_2026-05-07_015049.md🤖 Generated with Claude Code via AutoSkillit
Token Usage Summary
* Step used a non-Anthropic provider; caching behavior may differ.
Token Efficiency
Model Usage Breakdown