perf: switch YAML loading to CSafeLoader (11.1x speedup)#2150
Merged
Trecek merged 4 commits intoMay 7, 2026
Conversation
Replace yaml.safe_load() with yaml.load(Loader=_Loader) where _Loader is CSafeLoader (LibYAML C backend) with automatic fallback to SafeLoader when LibYAML is not compiled in. All YAML loading flows through load_yaml() in core/io.py, so the change propagates to all consumers. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… import The CSafeLoader import block added 5 lines to core/io.py, shifting write_versioned_json's atomic_write call from line 118 to line 123. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The except ImportError branch was using bare `pass`, leaving _Loader unbound and causing NameError on systems without LibYAML. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…rized test Merge test_load_yaml_uses_c_loader_when_available and test_load_yaml_path_uses_c_loader into a single parametrized test (str-input vs path-input). Add assert for unexpected kwargs in the spy function. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Trecek
added a commit
that referenced
this pull request
May 8, 2026
## Summary
Replace `yaml.safe_load()` calls in
`src/autoskillit/core/io.py:load_yaml()` with `yaml.load(source,
Loader=CSafeLoader)`, using LibYAML's C backend for an 11.1x speedup.
All YAML loading in the codebase flows through this single function —
recipe validation, config loading, experiment type registries, migration
loading, and CLI validation all benefit from the improvement.
## Requirements
### Problem
`load_yaml()` in `src/autoskillit/core/io.py:181` uses
`yaml.safe_load()`, which defaults to the pure-Python `SafeLoader` even
when LibYAML's C backend is available. LibYAML IS compiled in (PyYAML
6.0.3, `__with_libyaml__ = True`), but `yaml.safe_load()` does not use
it automatically.
### Measured Impact
Benchmark on 5 real recipe files, 100 iterations:
```
SafeLoader: 8.557s
CSafeLoader: 0.770s
Speedup: 11.1x
```
All YAML loading in the codebase flows through this single function.
Recipe validation, config loading, experiment type/methodology tradition
loading, migration loading — all benefit.
### Fix
In `src/autoskillit/core/io.py`, change `load_yaml()` to use
`CSafeLoader` with fallback:
```python
try:
from yaml import CSafeLoader as _Loader
except ImportError:
from yaml import SafeLoader as _Loader
# Then in load_yaml():
data = yaml.load(text, Loader=_Loader)
```
### Risk
Low. CSafeLoader has minor behavioral differences from SafeLoader
(stricter timestamp handling, different inf/nan behavior), but these are
unlikely to affect recipe/config YAML. Run the full test suite to
confirm.
### Files
- `src/autoskillit/core/io.py` — single fix point
## Conflict Resolution Decisions
The following files had merge conflicts that were automatically
resolved.
## Changed Files
### Modified (●):
- `src/autoskillit/core/io.py`
- `tests/core/test_io.py`
- `tests/infra/test_schema_version_convention.py`
Closes #2133
## Implementation Plan
Plan file:
`/home/talon/projects/autoskillit-runs/impl-20260507-000427-400052/.autoskillit/temp/make-plan/perf_switch_yaml_loading_to_csafeloader_plan_2026-05-07_000427.md`
🤖 Generated with [Claude Code](https://claude.com/claude-code) via
AutoSkillit
<!-- autoskillit:pipeline-signature
steps=prepare_pr,run_arch_lenses,compose_pr,annotate_pr_diff,review_pr
-->
## Token Usage Summary
| Step | Model | count | uncached | output | cache_read | peak_ctx |
turns | cache_write | time |
|------|-------|-------|----------|--------|------------|----------|-------|-------------|------|
| plan | claude-opus-4-6 | 1 | 69 | 5.8k | 506.1k | 45.1k | 40 | 35.1k |
4m 4s |
| verify | claude-sonnet-4-6 | 1 | 35 | 6.3k | 276.6k | 49.7k | 53 |
36.6k | 3m 43s |
| implement* | MiniMax-M2.7-highspeed | 1 | 173.7k | 3.6k | 450.1k |
42.9k | 35 | 53.7k | 2m 4s |
| fix | claude-opus-4-6 | 1 | 47 | 4.2k | 714.1k | 47.1k | 33 | 34.2k |
10m 5s |
| prepare_pr* | MiniMax-M2.7-highspeed | 1 | 48.4k | 2.9k | 175.7k |
29.8k | 19 | 15.1k | 1m 11s |
| compose_pr* | MiniMax-M2.7-highspeed | 1 | 47.2k | 1.5k | 205.5k |
29.8k | 15 | 15.0k | 42s |
| **Total** | | | 269.5k | 24.3k | 2.3M | 49.7k | | 189.8k | 21m 51s |
\* *Step used a non-Anthropic provider; caching behavior may differ.*
## Token Efficiency
| Step | LoC Changed | cache_read/LoC | cache_write/LoC | output/LoC |
|------|-------------|----------------|-----------------|------------|
| plan | 0 | — | — | — |
| verify | 0 | — | — | — |
| implement | 58 | 7759.8 | 925.6 | 62.6 |
| fix | 2 | 357047.5 | 17119.5 | 2124.5 |
| prepare_pr | 0 | — | — | — |
| compose_pr | 0 | — | — | — |
| **Total** | **60** | 38801.3 | 3162.7 | 404.8 |
## Model Usage Breakdown
| Model | steps | uncached | output | cache_read | cache_write | time |
|-------|-------|----------|--------|------------|-------------|------|
| claude-opus-4-6 | 2 | 116 | 10.0k | 1.2M | 69.3k | 14m 9s |
| claude-sonnet-4-6 | 1 | 35 | 6.3k | 276.6k | 36.6k | 3m 43s |
| MiniMax-M2.7-highspeed | 3 | 269.3k | 8.0k | 831.3k | 83.8k | 3m 58s |
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replace
yaml.safe_load()calls insrc/autoskillit/core/io.py:load_yaml()withyaml.load(source, Loader=CSafeLoader), using LibYAML's C backend for an 11.1x speedup. All YAML loading in the codebase flows through this single function — recipe validation, config loading, experiment type registries, migration loading, and CLI validation all benefit from the improvement.Requirements
Problem
load_yaml()insrc/autoskillit/core/io.py:181usesyaml.safe_load(), which defaults to the pure-PythonSafeLoadereven when LibYAML's C backend is available. LibYAML IS compiled in (PyYAML 6.0.3,__with_libyaml__ = True), butyaml.safe_load()does not use it automatically.Measured Impact
Benchmark on 5 real recipe files, 100 iterations:
All YAML loading in the codebase flows through this single function. Recipe validation, config loading, experiment type/methodology tradition loading, migration loading — all benefit.
Fix
In
src/autoskillit/core/io.py, changeload_yaml()to useCSafeLoaderwith fallback:Risk
Low. CSafeLoader has minor behavioral differences from SafeLoader (stricter timestamp handling, different inf/nan behavior), but these are unlikely to affect recipe/config YAML. Run the full test suite to confirm.
Files
src/autoskillit/core/io.py— single fix pointConflict Resolution Decisions
The following files had merge conflicts that were automatically resolved.
Changed Files
Modified (●):
src/autoskillit/core/io.pytests/core/test_io.pytests/infra/test_schema_version_convention.pyCloses #2133
Implementation Plan
Plan file:
/home/talon/projects/autoskillit-runs/impl-20260507-000427-400052/.autoskillit/temp/make-plan/perf_switch_yaml_loading_to_csafeloader_plan_2026-05-07_000427.md🤖 Generated with Claude Code via AutoSkillit
Token Usage Summary
* Step used a non-Anthropic provider; caching behavior may differ.
Token Efficiency
Model Usage Breakdown