perf: switch YAML loading to CSafeLoader (11.1x speedup) by Trecek · Pull Request #2150 · TalonT-Org/AutoSkillit

Trecek · 2026-05-07T07:49:30Z

Summary

Replace yaml.safe_load() calls in src/autoskillit/core/io.py:load_yaml() with yaml.load(source, Loader=CSafeLoader), using LibYAML's C backend for an 11.1x speedup. All YAML loading in the codebase flows through this single function — recipe validation, config loading, experiment type registries, migration loading, and CLI validation all benefit from the improvement.

Requirements

Problem

load_yaml() in src/autoskillit/core/io.py:181 uses yaml.safe_load(), which defaults to the pure-Python SafeLoader even when LibYAML's C backend is available. LibYAML IS compiled in (PyYAML 6.0.3, __with_libyaml__ = True), but yaml.safe_load() does not use it automatically.

Measured Impact

Benchmark on 5 real recipe files, 100 iterations:

SafeLoader:  8.557s
CSafeLoader: 0.770s
Speedup:     11.1x

All YAML loading in the codebase flows through this single function. Recipe validation, config loading, experiment type/methodology tradition loading, migration loading — all benefit.

Fix

In src/autoskillit/core/io.py, change load_yaml() to use CSafeLoader with fallback:

try:
    from yaml import CSafeLoader as _Loader
except ImportError:
    from yaml import SafeLoader as _Loader

# Then in load_yaml():
data = yaml.load(text, Loader=_Loader)

Risk

Low. CSafeLoader has minor behavioral differences from SafeLoader (stricter timestamp handling, different inf/nan behavior), but these are unlikely to affect recipe/config YAML. Run the full test suite to confirm.

Files

src/autoskillit/core/io.py — single fix point

Conflict Resolution Decisions

The following files had merge conflicts that were automatically resolved.

Changed Files

Modified (●):

src/autoskillit/core/io.py
tests/core/test_io.py
tests/infra/test_schema_version_convention.py

Closes #2133

Implementation Plan

Plan file: /home/talon/projects/autoskillit-runs/impl-20260507-000427-400052/.autoskillit/temp/make-plan/perf_switch_yaml_loading_to_csafeloader_plan_2026-05-07_000427.md

🤖 Generated with Claude Code via AutoSkillit

Token Usage Summary

Step	Model	count	uncached	output	cache_read	peak_ctx	turns	cache_write	time
plan	claude-opus-4-6	1	69	5.8k	506.1k	45.1k	40	35.1k	4m 4s
verify	claude-sonnet-4-6	1	35	6.3k	276.6k	49.7k	53	36.6k	3m 43s
implement*	MiniMax-M2.7-highspeed	1	173.7k	3.6k	450.1k	42.9k	35	53.7k	2m 4s
prepare_pr*	MiniMax-M2.7-highspeed	1	48.4k	2.9k	175.7k	29.8k	19	15.1k	1m 11s
compose_pr*	MiniMax-M2.7-highspeed	1	47.2k	1.5k	205.5k	29.8k	15	15.0k	42s
review_pr	claude-sonnet-4-6	2	176	27.9k	713.9k	48.7k	65	73.5k	7m 31s
resolve_review	claude-opus-4-6	2	94	14.3k	1.4M	57.2k	67	83.3k	12m 44s
Total			269.7k	62.2k	3.8M	57.2k		312.3k	32m 1s

* Step used a non-Anthropic provider; caching behavior may differ.

Token Efficiency

Step	LoC Changed	cache_read/LoC	cache_write/LoC	output/LoC
plan	0	—	—	—
verify	0	—	—	—
implement	58	7759.8	925.6	62.6
prepare_pr	0	—	—	—
compose_pr	0	—	—	—
review_pr	0	—	—	—
resolve_review	43	33229.8	1938.2	333.4
Total	101	37195.9	3092.5	616.3

Model Usage Breakdown

Model	steps	uncached	output	cache_read	cache_write	time
claude-opus-4-6	2	116	10.0k	1.2M	69.3k	14m 9s
claude-sonnet-4-6	1	35	6.3k	276.6k	36.6k	3m 43s
MiniMax-M2.7-highspeed	3	269.3k	8.0k	831.3k	83.8k	3m 58s

Replace yaml.safe_load() with yaml.load(Loader=_Loader) where _Loader is CSafeLoader (LibYAML C backend) with automatic fallback to SafeLoader when LibYAML is not compiled in. All YAML loading flows through load_yaml() in core/io.py, so the change propagates to all consumers. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

… import The CSafeLoader import block added 5 lines to core/io.py, shifting write_versioned_json's atomic_write call from line 118 to line 123. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The except ImportError branch was using bare `pass`, leaving _Loader unbound and causing NameError on systems without LibYAML. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…rized test Merge test_load_yaml_uses_c_loader_when_available and test_load_yaml_path_uses_c_loader into a single parametrized test (str-input vs path-input). Add assert for unexpected kwargs in the spy function. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

## Summary Replace `yaml.safe_load()` calls in `src/autoskillit/core/io.py:load_yaml()` with `yaml.load(source, Loader=CSafeLoader)`, using LibYAML's C backend for an 11.1x speedup. All YAML loading in the codebase flows through this single function — recipe validation, config loading, experiment type registries, migration loading, and CLI validation all benefit from the improvement. ## Requirements ### Problem `load_yaml()` in `src/autoskillit/core/io.py:181` uses `yaml.safe_load()`, which defaults to the pure-Python `SafeLoader` even when LibYAML's C backend is available. LibYAML IS compiled in (PyYAML 6.0.3, `__with_libyaml__ = True`), but `yaml.safe_load()` does not use it automatically. ### Measured Impact Benchmark on 5 real recipe files, 100 iterations: ``` SafeLoader: 8.557s CSafeLoader: 0.770s Speedup: 11.1x ``` All YAML loading in the codebase flows through this single function. Recipe validation, config loading, experiment type/methodology tradition loading, migration loading — all benefit. ### Fix In `src/autoskillit/core/io.py`, change `load_yaml()` to use `CSafeLoader` with fallback: ```python try: from yaml import CSafeLoader as _Loader except ImportError: from yaml import SafeLoader as _Loader # Then in load_yaml(): data = yaml.load(text, Loader=_Loader) ``` ### Risk Low. CSafeLoader has minor behavioral differences from SafeLoader (stricter timestamp handling, different inf/nan behavior), but these are unlikely to affect recipe/config YAML. Run the full test suite to confirm. ### Files - `src/autoskillit/core/io.py` — single fix point ## Conflict Resolution Decisions The following files had merge conflicts that were automatically resolved. ## Changed Files ### Modified (●): - `src/autoskillit/core/io.py` - `tests/core/test_io.py` - `tests/infra/test_schema_version_convention.py` Closes #2133 ## Implementation Plan Plan file: `/home/talon/projects/autoskillit-runs/impl-20260507-000427-400052/.autoskillit/temp/make-plan/perf_switch_yaml_loading_to_csafeloader_plan_2026-05-07_000427.md` 🤖 Generated with [Claude Code](https://claude.com/claude-code) via AutoSkillit  ## Token Usage Summary | Step | Model | count | uncached | output | cache_read | peak_ctx | turns | cache_write | time | |------|-------|-------|----------|--------|------------|----------|-------|-------------|------| | plan | claude-opus-4-6 | 1 | 69 | 5.8k | 506.1k | 45.1k | 40 | 35.1k | 4m 4s | | verify | claude-sonnet-4-6 | 1 | 35 | 6.3k | 276.6k | 49.7k | 53 | 36.6k | 3m 43s | | implement* | MiniMax-M2.7-highspeed | 1 | 173.7k | 3.6k | 450.1k | 42.9k | 35 | 53.7k | 2m 4s | | fix | claude-opus-4-6 | 1 | 47 | 4.2k | 714.1k | 47.1k | 33 | 34.2k | 10m 5s | | prepare_pr* | MiniMax-M2.7-highspeed | 1 | 48.4k | 2.9k | 175.7k | 29.8k | 19 | 15.1k | 1m 11s | | compose_pr* | MiniMax-M2.7-highspeed | 1 | 47.2k | 1.5k | 205.5k | 29.8k | 15 | 15.0k | 42s | | **Total** | | | 269.5k | 24.3k | 2.3M | 49.7k | | 189.8k | 21m 51s | \* *Step used a non-Anthropic provider; caching behavior may differ.* ## Token Efficiency | Step | LoC Changed | cache_read/LoC | cache_write/LoC | output/LoC | |------|-------------|----------------|-----------------|------------| | plan | 0 | — | — | — | | verify | 0 | — | — | — | | implement | 58 | 7759.8 | 925.6 | 62.6 | | fix | 2 | 357047.5 | 17119.5 | 2124.5 | | prepare_pr | 0 | — | — | — | | compose_pr | 0 | — | — | — | | **Total** | **60** | 38801.3 | 3162.7 | 404.8 | ## Model Usage Breakdown | Model | steps | uncached | output | cache_read | cache_write | time | |-------|-------|----------|--------|------------|-------------|------| | claude-opus-4-6 | 2 | 116 | 10.0k | 1.2M | 69.3k | 14m 9s | | claude-sonnet-4-6 | 1 | 35 | 6.3k | 276.6k | 36.6k | 3m 43s | | MiniMax-M2.7-highspeed | 3 | 269.3k | 8.0k | 831.3k | 83.8k | 3m 58s | --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

Trecek and others added 4 commits May 7, 2026 00:14

fix: update JSON write site allowlist for line shift from CSafeLoader…

1d98759

… import The CSafeLoader import block added 5 lines to core/io.py, shifting write_versioned_json's atomic_write call from line 118 to line 123. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix(review): add SafeLoader fallback when CSafeLoader import fails

11598b4

The except ImportError branch was using bare `pass`, leaving _Loader unbound and causing NameError on systems without LibYAML. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Trecek added this pull request to the merge queue May 7, 2026

Merged via the queue into develop with commit bc7fc0e May 7, 2026
2 checks passed

Trecek deleted the perf-switch-yaml-loading-to-csafeloader-11-1x-speedup/2133 branch May 7, 2026 08:41

Trecek mentioned this pull request May 8, 2026

Promote develop to main (200 PRs, 160+ issues, 179 fixes, 480 features, 27 refactors, 22 infra) #2213

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: switch YAML loading to CSafeLoader (11.1x speedup)#2150

perf: switch YAML loading to CSafeLoader (11.1x speedup)#2150
Trecek merged 4 commits into
developfrom
perf-switch-yaml-loading-to-csafeloader-11-1x-speedup/2133

Trecek commented May 7, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Trecek commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Requirements

Problem

Measured Impact

Fix

Risk

Files

Conflict Resolution Decisions

Changed Files

Modified (●):

Implementation Plan

Token Usage Summary

Token Efficiency

Model Usage Breakdown

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Trecek commented May 7, 2026 •

edited

Loading