Fix TOML compliance and pin TOML/YAML test suites to specific SHAs#576
Conversation
Add ValidateInput pre-scan to the TOML parser that strips a UTF-8 BOM at file start and rejects forbidden control characters (U+0000–U+0008, U+000B–U+000C, U+000E–U+001F, U+007F) before parsing begins. This fixes 6 upstream toml-test failures (4 false accepts for vertical tab in bare values, 2 false rejects for BOM-prefixed files) without affecting ECMAScript engine conformance. Pin both run_toml_test_suite.py and run_yaml_test_suite.py to specific upstream SHAs instead of cloning branch HEAD, matching the existing test262 pattern. Add weekly cron workflows and bump scripts for both suites. All three bump workflows now use a fixed branch name so an unmerged PR is updated in place rather than replaced. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub. 1 Skipped Deployment
|
|
ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (10)
📝 WalkthroughWalkthroughThis PR establishes automated weekly pinning of test suite SHAs across test262, toml-test, and yaml-test-suite. It introduces a shared Bun utility script, updates Python test runners to use SHA-based checkouts, adds three CI workflows (test262-bump, toml-test-bump, yaml-test-bump), enhances TOML parser input validation, and documents the new automation. Configuration and documentation updates reflect these infrastructure changes. ChangesTest Suite Automation and TOML Parser Enhancement
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
Suite TimingTest Runner (interpreted: 8,914 passed; bytecode: 8,914 passed)
MemoryGC rows aggregate the main thread plus all worker thread-local GCs. Test runner worker shutdown frees thread-local heaps in bulk; that shutdown reclamation is not counted as GC collections or collected objects.
Benchmarks (interpreted: 407; bytecode: 407)
MemoryGC rows aggregate the main thread plus all worker thread-local GCs. Benchmark runner performs explicit between-file collections, so collection and collected-object counts can be much higher than the test runner.
Measured on ubuntu-latest x64. |
Benchmark Results407 benchmarks Interpreted: 🟢 67 improved · 🔴 36 regressed · 304 unchanged · avg +0.8% arraybuffer.js — Interp: 🟢 1, 13 unch. · avg +2.1% · Bytecode: 🟢 6, 🔴 3, 5 unch. · avg +2.5%
arrays.js — Interp: 🟢 2, 17 unch. · avg +0.6% · Bytecode: 🟢 10, 9 unch. · avg +4.4%
async-await.js — Interp: 6 unch. · avg +1.0% · Bytecode: 🟢 3, 3 unch. · avg +2.7%
async-generators.js — Interp: 2 unch. · avg +0.4% · Bytecode: 2 unch. · avg +0.4%
base64.js — Interp: 🔴 1, 9 unch. · avg -0.6% · Bytecode: 🟢 6, 🔴 1, 3 unch. · avg +3.0%
classes.js — Interp: 🟢 4, 🔴 1, 26 unch. · avg +1.1% · Bytecode: 🟢 15, 16 unch. · avg +2.9%
closures.js — Interp: 🔴 1, 10 unch. · avg -1.7% · Bytecode: 🟢 10, 1 unch. · avg +8.6%
collections.js — Interp: 🟢 1, 11 unch. · avg +0.4% · Bytecode: 🟢 10, 2 unch. · avg +5.8%
csv.js — Interp: 13 unch. · avg -0.2% · Bytecode: 🟢 10, 3 unch. · avg +6.9%
destructuring.js — Interp: 🟢 1, 🔴 1, 20 unch. · avg -0.0% · Bytecode: 🟢 9, 🔴 1, 12 unch. · avg +2.9%
fibonacci.js — Interp: 8 unch. · avg -0.2% · Bytecode: 🟢 5, 3 unch. · avg +5.1%
float16array.js — Interp: 🟢 7, 🔴 2, 23 unch. · avg +0.7% · Bytecode: 🟢 15, 🔴 5, 12 unch. · avg +0.1%
for-of.js — Interp: 🟢 1, 6 unch. · avg +0.2% · Bytecode: 🟢 4, 3 unch. · avg +5.5%
generators.js — Interp: 4 unch. · avg +1.7% · Bytecode: 🟢 1, 🔴 1, 2 unch. · avg +0.5%
iterators.js — Interp: 🟢 11, 🔴 1, 30 unch. · avg +1.1% · Bytecode: 🟢 35, 7 unch. · avg +6.2%
json.js — Interp: 🟢 7, 🔴 1, 12 unch. · avg +1.8% · Bytecode: 🟢 11, 🔴 1, 8 unch. · avg +4.0%
jsx.jsx — Interp: 🔴 6, 15 unch. · avg -1.7% · Bytecode: 🟢 5, 16 unch. · avg +2.0%
modules.js — Interp: 🔴 1, 8 unch. · avg -0.2% · Bytecode: 🟢 8, 1 unch. · avg +8.1%
numbers.js — Interp: 🟢 3, 8 unch. · avg +1.5% · Bytecode: 🟢 10, 1 unch. · avg +9.4%
objects.js — Interp: 🔴 4, 3 unch. · avg -2.7% · Bytecode: 🟢 3, 4 unch. · avg +4.1%
promises.js — Interp: 🟢 3, 9 unch. · avg +2.0% · Bytecode: 🟢 8, 4 unch. · avg +4.8%
regexp.js — Interp: 🟢 2, 9 unch. · avg +1.0% · Bytecode: 🟢 10, 1 unch. · avg +7.2%
strings.js — Interp: 🔴 3, 16 unch. · avg -1.9% · Bytecode: 🟢 15, 4 unch. · avg +5.1%
tsv.js — Interp: 🟢 3, 6 unch. · avg +2.7% · Bytecode: 🟢 7, 2 unch. · avg +10.1%
typed-arrays.js — Interp: 🟢 6, 🔴 5, 11 unch. · avg -3.3% · Bytecode: 🟢 6, 🔴 9, 7 unch. · avg +1.8%
uint8array-encoding.js — Interp: 🟢 9, 🔴 7, 2 unch. · avg -7.9% · Bytecode: 🟢 13, 🔴 3, 2 unch. · avg +13.3%
weak-collections.js — Interp: 🟢 6, 🔴 2, 7 unch. · avg +24.6% · Bytecode: 🟢 9, 🔴 4, 2 unch. · avg -8.4%
Deterministic profile diffDeterministic profile diff: no significant changes. Measured on ubuntu-latest x64. Benchmark ranges compare cached main-branch min/max ops/sec with the PR run; overlapping ranges are treated as unchanged noise. Percentage deltas are secondary context. |
test262 Conformance
Areas closest to 100%
Per-test deltas (+1 / -0)Newly passing (1):
Steady-state failures are non-blocking; regressions vs the cached main baseline (lower total pass count, or any PASS → non-PASS transition) fail the conformance gate. Measured on ubuntu-latest x64, bytecode mode. Areas grouped by the first two test262 path components; minimum 25 attempted tests, areas already at 100% excluded. Δ vs main compares against the most recent cached |
Collapse toml-test-bump-pin.ts and yaml-test-bump-pin.ts into a single suite-bump-pin.ts that accepts the target file as its first argument. Rename ValidateInput to PrepareInput since it both strips the UTF-8 BOM (mutating FIndex) and validates control characters. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
ValidateInputpre-scan to the TOML parser that strips UTF-8 BOM at file start and rejects forbidden control characters before parsing begins (Option C — input-level validation so every downstream consumer is protected).run_toml_test_suite.pyandrun_yaml_test_suite.pyto specific upstream SHAs instead of cloning branch HEAD, matching the existing test262 pattern.toml-test-bump.yml,yaml-test-bump.yml) and bump scripts (toml-test-bump-pin.ts,yaml-test-bump-pin.ts) for both suites.chore/<suite>-bump) so an unmerged PR is updated in place rather than replaced.*.orandbuild/GocciaTOMLCheckto.gitignoreto prevent build artifacts from being staged.Testing