fix(runtime): built-ins/String test262 parity#4754
Merged
Conversation
Argument-coercion and whitespace fixes across String.prototype methods, lifting built-ins/String from 752 to 814 passing (zero regressions). Root causes: - split(separator, limit) skipped ToString(separator)/ToUint32(limit) and treated undefined separator as empty-string (per-char split). New js_string_split_value takes boxed args: ToUint32(limit) before ToString(separator) (spec order; either may throw), undefined -> [S], limit 0 -> [], RegExp-separator delegation. - indexOf/lastIndexOf bit-cast an object searchString as a string handle and raw-fptosi'd the position (skipping valueOf). Now ToString(searchString) via js_string_coerce before ToNumber(position) via js_string_index_to_i32 / js_number_coerce. - search/match required a RegExp arg (returned -1/null otherwise). New js_string_search_value/js_string_match_value coerce any non-RegExp arg via RegExpCreate(ToString(arg)); undefined -> empty /(?:)/. match/search now accept 0 args. (match(undefined) previously SIGSEGV'd: a null pattern pointer was deref'd by lookup_fancy_regex — now an empty StringHeader.) - trim/trimStart/trimEnd used Rust's Unicode White_Space, which excludes U+FEFF (BOM, a JS WhiteSpace) and includes U+0085 (NEL, not JS). Added an is_js_whitespace predicate matching ECMA-262 §22.1.3.32. Files: lower_string_method.rs (codegen), runtime_decls/strings_part2.rs, native_call_method.rs, regex.rs, string/slice_ops.rs, string/split.rs. built-ins/String: 752 -> 814 pass; 160 -> 102 runtime-fail; 20 -> 16 compile-fail; 0 regressions.
proggeramlug
pushed a commit
that referenced
this pull request
Jun 7, 2026
Follow-up to #4754. built-ins/String 814 -> 840 passing, 0 regressions. - String.prototype.search dispatch arm returned a NaN-boxed INT32 instead of a raw f64, so `new String(s).search(x) === N` (and generic-this `__o.search = String.prototype.search` forms) strict-compared unequal to a number literal even though the index was correct. Return `i32 as f64`, matching the indexOf arm. (search 19/31 -> 31/31) - new String(...) wrapper (type Named("String")) mis-routed the Array/String shared methods indexOf/includes/slice/lastIndexOf to ArrayIndexOf/etc. (read the wrapper as an array -> -1/false). Treat the boxed String wrapper as a string in try_local_array_methods so they route to the ToString- coercing string dispatch. - replace/replaceAll did not ToString-coerce a non-RegExp searchValue or a non-function replaceValue (object args with throwing toString didn't throw, per ECMA-262 §22.1.3.19 order: searchValue before replaceValue). - concat bit-cast its args as string handles instead of ToString-coercing them, dropping `undefined`/booleans ("lego".concat(undefined) -> "lego"). Now ToString-coerces each non-static-string arg (§22.1.3.5). - padStart/padEnd bit-cast a non-string fillString, dropping it. New js_string_pad_fill ToString-coerces the fill (undefined -> default " "). Files: native_call_method.rs, string/pad.rs (runtime), lower_string_method.rs, runtime_decls/strings_part2.rs (codegen), lower/expr_call/local_array_methods.rs (HIR).
proggeramlug
added a commit
that referenced
this pull request
Jun 7, 2026
…4755) Follow-up to #4754. built-ins/String 814 -> 840 passing, 0 regressions. - String.prototype.search dispatch arm returned a NaN-boxed INT32 instead of a raw f64, so `new String(s).search(x) === N` (and generic-this `__o.search = String.prototype.search` forms) strict-compared unequal to a number literal even though the index was correct. Return `i32 as f64`, matching the indexOf arm. (search 19/31 -> 31/31) - new String(...) wrapper (type Named("String")) mis-routed the Array/String shared methods indexOf/includes/slice/lastIndexOf to ArrayIndexOf/etc. (read the wrapper as an array -> -1/false). Treat the boxed String wrapper as a string in try_local_array_methods so they route to the ToString- coercing string dispatch. - replace/replaceAll did not ToString-coerce a non-RegExp searchValue or a non-function replaceValue (object args with throwing toString didn't throw, per ECMA-262 §22.1.3.19 order: searchValue before replaceValue). - concat bit-cast its args as string handles instead of ToString-coercing them, dropping `undefined`/booleans ("lego".concat(undefined) -> "lego"). Now ToString-coerces each non-static-string arg (§22.1.3.5). - padStart/padEnd bit-cast a non-string fillString, dropping it. New js_string_pad_fill ToString-coerces the fill (undefined -> default " "). Files: native_call_method.rs, string/pad.rs (runtime), lower_string_method.rs, runtime_decls/strings_part2.rs (codegen), lower/expr_call/local_array_methods.rs (HIR). Co-authored-by: Ralph Küpper <ralph@skelpo.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Lifts built-ins/String test262 parity from 752 → 814 passing (runtime-fail 160 → 102, compile-fail 20 → 16), zero regressions (no test that passed before now fails). Differential vs Node on the pinned corpus (tc39/test262 @ 4249661388).
Root causes fixed
1.
String.prototype.split(separator, limit)argument coercion. Split skippedToString(separator)/ToUint32(limit)and treated anundefinedseparator as an empty-string separator (per-character split) instead of returning[wholeString]. Newjs_string_split_valuetakes the boxed separator + limit and follows §22.1.3.21 exactly:ToUint32(limit)runs beforeToString(separator)(spec order — either may run uservalueOf/toStringand throw),undefinedseparator →[S],limit === 0→[], and a RegExp separator delegates to the regex splitter. (~22 tests)2.
indexOf/lastIndexOfargument coercion. An objectsearchStringwas bit-cast as a string handle (→ garbage → -1) and thepositionwas raw-fptosid (skippingvalueOf). NowToString(searchString)viajs_string_coerceruns beforeToNumber(position)viajs_string_index_to_i32/js_number_coerce. (~18 tests)3.
search/matchregex coercion. Both required an actual RegExp argument (returned -1/null for a string/undefined/{toString}arg). Newjs_string_search_value/js_string_match_valuecoerce any non-RegExp arg viaRegExpCreate(ToString(arg)), withundefined→ the empty/(?:)/regex;match/searchnow also accept 0 args.match(undefined)previously SIGSEGVd — a null pattern pointer was dereferenced bylookup_fancy_regex; now an emptyStringHeaderis built. (~30 tests)4.
trim/trimStart/trimEndwhitespace set. Used Rust’s UnicodeWhite_Space, which excludes U+FEFF (BOM — a JSWhiteSpace) and includes U+0085 (NEL — not JS whitespace). Added anis_js_whitespacepredicate matching ECMA-262 §22.1.3.32 (WhiteSpace+LineTerminator). (~7 tests)Files
crates/perry-codegen/src/lower_string_method.rs— needle/position/separator/regex-arg coercion + 0-argmatch/searchcrates/perry-codegen/src/runtime_decls/strings_part2.rs— declarejs_string_search_value/js_string_match_value(thejs_string_split_valuedecl already landed on main)crates/perry-runtime/src/object/native_call_method.rs— generic-thisdispatch arms route through the coercing entriescrates/perry-runtime/src/regex.rs—coerce_search_arg_to_regex+js_string_search_value/js_string_match_valuecrates/perry-runtime/src/string/slice_ops.rs—is_js_whitespacetrimcrates/perry-runtime/src/string/split.rs—js_string_split_valueValidation
Default compile path (no env flags).
built-ins/String: 752 → 814 pass, 0 regressions. Verified manually against Node for split/indexOf/lastIndexOf/search/match/trim incl. the previously-crashing"abc".match(undefined).