Skip to content

Commit 089d6c7

Browse files
joyeecheungRafaelGSS
authored andcommitted
build,test: test array index hash collision
This enables v8_enable_seeded_array_index_hash and add a test for it. deps: V8: backport 0a8b1cdcc8b2 Original commit message: implement rapidhash secret generation Bug: 409717082 Change-Id: I471f33d66de32002f744aeba534c1d34f71e27d2 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/6733490 Reviewed-by: Leszek Swirski <leszeks@chromium.org> Commit-Queue: snek <snek@chromium.org> Cr-Commit-Position: refs/heads/main@{#101499} deps: V8: backport 185f0fe09b72 Original commit message: [numbers] Refactor HashSeed as a lightweight view over ByteArray Instead of copying the seed and secrets into a struct with value fields, HashSeed now stores a pointer pointing either into the read-only ByteArray, or the static default seed for off-heap HashSeed::Default() calls. The underlying storage is always 8-byte aligned so we can cast it directly into a struct. Change-Id: I5896a7f2ae24296eb4c80b757a5d90ac70a34866 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/7609720 Reviewed-by: Leszek Swirski <leszeks@chromium.org> Commit-Queue: Joyee Cheung <joyee@igalia.com> Cr-Commit-Position: refs/heads/main@{#105531} deps: V8: backport 1361b2a49d02 Original commit message: [strings] improve array index hash distribution Previously, the hashes stored in a Name's raw_hash_field for decimal numeric strings (potential array indices) consist of the literal integer value along with the length of the string. This means consecutive numeric strings can have consecutive hash values, which can lead to O(n^2) probing for insertion in the worst case when e.g. a non-numeric string happen to land in the these buckets. This patch adds a build-time flag v8_enable_seeded_array_index_hash that scrambles the 24-bit array-index value stored in a Name's raw_hash_field to improve the distribution. x ^= x >> kShift; x = (x * m1) & kMask; // round 1 x ^= x >> kShift; x = (x * m2) & kMask; // round 2 x ^= x >> kShift; // finalize To decode, apply the same steps with the modular inverses of m1 and m2 in reverse order. x ^= x >> kShift; x = (x * m2_inv) & kMask; // round 1 x ^= x >> kShift; x = (x * m1_inv) & kMask; // round 2 x ^= x >> kShift; // finalize where kShift = kArrayIndexValueBits / 2, kMask = kArrayIndexValueMask, m1, m2 (both odd) are the lower bits of the rapidhash secrets, m1_inv, m2_inv (modular inverses) are precomputed modular inverse of m1 and m2. The pre-computed values are appended to the hash_seed ByteArray in ReadOnlyRoots and accessed in generated code to reduce overhead. In call sites that don't already have access to the seeds, we read them from the current isolate group/isolate's read only roots. To consolidate the code that encode/decode these hashes, this patch adds MakeArrayIndexHash/DecodeArrayIndexFromHashField in C++ and CSA that perform seeding/unseeding if enabled, and updates places where encoding/decoding of array index is needed to use them. Bug: 477515021 Change-Id: I350afe511951a54c4378396538152cc56565fd55 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/7564330 Reviewed-by: Leszek Swirski <leszeks@chromium.org> Commit-Queue: Joyee Cheung <joyee@igalia.com> Cr-Commit-Position: refs/heads/main@{#105596} deps: V8: cherry-pick aac14dd95e5b Original commit message: [string] add 3rd round to seeded array index hash Since we already have 3 derived secrets, and arithmetics are relatively cheap, add a 3rd round to the xorshift-multiply seeding scheme. This brings the bias from ~3.4 to ~0.4. Bug: 477515021 Change-Id: I1ef48954bcee8768d8c90db06ac8adb02f06cebf Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/7655117 Reviewed-by: Chengzhong Wu <cwu631@bloomberg.net> Commit-Queue: Joyee Cheung <joyee@igalia.com> Reviewed-by: Leszek Swirski <leszeks@chromium.org> Cr-Commit-Position: refs/heads/main@{#105824} PR-URL: nodejs-private/node-private#834 CVE-ID: CVE-2026-21717 deps: V8: backport 185f0fe09b72 Original commit message: [numbers] Refactor HashSeed as a lightweight view over ByteArray Instead of copying the seed and secrets into a struct with value fields, HashSeed now stores a pointer pointing either into the read-only ByteArray, or the static default seed for off-heap HashSeed::Default() calls. The underlying storage is always 8-byte aligned so we can cast it directly into a struct. Change-Id: I5896a7f2ae24296eb4c80b757a5d90ac70a34866 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/7609720 Reviewed-by: Leszek Swirski <leszeks@chromium.org> Commit-Queue: Joyee Cheung <joyee@igalia.com> Cr-Commit-Position: refs/heads/main@{#105531} deps: V8: backport 1361b2a49d02 Original commit message: [strings] improve array index hash distribution Previously, the hashes stored in a Name's raw_hash_field for decimal numeric strings (potential array indices) consist of the literal integer value along with the length of the string. This means consecutive numeric strings can have consecutive hash values, which can lead to O(n^2) probing for insertion in the worst case when e.g. a non-numeric string happen to land in the these buckets. This patch adds a build-time flag v8_enable_seeded_array_index_hash that scrambles the 24-bit array-index value stored in a Name's raw_hash_field to improve the distribution. x ^= x >> kShift; x = (x * m1) & kMask; // round 1 x ^= x >> kShift; x = (x * m2) & kMask; // round 2 x ^= x >> kShift; // finalize To decode, apply the same steps with the modular inverses of m1 and m2 in reverse order. x ^= x >> kShift; x = (x * m2_inv) & kMask; // round 1 x ^= x >> kShift; x = (x * m1_inv) & kMask; // round 2 x ^= x >> kShift; // finalize where kShift = kArrayIndexValueBits / 2, kMask = kArrayIndexValueMask, m1, m2 (both odd) are the lower bits of the rapidhash secrets, m1_inv, m2_inv (modular inverses) are precomputed modular inverse of m1 and m2. The pre-computed values are appended to the hash_seed ByteArray in ReadOnlyRoots and accessed in generated code to reduce overhead. In call sites that don't already have access to the seeds, we read them from the current isolate group/isolate's read only roots. To consolidate the code that encode/decode these hashes, this patch adds MakeArrayIndexHash/DecodeArrayIndexFromHashField in C++ and CSA that perform seeding/unseeding if enabled, and updates places where encoding/decoding of array index is needed to use them. Bug: 477515021 Change-Id: I350afe511951a54c4378396538152cc56565fd55 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/7564330 Reviewed-by: Leszek Swirski <leszeks@chromium.org> Commit-Queue: Joyee Cheung <joyee@igalia.com> Cr-Commit-Position: refs/heads/main@{#105596} deps: V8: cherry-pick aac14dd95e5b Original commit message: [string] add 3rd round to seeded array index hash Since we already have 3 derived secrets, and arithmetics are relatively cheap, add a 3rd round to the xorshift-multiply seeding scheme. This brings the bias from ~3.4 to ~0.4. Bug: 477515021 Change-Id: I1ef48954bcee8768d8c90db06ac8adb02f06cebf Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/7655117 Reviewed-by: Chengzhong Wu <cwu631@bloomberg.net> Commit-Queue: Joyee Cheung <joyee@igalia.com> Reviewed-by: Leszek Swirski <leszeks@chromium.org> Cr-Commit-Position: refs/heads/main@{#105824} Co-authored-by: Joyee Cheung <joyeec9h3@gmail.com> Refs: https://hackerone.com/reports/3511792 Refs: v8/v8@aac14dd PR-URL: #61898 Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com> Reviewed-By: Filip Skokan <panva.ip@gmail.com> Reviewed-By: Rafael Gonzaga <rafael.nunu@hotmail.com> Reviewed-By: Chengzhong Wu <legendecas@gmail.com> (cherry picked from commit fff9a8a)
1 parent 15d406c commit 089d6c7

31 files changed

Lines changed: 563 additions & 128 deletions

common.gypi

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@
3838

3939
# Reset this number to 0 on major V8 upgrades.
4040
# Increment by one for each non-official patch applied to deps/v8.
41-
'v8_embedder_string': '-node.17',
41+
'v8_embedder_string': '-node.18',
4242

4343
##### V8 defaults for Node.js #####
4444

deps/v8/BUILD.bazel

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -240,6 +240,11 @@ v8_flag(
240240
default = False,
241241
)
242242

243+
v8_flag(
244+
name = "v8_enable_seeded_array_index_hash",
245+
default = False,
246+
)
247+
243248
selects.config_setting_group(
244249
name = "enable_drumbrake_x64",
245250
match_all = [
@@ -512,6 +517,7 @@ v8_config(
512517
"v8_enable_webassembly": "V8_ENABLE_WEBASSEMBLY",
513518
"v8_enable_drumbrake": "V8_ENABLE_DRUMBRAKE",
514519
"v8_enable_drumbrake_tracing": "V8_ENABLE_DRUMBRAKE_TRACING",
520+
"v8_enable_seeded_array_index_hash": "V8_ENABLE_SEEDED_ARRAY_INDEX_HASH",
515521
"v8_jitless": "V8_JITLESS",
516522
"v8_enable_vtunejit": "ENABLE_VTUNE_JIT_INTERFACE",
517523
"v8_enable_undefined_double": "V8_ENABLE_UNDEFINED_DOUBLE",
@@ -2028,6 +2034,7 @@ filegroup(
20282034
"src/numbers/conversions.h",
20292035
"src/numbers/conversions-inl.h",
20302036
"src/numbers/hash-seed.h",
2037+
"src/numbers/hash-seed.cc",
20312038
"src/numbers/hash-seed-inl.h",
20322039
"src/numbers/ieee754.cc",
20332040
"src/numbers/ieee754.h",

deps/v8/BUILD.gn

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -499,6 +499,9 @@ declare_args() {
499499
# Use a hard-coded secret value when hashing.
500500
v8_use_default_hasher_secret = true
501501

502+
# Enable seeded array index hash.
503+
v8_enable_seeded_array_index_hash = false
504+
502505
# add instrumentation for Dumpling differential fuzzing
503506
v8_dumpling = false
504507

@@ -1241,6 +1244,9 @@ config("features") {
12411244
if (v8_enable_lite_mode) {
12421245
defines += [ "V8_LITE_MODE" ]
12431246
}
1247+
if (v8_enable_seeded_array_index_hash) {
1248+
defines += [ "V8_ENABLE_SEEDED_ARRAY_INDEX_HASH" ]
1249+
}
12441250
if (v8_enable_gdbjit) {
12451251
defines += [ "ENABLE_GDB_JIT_INTERFACE" ]
12461252
}
@@ -5981,6 +5987,7 @@ v8_source_set("v8_base_without_compiler") {
59815987
"src/logging/runtime-call-stats.cc",
59825988
"src/logging/tracing-flags.cc",
59835989
"src/numbers/conversions.cc",
5990+
"src/numbers/hash-seed.cc",
59845991
"src/numbers/ieee754.cc",
59855992
"src/numbers/math-random.cc",
59865993
"src/objects/abstract-code.cc",

deps/v8/src/DEPS

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -138,7 +138,7 @@ specific_include_rules = {
138138
"heap\.cc": [
139139
"+third_party/rapidhash-v8/secret.h",
140140
],
141-
"hash-seed-inl\.h": [
141+
"hash-seed\.cc": [
142142
"+third_party/rapidhash-v8/secret.h",
143143
],
144144
}

deps/v8/src/ast/ast-value-factory.cc

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -83,7 +83,8 @@ bool AstRawString::AsArrayIndex(uint32_t* index) const {
8383
// can't be convertible to an array index.
8484
if (!IsIntegerIndex()) return false;
8585
if (length() <= Name::kMaxCachedArrayIndexLength) {
86-
*index = Name::ArrayIndexValueBits::decode(raw_hash_field_);
86+
*index = StringHasher::DecodeArrayIndexFromHashField(
87+
raw_hash_field_, HashSeed(GetReadOnlyRoots()));
8788
return true;
8889
}
8990
// Might be an index, but too big to cache it. Do the slow conversion. This

deps/v8/src/builtins/number.tq

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -300,7 +300,7 @@ transitioning javascript builtin NumberParseFloat(
300300
const hash: NameHash = s.raw_hash_field;
301301
if (IsIntegerIndex(hash) &&
302302
hash.array_index_length < kMaxCachedArrayIndexLength) {
303-
const arrayIndex: uint32 = hash.array_index_value;
303+
const arrayIndex: uint32 = DecodeArrayIndexFromHashField(hash);
304304
return SmiFromUint32(arrayIndex);
305305
}
306306
// Fall back to the runtime to convert string to a number.
@@ -351,7 +351,7 @@ transitioning builtin ParseInt(
351351
const hash: NameHash = s.raw_hash_field;
352352
if (IsIntegerIndex(hash) &&
353353
hash.array_index_length < kMaxCachedArrayIndexLength) {
354-
const arrayIndex: uint32 = hash.array_index_value;
354+
const arrayIndex: uint32 = DecodeArrayIndexFromHashField(hash);
355355
return SmiFromUint32(arrayIndex);
356356
}
357357
// Fall back to the runtime.

deps/v8/src/builtins/wasm.tq

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1611,8 +1611,8 @@ builtin WasmStringToDouble(s: String): float64 {
16111611
const hash: NameHash = s.raw_hash_field;
16121612
if (IsIntegerIndex(hash) &&
16131613
hash.array_index_length < kMaxCachedArrayIndexLength) {
1614-
const arrayIndex: int32 = Signed(hash.array_index_value);
1615-
return Convert<float64>(arrayIndex);
1614+
const arrayIndex: uint32 = DecodeArrayIndexFromHashField(hash);
1615+
return Convert<float64>(Signed(arrayIndex));
16161616
}
16171617
return StringToFloat64(Flatten(s));
16181618
}

deps/v8/src/codegen/code-stub-assembler.cc

Lines changed: 63 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2711,6 +2711,66 @@ TNode<Uint32T> CodeStubAssembler::LoadJSReceiverIdentityHash(
27112711
return var_hash.value();
27122712
}
27132713

2714+
#ifdef V8_ENABLE_SEEDED_ARRAY_INDEX_HASH
2715+
// Mirror C++ StringHasher::SeedArrayIndexValue.
2716+
TNode<Uint32T> CodeStubAssembler::SeedArrayIndexValue(TNode<Uint32T> value) {
2717+
// Load m1, m2 and m3 from the hash seed byte array. In the compiled code
2718+
// these will always come from the read-only roots.
2719+
TNode<ByteArray> hash_seed = CAST(LoadRoot(RootIndex::kHashSeed));
2720+
intptr_t base_offset = OFFSET_OF_DATA_START(ByteArray) - kHeapObjectTag;
2721+
TNode<Uint32T> m1 = Load<Uint32T>(
2722+
hash_seed, IntPtrConstant(base_offset + HashSeed::kDerivedM1Offset));
2723+
TNode<Uint32T> m2 = Load<Uint32T>(
2724+
hash_seed, IntPtrConstant(base_offset + HashSeed::kDerivedM2Offset));
2725+
TNode<Uint32T> m3 = Load<Uint32T>(
2726+
hash_seed, IntPtrConstant(base_offset + HashSeed::kDerivedM3Offset));
2727+
2728+
TNode<Word32T> x = value;
2729+
// 3-round xorshift-multiply.
2730+
x = Word32Xor(x, Word32Shr(x, Uint32Constant(Name::kArrayIndexHashShift)));
2731+
x = Word32And(Uint32Mul(Unsigned(x), m1),
2732+
Uint32Constant(Name::kArrayIndexValueMask));
2733+
x = Word32Xor(x, Word32Shr(x, Uint32Constant(Name::kArrayIndexHashShift)));
2734+
x = Word32And(Uint32Mul(Unsigned(x), m2),
2735+
Uint32Constant(Name::kArrayIndexValueMask));
2736+
x = Word32Xor(x, Word32Shr(x, Uint32Constant(Name::kArrayIndexHashShift)));
2737+
x = Word32And(Uint32Mul(Unsigned(x), m3),
2738+
Uint32Constant(Name::kArrayIndexValueMask));
2739+
x = Word32Xor(x, Word32Shr(x, Uint32Constant(Name::kArrayIndexHashShift)));
2740+
2741+
return Unsigned(x);
2742+
}
2743+
2744+
// Mirror C++ StringHasher::UnseedArrayIndexValue.
2745+
TNode<Uint32T> CodeStubAssembler::UnseedArrayIndexValue(TNode<Uint32T> value) {
2746+
// Load m1_inv, m2_inv and m3_inv from the hash seed byte array. In the
2747+
// compiled code these will always come from the read-only roots.
2748+
TNode<ByteArray> hash_seed = CAST(LoadRoot(RootIndex::kHashSeed));
2749+
intptr_t base_offset = OFFSET_OF_DATA_START(ByteArray) - kHeapObjectTag;
2750+
TNode<Uint32T> m1_inv = Load<Uint32T>(
2751+
hash_seed, IntPtrConstant(base_offset + HashSeed::kDerivedM1InvOffset));
2752+
TNode<Uint32T> m2_inv = Load<Uint32T>(
2753+
hash_seed, IntPtrConstant(base_offset + HashSeed::kDerivedM2InvOffset));
2754+
TNode<Uint32T> m3_inv = Load<Uint32T>(
2755+
hash_seed, IntPtrConstant(base_offset + HashSeed::kDerivedM3InvOffset));
2756+
2757+
TNode<Word32T> x = value;
2758+
// 3-round xorshift-multiply (inverse).
2759+
// Xorshift is an involution when kShift is at least half of the value width.
2760+
x = Word32Xor(x, Word32Shr(x, Uint32Constant(Name::kArrayIndexHashShift)));
2761+
x = Word32And(Uint32Mul(Unsigned(x), m3_inv),
2762+
Uint32Constant(Name::kArrayIndexValueMask));
2763+
x = Word32Xor(x, Word32Shr(x, Uint32Constant(Name::kArrayIndexHashShift)));
2764+
x = Word32And(Uint32Mul(Unsigned(x), m2_inv),
2765+
Uint32Constant(Name::kArrayIndexValueMask));
2766+
x = Word32Xor(x, Word32Shr(x, Uint32Constant(Name::kArrayIndexHashShift)));
2767+
x = Word32And(Uint32Mul(Unsigned(x), m1_inv),
2768+
Uint32Constant(Name::kArrayIndexValueMask));
2769+
x = Word32Xor(x, Word32Shr(x, Uint32Constant(Name::kArrayIndexHashShift)));
2770+
return Unsigned(x);
2771+
}
2772+
#endif // V8_ENABLE_SEEDED_ARRAY_INDEX_HASH
2773+
27142774
TNode<Uint32T> CodeStubAssembler::LoadNameHashAssumeComputed(TNode<Name> name) {
27152775
TNode<Uint32T> hash_field = LoadNameRawHash(name);
27162776
CSA_DCHECK(this, IsClearWord32(hash_field, Name::kHashNotComputedMask));
@@ -9404,8 +9464,7 @@ TNode<Number> CodeStubAssembler::StringToNumber(TNode<String> input) {
94049464
GotoIf(IsSetWord32(raw_hash_field, Name::kDoesNotContainCachedArrayIndexMask),
94059465
&runtime);
94069466

9407-
var_result = SmiTag(Signed(
9408-
DecodeWordFromWord32<String::ArrayIndexValueBits>(raw_hash_field)));
9467+
var_result = SmiFromUint32(DecodeArrayIndexFromHashField(raw_hash_field));
94099468
Goto(&end);
94109469

94119470
BIND(&runtime);
@@ -10535,9 +10594,8 @@ void CodeStubAssembler::TryToName(TNode<Object> key, Label* if_keyisindex,
1053510594

1053610595
BIND(&if_has_cached_index);
1053710596
{
10538-
TNode<IntPtrT> index =
10539-
Signed(DecodeWordFromWord32<String::ArrayIndexValueBits>(
10540-
raw_hash_field));
10597+
TNode<IntPtrT> index = Signed(ChangeUint32ToWord(
10598+
DecodeArrayIndexFromHashField(raw_hash_field)));
1054110599
CSA_DCHECK(this, IntPtrLessThan(index, IntPtrConstant(INT_MAX)));
1054210600
*var_index = index;
1054310601
Goto(if_keyisindex);

deps/v8/src/codegen/code-stub-assembler.h

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4835,6 +4835,12 @@ class V8_EXPORT_PRIVATE CodeStubAssembler
48354835
return WordEqual(WordAnd(flags, IntPtrConstant(mask)), IntPtrConstant(0));
48364836
}
48374837

4838+
#ifdef V8_ENABLE_SEEDED_ARRAY_INDEX_HASH
4839+
// Mirror C++ StringHasher::SeedArrayIndexValue and UnseedArrayIndexValue.
4840+
TNode<Uint32T> SeedArrayIndexValue(TNode<Uint32T> value);
4841+
TNode<Uint32T> UnseedArrayIndexValue(TNode<Uint32T> value);
4842+
#endif // V8_ENABLE_SEEDED_ARRAY_INDEX_HASH
4843+
48384844
private:
48394845
friend class CodeStubArguments;
48404846

deps/v8/src/heap/factory-base.cc

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -299,9 +299,9 @@ Handle<ProtectedWeakFixedArray> FactoryBase<Impl>::NewProtectedWeakFixedArray(
299299
}
300300

301301
template <typename Impl>
302-
Handle<ByteArray> FactoryBase<Impl>::NewByteArray(int length,
303-
AllocationType allocation) {
304-
return ByteArray::New(isolate(), length, allocation);
302+
Handle<ByteArray> FactoryBase<Impl>::NewByteArray(
303+
int length, AllocationType allocation, AllocationAlignment alignment) {
304+
return ByteArray::New(isolate(), length, allocation, alignment);
305305
}
306306

307307
template <typename Impl>
@@ -1173,7 +1173,8 @@ inline Handle<String> FactoryBase<Impl>::SmiToString(Tagged<Smi> number,
11731173
if (raw->raw_hash_field() == String::kEmptyHashField &&
11741174
number.value() >= 0) {
11751175
uint32_t raw_hash_field = StringHasher::MakeArrayIndexHash(
1176-
static_cast<uint32_t>(number.value()), raw->length());
1176+
static_cast<uint32_t>(number.value()), raw->length(),
1177+
HashSeed(read_only_roots()));
11771178
raw->set_raw_hash_field(raw_hash_field);
11781179
}
11791180
}
@@ -1333,9 +1334,9 @@ FactoryBase<Impl>::AllocateRawTwoByteInternalizedString(
13331334

13341335
template <typename Impl>
13351336
Tagged<HeapObject> FactoryBase<Impl>::AllocateRawArray(
1336-
int size, AllocationType allocation, AllocationHint hint) {
1337-
Tagged<HeapObject> result =
1338-
AllocateRaw(size, allocation, AllocationAlignment::kTaggedAligned, hint);
1337+
int size, AllocationType allocation, AllocationHint hint,
1338+
AllocationAlignment alignment) {
1339+
Tagged<HeapObject> result = AllocateRaw(size, allocation, alignment, hint);
13391340
if ((size >
13401341
isolate()->heap()->AsHeap()->MaxRegularHeapObjectSize(allocation)) &&
13411342
v8_flags.use_marking_progress_bar) {

0 commit comments

Comments
 (0)