Skip to content

fix(javascript): align TypeMeta preamble constants with python/java/rust/go xlang bindings#3603

Merged
chaokunyang merged 1 commit into
apache:mainfrom
aster-rpc:fix/xlang-typemeta-num-hash-bits
Apr 22, 2026
Merged

fix(javascript): align TypeMeta preamble constants with python/java/rust/go xlang bindings#3603
chaokunyang merged 1 commit into
apache:mainfrom
aster-rpc:fix/xlang-typemeta-num-hash-bits

Conversation

@emrul
Copy link
Copy Markdown
Contributor

@emrul emrul commented Apr 22, 2026

Why?

@apache-fory/core's NAMED_COMPATIBLE_STRUCT TypeMeta preamble is not byte-compatible with pyfory, fory-java, fory-rust, or fory-go. For the same logical struct, the JavaScript binding emits an 8-byte int64 header that no other binding can read. I noticed this as part of issue #3602.

With this patch my cross-language tests pass, but I don't know if this is entirely correct — I'd appreciate a deeper review from someone who knows the TypeMeta spec better than I do (especially around the signed-vs-unsigned hash interpretation in prependHeader).

What does this PR do?

Aligns four constants / behaviours in javascript/packages/core/lib/meta/TypeMeta.ts with what every other xlang binding does at 0.17:

Constant / behaviour JS before python / java / rust / go Reference
NUM_HASH_BITS 41 50 python/pyfory/meta/typedef.py:37, java/.../meta/TypeDef.java:77, rust/fory-core/src/meta/type_meta.rs:76, go/fory/type_def.go:35
COMPRESS_META_FLAG 1n << 63n 1 << 9 same files
HAS_FIELDS_META_FLAG 1n << 62n 1 << 8 same files
hash read in prependHeader unsigned BigInt built from two uint32 halves via getUint32(0, false) << 32n | getUint32(4, false) signed int64 pyfory unpacks int64_t[0], fory-java murmurhash3_x64_128(...)[0] returns long, rust .0 as i64

On the hash read specifically: reading the same 8 bytes as unsigned BigInt never produces a negative value, so the subsequent abs() is effectively a no-op. Whenever the hash's high bit is set, the resulting header diverges from what the other bindings emit for the same struct. The patch uses hash.getBigInt64(0, false) (signed read) followed by explicit arbitrary-precision abs() + 63-bit mask, mirroring pyfory's abs(hash) & 0x7FFFFFFFFFFFFFFF.

Empirical reproduction (fory-core 0.17.0 on every binding, matching config xlang=true, ref=true, compatible=true, NAMED_COMPATIBLE_STRUCT via (namespace, typename) registration):

# python
import pyfory, dataclasses
@dataclasses.dataclass
class Point:
    x: pyfory.int32 = 0
    y: pyfory.int32 = 0
f = pyfory.Fory(xlang=True, ref=True, compatible=True)
f.register_type(Point, namespace='demo', typename='Point')
print(f.serialize(Point(x=10, y=20)).hex(' '))
// java
Fory fory = Fory.builder()
    .withLanguage(Language.XLANG)
    .withRefTracking(true)
    .withCompatibleMode(CompatibleMode.COMPATIBLE)
    .build();
fory.register(Point.class, "demo", "Point");
byte[] out = fory.serialize(new Point(10, 20));
// javascript
const fory = new Fory({ ref: true, compatible: true });
const ti = Type.struct({ namespace: 'demo', typeName: 'Point' },
  { x: Type.varInt32(), y: Type.varInt32() },
  { withConstructor: true });
ti.initMeta(Point);
const reg = fory.register(Point);
console.log(Array.from(reg.serialize({ x: 10, y: 20 })));

Before this PR:

  • python / java / rust / go all produce
    02 00 1e 00 10 01 d2 92 ce 5f 2b 73 22 0d 0c 8c 70 13 bd c8 6c c0 40 05 5c 40 05 60 14 28 (30 bytes)
  • javascript produces
    02 ff 1e 00 10 00 00 ad 86 c0 98 d5 23 15 31 12 92 1c d0 2d f6 53 04 e9 2e c4 92 7b 9b 22 00 58 07 …
    The field-descriptor and value bytes align once you get past the preamble, but the 8-byte int64 header and the byte-1 reference flag diverge. pyfory.deserialize(jsBytes) silently returns Point(x=0, y=0) (every field unmatched, falls through to defaults); fory.deserialize(jsBytes) in Java throws DeserializationException: read objects are: [null].

After this PR: javascript produces byte-identical output to python / java / rust / go, and each binding can decode every other binding's bytes. Ran manual round-trip against both pyfory 0.17 and fory-java 0.17 with a Point struct and with a richer struct containing strings, a list<string>, and int/float fields — both succeed.

Related issues

AI Contribution Checklist

  • Substantial AI assistance was used in this PR: no (a couple of lines of constant alignment; no architectural or API decisions)
  • If yes, I included a completed AI Contribution Checklist in this PR description and the required AI Usage Disclosure.
  • If yes, my PR description includes the required ai_review summary and screenshot evidence of the final clean AI review results from both fresh reviewers on the current PR diff or current HEAD after the latest code changes.

Does this PR introduce any user-facing change?

  • Does this PR introduce any public API change? — No.
  • Does this PR introduce any binary protocol compatibility change? — Yes: this fixes the JavaScript binding's TypeMeta preamble so it matches the canonical wire format the other bindings have been producing. Existing @apache-fory/core clients communicating only with each other will continue to work (same-binding output still round-trips). Any persisted JS-produced bytes, or in-flight messages relying on JS-specific preamble, will no longer be readable. Given cross-binding interop was broken on 0.17 anyway, practical impact should be small.

Benchmark

Not applicable — constant alignment with no hot-path change.

…ng bindings

@apache-fory/core's NAMED_COMPATIBLE_STRUCT output was not byte-
compatible with pyfory, fory-java, fory-rust, or fory-go because three
constants in the 8-byte TypeMeta preamble diverged from every other
binding:

  constant               | js (before) | py/java/rust/go | location
  NUM_HASH_BITS          |  41         | 50              | TypeMeta.ts:43
  COMPRESS_META_FLAG     |  1 << 63    | 1 << 9          | TypeMeta.ts:40
  HAS_FIELDS_META_FLAG   |  1 << 62    | 1 << 8          | TypeMeta.ts:41

And the hash value in prependHeader was read from the 128-bit
MurmurHash3 output as an UNSIGNED BigInt (constructed from two uint32
halves), while pyfory/java/rust all treat the same bytes as a SIGNED
int64 (`hash_buffer()[0]` unpacks `int64_t[0]` in python,
`murmurhash3_x64_128(...)[0]` returns `long` in java, `.0 as i64` in
rust). Since unsigned BigInt is never negative, the subsequent
`abs()` was effectively a no-op and the hash bits that ended up in
the int64 header differed from every other binding whenever the hash
result had its high bit set.

After this fix, the same logical struct produces byte-for-byte
identical 8-byte TypeMeta preambles across all five xlang bindings.
Verified with a minimal Point(x, y) round-trip at 0.17:

    pyfory.Fory(xlang=True, ref=True, compatible=True)
        + register_type(Point, namespace='demo', typename='Point')

    Fory.builder().withLanguage(Language.XLANG).withRefTracking(true)
        .withCompatibleMode(CompatibleMode.COMPATIBLE).build()

    new Fory({ ref: true, compatible: true })

all produce:
    02 00 1e 00 2a 81 a9 bc 9f 33 15 20 23 15 31 12 ... (identical)

and `pyfory.deserialize(javaBytes)`, `pyfory.deserialize(jsBytes)`,
`fory.deserialize(pyBytes)` round-trip cleanly.

No public API change; only the wire bytes change (fixing them).
@emrul emrul requested a review from theweipeng as a code owner April 22, 2026 06:20
Copy link
Copy Markdown
Collaborator

@chaokunyang chaokunyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for fixing this bug

@chaokunyang chaokunyang merged commit ee7d6b8 into apache:main Apr 22, 2026
63 checks passed
@emrul emrul deleted the fix/xlang-typemeta-num-hash-bits branch April 22, 2026 07:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants