Skip to content

withNumberCompressed(true) silently corrupts boxed Long fields when serialize and deserialize go through different ThreadSafeForyPool instances #3624

@drse

Description

@drse

Search before asking

  • I had searched in the issues and found no similar issues.

Version

  • Apache Fory 0.17.0
  • JDK 21
  • Linux / macOS

Component(s)

Java

Minimal reproduce step

When all of the following are true, Fory either throws IndexOutOfBoundsException from the generated codec or, worse, silently decodes a different long value with no exception:

  • pooled Fory (buildThreadSafeForyPool)
  • codegen on (withCodegen(true) — the default)
  • number compression on (withNumberCompressed(true))
  • a field of type boxed Long (not primitive long)

The bug requires two pool instances. Single-Fory round-trip works. Disabling any one of withNumberCompressed, withCodegen, the boxing, or the second pool eliminates the bug.

import org.apache.fory.Fory;
import org.apache.fory.ThreadSafeFory;
import org.apache.fory.config.CompatibleMode;
import org.apache.fory.config.ForyBuilder;
import org.apache.fory.config.Language;

public class ForyPoolNumberCompressedBug {

    public record Payload(Long longValue, String stringValue) {}

    static ForyBuilder builder() {
        return Fory.builder()
                .withLanguage(Language.JAVA)
                .withCodegen(true)
                .withAsyncCompilation(true)
                .requireClassRegistration(false)
                .suppressClassRegistrationWarnings(true)
                .withDeserializeUnknownClass(true)
                .withRefTracking(true)
                .withCompatibleMode(CompatibleMode.COMPATIBLE)
                .withStringCompressed(true)
                .withNumberCompressed(true) // <-- removing this fixes the bug
                .withRefCopy(true);
    }

    public static void main(String[] args) {
        // Two pools built from the same config — e.g. producer/consumer on
        // different nodes, or process-restart snapshot restore.
        ThreadSafeFory writer = builder().buildThreadSafeForyPool(4);
        ThreadSafeFory reader = builder().buildThreadSafeForyPool(4);

        Payload original = new Payload(
                123_456_789L,
                "longer string with multibyte: \u00ff\u00fe");

        byte[] bytes = writer.serialize(original);
        Payload roundTrip = (Payload) reader.deserialize(bytes);

        System.out.println("original  = " + original);
        System.out.println("roundTrip = " + roundTrip);

        if (!original.equals(roundTrip)) {
            throw new AssertionError("CORRUPTION: " + original + " != " + roundTrip);
        }
    }
}

What did you expect to see?

A round-trip across two pools built from the same ForyBuilder should produce a value equal to the original.

What did you see instead?

original  = Payload[longValue=123456789, stringValue=longer string with multibyte: ÿþ]
roundTrip = Payload[longValue=988764757, stringValue=longer string with multibyte: ÿþ]
Exception in thread "main" java.lang.AssertionError: CORRUPTION: ...

Alternate failure mode (thrown from generated codec)

With other payload shapes (concurrent calls on a single pool, multiple records in sequence) the same configuration throws:

java.lang.IndexOutOfBoundsException: readerIndex(78) + length(18) exceeds size(84)
    at org.apache.fory.memory.MemoryBuffer$BoundChecker.fillBuffer(MemoryBuffer.java:189)
    at org.apache.fory.serializer.StringSerializer.readBytesUnCompressedUTF16(StringSerializer.java:565)
    at org.apache.fory.serializer.StringSerializer.readCompressedBytesString(StringSerializer.java:259)
    at <Pkg>$PayloadForyRefCodecMetaShared0_0.read(<Pkg>$PayloadForyRefCodecMetaShared0_0.java:47)
    at org.apache.fory.context.ReadContext.readDataInternal(ReadContext.java:666)
    at org.apache.fory.context.ReadContext.readNonRef(ReadContext.java:580)
    at org.apache.fory.context.ReadContext.readRef(ReadContext.java:518)
    at org.apache.fory.Fory.deserialize(Fory.java:476)

The generated codec name is *ForyRefCodecMetaShared0_0, suggesting the writer and reader codecs disagree on the encoded layout for boxed numeric fields.

Bisection (each toggle applied in isolation against the failing baseline)

Toggle Result
withNumberCompressed(false) works
withCodegen(false) (interpreted path) works
Payload(long longValue, String stringValue) (primitive long) works
single Fory (no pool) works
withStringCompressed(false) still broken
withRefTracking(false) + withRefCopy(false) still broken
CompatibleMode.SCHEMA_CONSISTENT still broken

So the trigger is codegen + number compression + boxed numeric field + cross-pool deserialization. The bug exists in both COMPATIBLE and SCHEMA_CONSISTENT modes.

Anything Else?

As a workaround, resorting to withNumberCompressed(false)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions