Skip to content

feat(cpp): add float16 to c++#3486

Closed
UninspiredCarrot wants to merge 25 commits intoapache:mainfrom
UninspiredCarrot:float16
Closed

feat(cpp): add float16 to c++#3486
UninspiredCarrot wants to merge 25 commits intoapache:mainfrom
UninspiredCarrot:float16

Conversation

@UninspiredCarrot
Copy link

@UninspiredCarrot UninspiredCarrot commented Mar 16, 2026

Why?

Implement float16_t (IEEE 754 binary16 / half-precision) as a primitive type in the C++ runtime, as required by issue #3208. No C++ standard type represents float16, so the framework needs its own strong type with correct
IEEE 754 semantics and serialiser integration.

What does this PR do?

cpp/fory/util/float16.h / float16.cc — new fory::float16_t strong type:

  • Trivial, standard-layout, exactly 2 bytes; internal storage is uint16_t bits accessed only via to_bits()/from_bits()
  • from_float / to_float — IEEE 754 compliant conversion with round-to-nearest ties-to-even, correct handling of ±0, ±Inf, NaN (payload preserved, signaling→quiet), subnormals, overflow→Inf, underflow→subnormal/±0
  • Classification: is_nan, is_inf (two overloads), is_zero, signbit, is_subnormal, is_normal, is_finite
  • Arithmetic: add, sub, mul, div, neg, abs; optional math: sqrt, min, max, copysign, floor, ceil, trunc, round, round_to_even; compound assignment and binary operator overloads
  • Comparisons: equal, less, less_eq, greater, greater_eq, compare (NaN unordered, +0 == −0); comparison operator overloads (==, !=, <, <=, >, >=)

cpp/fory/serialization/struct_serializer.h — serializer integration:

  • Serializer<float16_t> specialization wired to TypeId::FLOAT16 (type ID 17)

cpp/fory/util/float16_test.cc — exhaustive tests (1300+ lines, 61 test cases):

  • Stress-tests all 65 536 bit patterns for round-trip correctness
  • Ties-to-even rounding, subnormal gradual underflow, overflow→Inf, NaN payload preservation
  • Buffer wire-format goldens (little-endian), serializer round-trips (scalar, vector, map, optional), type ID check
  • Full comparison test suite including NaN unordered and ±0 equality edge cases

Related issues

AI Contribution Checklist

  • Substantial AI assistance was used in this PR: no
  • If yes, I included a completed AI Contribution Checklist in this PR description and the required AI Usage Disclosure.

Does this PR introduce any user-facing change?

  • Does this PR introduce any public API change?
  • Does this PR introduce any binary protocol compatibility change?

UninspiredCarrot and others added 25 commits March 7, 2026 23:55
…oinline (apache#3456)

## Why?

//go:inline directives were introduced in the go runtime, expecting the
gc compiler would inline that specific function. But //go:inline isn't a
directive that is recognized by the gc compiler. gc compiler
automatically inlines a function which costs less than 80.

## What does this PR do?

1. Removes all occurrences of //go:inline. 
2. Mark cold paths as //go:noinline.

## Related issues
Closes apache#3446 


## Does this PR introduce any user-facing change?



- [ ] Does this PR introduce any public API change?
- [ ] Does this PR introduce any binary protocol compatibility change?

## Benchmark
main branch -
<img width="1114" height="811" alt="image"
src="https://github.com/user-attachments/assets/75f1dbea-7110-484b-8233-ac5599137cf6"
/>


this branch - 
<img width="1114" height="811" alt="image"
src="https://github.com/user-attachments/assets/50d8b9cd-f4e9-48a0-a81d-36e47ac32947"
/>
)

## Why?

We currently don't have any size limits for incoming payloads in the C++
implementation. This is a security risk because a malicious or malformed
payload can claim to have a massive collection or binary length, forcing
the system to pre-allocate gigabytes of memory (via `.reserve()` or
constructors) before actually reading the data. This makes the system
vulnerable to simple Out-of-Memory (OOM) Denial-of-Service attacks.

## What does this PR do?

This PR adds two essential security guardrails to the deserialization
path: `max_binary_size` and `max_collection_size`.

**Changes included:**

* **Config & API**: Added the two new limits to `serialization::Config`
and updated `ForyBuilder` so users can easily set these at runtime.
Defaults are 64MB for binary and 1M entries for collections.
* **Security Enforcement**:
* Integrated checks into all sensitive pre-allocation paths, including
`std::vector`, `std::list`, `std::deque`, `std::set`, and
`std::unordered_set`.
* Added entry-count validation for Maps (both fast and slow paths).
* Specifically handled arithmetic vectors by converting byte-lengths to
element counts to ensure `max_collection_size` is respected.


* **Context Access**: Exposed a public `config()` accessor in
`ReadContext` and `WriteContext` so internal serializers can reach these
settings.
* **Tests**: Added new test cases in `collection_serializer_test.cc` and
`map_serializer_test.cc` to verify that deserialization fails
immediately with a descriptive error when limits are exceeded.

## Related issues

Fixes apache#3408

## Does this PR introduce any user-facing change?

Yes, it adds two new methods (`max_binary_size` and
`max_collection_size`) to the `ForyBuilder`.

* [x] Does this PR introduce any public API change?
* [ ] Does this PR introduce any binary protocol compatibility change?

## Benchmark

The performance impact is negligible. The checks are simple integer
comparisons performed once per collection/binary read, occurring right
before the expensive allocation phase. All 30 existing C++ test targets
pass with no measurable change in execution time.
## Why?



## What does this PR do?


## Related issues

apache#3355


## Does this PR introduce any user-facing change?



- [ ] Does this PR introduce any public API change?
- [ ] Does this PR introduce any binary protocol compatibility change?

## Benchmark

| Datatype | Operation | Fory TPS | Protobuf TPS | Msgpack TPS | Fastest
|
| --- | --- | ---: | ---: | ---: | --- |
| Struct | Serialize | 9,727,950 | 6,572,406 | 141,248 | fory (1.48x) |
| Struct | Deserialize | 11,889,570 | 8,584,510 | 99,792 | fory (1.39x)
|
| Sample | Serialize | 3,496,305 | 1,281,983 | 17,188 | fory (2.73x) |
| Sample | Deserialize | 1,045,018 | 765,706 | 12,767 | fory (1.36x) |
| MediaContent | Serialize | 1,425,354 | 678,542 | 29,048 | fory (2.10x)
|
| MediaContent | Deserialize | 614,447 | 478,298 | 12,711 | fory (1.28x)
|
| StructList | Serialize | 3,307,962 | 1,028,210 | 24,781 | fory (3.22x)
|
| StructList | Deserialize | 2,788,200 | 708,596 | 8,160 | fory (3.93x)
|
| SampleList | Serialize | 715,734 | 205,380 | 3,361 | fory (3.48x) |
| SampleList | Deserialize | 199,317 | 133,425 | 1,498 | fory (1.49x) |
| MediaContentList | Serialize | 364,097 | 103,721 | 5,538 | fory
(3.51x) |
| MediaContentList | Deserialize | 103,421 | 86,331 | 1,529 | fory
(1.20x) |
## Why?



## What does this PR do?



## Related issues

apache#3349 

## Does this PR introduce any user-facing change?



- [ ] Does this PR introduce any public API change?
- [ ] Does this PR introduce any binary protocol compatibility change?

## Benchmark
## Why?



## What does this PR do?

add fory perf optimization skill for ai agent

## Related issues

apache#3397 apache#3355 apache#3012 apache#1993 apache#2982
apache#1017 

## Does this PR introduce any user-facing change?



- [ ] Does this PR introduce any public API change?
- [ ] Does this PR introduce any binary protocol compatibility change?

## Benchmark
## Why?



## What does this PR do?

apache#3387 

## Related issues



## Does this PR introduce any user-facing change?



- [ ] Does this PR introduce any public API change?
- [ ] Does this PR introduce any binary protocol compatibility change?

## Benchmark
## Why?



## What does this PR do?



## Related issues

apache#1017 

## Does this PR introduce any user-facing change?



- [ ] Does this PR introduce any public API change?
- [ ] Does this PR introduce any binary protocol compatibility change?

## Benchmark
…che#3468)

## Summary
- support typed `tuple[...]` dataclass fields in Python native-mode
field inference by routing them to `TupleSerializer`
- serialize instances without `__dict__` as zero-field objects so bare
`object()` round-trips cleanly
- add focused regressions for both issues, including empty-object ref
tracking behavior

## Testing
- cd python && ruff format --check pyfory/type_util.py pyfory/struct.py
pyfory/serializer.py pyfory/tests/test_struct.py
pyfory/tests/test_serializer.py
- cd python && ruff check pyfory/type_util.py pyfory/struct.py
pyfory/serializer.py pyfory/tests/test_struct.py
pyfory/tests/test_serializer.py
- cd python && ENABLE_FORY_CYTHON_SERIALIZATION=0 pytest -q .
- cd python && ENABLE_FORY_CYTHON_SERIALIZATION=1 pytest -q .

Closes apache#3466
Closes apache#3467
## Why?

Apache Fory is aperformance-critical foundational serialization
framework.
As AI-assisted contributions increase, we need a project policy that
keeps review quality high, protects legal/provenance safety, and
preserves long-term maintainability.

This policy is intended to:
- keep human accountability as the core rule
- reduce low-signal or unverified AI-generated changes
- require verifiable testing/performance evidence for sensitive paths
- align contribution practice with ASF legal and governance expectations

## What does this PR do?

- Adds a new top-level policy document: `AI_CONTRIBUTION_POLICY.md`.
- Defines requirements for AI-assisted contributions, including:
  - contributor responsibility and line-by-line self review
  - privacy-safe disclosure expectations
- verification requirements (tests/spec/perf evidence where applicable)
  - licensing/provenance requirements
  - quality gate and maintainer enforcement process
- Updates `.github/pull_request_template.md` to add an AI contribution
checklist for PR authors.

## Related issues

N/A

## Does this PR introduce any user-facing change?

- [ ] Does this PR introduce any public API change?
- [ ] Does this PR introduce any binary protocol compatibility change?

## Benchmark

N/A (documentation and PR template change only)
## Why?



## What does this PR do?



## Related issues



## AI Contribution Checklist (required when AI assistance = `yes`)




- [ ] Substantial AI assistance was used in this PR: `yes` / `no`
- [ ] If `yes`, I included the standardized AI Usage Disclosure block
below.
- [ ] If `yes`, I can explain and defend all important changes without
AI help.
- [ ] If `yes`, I reviewed AI-assisted code changes line by line before
submission.
- [ ] If `yes`, I ran adequate human verification and recorded evidence
(checks run locally or in CI, pass/fail summary, and confirmation I
reviewed results).
- [ ] If `yes`, I added/updated tests and specs where required.
- [ ] If `yes`, I validated protocol/performance impacts with evidence
when applicable.
- [ ] If `yes`, I verified licensing and provenance compliance.

AI Usage Disclosure (only when substantial AI assistance = `yes`):



```text
AI Usage Disclosure
- substantial_ai_assistance: yes
- scope: <design drafting | code drafting | refactor suggestions | tests | docs | other>
- affected_files_or_subsystems: <high-level paths/modules>
- human_verification: <checks run locally or in CI + pass/fail summary + contributor reviewed results>
- performance_verification: <N/A or benchmark/regression evidence summary>
- provenance_license_confirmation: <Apache-2.0-compatible provenance confirmed; no incompatible third-party code introduced>
```

## Does this PR introduce any user-facing change?



- [ ] Does this PR introduce any public API change?
- [ ] Does this PR introduce any binary protocol compatibility change?

## Benchmark
…3382)

## What does this PR do?

Adds depth limiting for deserialization to prevent stack overflow and
denial-of-service attacks from maliciously crafted deeply nested data
structures.

## Why is this needed?

Without depth limits, an attacker could send deeply nested serialized
data that causes stack overflow during deserialization, crashing the
application or causing resource exhaustion.

## Implementation

- Added `maxDepth` config option (default: 50, minimum: 2)
- Depth tracked only during deserialization (security-focused)
- Integrated into code generator with try/finally for proper cleanup
- Comprehensive test coverage (29 tests)

## Usage

```typescript
const fory = new Fory({ maxDepth: 100 });
```

## Consistency
Follows the same pattern as Java and Python implementations for
cross-language alignment.

Fixes apache#3335

---------

Co-authored-by: chaokunyang <shawn.ck.yang@gmail.com>
## Why?



## What does this PR do?



## Related issues



## AI Contribution Checklist (required when AI assistance = `yes`)




- [ ] Substantial AI assistance was used in this PR: `yes` / `no`
- [ ] If `yes`, I included the standardized AI Usage Disclosure block
below.
- [ ] If `yes`, I can explain and defend all important changes without
AI help.
- [ ] If `yes`, I reviewed AI-assisted code changes line by line before
submission.
- [ ] If `yes`, I ran adequate human verification and recorded evidence
(checks run locally or in CI, pass/fail summary, and confirmation I
reviewed results).
- [ ] If `yes`, I added/updated tests and specs where required.
- [ ] If `yes`, I validated protocol/performance impacts with evidence
when applicable.
- [ ] If `yes`, I verified licensing and provenance compliance.

AI Usage Disclosure (only when substantial AI assistance = `yes`):



```text
AI Usage Disclosure
- substantial_ai_assistance: yes
- scope: <design drafting | code drafting | refactor suggestions | tests | docs | other>
- affected_files_or_subsystems: <high-level paths/modules>
- human_verification: <checks run locally or in CI + pass/fail summary + contributor reviewed results>
- performance_verification: <N/A or benchmark/regression evidence summary>
- provenance_license_confirmation: <Apache-2.0-compatible provenance confirmed; no incompatible third-party code introduced>
```

## Does this PR introduce any user-facing change?



- [ ] Does this PR introduce any public API change?
- [ ] Does this PR introduce any binary protocol compatibility change?

## Benchmark
…pache#3472)

## What does this PR do?
- Adds a no-progress guard to `DeflaterMetaCompressor.decompress` so
invalid/truncated deflate streams fail fast instead of spinning forever.
- Introduces `InvalidDataException` (subclass of `ForyException`) and
raises it for malformed/truncated meta-compression input.
- Adds `DeflaterMetaCompressorTest` coverage for roundtrip +
truncated/corrupt payload failures (with timeout guard).
- Registers `InvalidDataException` in `fory-core`
`native-image.properties`.

## Why is this needed?
Issue apache#3471 reports an infinite loop (`inflate() == 0` while `finished()
== false`) that can peg CPU at 100% for corrupt or truncated streams.

## Verification
- `cd java && mvn -T16 -pl fory-core test
-Dtest=org.apache.fory.meta.DeflaterMetaCompressorTest`

Closes apache#3471
…K25 (apache#3476)

## Why?

Spotless fails when run in JDK25:
```
Execution default-cli of goal com.diffplug.spotless:spotless-maven-plugin:2.41.1:apply failed: An API incompatibility was encountered while executing com.diffplug.spotless:spotless-maven-plugin:2.41.1:apply: java.lang.NoSuchMethodError: 'java.util.Queue com.sun.tools.javac.util.Log$DeferredDiagnosticHandler.getDiagnostics()'
```

## What does this PR do?

Updates build plugins to newer versions with JDK25 support
## Why?

Restores compact codec fixed width binary optimizations

## Related issues

Fixes apache#3117
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[C++] add float16 to c++

6 participants