feat(rust): add configurable size guardrails by ayush00git · Pull Request #3579 · apache/fory

ayush00git · 2026-04-16T18:01:20Z

Why?

To prevent excessive allocation from malicious untrusted payloads in the Rust runtime.

What does this PR do?

This brings the Rust implementation into parity with the C++ runtime by introducing configurable guardrails for binary sizes and collection counts.

Related issues

#3409

AI Contribution Checklist

AI Usage Disclosure (only when substantial AI assistance = yes):

If yes, my PR description includes the required ai_review summary and screenshot evidence of the final clean AI review results from both fresh reviewers on the current PR diff or current HEAD after the latest code changes.

The implementation looks solid, matches the C++ reference implementation accurately, and is ready for the main branch. I've reviewed your code changes across config.rs, context.rs, and the serializer module, verifying edge cases and consistency.

Here are the key aspects that verify the correctness of the Rust implementation:

Guardrail Logic Parity with C++:

max_collection_size correctly intercepts untrusted payload lengths before large iterations or memory allocations occur in Vec, HashMap, BTreeMap, and other collection variants. This perfectly encapsulates the C++ read_collection_data_slow patterns.
max_binary_size accurately targets raw-byte slice operations across primitive_list.rs without inappropriately restricting types like String, mirroring how C++ omits this check inside string_serializer.h.
Accurate Error Propagation:

The addition of Error::SizeLimitExceeded works brilliantly with the new Fory configuration model. You're properly halting deserialization and passing clear payload dimensions to Error::size_limit_exceeded, ensuring graceful failure without panics or memory exhaustion.
Validation and Pre-Allocation Consistency:

Checking max_collection_size just prior to enforcing buffer limits inside check_collection_len and check_map_len is well-ordered. By prioritizing the runtime limit checks, you avoid false OOMs or deep iterations if Fory receives heavily corrupted payload headers.

Does this PR introduce any user-facing change?

Does this PR introduce any public API change? ((Yes, adds Fory configuration methods max_binary_size() and max_collection_size(), as well as Error::SizeLimitExceeded))
Does this PR introduce any binary protocol compatibility change?

Benchmark

ayush00git · 2026-04-16T18:12:13Z

Hey @chaokunyang
Have a look at the implementation and let me know the changes.
I don't write rust, but these issues (including streaming deserialization support) were pending from a long time, so I used AI to code but I have reviewed it.
Duplicate PR - #3421

chaokunyang · 2026-04-17T02:41:55Z

@ayush00git Could you run benchmarks/rust and compare with main branch?

ayush00git · 2026-04-17T05:44:17Z

main branch -

feat/rust-sizeguards branch -

ayush00git · 2026-04-17T05:47:45Z

some areas like MediaContentList serialization/deserialization, Sample serialization and StructList deserialization are showing regressions averagely of around 20%, i'll investigate these ones. most probably this is due to field type validations.

ayush00git · 2026-04-17T08:14:17Z

StructList and MediaContentList serialize calls still shows around 10% regression

feat/rust-sizeguards

## Benchmark Results

### Timing Results (nanoseconds)

| Datatype         | Operation   | fory (ns) | protobuf (ns) | Fastest |
| ---------------- | ----------- | --------- | ------------- | ------- |
| Struct           | Serialize   | 68.2      | 122.5         | fory    |
| Struct           | Deserialize | 37.9      | 64.8          | fory    |
| Sample           | Serialize   | 102.9     | 566.3         | fory    |
| Sample           | Deserialize | 162.6     | 868.7         | fory    |
| MediaContent     | Serialize   | 219.4     | 332.2         | fory    |
| MediaContent     | Deserialize | 280.4     | 599.8         | fory    |
| StructList       | Serialize   | 192.0     | 606.2         | fory    |
| StructList       | Deserialize | 143.2     | 444.3         | fory    |
| SampleList       | Serialize   | 391.3     | 4002.1        | fory    |
| SampleList       | Deserialize | 1279.0    | 4939.9        | fory    |
| MediaContentList | Serialize   | 856.0     | 2501.9        | fory    |
| MediaContentList | Deserialize | 1676.1    | 3206.9        | fory    |

### Throughput Results (ops/sec)

| Datatype         | Operation   | fory TPS   | protobuf TPS | Fastest |
| ---------------- | ----------- | ---------- | ------------ | ------- |
| Struct           | Serialize   | 14,665,552 | 8,161,267    | fory    |
| Struct           | Deserialize | 26,369,222 | 15,434,242   | fory    |
| Sample           | Serialize   | 9,721,007  | 1,765,880    | fory    |
| Sample           | Deserialize | 6,151,575  | 1,151,198    | fory    |
| MediaContent     | Serialize   | 4,558,716  | 3,010,144    | fory    |
| MediaContent     | Deserialize | 3,565,952  | 1,667,167    | fory    |
| StructList       | Serialize   | 5,208,605  | 1,649,702    | fory    |
| StructList       | Deserialize | 6,985,191  | 2,250,883    | fory    |
| SampleList       | Serialize   | 2,555,323  | 249,869      | fory    |
| SampleList       | Deserialize | 781,861    | 202,433      | fory    |
| MediaContentList | Serialize   | 1,168,170  | 399,696      | fory    |
| MediaContentList | Deserialize | 596,623    | 311,828      | fory    |

main

## Benchmark Results

### Timing Results (nanoseconds)

| Datatype         | Operation   | fory (ns) | protobuf (ns) | Fastest |
| ---------------- | ----------- | --------- | ------------- | ------- |
| Struct           | Serialize   | 67.5      | 123.3         | fory    |
| Struct           | Deserialize | 38.3      | 63.4          | fory    |
| Sample           | Serialize   | 101.4     | 561.7         | fory    |
| Sample           | Deserialize | 165.6     | 919.2         | fory    |
| MediaContent     | Serialize   | 213.0     | 332.2         | fory    |
| MediaContent     | Deserialize | 281.9     | 568.0         | fory    |
| StructList       | Serialize   | 175.2     | 678.8         | fory    |
| StructList       | Deserialize | 141.8     | 453.0         | fory    |
| SampleList       | Serialize   | 448.6     | 3831.5        | fory    |
| SampleList       | Deserialize | 1347.9    | 4977.6        | fory    |
| MediaContentList | Serialize   | 759.1     | 2429.7        | fory    |
| MediaContentList | Deserialize | 1665.3    | 3674.4        | fory    |

### Throughput Results (ops/sec)

| Datatype         | Operation   | fory TPS   | protobuf TPS | Fastest |
| ---------------- | ----------- | ---------- | ------------ | ------- |
| Struct           | Serialize   | 14,815,693 | 8,109,642    | fory    |
| Struct           | Deserialize | 26,132,177 | 15,766,902   | fory    |
| Sample           | Serialize   | 9,864,852  | 1,780,215    | fory    |
| Sample           | Deserialize | 6,040,471  | 1,087,903    | fory    |
| MediaContent     | Serialize   | 4,695,056  | 3,009,782    | fory    |
| MediaContent     | Deserialize | 3,547,861  | 1,760,563    | fory    |
| StructList       | Serialize   | 5,707,437  | 1,473,231    | fory    |
| StructList       | Deserialize | 7,052,684  | 2,207,652    | fory    |
| SampleList       | Serialize   | 2,229,008  | 260,994      | fory    |
| SampleList       | Deserialize | 741,895    | 200,900      | fory    |
| MediaContentList | Serialize   | 1,317,402  | 411,573      | fory    |
| MediaContentList | Deserialize | 600,492    | 272,153      | fory    |

…ed buffer space

ayush00git · 2026-04-17T14:02:06Z

@chaokunyang
The benches are now fine, just MediaContentList on serialization calls is showing a regression of around 6%, rest every parameter is either in noise or improved.

updated feat/rust-sizeguards bench -

Timing Results (nanoseconds)

Datatype	Operation	fory (ns)	protobuf (ns)	Fastest
Struct	Serialize	74.2	138.3	fory
Struct	Deserialize	38.4	69.8	fory
Sample	Serialize	102.5	578.6	fory
Sample	Deserialize	164.8	881.1	fory
MediaContent	Serialize	229.7	330.2	fory
MediaContent	Deserialize	287.9	523.5	fory
StructList	Serialize	172.4	652.9	fory
StructList	Deserialize	144.6	413.2	fory
SampleList	Serialize	399.5	3725.5	fory
SampleList	Deserialize	1269.1	5017.5	fory
MediaContentList	Serialize	808.5	2440.5	fory
MediaContentList	Deserialize	1646.0	3546.2	fory

Throughput Results (ops/sec)

Datatype	Operation	fory TPS	protobuf TPS	Fastest
Struct	Serialize	13,468,195	7,231,181	fory
Struct	Deserialize	26,035,565	14,319,262	fory
Sample	Serialize	9,754,194	1,728,250	fory
Sample	Deserialize	6,069,066	1,134,996	fory
MediaContent	Serialize	4,352,936	3,028,376	fory
MediaContent	Deserialize	3,473,790	1,910,220	fory
StructList	Serialize	5,801,474	1,531,581	fory
StructList	Deserialize	6,917,064	2,419,960	fory
SampleList	Serialize	2,503,317	268,420	fory
SampleList	Deserialize	787,960	199,302	fory
MediaContentList	Serialize	1,236,889	409,752	fory
MediaContentList	Deserialize	607,533	281,992	fory

ayush00git · 2026-04-17T14:10:23Z

@chaokunyang
Do we plan to roll out this PR and the streaming deserialization support (which is still pending) in the v0.17.0 release ?

chaokunyang · 2026-04-18T08:49:17Z

This pr still introduce some performance regression, we can't merge it

chaokunyang · 2026-04-18T09:02:46Z

Only stream mode do not introduce any performance regression, then we will support it in fory rust.

ayush00git · 2026-04-19T08:53:05Z

Only stream mode do not introduce any performance regression, then we will support it in fory rust.

@chaokunyang could you please share me the outputs of terminal or example of operations which are showing regressions? I tried a lot debugging, regressions are there in write path f32/f64 and sometimes in some other ops as well, and i'm thinking some of these are just because of noise. size guardrails should've introduced regressions in the read path, but they didn't, the most likely cause of regression in the write path is the buffer pre-reservation that i made in this commit.

I'm investigating further, please if you find a likely cause running the benches on your machine, do guide me through what was real regressions or just noise.

chaokunyang · 2026-04-19T12:58:50Z

-                Ok(Box::new(WriteContext::new(type_resolver.clone(), config)))
+                Ok(Box::new(WriteContext::new(
+                    type_resolver.clone(),
+                    self.config.clone(),


why you clone config?

previously it was cloned as well, i just moved it inside creation closure, which avoided its access on every serialize call and using its cached context instead.

ayush00git · 2026-04-19T14:36:57Z

@chaokunyang have a look at the terminal output now, i tried setting the cpu performance fixed and now it didn't showed any regression

fory/benchmarks/rust on  main [$✘?] is 󰏗 v0.17.0-alpha.0 via  v3.14.3 via 󱘗 v1.95.0 on  (us-east-1) took 18s 
❯ cargo bench --bench buffer_write_bench -- --save-baseline main
    Finished `bench` profile [optimized] target(s) in 0.06s
     Running benches/buffer_write_bench.rs (target/release/deps/buffer_write_bench-011c6e0ee322c976)
Gnuplot not found, using plotters backend
write_u8/current        time:   [6.9713 µs 6.9928 µs 7.0194 µs]
                        thrpt:  [142.46 Melem/s 143.00 Melem/s 143.44 Melem/s]
Found 15 outliers among 100 measurements (15.00%)
  8 (8.00%) high mild
  7 (7.00%) high severe

write_i32/current       time:   [741.79 ns 743.48 ns 745.28 ns]
                        thrpt:  [1.3418 Gelem/s 1.3450 Gelem/s 1.3481 Gelem/s]
Found 10 outliers among 100 measurements (10.00%)
  1 (1.00%) low severe
  1 (1.00%) low mild
  7 (7.00%) high mild
  1 (1.00%) high severe

write_i64/current       time:   [749.40 ns 751.35 ns 753.40 ns]
                        thrpt:  [1.3273 Gelem/s 1.3309 Gelem/s 1.3344 Gelem/s]
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) low mild
  3 (3.00%) high mild
  1 (1.00%) high severe

write_f32/current       time:   [827.00 ns 834.11 ns 841.56 ns]
                        thrpt:  [1.1883 Gelem/s 1.1989 Gelem/s 1.2092 Gelem/s]
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild

write_f64/current       time:   [860.26 ns 866.46 ns 872.89 ns]
                        thrpt:  [1.1456 Gelem/s 1.1541 Gelem/s 1.1624 Gelem/s]
Found 7 outliers among 100 measurements (7.00%)
  2 (2.00%) high mild
  5 (5.00%) high severe

write_varint32_small/current
                        time:   [894.10 ns 898.41 ns 902.69 ns]
                        thrpt:  [1.1078 Gelem/s 1.1131 Gelem/s 1.1184 Gelem/s]
Found 11 outliers among 100 measurements (11.00%)
  1 (1.00%) low severe
  1 (1.00%) low mild
  6 (6.00%) high mild
  3 (3.00%) high severe

write_varint32_medium/current
                        time:   [1.4174 µs 1.4189 µs 1.4207 µs]
                        thrpt:  [703.90 Melem/s 704.77 Melem/s 705.54 Melem/s]
Found 11 outliers among 100 measurements (11.00%)
  1 (1.00%) low mild
  7 (7.00%) high mild
  3 (3.00%) high severe

write_varint32_large/current
                        time:   [1.6701 µs 1.6717 µs 1.6734 µs]
                        thrpt:  [597.59 Melem/s 598.20 Melem/s 598.77 Melem/s]
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe

write_varint64_small/current
                        time:   [1.1521 µs 1.1594 µs 1.1669 µs]
                        thrpt:  [856.95 Melem/s 862.49 Melem/s 868.01 Melem/s]
Found 14 outliers among 100 measurements (14.00%)
  1 (1.00%) low severe
  3 (3.00%) low mild
  10 (10.00%) high mild

write_varint64_medium/current
                        time:   [1.7693 µs 1.7748 µs 1.7802 µs]
                        thrpt:  [561.73 Melem/s 563.45 Melem/s 565.18 Melem/s]

write_varint64_large/current
                        time:   [2.8024 µs 2.8496 µs 2.9050 µs]
                        thrpt:  [344.24 Melem/s 350.93 Melem/s 356.84 Melem/s]
Found 38 outliers among 100 measurements (38.00%)
  16 (16.00%) low severe
  6 (6.00%) low mild
  1 (1.00%) high mild
  15 (15.00%) high severe
  
fory/benchmarks/rust on  feat/rust-sizeguards [$✘?] is 󰏗 v0.17.0-alpha.0 via  v3.14.3 via 󱘗 v1.95.0 on  (us-east-1) took 5s 
❯ cargo bench --bench buffer_write_bench -- --baseline main
    Finished `bench` profile [optimized] target(s) in 0.06s
     Running benches/buffer_write_bench.rs (target/release/deps/buffer_write_bench-011c6e0ee322c976)
Gnuplot not found, using plotters backend
write_u8/current        time:   [6.8933 µs 6.9147 µs 6.9399 µs]
                        thrpt:  [144.09 Melem/s 144.62 Melem/s 145.07 Melem/s]
                 change:
                        time:   [-2.0992% -1.4519% -0.9010%] (p = 0.00 < 0.05)
                        thrpt:  [+0.9091% +1.4733% +2.1442%]
                        Change within noise threshold.
Found 20 outliers among 100 measurements (20.00%)
  6 (6.00%) high mild
  14 (14.00%) high severe

write_i32/current       time:   [729.41 ns 732.87 ns 736.71 ns]
                        thrpt:  [1.3574 Gelem/s 1.3645 Gelem/s 1.3710 Gelem/s]
                 change:
                        time:   [-2.2545% -1.4650% -0.3802%] (p = 0.00 < 0.05)
                        thrpt:  [+0.3817% +1.4868% +2.3065%]
                        Change within noise threshold.
Found 7 outliers among 100 measurements (7.00%)
  2 (2.00%) high mild
  5 (5.00%) high severe

write_i64/current       time:   [741.69 ns 743.49 ns 745.33 ns]
                        thrpt:  [1.3417 Gelem/s 1.3450 Gelem/s 1.3483 Gelem/s]
                 change:
                        time:   [-1.2781% -0.9131% -0.5686%] (p = 0.00 < 0.05)
                        thrpt:  [+0.5718% +0.9215% +1.2947%]
                        Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

write_f32/current       time:   [814.51 ns 818.90 ns 823.47 ns]
                        thrpt:  [1.2144 Gelem/s 1.2211 Gelem/s 1.2277 Gelem/s]
                 change:
                        time:   [-2.0663% -1.1408% -0.1963%] (p = 0.02 < 0.05)
                        thrpt:  [+0.1966% +1.1540% +2.1099%]
                        Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
  4 (4.00%) high mild
  1 (1.00%) high severe

write_f64/current       time:   [823.12 ns 827.97 ns 833.88 ns]
                        thrpt:  [1.1992 Gelem/s 1.2078 Gelem/s 1.2149 Gelem/s]
                 change:
                        time:   [-6.2183% -4.5369% -2.9123%] (p = 0.00 < 0.05)
                        thrpt:  [+2.9997% +4.7525% +6.6306%]
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

write_varint32_small/current
                        time:   [899.07 ns 905.35 ns 914.34 ns]
                        thrpt:  [1.0937 Gelem/s 1.1045 Gelem/s 1.1123 Gelem/s]
                 change:
                        time:   [-0.7350% +0.7029% +2.6802%] (p = 0.46 > 0.05)
                        thrpt:  [-2.6103% -0.6980% +0.7405%]
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low severe
  1 (1.00%) low mild
  2 (2.00%) high mild
  3 (3.00%) high severe

write_varint32_medium/current
                        time:   [1.4128 µs 1.4142 µs 1.4159 µs]
                        thrpt:  [706.29 Melem/s 707.09 Melem/s 707.83 Melem/s]
                 change:
                        time:   [-0.6195% -0.3812% -0.1441%] (p = 0.00 < 0.05)
                        thrpt:  [+0.1443% +0.3827% +0.6233%]
                        Change within noise threshold.
Found 11 outliers among 100 measurements (11.00%)
  1 (1.00%) low severe
  1 (1.00%) low mild
  6 (6.00%) high mild
  3 (3.00%) high severe

write_varint32_large/current
                        time:   [1.6586 µs 1.6692 µs 1.6780 µs]
                        thrpt:  [595.95 Melem/s 599.07 Melem/s 602.92 Melem/s]
                 change:
                        time:   [-0.5057% -0.1241% +0.1752%] (p = 0.49 > 0.05)
                        thrpt:  [-0.1749% +0.1243% +0.5083%]
                        No change in performance detected.
Found 9 outliers among 100 measurements (9.00%)
  1 (1.00%) low severe
  1 (1.00%) low mild
  3 (3.00%) high mild
  4 (4.00%) high severe

write_varint64_small/current
                        time:   [1.1234 µs 1.1247 µs 1.1260 µs]
                        thrpt:  [888.12 Melem/s 889.16 Melem/s 890.17 Melem/s]
                 change:
                        time:   [-2.1981% -1.4802% -0.7546%] (p = 0.00 < 0.05)
                        thrpt:  [+0.7604% +1.5024% +2.2475%]
                        Change within noise threshold.
Found 7 outliers among 100 measurements (7.00%)
  2 (2.00%) low severe
  3 (3.00%) high mild
  2 (2.00%) high severe

write_varint64_medium/current
                        time:   [1.7863 µs 1.7907 µs 1.7945 µs]
                        thrpt:  [557.25 Melem/s 558.43 Melem/s 559.83 Melem/s]
                 change:
                        time:   [-0.1626% +0.2047% +0.5850%] (p = 0.27 > 0.05)
                        thrpt:  [-0.5816% -0.2042% +0.1629%]
                        No change in performance detected.

write_varint64_large/current
                        time:   [2.7535 µs 2.7624 µs 2.7721 µs]
                        thrpt:  [360.74 Melem/s 362.00 Melem/s 363.17 Melem/s]
                 change:
                        time:   [-3.0492% -1.9645% -1.0085%] (p = 0.00 < 0.05)
                        thrpt:  [+1.0188% +2.0038% +3.1451%]
                        Performance has improved.

chaokunyang

LGTM

feat(rust): add configurable size guardrails

7e31257

ayush00git requested review from chaokunyang and theweipeng as code owners April 16, 2026 18:01

feat: added size guard tests

8e058cf

fix: moved size check errors to a cold helper

a4f38ad

fix: move config.clone() to creation closure and defined a pre-reserv…

91c120d

…ed buffer space

chaokunyang reviewed Apr 19, 2026

View reviewed changes

chaokunyang approved these changes Apr 19, 2026

View reviewed changes

chaokunyang merged commit 3e94a45 into apache:main Apr 19, 2026
62 checks passed

This was referenced Apr 19, 2026

feat(rust): add configurable size guardrails #3421

Open

[Rust] configurable size guardrails for untrusted payloads #3409

Open

Conversation

ayush00git commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why?

What does this PR do?

Related issues

AI Contribution Checklist

Does this PR introduce any user-facing change?

Benchmark

Uh oh!

ayush00git commented Apr 16, 2026

Uh oh!

chaokunyang commented Apr 17, 2026

Uh oh!

ayush00git commented Apr 17, 2026

Uh oh!

ayush00git commented Apr 17, 2026

Uh oh!

ayush00git commented Apr 17, 2026

Uh oh!

ayush00git commented Apr 17, 2026

Timing Results (nanoseconds)

Throughput Results (ops/sec)

Uh oh!

ayush00git commented Apr 17, 2026

Uh oh!

chaokunyang commented Apr 18, 2026

Uh oh!

chaokunyang commented Apr 18, 2026

Uh oh!

ayush00git commented Apr 19, 2026

Uh oh!

chaokunyang Apr 19, 2026

Choose a reason for hiding this comment

Uh oh!

ayush00git Apr 19, 2026

Choose a reason for hiding this comment

Uh oh!

ayush00git commented Apr 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chaokunyang left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ayush00git commented Apr 16, 2026 •

edited

Loading

ayush00git commented Apr 19, 2026 •

edited

Loading