Skip to content

Set the default preallocation mode of binary IonWriter to 1.#481

Merged
linlin-s merged 1 commit intomasterfrom
preallocation1
Apr 13, 2023
Merged

Set the default preallocation mode of binary IonWriter to 1.#481
linlin-s merged 1 commit intomasterfrom
preallocation1

Conversation

@linlin-s
Copy link
Copy Markdown
Contributor

Issue #, if available:
According to issue #421

Length preallocation is a setting for the binary IonWriter that trades data size for write speed. When enabled, the writer will write N placeholder bytes in the buffer to eventually store the encoded container's length. This causes the container to be less compact, but simplifies the buffer management and the writer must perform enough to offer a speed boost.

Description of changes:

Motivation:

Now that 1-byte preallocation does not result in a data size penalty (#401), changing the default preallocation mode from 0 to 1 can improve binary write performance without inflating the encoding.

When the container length is:

  • < 14:
    The body of this container/wrapper is small enough that its length can fit in the lower nibble of the type descriptor byte; we don't need the extra 1 byte that was preallocated. Shift the encoded body of the container/wrapper backwards in the buffer to reclaim the extra 1 byte. The action of shift back will usually be a very fast memcpy.
  • >= 14 and < 128:
    The container's encoded body is too long to fit the length in the type descriptor byte, but it will fit in the preallocated length byte that was added to the buffer when the container was started. Update that bytes with the VarUInt encoding of the length value. In this case, Ion binary writing will enjoy the speed boost from preallocation.
  • >= 128:
    The container's encoded body is too long to fit in the length byte that was preallocated. We will use PatchPoint to write the VarUInt encoding of the length in a secondary buffer and include that data when we flush the primary buffer to the output stream.

Here are the benchmark results which indicate the speed boost:

  • The binary IonWriter using preallocation mode 1 shows 16.58% faster for dataset log.10n comparing to binary IonWriter with default preallocation mode 0.
    Full results here.

  • The binary IonWriter using preallocation mode 1 shows 6.28% faster for dataset item.10n comparing to binary IonWriter with default preallocation mode 0.
    Full results here.

  • The binary IonWriter using preallocation mode 1 shows 11.11% faster for dataset event.10n comparing to binary IonWriter with default preallocation mode 0.
    Full results here.

  • The binary IonWriter using preallocation mode 1 shows 16.27% faster for dataset complex.10n comparing to binary IonWriter with default preallocation mode 0.
    Full results here.

Why set the default to 1 instead of 2?

According to the performance benchmark results, when we set the preallocation mode to 2, performance is even better. However, extra preallocated space is not reclaimed in this case when containers are >= 14 and < 128 bytes, which might causes the serialized size to inflate. Setting the default to 1 for now avoids that tradeoff.

Next step:

  • Set the default preallocation mode to 2, reclaim the extra preallocated byte when the container length is >= 14 and < 128 bytes, then benchmark the new change to see if there’s further performance improvements.
  • Explore this option from the comment: Compacts preallocated headers when possible #401 (review)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 12, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: -0.02 ⚠️

Comparison is base (16a4956) 66.84% compared to head (d7cc7a2) 66.82%.

Additional details and impacted files
@@             Coverage Diff              @@
##             master     #481      +/-   ##
============================================
- Coverage     66.84%   66.82%   -0.02%     
- Complexity     5417     5419       +2     
============================================
  Files           156      156              
  Lines         22728    22728              
  Branches       4087     4087              
============================================
- Hits          15193    15189       -4     
- Misses         6240     6242       +2     
- Partials       1295     1297       +2     
Impacted Files Coverage Δ
...azon/ion/impl/_Private_IonBinaryWriterBuilder.java 83.18% <100.00%> (ø)

... and 2 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Copy Markdown
Contributor

@zslayton zslayton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@linlin-s linlin-s merged commit a3d03cd into master Apr 13, 2023
@linlin-s linlin-s deleted the preallocation1 branch January 16, 2024 19:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants