Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implements writing length-prefixed and delimited structs #709

Merged
merged 17 commits into from
Feb 13, 2024
Merged

Conversation

zslayton
Copy link
Contributor

@zslayton zslayton commented Feb 12, 2024

This PR diverges from the specification-as-written to align with ion-java's implementation.

Rather than having two separate opcodes for structs with and without inline field names, we only use 0xC_ and start each struct using FlexUInt symbol ID field names. If/when a field name needs to be written with inline text, the write emits a 0x00 byte to communicate to the reader that we are switching from FlexUInt field names to FlexSym field names for all field names that follow.

This allows Ion 1.1 structs that do not leverage inline field names to be as compact as Ion 1.0 structs while preserving more opcode space for other uses. It also obviates the need for writers to guess which struct encoding they'll need at the outset of the struct since they'll be able to "upgrade" the struct's encoding on demand.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@zslayton zslayton merged commit aa2a29b into main Feb 13, 2024
20 checks passed
@zslayton zslayton deleted the write-structs branch February 13, 2024 14:17
if self.flex_uint_encoding {
// Write a zero byte out to signal to future readers that we are switching from
// FlexUInt to FlexSym.
self.container_writer.buffer().push(0x00);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Java we ended up using FlexInt zero (0x01) instead of 0x00. See https://github.com/amazon-ion/ion-java/pull/673/files

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Java, if nothing had been written yet we swapped out the opcode (to 0xFD: variable length, FlexSym fields) instead of writing the switch byte. I think the idea is that if you're using FlexSyms you're more likely to exceed the lower nibble anyway, so you can use the byte for the length instead and potentially save a byte. We can discuss whether this is worthwhile though. If it's not, then maybe 0xFD can be reclaimed. See https://github.com/amazon-ion/ion-java/pull/665/files#diff-1903a8a68f333503cde63de27c671d623ff68da4accf35aa1db835d2c3e6f783R339

Comment on lines +2680 to +2683
// Delimited struct
0xF3,
// FlexUInt symbol ID 4
0x09,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Matt and I were still operating under the assumption that delimited structs always had FlexSym field names: https://github.com/amazon-ion/ion-java/pull/665/files#diff-c491f83b80fd84cfed9e62a411f61789c10d9731d4afbb7fefe538738dd3cd27R407

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants