Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Backward compatibility (default value for fields / ignoring extra data) #373

Open
kdy1 opened this issue Mar 30, 2023 · 10 comments
Milestone

Comments

@kdy1
Copy link
Sponsor

kdy1 commented Mar 30, 2023

I'm the author of the swc project.

I wonder if rkyv can support some sort of backward compatibility, as we pass the whole AST to the Wasm plugin after serializing it using rkyv.

Most of AST changes are about new fields, and a default value is enough in most cases because the changes are mostly about TypeScript, which is stripped out before invoking Wasm plugins.
But we still have TypeScript data in AST, so the layout of archived value changes even if the value is always e.g. false.

@djkoloski
Copy link
Collaborator

Thanks for the context. There has been a lot of interest in this and there are a few ways to maintain backwards compatibility.

Backwards compatibility can be maintained manually as demonstrated in the backwards_compat example. This requires some special care to, for example, box types that may evolve over time. All of that should be configurable via rkyv's wrapper type system (e.g. #[with(AsBox)] to box types in the archived format).

I'm also familiar with protoss by @fu5ha, which aims to solve backward- and forward-compatibility for rkyv. IIUC it's still relatively early on in development but has a lot of promise. We're both pretty busy in general though; some spare hands would be greatly appreciated to get it over the finish line.

@kdy1
Copy link
Sponsor Author

kdy1 commented Mar 31, 2023

Thank you for the pointer!

I think I can create a proc-macro that can be used to duplicate AST type without new fields.
I'm a bit worried about the binary size, but I'll try when I have some time.

@kdy1
Copy link
Sponsor Author

kdy1 commented Mar 31, 2023

@djkoloski I thought about it more deeply, but I think it's going to be hard because I have to pay attention to the size of the binary.
Is it possible to add some defaults for the fields? If it's possible in the architecture of rkyv, I can work on it

@fu5ha
Copy link
Contributor

fu5ha commented Mar 31, 2023

If you're only using nested structs and no enums, protoss may actually be able to fit the bill already--just mind the un-production-testedness :)

@kdy1
Copy link
Sponsor Author

kdy1 commented Mar 31, 2023

SWC uses lots of enums, so I think I can't use it.
But I have to consider the binary size, too. Because of it, the best solution for us is to support the default value, even if it has a noticeable performance cost.

rmp-serde has to_vec and to_vec_named. The former does not store field names, and the latter does so. I think sort of default requires encoding field names, right?

@kwonoj
Copy link

kwonoj commented May 12, 2023

@djkoloski

Mind if I ask one q regarding how to deal copying / allocate AsBox marked field into another memory space?

SWC's host allocates memory in wasm plugin's memory space, then copy serialized struct into. When VersionedSerializble's field is marked with AsBox, it (looks like) the serialized struct contains pointer to the given field but not the actual value, in result deserialized struct in the guest loses value, i.e VersionedSerializable::new(Some("string")) -> Some("").

@djkoloski
Copy link
Collaborator

To make sure I understand correctly, you're serializing a type that includes a #[with(AsBox)] and then copying the result into a separate memory space. Is that accurate? If so, it should work correctly as long as you copy the entire serialized buffer into the wasm plugin's memory space. Are you only copying part of the serialized buffer into the wasm plugin memory space?

@kwonoj
Copy link

kwonoj commented May 13, 2023

you're serializing a type that includes a #[with(AsBox)] and then copying the result into a separate memory space. Is that accurate?

Yes.

Are you only copying part of the serialized buffer into the wasm plugin memory space?

I don't think I do special to copy part of it. Pseudocodewise, what I'm doing is basically

let versioned = Versisoned(Some("string".to_string()));

let bytes_field = rkyv::to_bytes::<_, 512>(versioned)
let ptr = bytes_field.as_ptr();
let ptr_len = bytes_field.len();

// allocate guest memory with serialized bytes' len, then copy serialized slice
guest_memory_write(bytes_field.as_slice(), ptr_len);

// in guest, read & deserialize

The exact same logic works without AsBox for T. Or should there be additional steps to get full serialized slices with AsBox instead?

@djkoloski
Copy link
Collaborator

Hm, if you're copying all of the bytes into the target memory space then you should have the exact same values as before. Do you have a repro for this? I'd be happy to take a look and see if I can dig up any additional information.

@kwonoj
Copy link

kwonoj commented May 14, 2023

The current PR I'm working on is swc-project/swc#7382, and this https://github.com/swc-project/swc/pull/7382/files#diff-4d01e730818bcc35a17d65dd364390283fd29e3fb1dc0e35050a67034eeb474dR156 is the struct can repro if you enable AsBox.

The offending test case is here https://github.com/swc-project/swc/blob/a2577adffa846a9e9010be7f42378170cfec9338/crates/swc_plugin_runner/tests/fixture/swc_internal_plugin/src/lib.rs#L80-L82

which verifies plugin copied host's value - if enable AsBox, those assertion will fail as string is empty.

@djkoloski djkoloski added this to the Backlog milestone Mar 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants