Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[hxb] Future compatibility #11505

Closed
Simn opened this issue Jan 23, 2024 · 4 comments
Closed

[hxb] Future compatibility #11505

Simn opened this issue Jan 23, 2024 · 4 comments

Comments

@Simn
Copy link
Member

Simn commented Jan 23, 2024

One of the most important aspects we must get right in the design is future compatibility. Ideally, any hxb generated by Haxe version N would work with all Haxe versions M < N. This is quite challenging to design properly, so input is welcome!

For the time being, I have two main ideas:

  1. The reader should ignore unknown chunks. This will allow future versions to add additional chunks without breaking compatibility.
  2. We need some way to "extend/modify" ADT information at a lower level. For instance, consider something like Coroutine Experiments #10128 which would add an additional argument to something as central as TFun.

One way to handle this would be to reserve byte 255 as an "extension", the semantics of which then depend on what we're extending. For the concrete example of the coroutine TFun, it would look something like this when reading a type instance:

0xFF // extension byte
0x00 // the kind of extension, in this case the semantic meaning would be "coroutine flag"
length // length of the payload
payload // bytes of the payload, interpreted depending on the extension kind. For the coroutine example, this would just be a bool
TFun type // the type instance we're "extending"

The reader can then, for all extension bytes, first read the kind and the payload. If it is aware of the semantics of the kind, it can process that accordingly. Otherwise, it ignores this information.

I'm making this up as I'm writing the issue, so let me know if this makes any sense!

@Simn
Copy link
Member Author

Simn commented Jan 23, 2024

This extension byte approach would also provide a decent amount of backwards compatibility. Going with the coroutine example, let's say the current hxb reader knows how to read TFun(args, ret). We add the additional argument to it and use the extension byte, so the encoding essentially becomes TExtend(Coroutine, true, TFun(args, ret)). This mean that, by design, the reader still has to be able to read the TFun(args, ret) itself, in this case by adding a default false argument to the internal structure. And this in turn means that it is also still able to read this from an old representation, using the same default.

This should be really good for all the variant data which already switches on a byte read. The other thing we'll need is a common way to extend record types.

@ncannasse
Copy link
Member

I've dealt with many file formats in the past and the most easy I found is the following :

  • make your data writing as simple as possible
  • add a version in the header
  • at read time only, support some (and not all) previous versions data encoding
  • don't deal with forward compatibility, or allow a multi version binary format with automatic version selection

@back2dos
Copy link
Member

I'm not sure future proofing is practical. Let's assume we have Haxe 5 without coroutines and Haxe 5.1 with coroutines and an hxb produced by the latter. What good is there in ingesting it from the former if it cannot honor the semantics of what's in there. If all coroutines become plain functions, can we even expect the code to compile? I mean, that's pretty close to designing .hx syntax in a way that older haxe versions can parse newer ones, even if they contain syntax that cannot be understood.

Personally, I would consider backward compatibility more achievable and also quite useful. I think there's really something to be said for distributing some libraries in precompiled form - be it their run script or their actual code - and ideally something published for some version of Haxe 5 would be compatible with at least all subsequent Haxe 5 versions.

@Simn
Copy link
Member Author

Simn commented Jan 27, 2024

and ideally something published for some version of Haxe 5 would be compatible with at least all subsequent Haxe 5 versions.

I am quite worried about that aspect. It's already sometimes a problem with the macro API, where we don't want to break the interface so we have to be creative in order to handle the encoding and decoding of additional information. This will be harder to do in a binary protocol, which is generally less flexible. Maybe I'm overthinking this though.

And I see your point about the problems with forward compatibility. My thinking was that the problem wouldn't necessarily have to occur at reading-time, but rather when a new feature ends up actually being used. But of course this won't be detectable if we don't deal with the semantics, so this doesn't make too much sense as a whole.

In that light, I suppose there's not much to design here after all and I'll close the issue. Thank you for the comments!

@Simn Simn closed this as completed Jan 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants