New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Umarshal - version-independent serialization format #626
Conversation
We introduce a new module, Umarshal, that provides basic infrastructure for marshaling, along with some combinators, and use that with the types that end up in fingerprint caches. For each such type "typ", we define marshalling functions that are wrapped in a record of type Umarshal.t, named "mtyp" or just "m" when typ is "t". The main substance of this commit is in Umarshal; all the other changes are calls to the combinators defined in Umarshal to build Umarshal.t values, and the switch from Marshal functions to Umarshal ones in Fpcache.
Umarshal uses Bytes.{get,set}_int*_be, which has been introduced in OCaml 4.08.0
This reverts commit 909e092. Bytearray.{marshal,unmarshal} are still needed to maintain compatibility with versions 2.51.x that use the old RPC wire format.
Bring Umarshal up to date.
Preferences data is serialized separately from (and within) RPC layer. Add ability to send and receive preferences with the old (<= 2.51.5) serialization format and select the format based on the negotiated RPC version.
To keep compatibility with versions <= 2.51.5, the types of RPC call arguments and return values must not change. This patch provides conversion functions for any arguments and return values that have been changed, thereby keeping full compatibility with 2.51.
Make sure both the optimistic and nonoptimistic archive loading paths work in 2.51-compatibility mode.
Since the archive structure (schema) is not stored in the on-disk archive, any format-changing features are recorded in the file. Use this information when loading the archive to make sure that the types match those used when storing the archive. The marshal type functions (the `m`) will use the information of enabled features.
Just like with old archive files, remove the old fpcache file after upgrade from 2.51 has succeeded.
Errors when reading in cache are not supposed to be fatal. It is important to ignore such errors to remain backwards and forwards compatible with potential format changes.
I've tested on netbsd-9 with ocaml 4.11.2. On amd64, tests pass, and umarshal locally interoperates with 2.51.5 remotely (amd64, i386, earmv7hf-el netbsd-9, amd64 macOS 10.13), and upgrades archive files. With umarshal on a remote amd64 netbsd-9, sync works and the remote is updated. umarshal on remote netbsd-9 earmv7hf-el also worked. I hope to test on NetBSD/sparc, but that is going to take longer. Therefore I think this is over the line to merge, given that I don't think test reports from anyone else are too likely (but please do test if you are reading!!!), but I want to prepare a PR that changes the version number, and will merge this and then that in quick succession, so the version will show as something like 2.51.70, most importantly not appearing as 2.51.5. That will be a breakpoint from the 2.51 series -- but that's the plan at this point. |
Don't you think a 3.0 version would be in order? At the latest I would say that an OCaml version independent unison, and furthermore still dynamically compatible with some older protocol versions is quite a revolution, deserving a new major number? Sorry I didn't find the time to do any testing so far. |
It is significant in terms of fixing what I view as a bug, but it is not a significant change to what unison can do or how it is used. For users upgrading to the new version, it will, we hope, be more or less invisible, except that if they happen to try to sync across ocaml versions, it will work instead of failing. In this way it is less of a big deal than transitions like 2.40 to 2.48, etc. |
I might be able to test on linux/ppc64le, but don't wait for me! |
@avollmerhaus Thanks! I will not wait for you for merging, but please do test if you are able, even after merging. Until there is a release it is much easier to fix anything that turns out to need fixing. |
Added a commit to resolve an old FIXME-comment (archive file names were truncated from 32 chars to 6 chars on Windows). There is no reason to consider 8.3 limitations; even more so as the temp file names have not fit in 8.3 for a very long time (never?). |
This commit is a logical continuation to the work started in commit 90dd589. Without actually changing the types, it provides the foundation so that they can be changed while keeping compatibility with versions <= 2.51.5. The types prepared like this are [Fileinfo.stamp] and [Props.t] (continuing from the earlier commit) and other types which include any of these two or [archive] (which itself was prepared in the earlier commit). A bug from commit aa29f5d is fixed.
Added a commit extending and preparing the foundation that allows to keep backwards compatibility when starting to make type changes in future. The commit itself does not introduce any type changes. |
Fix a couple of long-standing FIXME comments. While not a functional change, it helps clean up the code and captures meaning in the type system instead of relying on magic values.
Added another commit, this time resolving two old FIXME-comments. It does not change functionality, it is to be considered code cleanup. One might wonder why all these commits in this PR. Since this PR represents a format change, it is good to use this opportunity to commit other non-invasive changes that otherwise would require a format change (and, as it happens, which were previously postponed precisely not to impose an unnecessary format change on users). Edit: Ignore the build errors, they are not related to this commit. This is a GHA issue that appears randomly (not sure why). |
This PR is ready to be merged. |
I have successfully compiled the master branch (at There have been I didn't test the GUI though. Big thanks for all the effort that went into this, unison really has been making my life a lot easier for years and breaking free from the strict version requirements will make it even greater and a lot easier to recommend to less technically inclined folk. |
Description
The long-awaited improvement is finally here!
Umarshal is the new serialization format that will be used for client-server communication and archive files to make them independent of OCaml versions and Unison versions. This code will keep compatibility with 2.51 releases and future releases.
No longer must you match Unison versions and OCaml versions (the latter applies only if both client and server have this PR).
Closes #375
Closes #377
Closes #407
Credits
The author of this work is @glondu.
I am just taking his work, merging with my own previous work (#507 and #509) and adding some tweaks to make sure that it remains compatible with 2.51 releases and a wide range of OCaml versions.
Interoperability
Full compatibility with 2.51.x versions is intended (both as a server and as a client). To interop with any 2.51.x version, you have to still match the OCaml versions.
Upgrading
It is possible to upgrade from 2.51.x versions while keeping the archive intact. This means that no re-scan of the replicas is needed and the replicas don't have to be in sync before the upgrade. The user does not have to do anything for the upgrade -- just drop in the new version and run it. It will automatically pick up the existing archive and convert it to the new format.
To upgrade from any 2.51.x version, you have to still match the OCaml versions. Once you have upgraded all the clients and servers that have to work together, then you no longer have to match OCaml versions.
Edit: The upgrade works also from 2.48.x and maybe even older versions, if the OCaml versions match (or are similar, at least).
Note: once the upgrade has succeeded, you should not go back and run the old version. To go back to old version, you have to delete the archive files and start afresh.
Testing
Call for testing! Everyone who would like to contribute is welcome to do testing (there is no script; test any way you like). Even though this code should not cause any data loss or corruption, it is best to test on special replicas or have working backups at hand.
Since this code is expected to be version-independent and backwards-compatible with 2.51 releases, various combinations of client/server versions and OCaml versions must be tested. I don't expect all combinations below to be tested but the tables are provided for a quick overview of which combinations have been tested.
A. Umarshal code (this PR) with different OCaml versions
rows = client; columns = server
L = Linux; W = Windows; i = illumos; L2L = Linux-to-Linux; etc.
B. Umarshal code (this PR) against Unison 2.51.x (before this PR) different versions
Client and server OCaml versions match.
C. Umarshal code (this PR) upgrading from Unison 2.51.x (before this PR) different versions
2.51.x and Umarshal OCaml versions match.