Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it easier to bump and duplicate magic numbers #12652

Merged
merged 2 commits into from
Oct 24, 2023

Conversation

shindere
Copy link
Contributor

This stems from #11996 and addresses this comment by @nojb.

The magic numbers (actually strings) used to identify binary files produced by the compiler are the concatenation of three pieces: an 8-bytes prefix common to all of them (Caml1999), a 1-byte file type (executable, CMO, CMA etc.) and a 3-bytes version identifier.

The current definitions of these magic numbers carry some redundancy: the prefix and version fragments appear in each magic number definition. This makes bumping them non-trivial, since a change in the version string will need to be applied to each definition, of which only a sub-string needs to be modified.

Also, a few magic numbers are needed at several places in the codebase: the magic number for executables, for instance, is used in both the runtime (C world) and in the compiler (OCaml world) and its definition is thus duplicated. The definition of magic numbers for CMO, CMA and CMXS formats are required by both the compiler and the dynamic loader (dynlink) and here, the cost for avoiding the duplication is that dynlink depends on the Config module of compilerlibs, hence the link with #11996.

The present PR is a proposal to address all these limitations by doing three things:

  1. Define the basic blocks from which magic numbers are built in the configure system (namely in build-aux/ocaml_version.m4, close to the version numbers), so that all the pieces can be disseminated wherever they are useful in the codebase during the configure stage.
  2. Make all the fragments of magic numbers available individually in each file that needs them and reconstruct them from these bits either at configure or at compile time. This reduces the need to bump magic numbers individually, as the version is stored in fewer places from which all the magic numbers are derived.
  3. Introduce the bump-magic-numbers tool to both document and automate the bumping process.

The bump-magic-numbers tool is used in Inria's CI bootstrap job. This means that the tool, which is intended to be used when preparing releases, will be tested regularly.

Moreover, this means that the current PR makes the bootstrap CI job both more strict and more faithful to what happens when preparing a release since it makes sure the compiler bootstrap works even when all the magic numbers are changed (as happens for releases), whereas on current trunk the bootstrap job only changes the magic number for executables.

Makefile Outdated
@@ -717,7 +717,8 @@ runtime_NATIVE_C_SOURCES = \
$(runtime_NATIVE_ONLY_C_SOURCES:%=runtime/%.c)

## Header files generated by configure
runtime_CONFIGURED_HEADERS = $(addprefix runtime/caml/, m.h s.h version.h)
runtime_CONFIGURED_HEADERS = \
$(addprefix runtime/caml/, m.h s.h version.h exec.h)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider keeping the files list sorted.

Suggested change
$(addprefix runtime/caml/, m.h s.h version.h exec.h)
$(addprefix runtime/caml/, exec.h m.h s.h version.h)

@shindere
Copy link
Contributor Author

shindere commented Oct 12, 2023 via email

@shindere
Copy link
Contributor Author

shindere commented Oct 13, 2023 via email

configure.ac Outdated
Comment on lines 105 to 113
AC_SUBST([MAGIC_PREFIX], [MAGIC__PREFIX])
AC_DEFINE([MAGIC_PREFIX], ["][MAGIC__PREFIX]["])
AC_SUBST([MAGIC_VERSION], [MAGIC__VERSION])
AC_DEFINE([MAGIC_VERSION], ["][MAGIC__VERSION]["])
AC_SUBST([EXEC_FORMAT], [EXEC__FORMAT])
AC_DEFINE([EXEC_FORMAT], ["][EXEC__FORMAT]["])
AC_SUBST([MAGIC_LENGTH], [MAGIC__LENGTH])
AC_SUBST([CMX_FORMAT])
AC_SUBST([CMXA_FORMAT])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to prefix these with OCAML_? I tend to get confused about which variables are relatively standard, say CC or CPP, and which are specific to the project.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I concur, that sounds like a good idea, at the very least for the MAGIC_* variables.


# Bump magic numbers in runtime/caml/exec.h

sed -i -e s/'define MAGIC_VERSION "..."'/"define MAGIC_VERSION \"$new_num\""/ \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
sed -i -e s/'define MAGIC_VERSION "..."'/"define MAGIC_VERSION \"$new_num\""/ \
sed -i.tmp -e s/'define MAGIC_VERSION "..."'/"define MAGIC_VERSION \"$new_num\""/ \

For the rare case where you'd want to use this script from a BSD or macOS system. You need to specify a (possibly empty) extension for in-place editing with BSD sed. I also suggest changing the sed invocations below. This is the pattern that was previously used.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After test, this is still incompatible with macOS sed? Using an explicit temporary file seems simpler .

Comment on lines 55 to 58
dd7927e156b7cb2f9

Beware, though, that PR #12652, which is more recent than the commit
mentionned above, changes the way magic numbers are defined.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
dd7927e156b7cb2f9
Beware, though, that PR #12652, which is more recent than the commit
mentionned above, changes the way magic numbers are defined.
https://github.com/ocaml/ocaml/commit/dd7927e156b7cb2f9cb73d2d54a15a9c81921392[dd7927e]
Beware, though, that https://github.com/ocaml/ocaml/pull/12652/[PR #12652],
which is more recent than the commit mentioned above, changes the way
magic numbers are defined.

I suggest adding links. There's also a typo s/mentionned/mentioned.

@shindere
Copy link
Contributor Author

shindere commented Oct 16, 2023 via email

@Octachron
Copy link
Member

@shindere, after thinking further, I am fine with computing the magic prefix for the various binary file formats at runtime.

@shindere
Copy link
Contributor Author

shindere commented Oct 23, 2023 via email

@shindere
Copy link
Contributor Author

shindere commented Oct 23, 2023 via email

This commit replaces the double quotes that start magic numbers by
{magic| and those that end them by |magic}.

Such quotes make it easier to automate the bumping process for magic numbers.
@@ -52,4 +52,16 @@ tools to test the new release, and if you update *after* that you risk
breaking them again without them noticing.

For example, the magic numbers for 4.13 were updated in
dd7927e156b7cb2f9
https://github.com/ocaml/ocaml/commit/dd7927e156b7cb2f9cb73d2d54a15a9c8192139\2[dd7927e]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is better to remove the link to this commit, since the commit will point to files that are no longer versioned.

which is more recent than the commit mentioned above, changes the
way magic numbers are defined.

To bump the magic numbers to version xyz simply run:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would propose to merge the previous paragraph with this sentence, while removing the Beware warning:

Since https://github.com/ocaml/ocaml/pull/12652/[PR #12652], to bump the magic numbers to version xyz simply run:

Copy link
Member

@Octachron Octachron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current state looks nice to me: it improves phase separation (more configuration-time variables are computed at configuration time) and the documentation of the bootstrap process.
I will let you decide if you want to improve the documentation and variable naming for the ambiguous one before merging.

@shindere
Copy link
Contributor Author

shindere commented Oct 24, 2023 via email

@Octachron
Copy link
Member

I would say MAGIC_PREFIX, MAGIC_LENGTH, MAGIC_VERSION and maybe EXEC_MAGIC_LENGTH whose meaning without context is a bit unclear.

Maybe the most natural way to fix that point is to not shorten MAGIC_NUMBER to MAGIC (in other words keep MAGIC_NUMBER_PREFIX) since MAGIC could refer to some other "magic" constants rather than a file format magic number. Moreover keeping MAGIC_NUMBER would be consistent with the variable names for the file format CMX_MAGIC_NUMBER, ... .

@shindere
Copy link
Contributor Author

shindere commented Oct 24, 2023 via email

This commit makes sure all the magic numbers are defined in
build-aux/ocaml_version.m4 and duly propagated from there.

It also introduces the tools/bump-magic-numbers script and uses it
in Inria's CI bootstrap job. This script should also make the release
process easier. It is a documented, automated and regularly verified
procedure for bumping magic numbers.
@shindere shindere merged commit 7a0439d into ocaml:trunk Oct 24, 2023
9 checks passed
@shindere shindere deleted the magic-numbers branch October 24, 2023 13:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants