Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V2: Define __ocaml_freestanding__, fix the endian.h mess #74

Merged
merged 3 commits into from
May 5, 2020

Conversation

mato
Copy link
Contributor

@mato mato commented May 4, 2020

This is a re-work of #73, containing the following changes:

  • nolibc: Provide a sys/param.h with BYTE_ORDER (83fc2b4 3226548)
  • nolibc: Provide a full endian.h (3c251fe 5055052)
  • nolibc: Define __ocaml_freestanding__, add overrides.h (239263b)

See individual commits for details.

/cc mirage/hacl#30 mirage/mirage-crypto#54 Solo5/solo5#453 @hannesm

This change allows for portable C code to determine if it is being built
in an ocaml-freestanding environment by checking if the preprocessor
macro __ocaml_freestanding__ is defined.

In order to be able to do this reliably and in the presence of existing
"#if...#elif...#endif" chains we need to undefine existing preprocessor
macros identifying the host toolchain, including but not limited to,
__FreeBSD__, __OpenBSD__ and __linux__. For a full list, see
nolibc/include/_freestanding/overrides.h.

_freestanding/overrides.h is now included automatically for each C
source file built by adding "-include _freestanding/overrides.h" to
both downstream pkg-config/dune CFLAGS and the Makefile-internal
FREESTANDING_CFLAGS.
@mato
Copy link
Contributor Author

mato commented May 4, 2020

@hannesm:

I have tested this manually on Debian 10 with GCC, including inspecting the config.log from the OCaml runtime and libgmp (gmp-freestanding) to verify that they are doing the right thing. Note that neither of these use an <endian.h> or <sys/endian.h>.

Regarding your comments in #73:

It is also more the truth of ocaml-freestanding -- this is neither linux nor FreeBSD nor anything else (and C code that uses #ifdef chains to select the platform should better fail hard (at compilation) when freestanding is encountered than soft (at link time).

Agreed. This V2 does that via -include _freestanding/overrides.h, IMO cleaner than adding the various undefs to CFLAGS in multiple places. It also applies the same flags to code built as part of ocaml-freestanding itself, notably the OCaml runtime.

Given the current mess of POSIX and endian.h (where to put it, which symbols to export), maybe we should just not provide it in freestanding at all?

We may as well provide an <endian.h> and relevant interfaces. That way, downstreams that want portable code with minimal duplication can do:

#if defined(__ocaml_freestanding__)
#include <endian.h>
#elif defined(...)
/* ... */
#endif     

... I see no harm in that.

The remaining question for me is that from your comments on #73 it's no longer obvious to me if we need the #ifndef _SYS_ENDIAN_H hack in our endian.h at all?

If you look at nolibc/include/_freestanding/byteorder.h, I've arranged things so that were we to remove the _SYS_ENDIAN_H guard, then in the presence of that particular failure case compilation will fail hard due to BYTE_ORDER already being defined. My feeling is it's better to deal with (and report/fix) breakage rather than paper over the problem, but I'm not sure.

What do you think?

@hannesm
Copy link
Member

hannesm commented May 5, 2020

This looks fine to me, I tested it on a FreeBSD 12.1 system.

The remaining question for me is that from your comments on #73 it's no longer obvious to me if we need the #ifndef _SYS_ENDIAN_H hack in our endian.h at all?

I don't think we need it. The undefine of the host platform is sufficient to avoid issues in the common path, and there's no need to support "broken" #ifdef .. #elif .. #endif expressions.

EDIT: I pushed 8c06c0f to your branch that removes the _SYS_ENDIAN_H guard / definition.

After this is merged, I only have two more questions (out of scope for this PR I suspect, but since we're cleaning up headers, I though I'd raise them here):

  • why do we copy only a selected set of .h files from the OCaml runtime? I'd expect a cp -R runtime/caml/ ${DESTINC}/caml/ to be a much more sustainable solution (see install all caml headers #75 for what I mean)
  • the openlibm header file issue - our install.sh copies openlibm/include and openlibm/src/*h (this is the same as openlibm's make install-headers) into ${DESTINC} which pollutes this quite a bit with private headers. My attempts to instead install into ${DESTINC}/openlibm were not too successful (the zarith-freestanding fails since it can't find the openlibm-dependent headers), but I'd appreciate a solution where our math.h symlinks to openlibm/openlibm.h and everything just works (i.e. not using -I${DESTINC}/openlibm).

mato added 2 commits May 5, 2020 15:47
This change provides a full <endian.h> in nolibc, defining BYTE_ORDER,
LITTLE_ENDIAN, BIG_ENDIAN and the beXXtoh(), htobeXX(), htoleXX() and
leXXtoh() interfaces as macros. [1]

Further, <endian.h> is now arch-independent and uses the compiler
built-in preprocessor macro __BYTE_ORDER__ to determine byte order. [2]
This macro is available in all versions of GCC and clang that are recent
enough for us to care about.

[1] As proposed for / in the process of being included into / the newest
POSIX standards: https://www.austingroupbugs.net/view.php?id=162

[2] https://gcc.gnu.org/onlinedocs/cpp/Common-Predefined-Macros.html
Provide a <sys/param.h> in nolibc, and define BYTE_ORDER, LITTLE_ENDIAN
and BIG_ENDIAN in it (if not already defined via inclusion of endian.h
or related elsewhere).

This change is provided for the benefit of autoconf-based build systems
such as the OCaml runtime and libgmp (gmp-freestanding).

Autoconf-based build systems commonly use the AC_C_BIGENDIAN test,
which looks for a definition of BYTE_ORDER in various UNIXy places [1],
falling back to gradually more horrible heuristics if one is not found.

While these heuristics work "by accident" until now, this solution is
cleaner and ensures that a compile-only test is used by autoconf, which
will work in the future presence of a cross compiler.

[1] See lib/autoconf/c.m4 in the GNU autoconf source.
@mato
Copy link
Contributor Author

mato commented May 5, 2020

This looks fine to me, I tested it on a FreeBSD 12.1 system.

Thanks

I don't think we need it. The undefine of the host platform is sufficient to avoid issues in the common path, and there's no need to support "broken" #ifdef .. #elif .. #endif expressions.

EDIT: I pushed 8c06c0f to your branch that removes the _SYS_ENDIAN_H guard / definition.

Ok. I rebased the existing commits, removing the _SYS_ENDIAN_H business entirely. No need to keep the back-and-forth in the history.

After this is merged, I only have two more questions (out of scope for this PR I suspect, but since we're cleaning up headers, I though I'd raise them here):

* why do we copy only a selected set of `.h` files from the OCaml runtime? I'd expect a `cp -R runtime/caml/ ${DESTINC}/caml/` to be a much more sustainable solution (see #75 for what I mean)

No idea. I'm sure there was some reason for why it got done that way. It's ~300kB of headers and "stuff", if you're fine with just installing all of it, then let's do it.

* the `openlibm` header file issue - our install.sh copies openlibm/include and openlibm/src/*h (this is the same as openlibm's `make install-headers`) into `${DESTINC}` which pollutes this quite a bit with private headers. My attempts to instead install into `${DESTINC}/openlibm` were not too successful (the zarith-freestanding fails since it can't find the openlibm-dependent headers), but I'd appreciate a solution where our `math.h` symlinks to `openlibm/openlibm.h` and everything just works (i.e. not using `-I${DESTINC}/openlibm`).

Yeah, this doesn't work because the openlibm public headers want to include the private headers, and do so without using #include "openlibm/...". I checked openlibm master and it's still that way, not sure what to do about it other than various sed` hackery.

@hannesm
Copy link
Member

hannesm commented May 5, 2020

@mato great, let's defer that openlibm to a later point in time (don't fix issues that are not present).

I looked into the history of the caml-header list, and it came with the initial commit. I feel more comfortable to install all of them (they are in caml/ anyways, and don't conflict with other headers).

EDIT:

I rebased the existing commits

that sounds good to me as well, I didn't want to force-push to your branch.

@mato
Copy link
Contributor Author

mato commented May 5, 2020

All green, good to merge!

@mato mato merged commit 0f92755 into mirage:master May 5, 2020
This was referenced May 5, 2020
@mato mato deleted the endian2 branch May 6, 2020 13:37
@dinosaure dinosaure mentioned this pull request Jun 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants