Skip to content

Conversation

@J0WI
Copy link

@J0WI J0WI commented Dec 20, 2019

No description provided.

@J0WI
Copy link
Author

J0WI commented Dec 20, 2019

Are #11 (comment) and #29 (comment) still valid?

@tianon
Copy link
Member

tianon commented Dec 20, 2019

Re: #11 (comment) -- definitely even more valid now than it was then; Alpine doesn't (AFAIK) support cross-compiling at all, while Debian has a very wide variety of cross compilers available, which we make active use of here to make it easier to build all these various binaries.

Re: #29 (comment) -- definitely also still relevant, given that Alpine is still going to have less name recognition for new Docker users than Ubuntu does.

Re: Travis, it looks like something about our i386 binary is segfaulting, so we need to do some debugging with Buster's compiler and our hacked up minimal code and see where things have gone haywire. 😞

@J0WI
Copy link
Author

J0WI commented Dec 21, 2019

Re: #11 (comment) -- definitely even more valid now than it was then; Alpine doesn't (AFAIK) support cross-compiling at all, while Debian has a very wide variety of cross compilers available, which we make active use of here to make it easier to build all these various binaries.

Alpine supports cross-compiling, but there are no prebuild binaries available: https://github.com/alpinelinux/aports/blob/master/scripts/bootstrap.sh#L39-L53

@tianon
Copy link
Member

tianon commented Dec 23, 2019

I've gone down a rabbit hole around -nostartfiles, and it turns out that what we're doing with gcc was never actually supported and we were very lucky worked.

The TL;DR is that even syscall(2) is a libc library function, which technically should have the standard libc initialization happen before we use it, but we don't get that when we compile statically, which means it may or may not work (hence our segfault).

There are other wrappers for syscalls, but they're deprecated and long since unimplemented (_syscall(2)).

The actual implementation of syscall in glibc for each architecture is a very small bit of assembly, but I'm definitely not comfortable committing those directly here, and getting them and using the right one for each architecture is going to be a bit of a pain, not to mention how nasty it is to do that much work just to invoke two syscalls. 😅

I did find that Buster adds explicit support for i386/i686 cross-compilation like other arches via libc6-dev-i386-cross and gcc-i686-linux-gnu, which allows us to drop dpkg --add-architecture i386 and our extra hackery around building i386 executables (CC='i686-linux-gnu-gcc' STRIP='i686-linux-gnu-strip'), but the end result is the same. 😞

If we don't use -nostartfiles and let glibc get included in full, we end up with a ~500x increase in binary size.

The only alternative I'm seeing is to use musl (which is frankly designed for what we're trying to do), but the hurdles there we need to overcome are that Debian's musl package isn't made for cross-compiling like we're doing, and if we use Alpine directly we'll have to set up some semi-complicated/annoying native-build scaffolding like we do for busybox, all for a tiny hello binary for each architecture.

(We could consider building just the bits of musl we need for each architecture here, but I'm not convinced that it's feasible to build just a minimal subset of it, nor what the compilation overhead would actually be, not to mention the final binary size being roughly on par with our current sizes.)

@tianon
Copy link
Member

tianon commented Dec 23, 2019

So, I went down the rabbit hole of building musl from source as part of our build, and it wasn't too bad. On my 12-core coffee-lake, it took less than 30 seconds to build for each architecture, and builds successfully without much real fanfare on all architectures except ppc64le.

Doing a standard int main() ... implementation, we were at roughly 10x filesize, which is much better than 500x but still not super awesome given we only really need syscall out of libc.

More interestingly, I did void _start() ... like we currently have and -nostartfiles, and ended up with the same segfault we have currently. 😄

Edit: adding -mlong-double-64 made even ppc64le work just fine

Edit 2: here's the before/after:

-rwxr-xr-x 1 tianon tianon 1.8K Dec 23 13:58 amd64/hello-world/hello*
-rwxr-xr-x 1 tianon tianon 1.7K Dec 23 13:58 arm32v5/hello-world/hello*
-rwxr-xr-x 1 tianon tianon 1.6K Dec 23 13:58 arm32v7/hello-world/hello*
-rwxr-xr-x 1 tianon tianon 4.7K Dec 23 13:58 arm64v8/hello-world/hello*
-rwxr-xr-x 1 tianon tianon 650K Dec 23 13:58 i386/hello-world/hello*
-rwxr-xr-x 1 tianon tianon  65K Dec 23 13:58 ppc64le/hello-world/hello*
-rwxr-xr-x 1 tianon tianon 2.0K Dec 23 13:58 s390x/hello-world/hello*

vs

-rwxr-xr-x 1 tianon tianon  14K Dec 23 13:27 amd64/hello-world/hello*
-rwxr-xr-x 1 tianon tianon 8.8K Dec 23 13:28 arm32v5/hello-world/hello*
-rwxr-xr-x 1 tianon tianon 4.8K Dec 23 13:28 arm32v7/hello-world/hello*
-rwxr-xr-x 1 tianon tianon 9.0K Dec 23 13:28 arm64v8/hello-world/hello*
-rwxr-xr-x 1 tianon tianon  13K Dec 23 13:28 i386/hello-world/hello*
-rwxr-xr-x 1 tianon tianon  65K Dec 23 13:55 ppc64le/hello-world/hello*
-rwxr-xr-x 1 tianon tianon 9.0K Dec 23 13:55 s390x/hello-world/hello*

@J0WI
Copy link
Author

J0WI commented Dec 23, 2019

With musl-tools you can also build musl-libc binaries on Debian, but I have never tried to cross compile with it.

@tianon tianon mentioned this pull request Dec 27, 2019
tianon added a commit to infosiftr/hello-world that referenced this pull request Dec 30, 2019
The intent of the previous implementation was to avoid libc, but it turns out that just invoking a syscall without libc is complicated (see docker-library#62 (comment) for details).

On the other hand, my personal machine can cross-compile all of musl in ~30s per architecture, which is pretty reasonable, and the resulting binary sizes are only around ~10k each, and I was able to do so successfully for every architecture we currently support.
tianon added a commit to infosiftr/hello-world that referenced this pull request Dec 30, 2019
The intent of the previous implementation was to avoid libc, but it turns out that just invoking a syscall without libc is complicated (see docker-library#62 (comment) for details).

On the other hand, my personal machine can cross-compile all of musl in ~30s per architecture, which is pretty reasonable, and the resulting binary sizes are only around ~10k each, and I was able to do so successfully for every architecture we currently support.
tianon added a commit to infosiftr/hello-world that referenced this pull request Dec 30, 2019
The intent of the previous implementation was to avoid libc, but it turns out that just invoking a syscall without libc is complicated (see docker-library#62 (comment) for details).

On the other hand, my personal machine can cross-compile all of musl in ~30s per architecture, which is pretty reasonable, and the resulting binary sizes are only around ~10k each, and I was able to do so successfully for every architecture we currently support.
tianon added a commit to infosiftr/hello-world that referenced this pull request Dec 31, 2019
The intent of the previous implementation was to avoid libc, but it turns out that just invoking a syscall without libc is complicated (see docker-library#62 (comment) for details).

On the other hand, my personal machine can cross-compile all of musl in ~30s per architecture, which is pretty reasonable, and the resulting binary sizes are only around ~10k each, and I was able to do so successfully for every architecture we currently support.
@yosifkit yosifkit closed this in #67 Dec 31, 2019
@J0WI J0WI deleted the buster branch January 1, 2020 00:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants