New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: add gcc module (help wanted) #3
Conversation
On 07/03/2018 06:38 AM, Carl Dong wrote:
Trying to add a |gcc| module as described here
<http://www.linuxfromscratch.org/lfs/view/7.5/chapter05/gcc-pass1.html>
Invoking |./mkroot.sh -n gcc| gives the following stderr
<https://pastebin.com/jbYvSxgn> and stdout
<https://www.dropbox.com/s/8mv0l72v6d4menl/mkroot.stdout?dl=0>
Are the LFS instructions sound? Do I need to do this in multiple passes or is
there a simpler alternative?
Sorry I haven't had time to work on this, day job's been taking up all my energy.
The mcm-buildall.sh script builds native compilers for each target, as well as
cross compilers, using Rich Felker's musl-cross-make project. Prebuilt binaries
for them are in the page linked from the README.
The reason I haven't integrated them yet is I'm not building a "make" binary
yet. (Without which a compiler is noticeably less useful.) Fixing that is third
on my mkroot todo list after replacing the two remaining busybox binaries (route
and hush), and integrating mkroot into the toybox build scripts.
Rob
|
@landley I'm trying to build this using gcc simply because I want to produce a reproducible build for the gitian process, which currently uses gcc, I'm guessing using musl-cross-make will not produce the same binaries as gcc will? Also, I believe we're not using musl, but perhaps I can bring that up as a possible new target for future releases of bitcoin. |
On 07/03/2018 11:53 AM, Carl Dong wrote:
@landley <https://github.com/landley> I'm trying to build this using gcc simply
because I want to produce a reproducible build for the gitian process, which
currently uses gcc, I'm guessing using musl-cross-make will not produce the same
binaries as gcc will? Also, I believe we're not using musl, but perhaps I can
bring that up as a possible new target for future releases of bitcoin.
musl-cross-make is a gcc/binutils build script:
https://github.com/richfelker/musl-cross-make
You can select a few different supported gcc/binutils/musl versions in the
Makefile variables at the start of:
https://github.com/richfelker/musl-cross-make/blob/master/Makefile
I've been setting GCC_VER=7.2.0, and sometimes setting MUSL_VER=git-master (a
special target that tells it to clone musl's current git repo instead of
downloading a tarball). But otherwise leaving the other versions alone.
My mcm-buildall.sh script invokes musl-cross-make repeatedly to build every
currently supported target, both cross and native compilers. I do so the same
way aboriginal linux used to: first I build an i686 compiler, then I use that to
build statically linked i686 binaries for the cross-compiler output, then I use
the cross compiler I just built for a target to build a native compiler for that
target. This way all the binaries should be reproducible and relocatable.
https://github.com/landley/mkroot/blob/master/mcm-buildall.sh
The prebuilt binaries I referred you to earlier (http://b.zv.io/mcm/bin/) are
Zach van Rijn in Philadelphia running my mcm-buildall.sh script and putting the
results online.
That's my existing strategy for reproducible toolchain builds: foisting it off
on a relevant domain expert. (Rich is pretty good at that part.)
(Longer-term I've been looking at Rich Pennington's https://ellcc.org but his
build needs _seriously_ cleanup, and doesn't support nearly as many targets yet.)
Rob
|
@landley I've been experimenting with mcm all day yesterday and have a preliminary module for it (that I can open a PR for as soon as you merge the checkout functionality). I'm quite new to this cross compiling thing, so I want to validate a few of my observations and assumptions on running My observations:
My mental model of
Is the above correct? Questions:
|
Since the github post is public I'm cc-ing my reply to the toybox mailing list,
for reasons explained in the body:
On 07/04/2018 03:38 PM, Carl Dong wrote:
@landley <https://github.com/landley> I've been experimenting with mcm all day
yesterday and have a preliminary module for it (that I can open a PR for as soon
as you merge the checkout functionality).
I've been treating the musl-cross-make toolchains (cross and native) as build
dependencies of mkroot, I.E. already installed prerequisites.
You seem to want to put the toolchain build back under the mkroot build. That's
a design issue we need to work out.
I'm quite new to this cross compiling thing, so I want to validate a few of my
observations and assumptions on running |mcm-buildall.sh| so I don't go down the
wrong path...
Way back when I wrote an "intro to cross compiling" that really should have been
called "why cross compiling sucks", but I was trying to be polite:
http://landley.net/writing/docs/cross-compiling.html
Then I did Aboriginal Linux, with the motto "we cross compile so you don't have
to", and wrote a big page of documentation there explaining what it was trying
to accomplish:
http://landley.net/aboriginal/about.html
(Before that page, I did training sessions based on
https://speakerdeck.com/landley/developing-for-non-x86-targets-using-qemu and if
you _really_ want the full context of what I was trying to do I reminisced at
http://landley.net/aboriginal/history.html .)
tl;dr the point of Aboriginal Linux was "simplest Linux system capable of
rebuilding itself from source code and building Linux From Scratch under the
result". I got it down to 7 packages: busybox, uClibc, linux, gcc, binutils,
make, and bash. But I did so much work extending busybox to replace the 20+ gnu
packages from LFS that I wound up maintaining that project for a bit.
Then I rebased to toybox and musl-libc (and looked for a replacement toolchain
for gcc when it went gplv3), but the main design change between aboriginal and
mkroot is that aboriginal built its own toolchain and mkroot does not.
By moving the toolchain build out to an external project somebody else
maintains, 2/3 of the complexity of aboriginal linux went away, and what was
left could be greatly simplified. (I hadn't done so before because nobody who
produced cross compilers was willing/able to produce _native_ compilers as well,
but Rich Felker was willing to be talked into it when he did mcm.)
Since doing mkroot, I've realized that mkroot doesn't really _need_ to be a
standalone project: I can merge the kernel module into the main mkroot.sh file,
merge it into the toybox repository, have it build the copy of toybox it's part
of, and point to kernel source with a command line argument or an environment
variable, so "kernel source" is an environmental prerequisite just like cross
compiler toolchain is.
Toybox needs a qemu-based bootable test environment to run root tests in its
test suite, automated regression testing on multiple targets is nice, and a
builtin simple root filesystem builder in a single file under 1000 lines of
shell script isn't a bad thing for toybox to have. Plus my 2013 toybox talk
(http://landley.net/talks/celf-2013.txt I.E. http://youtu.be/SGmtP5Lg_t0 ) was
about turning AOSP into a self-hosting development environment, and there's AOSP
build work to do there (breaking it into orthogonal layers, providing it with a
hermetic/reproducible build environment, etc). I designed mkroot with all those
goals in mind.
The resulting usage pattern might look something like:
cd ~/dir
git clone toybox
git clone musl-cross-make
git clone linux
cd musl-cross-make
../toybox/scripts/mcm-buildall.sh
cd ../toybox
ln -s ../musl-cross-make/output mcm
scripts/cross.sh all scripts/mkroot.sh LINUX=~/dir/linux NATIVE=y
(I'm still waffling on how musl-cross-make specific it should be. The "mcm"
symlink isn't an ideal UI. And NATIVE=y implies scripts/mkroot.sh in toybox
would also be aware of the mcm symlink and look for native compilers under it,
which seems wrong. Really that's more a "cross.sh -n" option setting
NATIVE_COMPILER to a path the same way it sets CROSS_COMPILE, and then _only_
cross.sh cares about that symlink. As I said, there's design work to do. :)
However, getting even that far implies that I:
A) add usable versions the two remaining busybox commands (route and sh) to
toybox, so I can yank the busybox download. (I'm not merging something into
toybox that depends on busybox.)
B) Add a "make" implementation to toybox (or convince musl-cross-make to build
it as part of their build, but android builds with LLVM and will never install
GPL tools into its image, so I need to write a new make anyway if the kernel
build depends on it.)
My limiting factor in all this has been lack of time: $DAYJOB eats all my
energy, no big company's wanted to sponsor me, and "take a year off and live off
my savings" is less compelling in one's 40s with a 6 figure mortgage and maybe
20 years to retirement than in one's 30s with a 5 figure mortgage and 30 years
to retirement.
My observations:
1. When we have a directory that says $ARCH-linux-musl-cross, that means the
gcc under this directory is an executable runnable on whatever architecture
the host compiler was (in mcm-buildall.sh's case, i686), that will in turn
produce executables runnable on $ARCH
Close: mcm-buildall.sh is actually currently hardwired to i686 host for the
cross compilers. (They run faster, it's sort of a poor man's x32.)
It's easy enough to change: two instances of the tuple in the script, plus the
i686-host.txt log name tee writes to, then move the new host arch to the start
of the list in the for loop at the end.
(I'd make it a variable you can set except for the part about moving the
appropriate static/native build to the start of the for loop. Alas the dynamic
-host toolchain has some architecture assumptions that easily confuse it, so we
do a proper static build with it and then use that for the other architectures.
Easy way to do that is built that target first. :)
I've made puppy eyes at Rich about taking mcm-buildall.sh into his
musl-cross-make repo (it's not really appropriate for mkroot, and full of
_exactly_ the kind of black magic I'm trying to foist off on him anyway), but
haven't done so _loudly_ yet. :)
2. When we have a directory that says $ARCH-linux-musl-native, that means the
gcc under this directory is an executable runnable on |$ARCH| that was
produced using $ARCH-linux-musl-native
It was produced using $ARCH-linux-musl-cross. It runs on target, and produces
binaries for the target. You should be able to extract that tarball on pretty
much any system and use it, just like you can with the cross compilers. (In fact
i686-linux-cross and i686-linux-native should be pretty similar.
In _practice_:
$ strace -F ./gcc --sysroot $(readlink -f ..) hello.c 2>&1 | grep stdio.h
[pid 29064] read(3, "#include <stdio.h>\n\nint main(int"..., 97) = 97
[pid 29064]
open("/home/landley/musl-cross-make/bin/i686-linux-musl-native/bin/../lib/gcc/i686-linux-musl/7.2.0/include/stdio.h",
O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_CLOEXEC|0x200000) = -1 ENOENT (No such file or
directory)
[pid 29064]
stat64("/home/landley/musl-cross-make/bin/i686-linux-musl-native/bin/../lib/gcc/i686-linux-musl/7.2.0/include/stdio.h.gch",
0xff9f9840) = -1 ENOENT (No such file or directory)
[pid 29064]
open("/home/landley/musl-cross-make/bin/i686-linux-musl-native/bin/../lib/gcc/i686-linux-musl/7.2.0/include/stdio.h",
O_RDONLY|O_NOCTTY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
[pid 29064] readv(4, [{"#include <stdio.h>\n\nint main(int"..., 4095},
{"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1024}],
2) = 97
[pid 29064] writev(2, [{"", 0}, {"hello.c:1:10: fatal error: stdio"..., 102}],
2hello.c:1:10: fatal error: stdio.h: No such file or directory
#include <stdio.h>
Looks like I need to make more puppy eyes at rich. I'm pretty sure this worked
at one point, and if I add "-I include" it still does. (By default it's only
searching the directory where the compiler headers provided by glibc are
installed, not the directory where the libc headers from musl are installed.)
And of course the resulting hello world only runs if I --static link it because
this isn't a musl host.)
My mental model of |mcm-buildall.sh| is that it works like so:
* Create i686-linux-musl bootstrap compiler linked against host libc
o Create i686-linux-musl-cross from parent
+ Create i686-linux-musl-native from parent
+ Create *-linux-musl-cross from parent
# Create *-linux-musl-native from parent
Is the above correct?
More or less, yes.
Questions:
1. Are both *-cross and *-native compilers portable and statically linked? As
in, can I copy them to a machine with their runnable architecture and just
run them?
Yes, modulo the header search path glitch I just noticed above.
(There's always some weird regressionw ith new gcc versions. This is probably
because I'm building 7.2 instead of 6.4. Back in aboriginal linux I had ccwrap.c
that parsed the gcc command line and rewrote it starting with --nostdinc
--nostdlib and then added back all the search paths manually, because it was the
ONLY WAY to beat gcc into submission. Rich has more faith in the gcc developers.
Or possibly more patience.)
2. For the "i686-linux-musl bootstrap compiler linked against host libc," does
this mean that this bootstrap compiler produces musl executables, BUT this
compiler itself was compiled using host libc?
Yes.
My old rant about the 6 paths and how a compiler is conceptually no different
from a docbook to pdf converter was recorded at a conference 10 years ago, at
starting almost exactly the 10 minute mark in
http://free-electrons.com/pub/video/2008/ols/ols2008-rob-landley-linux-compiler.ogg
. (There's probably a written version somewhere but I can't find it just now.)
The GCC developers have been insanely self-important forever, and do stuff
terribly. (That's why it's a rant.)
3. Why do we need the "i686-linux-musl bootstrap compiler linked against host
libc"? Why not go straight to "i686-linux-musl-cross"?
There's a reason I refer to it as my "compiler rant". The short answer is "the
gcc developers are insane".
4. If I only wanted one tuple (say x86_64), I could change the script to do:
* Create x86_64-linux-musl bootstrap compiler linked against host libc
o Create x86_64-linux-musl-cross from parent
+ Create x86_64-linux-musl-native from parent
In theory, yes.
(As long as the cross/native pair for the host is the first on you build, it
should work. If it's the only one you build, that's the first one. :)
Rob
|
Trying to add a
gcc
module as described hereInvoking
./mkroot.sh -n gcc
gives the following stderr and stdoutAre the LFS instructions sound? Do I need to do this in multiple passes or is there a simpler alternative?