New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Grand Bootstrapping Plan #853

andrewrk opened this Issue Mar 21, 2018 · 4 comments


None yet
4 participants

andrewrk commented Mar 21, 2018

Depends on:

  • #89 (self-hosted compiler)
  • #490 (use zig as a c/c++ compiler) (since we depend on libclang, we can expose the functionality)
  • #514 (implement libc in zig)

The idea is to have a single source tarball that, given any C++ compiler which can build for the native machine, can produce a fully operational Zig compiler - for any target. The bootstrapping process is O(1) and never gets more complicated than this, because we continue to maintain the C++ zig implementation enough to the point that it can build the latest self-hosted compiler.


This tarball contains:

  • Zig source code
  • LLVM source code
  • Clang source code
  • LLD source code
  • Whatever libraries the above 3 depend on. This appears to be:
    • libxml2 source code
    • zlib source code
  • libstdc++ source code from LLVM project

The build process:

  1. Use the supplied C++ compiler to build LLVM, LLD, Clang, and their respective required dependencies for the native machine, and then the Zig Stage 1 compiler from C++ source code, for the native machine.
  2. Use Zig Stage 1 to build Zig Stage 2 for the native machine, and then Zig Stage 2 to build Zig Self-Hosted Compiler, for the native machine. We are not done because Zig Self-Hosted Compiler, through the LLVM,Clang,LLD dependencies, depend on native system libraries, for example libc.
  3. Use Zig Self-Hosted Compiler to build zig's libc for the target.
  4. Use Zig's libc and Zig Self-Hosted Compiler - using zig as a C++ compiler - to build libstdc++ from source for the target. Using the same strategy, and libstdc++, build LLVM, LLD, Clang, and the libraries they depend on from source, for the target.
  5. Use Zig Self-Hosted Compiler and all these libraries we just cross compiled, to build Zig Self-Hosted Compiler, for the target.

What we're left with after all this is a fully statically linked Zig binary, cross compiled for the target machine, plus all the standard library files and documentation that comes with a release. Bundle this all up into a .tar.xz and we have ourselves a binary ready to distribute to the specified target.

@andrewrk andrewrk added this to the 1.0.0 milestone Mar 21, 2018


This comment has been minimized.


bnoordhuis commented Mar 30, 2018

Ambitious, I like it!

libxml2 source code / iconv source code or icuuc source code?

LLVM uses libxml2 to merge Windows manifest files for side-by-side applications and that functionality can be disabled at build time (cmake -DLLVM_LIBXML2_ENABLED=OFF, IIRC.)

I don't expect manifest files are relevant to zig but even if they are, libxml2 can be built without icuuc and iconv support.

The icuuc source is ~90 MB. BSD's libiconv is much smaller but it's still a few MB (big lookup tables.)

we continue to maintain the C++ zig implementation enough to the point that it can bootstrap the latest self-hosted compiler

You don't want to get rid of it over time? Maintaining two compilers seems a bit of a drag: you have to either be conservative with what you use in the stage 2 compiler or implement new language features twice.

Is compiling the stage 2 compiler to C (or, if that's too restrictive, compiling to WebAssembly and using a wasm interpreter) and using that as the stage 1 compiler an option?


This comment has been minimized.


andrewrk commented Mar 30, 2018

One of the big reasons for maintaining the c++ compiler is for the benefit of package maintainers such as Debian. They want to be able to bootstrap the compiler from a trusted source version to avoid the back door problem. Maintaining a quick bootstrapping process from C++ code to final binary makes Zig easier to package and therefore more likely to be picked up by various package managers, and more likely to be kept up to date.

Dependencies in c/c++ are always the enemy of people getting the software built and running, so I really want to keep them to a minimum.

Compiling the stage2 compiler to C or WebAssembly does not satisfy the problem, because the C code or WebAssembly code would be output, rather than source code. What we want is a tarball full of source code only, and then with minimal dependencies, be able to convert this to the final output.


This comment has been minimized.


hcnelson99 commented Sep 29, 2018

What's the plan for ergonomic features? I'm interested in contributing to improve zig's error messages (think Would these types of improvements be only implemented in the self-hosted compiler or in both?


This comment has been minimized.


thejoshwolfe commented Sep 30, 2018

@hcnelson99 The stage1 compile errors are already a little bit human friendly with colors and source printing. We are on par with GCC and Clang for error message formatting including automatically switching modes depending on if stderr is a tty.

There's already been a rejected proposal to add fancy error message features to the stage1 compiler here: #1448 . I would not recommend adding anything fancy to the stage1 compiler in this domain, because its destiny is to only build a single Zig project, so comfy features like you're proposing would probably not be worth the maintenance burden.

I don't think the self hosted compiler is ready for the kinds of fancy features described in the document you linked yet. Maybe there is some work that can be done there, but I don't know.

Was there a specific feature you noticed was missing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment