Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

a plan to make stage1 and stage2/3 share translate-c code (migrate translate-c to userland) #1964

Open
andrewrk opened this Issue Feb 15, 2019 · 5 comments

Comments

Projects
None yet
1 participant
@andrewrk
Copy link
Member

andrewrk commented Feb 15, 2019

In the same way that we can make zig fmt work in stage1 by doing the same thing we do with zig build, we can move the translate-c implementation to userland. Here's the plan:

  1. Separate translate_c.cpp into clang-api-c-wrapper.h/clang-api-c-wrapper.cpp and translate_c.cpp. The wrapper simply creates a C interface on top of the C++ clang API, and translate_c.cpp goes through the wrapper API. This transition can be done piecemeal.
  2. Create clang-api-c-wrapper.zig which is a Zig version of the clang-api-c-wrapper.h file. These will have to be kept in sync in order to prevent a dependency loop. This can be done lazily with step 3.
  3. Port translate_c.cpp to translate_c.zig and then delete translate_c.cpp. zig translate-c in stage1 will then be done by modifying the build of stage1 as described in #1964 (comment). This step can also be done piecemeal, with a separate set of translate-c tests specifically for the Zig port, until finally the Zig port passes all the tests and step 3 is completed.

This makes us pay a small stage1 performance price today (going through a cached build process / child process execution / file system for C imports), in order to invest in the future. The price would not have to be paid in stage2/3, and it makes stage1/2/3 all share the same translate-c implementation, which would be userland Zig code.

@andrewrk andrewrk added this to the 0.5.0 milestone Feb 15, 2019

@andrewrk

This comment has been minimized.

Copy link
Member Author

andrewrk commented Feb 16, 2019

I did a sort of proof-of-concept with zig fmt in ba56f36

andrewrk added a commit that referenced this issue Feb 16, 2019

andrewrk added a commit that referenced this issue Feb 16, 2019

@andrewrk

This comment has been minimized.

Copy link
Member Author

andrewrk commented Feb 16, 2019

I started on step 1 in 356cfa0. The last puzzle piece to fit in here is that, until we get self-hosted and start shipping stage2 (#89), we will have an interesting build process:

  • Build zig/zig.exe - the stage1 compiler with @cImport and translate-c capabilities disabled.
  • One of the artifacts from the previous step is zig_cpp.a/zig_cpp.lib. Using that and .zig translate-c source code, build translate_c.a/translate_c.lib.
  • Build zig/zig.exe - the stage1 compiler again, this time with @cImport and translate-c capabilities enabled. Link with translate_c.a/translate_c.lib.

I will note this negates the "small stage1 performance price today" I mentioned earlier. We can have our 🍰 and eat it too!

andrewrk added a commit that referenced this issue Feb 17, 2019

@andrewrk

This comment has been minimized.

Copy link
Member Author

andrewrk commented Feb 17, 2019

Marking this as contributor friendly since you can follow the pattern I've started in the latest commit (c3c92ca). The goal is to delete these lines from translate_c.cpp:

#include <clang/Frontend/ASTUnit.h>
#include <clang/Frontend/CompilerInstance.h>
#include <clang/AST/Expr.h>

Instead relying only on the decls in zig_clang.h. Once that's done we can proceed to steps 2 and 3. If you want to contribute on this you can look for clang::Foo in translate_c.cpp and try to make the code use decls from zig_clang.h instead of the clang::Foo thing.

andrewrk added a commit that referenced this issue Apr 11, 2019

@andrewrk

This comment has been minimized.

Copy link
Member Author

andrewrk commented Apr 11, 2019

Progress Bar (grep 'clang::' ../src/translate_c.cpp | wc -l and then 1137 minus that number out of 1137):

  • 45%

andrewrk added a commit that referenced this issue Apr 11, 2019

andrewrk added a commit that referenced this issue Apr 12, 2019

andrewrk added a commit that referenced this issue Apr 12, 2019

andrewrk added a commit that referenced this issue Apr 12, 2019

andrewrk added a commit that referenced this issue Apr 12, 2019

andrewrk added a commit that referenced this issue Apr 15, 2019

andrewrk added a commit that referenced this issue Apr 16, 2019

andrewrk added a commit that referenced this issue Apr 16, 2019

stage1 is now a hybrid of C++ and Zig
This modifies the build process of Zig to put all of the source files
into libcompiler.a, except main.cpp and userland.cpp.

Next, the build process links main.cpp, userland.cpp, and libcompiler.a
into zig1. userland.cpp is a shim for functions that will later be
replaced with self-hosted implementations.

Next, the build process uses zig1 to build src-self-hosted/stage1.zig
into libuserland.a, which does not depend on any of the things that
are shimmed in userland.cpp, such as translate-c.

Finally, the build process re-links main.cpp and libcompiler.a, except
with libuserland.a instead of userland.cpp. Now the shims are replaced
with .zig code. This provides all of the Zig standard library to the
stage1 C++ compiler, and enables us to move certain things to userland,
such as translate-c.

As a proof of concept I have made the `zig zen` command use text defined
in userland. I added `zig translate-c-2` which is a work-in-progress
reimplementation of translate-c in userland, which currently calls
`std.debug.panic("unimplemented")` and you can see the stack trace makes
it all the way back into the C++ main() function (Thanks LemonBoy for
improving that!).

This could potentially let us move other things into userland, such as
hashing algorithms, the entire cache system, .d file parsing, pretty
much anything that libuserland.a itself doesn't need to depend on.

This can also let us have `zig fmt` in stage1 without the overhead
of child process execution, and without the initial compilation delay
before it gets cached.

See #1964

andrewrk added a commit that referenced this issue Apr 16, 2019

stage1 is now a hybrid of C++ and Zig
This modifies the build process of Zig to put all of the source files
into libcompiler.a, except main.cpp and userland.cpp.

Next, the build process links main.cpp, userland.cpp, and libcompiler.a
into zig1. userland.cpp is a shim for functions that will later be
replaced with self-hosted implementations.

Next, the build process uses zig1 to build src-self-hosted/stage1.zig
into libuserland.a, which does not depend on any of the things that
are shimmed in userland.cpp, such as translate-c.

Finally, the build process re-links main.cpp and libcompiler.a, except
with libuserland.a instead of userland.cpp. Now the shims are replaced
with .zig code. This provides all of the Zig standard library to the
stage1 C++ compiler, and enables us to move certain things to userland,
such as translate-c.

As a proof of concept I have made the `zig zen` command use text defined
in userland. I added `zig translate-c-2` which is a work-in-progress
reimplementation of translate-c in userland, which currently calls
`std.debug.panic("unimplemented")` and you can see the stack trace makes
it all the way back into the C++ main() function (Thanks LemonBoy for
improving that!).

This could potentially let us move other things into userland, such as
hashing algorithms, the entire cache system, .d file parsing, pretty
much anything that libuserland.a itself doesn't need to depend on.

This can also let us have `zig fmt` in stage1 without the overhead
of child process execution, and without the initial compilation delay
before it gets cached.

See #1964
@andrewrk

This comment has been minimized.

Copy link
Member Author

andrewrk commented Apr 17, 2019

Now that #2295 is merged, all the proofs of concept are completed, the plan is working, and the rest of this issue is just porting the C++ translate-c code to Zig. We have zig translate-c-2 coexisting with zig translate-c for now. Once the self-hosted version is on par with the C++ version we'll delete the C++ version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.