Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

self-hosted compiler: ship it! #89

Closed
34 of 49 tasks
andrewrk opened this issue Jan 26, 2016 · 33 comments · Fixed by #12368
Closed
34 of 49 tasks

self-hosted compiler: ship it! #89

andrewrk opened this issue Jan 26, 2016 · 33 comments · Fixed by #12368
Labels
backend-llvm The LLVM backend outputs an LLVM IR Module. enhancement Solving this issue will likely involve adding new logic or components to the codebase. frontend Tokenization, parsing, AstGen, Sema, and Liveness.
Milestone

Comments

@andrewrk
Copy link
Member

andrewrk commented Jan 26, 2016

This is a tracking issue for when we can completely replace stage1 with self-hosted in official builds of Zig.

@andrewrk andrewrk added the enhancement Solving this issue will likely involve adding new logic or components to the codebase. label Jan 26, 2016
@andrewrk andrewrk added this to the 1.0.0 milestone May 5, 2017
@andrewrk
Copy link
Member Author

andrewrk commented May 5, 2017

When we release zig 1.0.0, we'll actually have the self hosted compiler built and ready to go, passing all tests. Both projects will be maintained until 2.0.0.

@andrewrk andrewrk added the standard library This issue involves writing Zig code for the standard library. label Aug 27, 2017
@andrewrk
Copy link
Member Author

andrewrk commented Sep 8, 2017

There are some compelling reasons to self-host:

  • Makes f128 work without needing gnu extensions
  • We can write the doc generator in userland
  • Makes builtin overflow stuff work without C compiler extensions
  • Ability to easily cross-compile the compiler

We still want the bootstrapping process to be simple though. So here's another proposal. We get a self-hosted compiler going right now. It's the official zig compiler. However the C++ implementation must be able to build the official zig compiler. As long as that remains true, bootstrapping is 1 step process.

@andrewrk
Copy link
Member Author

andrewrk commented Apr 17, 2018

Things I Want to Improve in the Self-Hosted Compiler

Performance and Caching

  • Max out performance of machine with thread pool and async I/O
  • Pipeline all the work. Split the job up into individual functions that each produce .o files. We'll have LLVM spitting out .o files before the last source file has been tokenized.
  • No mutexes. When a coroutine needs a resource that another thread is working on, it yields to another job, getting resumed (in userspace!) when the resource is available (this looks like async/await)
  • Multi-layer caching. Cache files, cache AST, cache individual functions
  • Establish a file system watch on source files, detecting changes, running through the pipeline (taking caching into account), and atomically update output files in place. The compiler is a long lived process and some of the caching happens in memory.
  • Handle temporary out of memory situations with emitting an event that says "waiting for more memory to be available" and it prints how much was needed along with how much the system has available

Representation of Types and Values

ConstExprValue right now has a lot of footguns built into it,
and it wastes memory. The new data layout should accomplish
these things:

  • Use a minimal amount of memory
  • Have at the very least runtime safety for wrong union field access
    and hopefully more compile errors when adding and removing fields.
  • In Stage 1, the Type tells how to interpret the Value. In self-hosted,
    we should divorce these concepts. This should make comptime casting
    more correctly represented.
  • Introduce lazy values. For example, if you do @sizeOf(error), this
    can create Value that is backed by a LazyComptimeExpr. We could still
    find out @typeOf(@sizeOf(error)) without causing the lazy expr to
    evaluate. Once we get to the end of the compilation, we start evaluating
    all the lazy expressions. If a lazy expression depends on another lazy
    expression, it gets skipped, and we make a note to start over once done.
    If all lazy expressions must be skipped, then it's a compile error, and
    we show the dependency loop.

@phase
Copy link
Contributor

phase commented Apr 18, 2018

We'll have LLVM spitting out .o files before the last source file has been tokenized.

How will this work with name resolution? You can't compile a file that depends on a file that hasn't been parsed yet.

@andrewrk
Copy link
Member Author

andrewrk commented Apr 18, 2018

Here's an example:

  • we have 2 cores and therefore thread pool size 2
  • thread 1 load,tokenize,parse main.zig, which calls foo(), bar(), baz()
  • thread 1 scan top level decls and create jobs to analyze foo(), bar(), baz()
  • thread 1 analyzes foo()
  • thread 2 analyzes bar()
  • thread 1 generates llvm code for foo()
  • thread 2 generates llvm code for bar()
  • thread 1 emits foo.o
  • thread 2 emits bar.o
  • thread 1 analyzes baz(). baz() calls @import("quux.zig")
  • thread 1 load,tokenize,parse quux.zig
  • etc
  • main thread calls LLD linker on all the .o files

You can see from this example we would get better parallelism if we prioritized analysis of functions since that creates jobs for the pipeline - it would make thread 2 have something to do while thread 1 analyzes baz(). But this should illustrate the idea.

@ghost
Copy link

ghost commented Aug 21, 2018

0.3.0 ... seemed so close 😃

@andrewrk
Copy link
Member Author

Yeah. I couldn't make the deadline. 0.3.0 is two weeks away and I think those two weeks can best be spent on:

  • Stack traces for windows and MacOS
  • Documentation
  • Bug fixes

@ghost
Copy link

ghost commented Aug 21, 2018

building master always anyway 👍

@ghost
Copy link

ghost commented Nov 12, 2018

Something that I would like to see with a self-hosted compiler is the ability to import the compiler as a library within my application to compile and link new code while running. For example, https://github.com/anael-seghezzi/CToy embeds tcc and provides a creative coding environment that does not require a restart when modifying code.

Is such a thing possible already when importing std.build?

@andrewrk
Copy link
Member Author

Something that I would like to see with a self-hosted compiler is the ability to import the compiler as a library within my application to compile and link new code while running.

Unfortunately that's not ever going to be possible, because of the LLVM and clang dependency. Zig compiler ships with LLVM and clang libraries built into the zig compiler. And Zig supports cross compiling for many targets. To import the compiler as a library would require that LLVM and clang were available in source form (written in zig) so that they could be cross compiled for the target. To give up LLVM/clang and code our own code generator in zig would be giving up state-of-the-art optimizations and a very active community of people working on it.

However, the parts of the compiler that do not depend on LLVM and clang are available in the standard library, for example the parser and formatter. std.zig.parse and std.zig.render.

As for a coding environment that does not require a restart when modifying code, see #68.

@andrewrk andrewrk modified the milestones: 0.4.0, 0.5.0 Feb 8, 2019
@marler8997
Copy link
Contributor

Having LLVM/Clang written in C++ shouldn't prevent the compiler from being used as a library. Just like zig programs can link to and use C libraries even though they aren't in Zig.

andrewrk added a commit that referenced this issue Aug 9, 2022
stage1 is available behind the -fstage1 flag.

closes #89
kubkon pushed a commit that referenced this issue Aug 10, 2022
stage1 is available behind the -fstage1 flag.

closes #89
andrewrk added a commit that referenced this issue Aug 10, 2022
stage1 is available behind the -fstage1 flag.

closes #89
andrewrk added a commit that referenced this issue Aug 10, 2022
stage1 is available behind the -fstage1 flag.

closes #89
andrewrk added a commit that referenced this issue Aug 10, 2022
stage1 is available behind the -fstage1 flag.

closes #89
andrewrk added a commit that referenced this issue Aug 11, 2022
stage1 is available behind the -fstage1 flag.

closes #89
andrewrk added a commit that referenced this issue Aug 11, 2022
stage1 is available behind the -fstage1 flag.

closes #89
andrewrk added a commit that referenced this issue Aug 11, 2022
stage1 is available behind the -fstage1 flag.

closes #89
andrewrk added a commit that referenced this issue Aug 11, 2022
stage1 is available behind the -fstage1 flag.

closes #89
andrewrk added a commit that referenced this issue Aug 11, 2022
stage1 is available behind the -fstage1 flag.

closes #89
andrewrk added a commit that referenced this issue Aug 17, 2022
stage1 is available behind the -fstage1 flag.

closes #89
andrewrk added a commit that referenced this issue Aug 18, 2022
stage1 is available behind the -fstage1 flag.

closes #89
andrewrk added a commit that referenced this issue Aug 18, 2022
stage1 is available behind the -fstage1 flag.

closes #89
andrewrk added a commit that referenced this issue Aug 18, 2022
stage1 is available behind the -fstage1 flag.

closes #89
andrewrk added a commit that referenced this issue Aug 19, 2022
stage1 is available behind the -fstage1 flag.

closes #89
andrewrk added a commit that referenced this issue Aug 19, 2022
stage1 is available behind the -fstage1 flag.

closes #89
@andrewrk
Copy link
Member Author

Hello everybody,

The self-hosted compiler is now the default compiler. I have marked the 0.10.0 milestone with the due date of September 21, 2022.

Please take a look at the upgrade guide for help deciding when to upgrade and how to upgrade.

I will personally be working on the existing bug reports that are affecting third party projects, and then continue working with the projects to get them fully upgraded.

@RobertWHurst
Copy link

RobertWHurst commented Aug 20, 2022

Congratulations @andrewrk! I know you and your team have been working diligently to make this happen for a long time. It's a serious milestone. Nicely done 🎉

@lin72h
Copy link

lin72h commented Aug 20, 2022

Just witness a historical event, pure awesome

@hako
Copy link

hako commented Aug 20, 2022

major congrats @andrewrk and team for shipping this huge milestone for Zig! ⚡ 🚢

@malcolmstill
Copy link
Contributor

I have a table of function pointers which stage2 tells me to change to const* (which makes sense). But I then use those function pointers with @call(.{ .modifier = .always_tail } and I get the error error: modifier 'always_tail' requires a comptime-known function. Not sure what I need to do to fix that?

See malcolmstill/zware#139

@kuon
Copy link
Contributor

kuon commented Aug 20, 2022

Maybe we could create a specific issue template for bug occurring due to the upgrade? I don't think it will be efficient to make this one longer.

@david4r4
Copy link
Contributor

david4r4 commented Aug 20, 2022

I have a table of function pointers which stage2 tells me to change to const* (which makes sense). But I then use those function pointers with @call(.{ .modifier = .always_tail } and I get the error error: modifier 'always_tail' requires a comptime-known function. Not sure what I need to do to fix that?

See malcolmstill/foxwren#139

Yes, I agree with @kuon. You should discuss it in another issue or in any community channel. Just to give you a hint, basically you can't tail a function pointer call because you have no idea what body the function has.

By the way, congrats to the Zig team and the whole community! :)

@malcolmstill
Copy link
Contributor

@kuon @davidgm94 will do, thanks.

@andrewrk andrewrk unpinned this issue Aug 20, 2022
kristoff-it pushed a commit that referenced this issue Aug 23, 2022
stage1 is available behind the -fstage1 flag.

closes #89
TUSF pushed a commit to TUSF/zig that referenced this issue May 9, 2024
stage1 is available behind the -fstage1 flag.

closes ziglang#89
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend-llvm The LLVM backend outputs an LLVM IR Module. enhancement Solving this issue will likely involve adding new logic or components to the codebase. frontend Tokenization, parsing, AstGen, Sema, and Liveness.
Projects
None yet
Development

Successfully merging a pull request may close this issue.