Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stage2 whole file astgen #8554

Merged
merged 240 commits into from
May 19, 2021
Merged

Stage2 whole file astgen #8554

merged 240 commits into from
May 19, 2021

Conversation

andrewrk
Copy link
Member

This is an implementation of #8516. It is a continuation of #8514. There is much to be done before this branch passes the existing stage2 tests, but it is a major step forward in progress.

The goal of this branch is to get to the point where existing tests are passing, but with the main compilation being kicked off in the same way that it is with stage1: via an implicit _ = @import("std"); which, in turn, imports start.zig, which, in turn, imports the root source file and decides how to handle the entry point.

There are a lot of miscellaneous things standing in the way of this, and this branch aims to tackle all of them! See individual commit messages for more info.

Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".

I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.

The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.

 * AstGen: add "previously declared here" notes for variables shadowing
   decls.
 * Parse imports as structs. Module now calls `AstGen.structDeclInner`,
   which is called by `AstGen.containerDecl`.
   - `importFile` is a bit kludgy with how it handles the top level Decl
     that kinda gets merged into the struct decl at the end of the
     function. Be on the look out for bugs related to that as well as
     possibly cleaner ways to implement this.
 * Module: factor out lookupDeclName into lookupIdentifier and lookupNa
 * Rename `Scope.Container` to `Scope.Namespace`.
 * Delete some dead code.

This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
 * AstGen: emit decl lookup ZIR instructions rather than directly
   looking up decls in AstGen. This is necessary because we want to
   reuse the same immutable ZIR code for multiple generic instantiations
   (and comptime function calls).
 * AstGen: fix using members_len instead of fields_len for struct decls.
 * structs: the struct_decl ZIR instruction is now also a block. This is
   so that the type expressions, default field value expressions, and
   alignment expressions can be evaluated in a scope that contains the
   decls from the struct namespace itself.
 * Add "std" and "builtin" packages to the builtin package.
 * Don't try to build glibc, musl, or mingw-w64 when using `-ofmt=c`.
 * builtin.zig is generated without `usingnamespace`.
 * builtin.zig takes advantage of `std.zig.fmtId` for CPU features.
 * A first pass at implementing `usingnamespace`. It's problematic and
   should either be deleted, or polished, before merging this branch.
 * Sema: allow explicitly specifying the namespace in which to look up
   Decls. This is used by `struct_decl` in order to put the decls from
   the struct namespace itself in scope when evaluating the type
   expressions, default value expressions, and alignment expressions.
 * Module: fix `analyzeNamespace` assuming that it is the top-level root
   declaration node.
 * Sema: implement comptime and runtime cmp operator.
 * Sema: implement peer type resolution for enums and enum literals.
 * Pull in the changes from master branch:
   262e09c.
 * ZIR: complete out simple_ptr_type debug printing
fewer required language features to allow this to work
since it now uses top level fields
See #8516.

 * AstGen is now done on whole files at once rather than per Decl.

 * Introduce a new wait group for AstGen tasks. `performAllTheWork`
   waits for all AstGen tasks to be complete before doing Sema,
   single-threaded.
   - The C object compilation tasks are moved to be spawned after
     AstGen, since they only need to complete by the end of
     the function.

With this commit, the codebase compiles, but much more reworking is
needed to get things back into a useful state.
 * AstGen: represent compile errors in ZIR rather than returning
   `error.AnalysisFail`.
 * ZIR: remove decl_ref and decl_val instructions. These are replaced by
   `decl_ref_named` and `decl_val_named`, respectively, which will
   probably get renamed in the future to the instructions that were just
   deleted.
 * AstGen: implement `@This()`, `@fence()`, `@returnAddress()`, and
   `@src()`.
 * AstGen: struct_decl improved to support fields_len=0 but have decls.
 * AstGen: fix missing null bytes after compile error messages.
 * SrcLoc: no longer depend on `Decl`. Instead have an explicit field
   `parent_decl_node` which is an absolute AST Node index.
 * Module: `failed_files` table can have null value, in which case the
   key, which is a `*Scope.File`, will have ZIR errors in it.
 * ZIR: implement text rendering of struct decls.
 * CLI: introduce debug_usage and `zig astgen` command which is enabled
   when the compiler is built in debug mode.
I've run into this footgun enough times, nearly every time I want
`ensureUnusedCapacity`, not `ensureCapacity`. This commit deprecates
`ensureCapacity` in favor of `ensureTotalCapacity` and introduces
`ensureUnusedCapacity`.
And fix bug with using `ensureCapacity` when I wanted
`ensureUnusedCapacity`.
it was using the wrong scope and other mistakes too
previously, it was incorrectly relative to the package directory
This provides a simpler test runner that exercises fewer language
features so that we can get to the point faster where `zig test` works
for stage2.
Previously, stage2 used a global decl_table for all Decl objects, keyed
by a 16-byte name hash that was hopefully unique. Now, there is a tree
of Namespace objects that own their named Decl objects.
Not sure why it was failing, also not sure why it started passing again
Previously, ZIR was per-function so we could simply allocate a slice for
all ZIR instructions. However now ZIR is whole-file, so we need a sparse
mapping of ZIR to AIR instructions in order to not waste memory.
when the AST for the switch has never been loaded
In order for this test to pass, the host linking/start code needs to
support explicitly setting the stack size. Zig defaults to 16 MiB stack
size, which is enough to pass the test in Debug builds, however, most
operating systems do not honor the stack size we request for and give a
smaller amount.

Eventually the goal is to pass this test on all hosts.
…stgen

Conflicts:
 * src/codegen/spirv.zig
 * src/link/SpirV.zig

We're going to want to improve the stage2 test harness to print
the source file name when a compile error occurs otherwise std lib
contributors are going to see some confusing CI failures when they cause
stage2 AstGen compile errors.
A previous commit from this branch incorrectly changed the usage of
`comptime` keyword, and broke the std lib tests. This commit adds
`comptime` to a few function calls, correcting the behavior.
@andrewrk
Copy link
Member Author

Leaving this for myself, for tomorrow.
The last remaining problem in this branch is this:

When scanDecls happens, we create stub Decl objects that
have not been semantically analyzed. When they get referenced,
they get semantically analyzed.

Right now, when they get unreferenced, they get completely deleted,
including deleted from the containing Namespace.

However, if the update does not cause the containing Namespace to get
deleted, for example, if std.builtin.ExportOptions is no longer
referenced, but std.builtin is still referenced, and then ExportOptions
gets referenced again, the Namespace is incorrectly missing the
Decl, so we get an incorrect "no such member" error.

The solution is to, when dealing with a no longer referenced Decl
objects during an update, clear them to the state they would be in
on a fresh scanDecl, rather than completely deleting them.
Probably the logic needs to diverge because we also have the case
of wanting to completely delete them, such as when deinitializing
the Module.

This will solve this problem:

thread 35803 panic: attempt to use null value
/Users/runner/work/1/s/src/Sema.zig:7387:65: 0x1044f8c9d in Sema.getBuiltinType (test)
    const ty_inst = try sema.analyzeLoad(block, src, opt_ty_inst.?, src);
                                                                ^

I think that's the last issue blocking the merge.

When scanDecls happens, we create stub Decl objects that
have not been semantically analyzed. When they get referenced,
they get semantically analyzed.

Before this commit, when they got unreferenced, they were completely
deleted, including deleted from the containing Namespace.

However, if the update did not cause the containing Namespace to get
deleted, for example, if `std.builtin.ExportOptions` is no longer
referenced, but `std.builtin` is still referenced, and then `ExportOptions`
gets referenced again, the Namespace would be incorrectly missing the
Decl, so we get an incorrect "no such member" error.

The solution is to, when dealing with a no longer referenced Decl
objects during an update, clear them to the state they would be in
on a fresh scanDecl, rather than completely deleting them.
…stgen

I want the updated Drone CI stuff to get the CI green.
@andrewrk
Copy link
Member Author

andrewrk commented May 18, 2021

This is imminently close to merging, so I've extracted out some of the issues I noted while working on this branch, but should not bock the merge: #8817 #8818 #8819 #8820 #8821 #8822 #8823 #8824 #8825 #8826 #8827 #8828 #8829

Another problem I need to solve is that this makes it so @import of a non-existent file name is a compile error. I believe this is an improvement, however is also problematic for a zig installation since we intentionally omit large xyz_test.zig files for compiler-rt, to reduce installation size. My plan to address this is to blank out these files rather than omit them from the installation.

[nix-shell:~/dev/zig/build-release]$ ./zig-out/bin/zig build-exe hello.zig
error: unable to load crypto/test.zig: FileNotFound
error: unable to load net/test.zig: FileNotFound
error: unable to load os/linux/test.zig: FileNotFound
error: unable to load zig/parser_test.zig: FileNotFound
error: unable to load os/test.zig: FileNotFound
error: unable to load json/test.zig: FileNotFound
error: unable to load io/test.zig: FileNotFound
error: unable to load fs/test.zig: FileNotFound
error: unable to load math/big/int_test.zig: FileNotFound

Now that `@import` of a non-existent file name is a compile error, the
Zig installation needs to include files that end with test.zig. However
we still want to avoid bloat in the installation size, so we blank them
out instead of omitting them entirely.

Now AstGen of the full standard library can complete successfully on a
Zig installation rather than erroring out with file not found.
@andrewrk andrewrk merged commit 6435750 into master May 19, 2021
@andrewrk andrewrk deleted the stage2-whole-file-astgen branch May 19, 2021 00:20
@andrewrk andrewrk added the breaking Implementing this issue could cause existing code to no longer compile or have different behavior. label May 19, 2021
@Snektron Snektron mentioned this pull request May 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Implementing this issue could cause existing code to no longer compile or have different behavior.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants