-
Notifications
You must be signed in to change notification settings - Fork 506
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rework and vastly expand the MIR section #67
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -17,9 +17,9 @@ A control-flow graph is structured as a set of **basic blocks** | |
connected by edges. The key idea of a basic block is that it is a set | ||
of statements that execute "together" -- that is, whenever you branch | ||
to a basic block, you start at the first statement and then execute | ||
all the remainder. Only at the end of the is there the possibility of | ||
branching to more than one place (in MIR, we call that final statement | ||
the **terminator**): | ||
all the remainder. Only at the end of the block is there the | ||
possibility of branching to more than one place (in MIR, we call that | ||
final statement the **terminator**): | ||
|
||
``` | ||
bb0: { | ||
|
@@ -88,7 +88,8 @@ cycle. | |
|
||
## What is co- and contra-variance? | ||
|
||
*to be written* | ||
Check out the subtyping chapter from the | ||
[Rust Nomicon](https://doc.rust-lang.org/nomicon/subtyping.html). | ||
|
||
<a name=free-vs-bound> | ||
|
||
|
@@ -97,18 +98,17 @@ cycle. | |
Let's describe the concepts of free vs bound in terms of program | ||
variables, since that's the thing we're most familiar with. | ||
|
||
- Consider this expression: `a + b`. In this expression, `a` and `b` | ||
refer to local variables that are defined *outside* of the | ||
expression. We say that those variables **appear free** in the | ||
expression. To see why this term makes sense, consider the next | ||
example. | ||
- In contrast, consider this expression, which creates a closure: `|a, | ||
- Consider this expression, which creates a closure: `|a, | ||
b| a + b`. Here, the `a` and `b` in `a + b` refer to the arguments | ||
that the closure will be given when it is called. We say that the | ||
`a` and `b` there are **bound** to the closure, and that the closure | ||
signature `|a, b|` is a **binder** for the names `a` and `b` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps add binder to the glossary. I have often wondered about that term. |
||
(because any references to `a` or `b` within refer to the variables | ||
that it introduces). | ||
- Consider this expression: `a + b`. In this expression, `a` and `b` | ||
refer to local variables that are defined *outside* of the | ||
expression. We say that those variables **appear free** in the | ||
expression (i.e., they are **free**, not **bound** (tied up)). | ||
|
||
So there you have it: a variable "appears free" in some | ||
expression/statement/whatever if it refers to something defined | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,19 +6,24 @@ The compiler uses a number of...idiosyncratic abbreviations and things. This glo | |
Term | Meaning | ||
------------------------|-------- | ||
AST | the abstract syntax tree produced by the syntax crate; reflects user syntax very closely. | ||
binder | a "binder" is a place where a variable or type is declared; for example, the `<T>` is a binder for the generic type parameter `T` in `fn foo<T>(..)`, and `|a| ...` is a binder for the parameter `a`. See [the background chapter for more](./background.html#free-vs-bound) | ||
bound variable | a "bound variable" is one that is declared within an expression/term. For example, the variable `a` is bound within the closure expession `|a| a * 2`. See [the background chapter for more](./background.html#free-vs-bound) | ||
codegen unit | when we produce LLVM IR, we group the Rust code into a number of codegen units. Each of these units is processed by LLVM independently from one another, enabling parallelism. They are also the unit of incremental re-use. | ||
completeness | completeness is a technical term in type theory. Completeness means that every type-safe program also type-checks. Having both soundness and completeness is very hard, and usually soundness is more important. (see "soundness"). | ||
control-flow graph | a representation of the control-flow of a program; see [the background chapter for more](./background.html#cfg) | ||
cx | we tend to use "cx" as an abbrevation for context. See also `tcx`, `infcx`, etc. | ||
DAG | a directed acyclic graph is used during compilation to keep track of dependencies between queries. ([see more](incremental-compilation.html)) | ||
data-flow analysis | a static analysis that figures out what properties are true at each point in the control-flow of a program; see [the background chapter for more](./background.html#dataflow) | ||
DefId | an index identifying a definition (see `librustc/hir/def_id.rs`). Uniquely identifies a `DefPath`. | ||
free variable | a "free variable" is one that is not bound within an expression or term; see [the background chapter for more](./background.html#free-vs-bound) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, could you actually move this after DefId? The glossary is a bit out of order at the moment. This is fixed in #56... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
'gcx | the lifetime of the global arena ([see more](ty.html)) | ||
generics | the set of generic type parameters defined on a type or item | ||
HIR | the High-level IR, created by lowering and desugaring the AST ([see more](hir.html)) | ||
HirId | identifies a particular node in the HIR by combining a def-id with an "intra-definition offset". | ||
HIR Map | The HIR map, accessible via tcx.hir, allows you to quickly navigate the HIR and convert between various forms of identifiers. | ||
ICE | internal compiler error. When the compiler crashes. | ||
ICH | incremental compilation hash. ICHs are used as fingerprints for things such as HIR and crate metadata, to check if changes have been made. This is useful in incremental compilation to see if part of a crate has changed and should be recompiled. | ||
inference variable | when doing type or region inference, an "inference variable" is a kind of special type/region that represents value you are trying to find. Think of `X` in algebra. | ||
inference variable | when doing type or region inference, an "inference variable" is a kind of special type/region that represents what you are trying to infer. Think of X in algebra. For example, if we are trying to infer the type of a variable in a program, we create an inference variable to represent that unknown type. | ||
infcx | the inference context (see `librustc/infer`) | ||
IR | Intermediate Representation. A general term in compilers. During compilation, the code is transformed from raw source (ASCII text) to various IRs. In Rust, these are primarily HIR, MIR, and LLVM IR. Each IR is well-suited for some set of computations. For example, MIR is well-suited for the borrow checker, and LLVM IR is well-suited for codegen because LLVM accepts it. | ||
local crate | the crate currently being compiled. | ||
|
@@ -27,14 +32,18 @@ LTO | Link-Time Optimizations. A set of optimizations offer | |
MIR | the Mid-level IR that is created after type-checking for use by borrowck and trans ([see more](./mir.html)) | ||
miri | an interpreter for MIR used for constant evaluation ([see more](./miri.html)) | ||
newtype | a "newtype" is a wrapper around some other type (e.g., `struct Foo(T)` is a "newtype" for `T`). This is commonly used in Rust to give a stronger type for indices. | ||
NLL | [non-lexical lifetimes](./mir-regionck.html), an extension to Rust's borrowing system to make it be based on the control-flow graph. | ||
node-id or NodeId | an index identifying a particular node in the AST or HIR; gradually being phased out and replaced with `HirId`. | ||
obligation | something that must be proven by the trait system ([see more](trait-resolution.html)) | ||
promoted constants | constants extracted from a function and lifted to static scope; see [this section](./mir.html#promoted) for more details. | ||
provider | the function that executes a query ([see more](query.html)) | ||
quantified | in math or logic, existential and universal quantification are used to ask questions like "is there any type T for which is true?" or "is this true for all types T?"; see [the background chapter for more](./background.html#quantified) | ||
query | perhaps some sub-computation during compilation ([see more](query.html)) | ||
region | another term for "lifetime" often used in the literature and in the borrow checker. | ||
sess | the compiler session, which stores global data used throughout compilation | ||
side tables | because the AST and HIR are immutable once created, we often carry extra information about them in the form of hashtables, indexed by the id of a particular node. | ||
sigil | like a keyword but composed entirely of non-alphanumeric tokens. For example, `&` is a sigil for references. | ||
skolemization | a way of handling subtyping around "for-all" types (e.g., `for<'a> fn(&'a u32)` as well as solving higher-ranked trait bounds (e.g., `for<'a> T: Trait<'a>`). See [the chapter on skolemization and universes](./mir-regionck.html#skol) for more details. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think you are missing a closing paren after the first example... |
||
soundness | soundness is a technical term in type theory. Roughly, if a type system is sound, then if a program type-checks, it is type-safe; i.e. I can never (in safe rust) force a value into a variable of the wrong type. (see "completeness"). | ||
span | a location in the user's source code, used for error reporting primarily. These are like a file-name/line-number/column tuple on steroids: they carry a start/end point, and also track macro expansions and compiler desugaring. All while being packed into a few bytes (really, it's an index into a table). See the Span datatype for more. | ||
substs | the substitutions for a given generic type or item (e.g. the `i32`, `u32` in `HashMap<i32, u32>`) | ||
|
@@ -45,6 +54,7 @@ token | the smallest unit of parsing. Tokens are produced aft | |
trans | the code to translate MIR into LLVM IR. | ||
trait reference | a trait and values for its type parameters ([see more](ty.html)). | ||
ty | the internal representation of a type ([see more](ty.html)). | ||
variance | variance determines how changes to a generic type/lifetime parameter affect subtyping; for example, if `T` is a subtype of `U`, then `Vec<T>` is a subtype `Vec<U>` because `Vec` is *covariant* in its generic parameter. See [the background chapter for more](./background.html#variance). | ||
|
||
[LLVM]: https://llvm.org/ | ||
[lto]: https://llvm.org/docs/LinkTimeOptimization.html | ||
|
This file was deleted.
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,9 +4,9 @@ If you would like to get the MIR for a function (or constant, etc), | |
you can use the `optimized_mir(def_id)` query. This will give you back | ||
the final, optimized MIR. For foreign def-ids, we simply read the MIR | ||
from the other crate's metadata. But for local def-ids, the query will | ||
construct the MIR and then iteratively optimize it by putting it | ||
through various pipeline stages. This section describes those pipeline | ||
stages and how you can extend them. | ||
construct the MIR and then iteratively optimize it by applying a | ||
series of passes. This section describes how those passes work and how | ||
you can extend them. | ||
|
||
To produce the `optimized_mir(D)` for a given def-id `D`, the MIR | ||
passes through several suites of optimizations, each represented by a | ||
|
@@ -97,18 +97,19 @@ that appeared within the `main` function.) | |
### Implementing and registering a pass | ||
|
||
A `MirPass` is some bit of code that processes the MIR, typically -- | ||
but not always -- transforming it along the way in some way. For | ||
example, it might perform an optimization. The `MirPass` trait itself | ||
is found in in [the `rustc_mir::transform` module][mirtransform], and | ||
it basically consists of one method, `run_pass`, that simply gets an | ||
but not always -- transforming it along the way somehow. For example, | ||
it might perform an optimization. The `MirPass` trait itself is found | ||
in in [the `rustc_mir::transform` module][mirtransform], and it | ||
basically consists of one method, `run_pass`, that simply gets an | ||
`&mut Mir` (along with the tcx and some information about where it | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm guessing that you just modify the Mir in-place? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. clarified |
||
came from). | ||
came from). The MIR is therefore modified in place (which helps to | ||
keep things efficient). | ||
|
||
A good example of a basic MIR pass is [`NoLandingPads`], which walks the | ||
MIR and removes all edges that are due to unwinding -- this is used | ||
with when configured with `panic=abort`, which never unwinds. As you can see | ||
from its source, a MIR pass is defined by first defining a dummy type, a struct | ||
with no fields, something like: | ||
A good example of a basic MIR pass is [`NoLandingPads`], which walks | ||
the MIR and removes all edges that are due to unwinding -- this is | ||
used when configured with `panic=abort`, which never unwinds. As you | ||
can see from its source, a MIR pass is defined by first defining a | ||
dummy type, a struct with no fields, something like: | ||
|
||
```rust | ||
struct MyPass; | ||
|
@@ -120,8 +121,9 @@ this pass into the appropriate list of passes found in a query like | |
should go into the `optimized_mir` list.) | ||
|
||
If you are writing a pass, there's a good chance that you are going to | ||
want to use a [MIR visitor] too -- those are a handy visitor that | ||
walks the MIR for you and lets you make small edits here and there. | ||
want to use a [MIR visitor]. MIR visitors are a handy way to walk all | ||
the parts of the MIR, either to search for something or to make small | ||
edits. | ||
|
||
### Stealing | ||
|
||
|
@@ -149,7 +151,9 @@ be **stolen** by the `mir_validated()` suite. If nothing was done, | |
then `mir_const_qualif(D)` would succeed if it came before | ||
`mir_validated(D)`, but fail otherwise. Therefore, `mir_validated(D)` | ||
will **force** `mir_const_qualif` before it actually steals, thus | ||
ensuring that the reads have already happened: | ||
ensuring that the reads have already happened (remember that | ||
[queries are memoized](./query.html), so executing a query twice | ||
simply loads from a cache the second time): | ||
|
||
``` | ||
mir_const(D) --read-by--> mir_const_qualif(D) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is content from the nomicon that could be borrowed here...