Skip to content

Commit

Permalink
remove typestate from code, tests, and docs
Browse files Browse the repository at this point in the history
  • Loading branch information
nikomatsakis committed Jul 15, 2012
1 parent 1fbb9d0 commit 41a21f0
Show file tree
Hide file tree
Showing 103 changed files with 144 additions and 2,248 deletions.
331 changes: 1 addition & 330 deletions doc/rust.md
Original file line number Diff line number Diff line change
Expand Up @@ -211,7 +211,7 @@ The keywords in [source files](#source-files) are the following strings:
~~~~~~~~ {.keyword}
alt again assert
break
check claim class const copy
check class const copy
drop
else enum export extern
fail false fn for
Expand Down Expand Up @@ -1926,10 +1926,6 @@ an optional reference slot to serve as the function's output, bound to the
`lval` on the right hand side of the call. If the function eventually returns,
then the expression completes.

A call expression statically requires that the precondition declared in the
callee's signature is satisfied by the expression prestate. In this way,
typestates propagate through function boundaries.

An example of a call expression:

~~~~
Expand Down Expand Up @@ -2354,117 +2350,6 @@ library. It is best to use the macro forms of logging (*#error*,
when it is changed.


### Check expressions

~~~~~~~~{.ebnf .gram}
check_expr : "check" call_expr ;
~~~~~~~~

A `check` expression connects dynamic assertions made at run-time to the
static [typestate system](#typestate-system). A `check` expression takes a
constraint to check at run-time. If the constraint holds at run-time, control
passes through the `check` and on to the next expression in the enclosing
block. If the condition fails to hold at run-time, the `check` expression
behaves as a `fail` expression.

The typestate algorithm is built around `check` expressions, and in particular
the fact that control *will not pass* a check expression with a condition that
fails to hold. The typestate algorithm can therefore assume that the (static)
postcondition of a `check` expression includes the checked constraint
itself. From there, the typestate algorithm can perform dataflow calculations
on subsequent expressions, propagating [conditions](#conditions) forward and
statically comparing implied states and their specifications.

~~~~~~~~
# fn print(i: int) { }
pure fn even(x: int) -> bool {
ret x & 1 == 0;
}
fn print_even(x: int) : even(x) {
print(x);
}
fn test() {
let y: int = 8;
// Cannot call print_even(y) here.
check even(y);
// Can call print_even(y) here, since even(y) now holds.
print_even(y);
}
~~~~~~~~

### Prove expressions

**Note: Prove expressions are not yet supported by the compiler.**

~~~~~~~~{.ebnf .gram}
prove_expr : "prove" call_expr ;
~~~~~~~~

A `prove` expression has no run-time effect. Its purpose is to statically
check (and document) that its argument constraint holds at its expression
entry point. If its argument typestate does not hold, under the typestate
algorithm, the program containing it will fail to compile.

### Claim expressions

~~~~~~~~{.ebnf .gram}
claim_expr : "claim" call_expr ;
~~~~~~~~

A `claim` expression is an unsafe variant on a `check` expression that is not
actually checked at runtime. Thus, using a `claim` implies a proof obligation
to ensure---without compiler assistance---that an assertion always holds.

Setting a runtime flag can turn all `claim` expressions into `check`
expressions in a compiled Rust program, but the default is to not check the
assertion contained in a `claim`. The idea behind `claim` is that performance
profiling might identify a few bottlenecks in the code where actually checking
a given callee's predicate is too expensive; `claim` allows the code to
typecheck without removing the predicate check at every other call site.



### If-Check expressions

An `if check` expression combines a `if` expression and a `check`
expression in an indivisible unit that can be used to build more complex
conditional control-flow than the `check` expression affords.

In fact, `if check` is a "more primitive" expression than `check`;
instances of the latter can be rewritten as instances of the former. The
following two examples are equivalent:

Example using `check`:

~~~~
# pure fn even(x: int) -> bool { true }
# fn print_even(x: int) { }
# let x = 0;
check even(x);
print_even(x);
~~~~

Equivalent example using `if check`:

~~~~
# pure fn even(x: int) -> bool { true }
# fn print_even(x: int) { }
# let x = 0;
if check even(x) {
print_even(x);
} else {
fail;
}
~~~~

### Assert expressions

~~~~~~~~{.ebnf .gram}
Expand Down Expand Up @@ -2813,220 +2698,6 @@ Sending operations are not part of the Rust language, but are
implemented in the library. Generic functions that send values bound
the kind of these values to sendable.



## Typestate system


Rust programs have a static semantics that determine the types of values
produced by each expression, as well as the *predicates* that hold over
slots in the environment at each point in time during execution.

The latter semantics -- the dataflow analysis of predicates holding over slots
-- is called the *typestate* system.

### Points

Control flows from statement to statement in a block, and through the
evaluation of each expression, from one sub-expression to another. This
sequential control flow is specified as a set of _points_, each of which
has a set of points before and after it in the implied control flow.

For example, this code:

~~~~~~~~
# let mut s;
s = ~"hello, world";
io::println(s);
~~~~~~~~

Consists of 2 statements, 3 expressions and 12 points:


* the point before the first statement
* the point before evaluating the static initializer `"hello, world"`
* the point after evaluating the static initializer `"hello, world"`
* the point after the first statement
* the point before the second statement
* the point before evaluating the function value `println`
* the point after evaluating the function value `println`
* the point before evaluating the arguments to `println`
* the point before evaluating the symbol `s`
* the point after evaluating the symbol `s`
* the point after evaluating the arguments to `println`
* the point after the second statement


Whereas this code:


~~~~~~~~
# fn x() -> ~str { ~"" }
# fn y() -> ~str { ~"" }
io::println(x() + y());
~~~~~~~~

Consists of 1 statement, 7 expressions and 14 points:


* the point before the statement
* the point before evaluating the function value `println`
* the point after evaluating the function value `println`
* the point before evaluating the arguments to `println`
* the point before evaluating the arguments to `+`
* the point before evaluating the function value `x`
* the point after evaluating the function value `x`
* the point before evaluating the arguments to `x`
* the point after evaluating the arguments to `x`
* the point before evaluating the function value `y`
* the point after evaluating the function value `y`
* the point before evaluating the arguments to `y`
* the point after evaluating the arguments to `y`
* the point after evaluating the arguments to `+`
* the point after evaluating the arguments to `println`


The typestate system reasons over points, rather than statements or
expressions. This may seem counter-intuitive, but points are the more
primitive concept. Another way of thinking about a point is as a set of
*instants in time* at which the state of a task is fixed. By contrast, a
statement or expression represents a *duration in time*, during which the
state of the task changes. The typestate system is concerned with constraining
the possible states of a task's memory at *instants*; it is meaningless to
speak of the state of a task's memory "at" a statement or expression, as each
statement or expression is likely to change the contents of memory.


### Control flow graph

Each *point* can be considered a vertex in a directed *graph*. Each
kind of expression or statement implies a number of points *and edges* in
this graph. The edges connect the points within each statement or expression,
as well as between those points and those of nearby statements and expressions
in the program. The edges between points represent *possible* indivisible
control transfers that might occur during execution.

This implicit graph is called the _control-flow graph_, or _CFG_.


### Constraints

A [_predicate_](#predicate-functions) is a pure boolean function declared with
the keywords `pure fn`.

A _constraint_ is a predicate applied to specific slots.

For example, consider the following code:

~~~~~~~~
pure fn is_less_than(a: int, b: int) -> bool {
ret a < b;
}
fn test() {
let x: int = 10;
let y: int = 20;
check is_less_than(x,y);
}
~~~~~~~~

This example defines the predicate `is_less_than`, and applies it to the slots
`x` and `y`. The constraint being checked on the third line of the function is
`is_less_than(x,y)`.

Predicates can only apply to slots holding immutable values. The slots a
predicate applies to can themselves be mutable, but the types of values held
in those slots must be immutable.

### Conditions

A _condition_ is a set of zero or more constraints.

Each *point* has an associated *condition*:

* The _precondition_ of a statement or expression is the condition required at
in the point before it.
* The _postcondition_ of a statement or expression is the condition enforced
in the point after it.

Any constraint present in the precondition and *absent* in the postcondition
is considered to be *dropped* by the statement or expression.


### Calculated typestates

The typestate checking system *calculates* an additional condition for each
point called its _typestate_. For a given statement or expression, we call the
two typestates associated with its two points the prestate and a poststate.

* The _prestate_ of a statement or expression is the typestate of the
point before it.
* The _poststate_ of a statement or expression is the typestate of the
point after it.

A _typestate_ is a condition that has _been determined by the typestate
algorithm_ to hold at a point. This is a subtle but important point to
understand: preconditions and postconditions are *inputs* to the typestate
algorithm; prestates and poststates are *outputs* from the typestate
algorithm.

The typestate algorithm analyses the preconditions and postconditions of every
statement and expression in a block, and computes a condition for each
typestate. Specifically:


* Initially, every typestate is empty.
* Each statement or expression's poststate is given the union of the its
prestate, precondition, and postcondition.
* Each statement or expression's poststate has the difference between its
precondition and postcondition removed.
* Each statement or expression's prestate is given the intersection of the
poststates of every predecessor point in the CFG.
* The previous three steps are repeated until no typestates in the
block change.

The typestate algorithm is a very conventional dataflow calculation, and can
be performed using bit-set operations, with one bit per predicate and one
bit-set per condition.

After the typestates of a block are computed, the typestate algorithm checks
that every constraint in the precondition of a statement is satisfied by its
prestate. If any preconditions are not satisfied, the mismatch is considered a
static (compile-time) error.


### Typestate checks

The key mechanism that connects run-time semantics and compile-time analysis
of typestates is the use of [`check` expressions](#check-expressions). A
`check` expression guarantees that *if* control were to proceed past it, the
predicate associated with the `check` would have succeeded, so the constraint
being checked *statically* holds in subsequent points.^[A `check` expression
is similar to an `assert` call in a C program, with the significant difference
that the Rust compiler *tracks* the constraint that each `check` expression
enforces. Naturally, `check` expressions cannot be omitted from a "production
build" of a Rust program the same way `asserts` are frequently disabled in
deployed C programs.}

It is important to understand that the typestate system has *no insight* into
the meaning of a particular predicate. Predicates and constraints are not
evaluated in any way at compile time. Predicates are treated as specific (but
unknown) functions applied to specific (also unknown) slots. All the typestate
system does is track which of those predicates -- whatever they calculate --
*must have been checked already* in order for program control to reach a
particular point in the CFG. The fundamental building block, therefore, is the
`check` statement, which tells the typestate system "if control passes this
point, the checked predicate holds".

From this building block, constraints can be propagated to function signatures
and constrained types, and the responsibility to `check` a constraint
pushed further and further away from the site at which the program requires it
to hold in order to execute properly.



# Memory and concurrency models

Rust has a memory model centered around concurrently-executing _tasks_. Thus
Expand Down
1 change: 0 additions & 1 deletion src/cargo/cargo.rs
Original file line number Diff line number Diff line change
Expand Up @@ -763,7 +763,6 @@ fn rustc_sysroot() -> ~str {
alt os::self_exe_path() {
some(path) {
let path = ~[path, ~"..", ~"bin", ~"rustc"];
check vec::is_not_empty(path);
let rustc = path::normalize(path::connect_many(path));
#debug(" rustc: %s", rustc);
rustc
Expand Down
2 changes: 1 addition & 1 deletion src/compiletest/compiletest.rs
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ fn parse_config(args: ~[~str]) -> config {
getopts::optflag(~"verbose"),
getopts::optopt(~"logfile")];

check (vec::is_not_empty(args));
assert (vec::is_not_empty(args));
let args_ = vec::tail(args);
let match =
alt getopts::getopts(args_, opts) {
Expand Down
1 change: 0 additions & 1 deletion src/compiletest/runtest.rs
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,6 @@ fn run_pretty_test(config: config, props: test_props, testfile: ~str) {
if option::is_some(props.pp_exact) {
// Now we have to care about line endings
let cr = ~"\r";
check (str::is_not_empty(cr));
actual = str::replace(actual, cr, ~"");
expected = str::replace(expected, cr, ~"");
}
Expand Down
Loading

0 comments on commit 41a21f0

Please sign in to comment.