New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MIR] A pass to type-check MIR #31474

Merged
merged 13 commits into from Feb 20, 2016

Conversation

Projects
None yet
8 participants
@arielb1
Contributor

arielb1 commented Feb 7, 2016

This should stop broken MIR from annoying us when we try to implement things

@rust-highfive

This comment has been minimized.

Collaborator

rust-highfive commented Feb 7, 2016

r? @jroesch

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive assigned arielb1 and unassigned jroesch Feb 7, 2016

@@ -974,6 +978,7 @@ pub fn build_session_options(matches: &getopts::Matches) -> Options {
let parse_only = debugging_opts.parse_only;
let no_trans = debugging_opts.no_trans;
let treat_err_as_bug = debugging_opts.treat_err_as_bug;
let mir_opt_level = debugging_opts.mir_opt_level.unwrap_or(1);

This comment has been minimized.

@nagisa

nagisa Feb 7, 2016

Contributor

I was considering a default of 1 + -C opt-level. This way the level of MIR “optimisations” would automatically get attached to the -O and -C opt-level flags as well.

This comment has been minimized.

@arielb1

arielb1 Feb 7, 2016

Contributor

First make opt-levels greater than 1 do something, then we can talk.

@bors

This comment has been minimized.

Contributor

bors commented Feb 9, 2016

☔️ The latest upstream changes (presumably #31425) made this pull request unmergeable. Please resolve the merge conflicts.

@Aatch

This comment has been minimized.

Contributor

Aatch commented Feb 10, 2016

How fast do you expect the check to be? It would be nice if we could run the check between each pass, probably based on a -Z flag.

@nikomatsakis

This comment has been minimized.

Contributor

nikomatsakis commented Feb 10, 2016

cc me

@arielb1

This comment has been minimized.

Contributor

arielb1 commented Feb 10, 2016

This pass should not take too much time.

@jroesch

This comment has been minimized.

Member

jroesch commented Feb 10, 2016

Let me know when you think this is ready for review, reading over it right now.

@arielb1 arielb1 changed the title from [WIP] [MIR] A pass to type-check MIR to [MIR] A pass to type-check MIR Feb 11, 2016

@arielb1

This comment has been minimized.

Contributor

arielb1 commented Feb 11, 2016

the rvalue-type-checking code can be improved, but I will leave that to a separate PR - this one is growing too big.

r? @nikomatsakis

span_mirbug_and_err!(self, parent, "bad type {:?}", ty)
}
fn sanitize_lvalue(&mut self, lvalue: &Lvalue<'tcx>) -> LvalueTy<'tcx> {

This comment has been minimized.

@nikomatsakis

nikomatsakis Feb 12, 2016

Contributor

This prefix sanitize strikes me as odd. To me, it suggests something which will mutate the lvalue in place, removing artifacts or normalizing it in some way. But in fact sanitize_lvalue seems to compute the type of an lvalue.

There is also an unfortunate amount of overlap with the lvalue_ty routine -- these basically do the same thing, but just handle the error cases differently. Perhaps we could make lvalue_ty return a Result instead to eliminate the overlap?

This comment has been minimized.

@nikomatsakis

nikomatsakis Feb 12, 2016

Contributor

OK, reading a bit further I see that sanitize_projection actually does more consistency checking, so perhaps it is good to keep these two routines as separate. I think I would change sanitize to typeck_ or type_ or just check_.

This comment has been minimized.

@nikomatsakis

nikomatsakis Feb 12, 2016

Contributor

Perhaps verify_ would be the best name, given that this is part of the type verifier.

This comment has been minimized.

@arielb1

arielb1 Feb 12, 2016

Contributor

it returns the sanitized lvalue ty, which will be TyErr if there are are late-bound regions etc. in the type.

}
_ => LvalueTy::Ty {
ty: span_mirbug_and_err!(
self, lvalue, "can't downcast {:?}", base_ty)

This comment has been minimized.

@nikomatsakis

nikomatsakis Feb 12, 2016

Contributor

Nit: if I actually were getting this error, I think I'd prefer to see can't downcast {:?} to {:?} :)

This comment has been minimized.

@arielb1

arielb1 Feb 12, 2016

Contributor

The lvalue is already printed - it is "broken MIR (.. as Adt): can't downcast {:?}", but fine.

ty,
obligations);
let mut fulfill_cx = self.infcx.fulfillment_cx.borrow_mut();

This comment has been minimized.

@nikomatsakis

nikomatsakis Feb 12, 2016

Contributor

It's probably not correct to drain the fulfillment context here -- except for the fact that it should be empty coming in, so it seems harmless. Regardless, @soltanmm's recent PR #31588 ought to clean this up.

}
fn sanitize_type(&mut self, parent: &fmt::Debug, ty: Ty<'tcx>) -> Ty<'tcx> {
if !(ty.needs_infer() || ty.has_escaping_regions() ||

This comment has been minimized.

@nikomatsakis

nikomatsakis Feb 12, 2016

Contributor

Total nit: Maybe it's just me, but I find !x && !y && !z much easier to read than !(x || y || z)

This comment has been minimized.

@nikomatsakis

nikomatsakis Feb 12, 2016

Contributor

But even better would be:

if ty.needs_infer() || has_escaping_regions() || references_error() {
    span_mir_bug_and_err()
}

ty

This comment has been minimized.

@arielb1

arielb1 Feb 12, 2016

Contributor

that would need an else (otherwise it wouldn't be much of a sanitize), but fine.

}
}
impl<'a, 'tcx> MirPass for TypeckMir<'a, 'tcx> {

This comment has been minimized.

@nagisa

nagisa Feb 12, 2016

Contributor

I think it would be for the best if you moved the lifetime back to the MirPass trait. However, it is fine if you’re not interested doing it in this PR, I’ll gladly move the lifetime in one of my PRs myself.

This comment has been minimized.

@arielb1

arielb1 Feb 12, 2016

Contributor

I just wanted to do a minimal-change rebase. Will fix.

Terminator::Goto { .. } |
Terminator::Resume |
Terminator::Return |
Terminator::Drop { .. } => {}

This comment has been minimized.

@nikomatsakis

nikomatsakis Feb 12, 2016

Contributor

Nit: please place an empty newline inside the {} so that these arms are visually distinct from Terminator::If. Or even better something like

Drop { .. } => {
    // nothing to check here
}
}
Terminator::SwitchInt { ref discr, switch_ty, .. } => {
let discr_ty = mir.lvalue_ty(tcx, discr).to_ty(tcx);
if let Err(terr) = infer::can_mk_subty(self.infcx, discr_ty, switch_ty) {

This comment has been minimized.

@nikomatsakis

nikomatsakis Feb 12, 2016

Contributor

Should we check that switch_ty is an integral/char type? (Do we support switching on non-integral/non-char types? I forget, but I think not.)

This comment has been minimized.

@nikomatsakis

nikomatsakis Feb 12, 2016

Contributor

Also, the right way to use inference is not to call can_mk_subty but just mk_subty, since this is something that must be true (or else error). I don't think there are any inference variables floating around, so the difference is not large, but if you use can_mk_subty, then it will just check whether you could require that discr_ty <: switch_ty, versus adding the constraint that indeed discr_ty <: swich_ty ought to hold.

This comment has been minimized.

@arielb1

arielb1 Feb 12, 2016

Contributor

I don't want to pollute the global infcx. I guess that I should just take a param-env and create an infcx myself.

}
(&Some((ref dest, _)), ty::FnConverging(ty)) => {
let dest_ty = mir.lvalue_ty(tcx, dest).to_ty(tcx);
if let Err(terr) = infer::can_mk_subty(self.infcx, ty, dest_ty) {

This comment has been minimized.

@nikomatsakis

nikomatsakis Feb 12, 2016

Contributor

see comment below: use infer::mk_subty if this is something that has to hold or else an error occurs; can_mk_subty is for deciding which constraints to use

let fty = self.sanitize_type(lvalue, fty);
match self.field_ty(lvalue, base, field) {
Ok(ty) => {
if let Err(terr) = infer::can_mk_subty(self.infcx, ty, fty) {

This comment has been minimized.

@nikomatsakis

nikomatsakis Feb 12, 2016

Contributor

another incorrect use of can_mk_subty -- I won't tag them all, there are a bunch

let tcx = self.tcx();
match (destination, sig.output) {
(&Some(..), ty::FnDiverging) => {
span_mirbug!(self, term, "call to diverging function {:?} with dest", sig);

This comment has been minimized.

@nikomatsakis

nikomatsakis Feb 12, 2016

Contributor

I forget -- what did we do to translate things like

let x = foo()

where foo() diverges? We do want to consider x "initialized" I think, but probably not by setting it as the "destination" for the call.

This comment has been minimized.

@arielb1

arielb1 Feb 12, 2016

Contributor

Nah we are smart enough not to generate a dest.

Plus x is not actually initialized, and in fact the assignment is removed by clear_dead_blocks.

This comment has been minimized.

@nikomatsakis

nikomatsakis Feb 16, 2016

Contributor

@arielb1

Plus x is not actually initialized, and in fact the assignment is removed by clear_dead_blocks.

Yeah, so I was thinking about this. I am not sure yet it stripping out unreachable code like this is a good strategy long-term -- though I think it MAY be :).

Another alternative would be to have some sort of special rvalue like unreachable: T that we can use as the "initializer" for these sorts of cases. (Ideally, though, we'd insert assignments from this initializer in some clever way; the code typeck uses terrifies me.) The reason is that I think we want to type-check that unreachable code nonetheless.

Right now, of course, we're a bit inconsistent, as you pointed out. On the one hand, I think it would be a big breaking change to get strict about type-checking in unreachable scenarios like this:

let v = if ... { panic!(); } else { 3 }; // note that the "then" branch technically would have type `()`

(Though I know that I myself have been confused about this numerous types, since I expect the ; to cause a type-check error, but it often does not -- though I think I have seen it be required sometimes.)

Moreover, we apparently accept accesses to uninitialized variables in unreachable code, at least sometimes, since this type-checks:

fn main() {
    let x: ();
    return;
    use_it(x);
}

fn use_it(x: ()) { }

However, this program DOES yield an error:

fn main() {
    let x: String = format!("foo");
    if false {
        return;
        use_it(x);
    }
    use_it(x);
}

fn use_it(x: String) { }

Moreover, as part of the (eventual) move to non-lexical lifetimes, I expect we will be deferring all region checking to be done on the MIR, and hence stripping out unreachable code would definitely affect the set of constraints that get generated. OTOH, it would basically just mean that more things type-check, and in particular it may wind up matching programmer intuition better, since of course the programmer (probably) knows that the code in question is unreachable.

This comment has been minimized.

@arielb1

arielb1 Feb 16, 2016

Contributor

I am not sure. I would rather use a special rvalue (BlockEnd) for statement terminators and ignore them while type-checking MIR (and leave that for the new liveness pass).

This comment has been minimized.

@nikomatsakis

nikomatsakis Feb 16, 2016

Contributor

After some more thought, I think I am feeling good about stripping out unreachable code -- but we should feed those results into the "unreachable code" lint, so that we still see warnings. But basically it feels a bit silly to go way out of our way to act like we don't understand control flow, particularly since we're moving more and more towards making things flow-sensitive and intelligent. And clearly it'd be annoying if if cond { return; } else { 22 } and if cond { return } else { 22 } were distinct -- I imagine there will be analogous examples.

This comment has been minimized.

@nikomatsakis

nikomatsakis Feb 16, 2016

Contributor

(To be clear, I mean we should eventually feed those results into the lints -- presumably the lints today would warn anyhow, it just seems better for them to work on MIR eventually.)

This comment has been minimized.

@nagisa

nagisa Feb 16, 2016

Contributor

After some more thought, I think I am feeling good about stripping out unreachable code -- but we should feed those results into the "unreachable code" lint

Note, that in MIR build we already produce a bunch of dead blocks, regardless of reachability in the original code. Moreover, I’d expect almost every transformation pass result in a considerable amount of dead blocks, which should not be considered for lint purposes, but would get eliminated after the pass to make subsequent analysis faster. The point I’m trying to make here is that MIR is not really suited for reachability analysis of the original code.

This comment has been minimized.

@nikomatsakis

nikomatsakis Feb 16, 2016

Contributor

@nagisa

The point I’m trying to make here is that MIR is not really suited for reachability analysis of the original code.

Do you have specific examples in mind? I don't think that optimization or transformation passes matter at all -- we will be type-checking the exact output of build, or at least we will be very careful about what transformations we do before type checking.

In any case, clearly, not every unreachable block should trigger a warning, but it seems like we could design some heuristics to tell the difference. But this may be harder than I am realize -- notably around matches, where I think I've become persuaded by @arielb1's POV that exhaustiveness (and presumably then dead-code) might be better handled by a distinct algorithm from the one we use to lower to MIR.

Still, I think that if we're going to rely on dead-code stripping as part of the typeck/borrowck rules, it's pretty important that it align closely with what the lints will report as well as what typeck can compute, and the easiest and most reliable way to achieve that is to drive it from the MIR.

{
debug!("check_call_inputs({:?}, {:?})", sig, args);
if sig.inputs.len() > args.len() ||
(sig.inputs.len() < args.len() && !sig.variadic) {

This comment has been minimized.

@nikomatsakis

nikomatsakis Feb 12, 2016

Contributor

Am I... being silly? This seems wrong. I would think that supplying too few arguments is always a problem, but if the fn is variadic, you can supply extra arguments?

This comment has been minimized.

@jonas-schievink

jonas-schievink Feb 12, 2016

Contributor

The signature is on the left side of the comparison, the actual arguments on the right

@arielb1

This comment has been minimized.

Contributor

arielb1 commented Feb 14, 2016

fixed fulldeps test failure

@arielb1

This comment has been minimized.

Contributor

arielb1 commented Feb 14, 2016

or not. damn DepGraph. how do you make that thing work?

@nikomatsakis

This comment has been minimized.

Contributor

nikomatsakis commented Feb 16, 2016

OK, I feel good about this now, seems like a great starting point! Nice work @arielb1.

r=me -- shall we merge it? see comments below.

@nikomatsakis

This comment has been minimized.

Contributor

nikomatsakis commented Feb 17, 2016

Had two thoughts:

  1. I forgot to mention that we are not running region inference, which means we are not getting a complete type check here.
  2. Probably makes sense to do a crater run before landing this -- it seems to have high regression potential.
@arielb1

This comment has been minimized.

Contributor

arielb1 commented Feb 17, 2016

Right, we are not running region inference, and we are a bit loose with checking rvalues. I will do that in a future time.

I am not sure we need a crater run - this should only generate warnings, not ICEs, unless something is terribly broken. If we do a crater run, I will prefer to set ICE mode on (replace the span_warns in span_mirbug with span_bug).

@bors

This comment has been minimized.

Contributor

bors commented Feb 18, 2016

💔 Test failed - auto-win-gnu-64-nopt-t

@arielb1

This comment has been minimized.

Contributor

arielb1 commented Feb 18, 2016

what?

0xc0000374 is heap corruption. This does not seem to happen on Linux - I will try to investigate.

@bors

This comment has been minimized.

Contributor

bors commented Feb 18, 2016

☔️ The latest upstream changes (presumably #31600) made this pull request unmergeable. Please resolve the merge conflicts.

arielb1 added some commits Feb 7, 2016

add -Z mir-opt-level to disable MIR optimizations
setting -Z mir-opt-level=0 will disable all MIR optimizations
for easier debugging
introduce an early pass to clear dead blocks
this makes the the MIR assignment pass complete successfully
be more type-safe in panic/panic_bounds_check
TODO: find a correct borrow region

Fixes #31482
tuple arguments to overloaded calls
also fix translation of "rust-call" functions, although that could use
more optimizations
make *mut T -> *const T a coercion
rather than being implicit quasi-subtyping. Nothing good can come out
of quasi-subtyping.
@arielb1

This comment has been minimized.

Contributor

arielb1 commented Feb 20, 2016

@bors r=nikomatsakis

@bors

This comment has been minimized.

Contributor

bors commented Feb 20, 2016

📌 Commit 626dc3b has been approved by nikomatsakis

@bors

This comment has been minimized.

Contributor

bors commented Feb 20, 2016

⌛️ Testing commit 626dc3b with merge e768394...

bors added a commit that referenced this pull request Feb 20, 2016

Auto merge of #31474 - arielb1:mir-typeck, r=nikomatsakis
This should stop broken MIR from annoying us when we try to implement things
@bors

This comment has been minimized.

Contributor

bors commented Feb 20, 2016

💔 Test failed - auto-mac-64-opt

@arielb1

This comment has been minimized.

Contributor

arielb1 commented Feb 20, 2016

@bors r=nikomatsakis

@bors

This comment has been minimized.

Contributor

bors commented Feb 20, 2016

📌 Commit d84658e has been approved by nikomatsakis

@bors

This comment has been minimized.

Contributor

bors commented Feb 20, 2016

⌛️ Testing commit d84658e with merge 6c751e0...

bors added a commit that referenced this pull request Feb 20, 2016

Auto merge of #31474 - arielb1:mir-typeck, r=nikomatsakis
This should stop broken MIR from annoying us when we try to implement things

@bors bors merged commit d84658e into rust-lang:master Feb 20, 2016

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
homu Test successful
Details

@bors bors referenced this pull request Feb 20, 2016

Closed

[WIP] Sanitizers support. #31605

@llogiq llogiq referenced this pull request Feb 22, 2016

Closed

TWIR 119's CotW + PRs #172

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment