New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MIR] Dataflow framework, constant propagation and dead code elimination #35608

Closed
wants to merge 24 commits into
base: master
from

Conversation

Projects
None yet
10 participants
@nagisa
Copy link
Contributor

nagisa commented Aug 11, 2016

This is the 3rd take of the original dataflow framework PR (previous 2 being #34164 and #33628).

I lost a great amount of my motivation to work on this thing when I saw the opts we (me and @gereeter) already had semi-implemented worked on by the interns, so I decided to cut out most of the stuff from the framework not necessary for the other two optimisations I had in mind, so I could get this PR in landable state before 2020 or so.

No notable performance improvements to be seen here (LLVM is more than capable of doing constant propagation quickly and by itself), but peak resident memory rustc uses (~20MB reduction from 1.15GB) and, likely, metadata size receive a small reductions. All these passes are cumulative ~5 seconds for a bootstrap of whole stage (rustc + libstd). Thus, the primary benefit of this PR is a fix for #35316.

@gereeter sorry, I ended up not preserving the commits of your work, as the effort rebasing this work while also preserving commits ended up being greater than I had mood to deal with.

@eddyb ran a crater which found one regression which has been since fixed.

cc @eddyb @nikomatsakis @scottcarr

@rust-highfive

This comment has been minimized.

Copy link
Collaborator

rust-highfive commented Aug 11, 2016

r? @nikomatsakis

(rust_highfive has picked a reviewer for you, use r? to override)

pub fn new(num_bits: usize) -> BitVector {
let num_words = u64s(num_bits);
BitVector { data: vec![0; num_words] }
}

pub fn contains(&self, bit: usize) -> bool {
let (word, mask) = word_mask(bit);
(self.data[word] & mask) != 0
(self.data.get(word).cloned().unwrap_or(0) & mask) != 0

This comment has been minimized.

@scottcarr

scottcarr Aug 11, 2016

Contributor

It seems to me that asking for an index that doesn't exist should panic and that this might mask bugs in other code. Do you use this behavior in your analysis?

This comment has been minimized.

@nagisa

nagisa Aug 11, 2016

Author Contributor

It seems to me that asking for an index that doesn't exist should panic and that this might mask bugs in other code.

It had such a behaviour for sizes which are not 0 mod 64 up to next 0 mod 64 bit already. This makes the behaviour consistent for all sizes.

This comment has been minimized.

@nagisa

nagisa Aug 11, 2016

Author Contributor

Do you use this behavior in your analysis?

I didn’t answer this: yes, I do. The dead code removing pass uses this behaviour.

This comment has been minimized.

@scottcarr

scottcarr Aug 12, 2016

Contributor

Oh, I see now. Yes it should be consistent at least :)

On Aug 11, 2016, at 4:14 PM, Simonas Kazlauskas notifications@github.com wrote:

In src/librustc_data_structures/bitvec.rs #35608 (comment):

 pub fn new(num_bits: usize) -> BitVector {
     let num_words = u64s(num_bits);
     BitVector { data: vec![0; num_words] }
 }

 pub fn contains(&self, bit: usize) -> bool {
     let (word, mask) = word_mask(bit);
  •    (self.data[word] & mask) != 0
    
  •    (self.data.get(word).cloned().unwrap_or(0) & mask) != 0
    
    It seems to me that asking for an index that doesn't exist should panic and that this might mask bugs in other code.

It had such a behaviour for sizes which are not 0 mod 64 up to next 0 mod 64 bit already. This makes the behaviour consistent for all sizes.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/rust-lang/rust/pull/35608/files/0962575160c619b29df676c0fbb0cee3f804caf7#r74520803, or mute the thread https://github.com/notifications/unsubscribe-auth/AAc1nWMtSQhjePDC7v--c3TRfwBd_6YFks5qe6zdgaJpZM4Jinmf.


pub struct CsPropagate;

impl Pass for CsPropagate {}

This comment has been minimized.

@scottcarr

scottcarr Aug 11, 2016

Contributor

Nit: I don't like highly visible names as acronyms. Is "Cs" and acronym for Constant-Simply? Why is the 's' lower case?

This comment has been minimized.

@nagisa

nagisa Aug 11, 2016

Author Contributor

I don't like highly visible names as acronyms.

Highly… visible?

Why is the 's' lower case?

Because of how naming conventions in Rust work.

This comment has been minimized.

@nagisa

nagisa Aug 11, 2016

Author Contributor

I guess I could name it ConstPropagate, I don’t care much either way.

@scottcarr

This comment has been minimized.

Copy link
Contributor

scottcarr commented Aug 12, 2016

I only had time for a quick look, but it looks cool to me. I don't think you should be discouraged in anyway by my work on move up propagation -- for several reasons:

  • Move up propagation still a work in progress
  • I have often thought that we need a way of doing optimizations generically
  • All the optimizations should be additive (hopefully)

Does the dead code optimization ever fire?

You could make SimplifyRewrite and CsPropgate separate, right? SimplifyRewrite does not appear to ever look at the lattice?

We may want to put all the optimizations behind some command line option like mir-opt-level or just a regular optimization flag now?

@nagisa

This comment has been minimized.

Copy link
Contributor Author

nagisa commented Aug 12, 2016

You could make SimplifyRewrite and CsPropgate separate, right? SimplifyRewrite does not appear to ever look at the lattice?

The important part of having them together is the fact that they are interleaved. This is the major feature of this dataflow framework. The interleaving allows optimisation to not miss opportunities. Something that otherwise would be impossible by having these passes completely separate.

For example consider a case like this:

First constant propagation replaces if tmp1 { peach } with if false { peach }. Then simplification pass removes the blocks for if body and by extension a predecessor for the block after this conditional. In case the predecessor were to stay, it could have impeded further constant propagation in certain cases, but simplification pass went ahead and removed the whole conditional before it was even considered for propagation.

Does the dead code optimization ever fire?

It does, indeed. The preceding constant propagation pass does not care about cleaning up after itself, so it falls onto dead code optimisation to clean up after it by removing all the extra assignments which end up getting unused.

We may want to put all the optimizations behind some command line option like mir-opt-level or just a regular optimization flag now?

We indeed might want to. I previously proposed to default mir-opt-level = -C opt-level + 1, so the MIR optimisation level could easily be derived from the optimisation level of the whole compilation unit, and also have an option to run some passes always (optimisation level 0).

All the optimizations should be additive (hopefully)

It does not make sense to dedicate manpower on making two equivalent (in their purpose) passes in two different ways when there’s already so many more, better, things to do.

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Aug 12, 2016

@nagisa @scottcarr Something that crossed my mind is that, well, "destination propagation" (aka "move up propagation" aka "backward move propagation") is orthogonal "source propagation" (aka "(forward) value propagation").
That is, one attempts to lift destinations to earlier writes, the other attempts to sink sources to later reads.

IIUC, dataflow being employed for value propagation (specifically to constants, and generally to moves), would be "source propagation", not "destination propagation" (correct me if I'm wrong @nagisa), so it could eliminate the single moves left by "destination propagation" reducing a move chain, e.g.:

var0 = arg0;
tmp0 = var0;
tmp1 = tmp0;
var1 = tmp1;
tmp2 = var1;
return = f(tmp2);

Destination propagation would likely be able to do this:

tmp2 = arg0;
return = f(tmp2);

But only source propagation (using dataflow analysis) could turn that into:

```rust
return = f(arg0);

In fact, the testcase from @scottcarr's #34693 results in this:

var0 = arg0;
return = (*var0);

You'd need source propagation to reduce it to:

return = (*arg0);
let StatementKind::Assign(ref lval, ref rval) = s.kind;
match *rval {
Rvalue::Use(Operand::Consume(ref nlval)) => {
lat.insert(lval, Either::Lvalue(nlval.clone()));

This comment has been minimized.

@scottcarr

scottcarr Aug 12, 2016

Contributor

Here, an Either::Lvalue is put in the lattice, but I don't see any place where an Either::Lvalue could be read out. Am I missing something?

I see that ConstRewrite uses Either::Const, but I don't understand what Either::Lvalue is for.

This comment has been minimized.

@nagisa

nagisa Aug 12, 2016

Author Contributor

It was primarily for move propagation, but otherwise it is still a good idea to keep it Either::Lvalue here, because in case of code like tmp1 = var1; <stuff> tmp1 = 0; it would correctly pick up the fact that the tmp1 is a constant. This wouldn’t work with Top because of how join operation with ⊤ works. Just removing the lval from the map here doesn’t work either, because then, in a block with two incoming edges, there could be a case where ⊤ is not produced, when it really should be. Finally, just making a Either::Lvalue (without the contained Lvalue) is also suboptimal, because you’d then have to produce Either::Top every time two such values are joined.

This comment has been minimized.

@nagisa

nagisa Aug 12, 2016

Author Contributor

Actually, scratch that, I think Either::Lvalue stuff without inner data could be made to work.

This comment has been minimized.

@arielb1

arielb1 Aug 14, 2016

Contributor

If you want to use Either::Lvalue, you ought to invalidate it when the lvalue is re-assigned, because e.g.

let mut x = getit();
y = x; // y <- Either::Lvalue(x)
x = 1;
return y; // y must not be Either::Lvalue(x);

This comment has been minimized.

@gereeter

gereeter Aug 14, 2016

Contributor

I believe that move propagation, implemented in this framework, should not use a plain from from lvalues to lvalues - rather, it should used some some of data structure that holds equivalence classes, exactly for the invalidation reason you bring up.

@gereeter

This comment has been minimized.

Copy link
Contributor

gereeter commented Aug 12, 2016

You could make SimplifyRewrite and CsPropgate separate, right? SimplifyRewrite does not appear to ever look at the lattice?

@nagisa mostly covered this, but it is important to note that the interleaved transformation is more powerful than even an arbitrary number of iterations of the two passes seperately in any order. For example, in the following code, neither pass would be able to make any progress:

let mut x = 0;
loop {
    if x != 0 {
        x = 1;
    }
    println!("{}", x);
}

Constant propagation cannot see that x is always 0, since it will see the x = 1, while there are no trivially simplifiable expressions, so the simplification pass cannot do anything. In contrast, when the two are interleaved, the code can be properly optimized to

let mut x = 0;
loop {
    println!("{}", 0);
}
self.values.remove(key)
}
fn top(&mut self, key: &Lvalue<'tcx>) {
self.insert(key, Either::Top);

This comment has been minimized.

@gereeter

gereeter Aug 12, 2016

Contributor

Why not just remove from the map? With an intersection based join function, not present means Top.

This comment has been minimized.

@arielb1

arielb1 Aug 14, 2016

Contributor

But this is a union-based lattice. No, missing hash table values default to Top. But then, why do you have an fn remove?

This comment has been minimized.

@gereeter

gereeter Aug 14, 2016

Contributor

Yeah, this is essentially my confusion over the lattice - the intersection based join function seems to conflict with all the code that treats removal and setting values to Top differently.

}

impl<'tcx> Lattice for CsLattice<'tcx> {
fn bottom() -> Self { CsLattice { values: FnvHashMap() } }

This comment has been minimized.

@gereeter

gereeter Aug 12, 2016

Contributor

With an intersection based lattice, the empty map is actually the Top element, not the bottom element. join(x, bottom()) should be x, not bottom().

@gereeter

This comment has been minimized.

Copy link
Contributor

gereeter commented Aug 12, 2016

I think I may not be quite understanding what lattice you're trying to use for constant propagation and simplification. Here is the model that I am envisioning.

For every lvalue, we want to track the state of what is stored in that location. There are three classes of value:

  • ⊥, meaning that the location is uninitialized.
  • A conrete value (a constant, since you stripped out move propagation).
  • ⊤, meaning that the location is initialized with an unknown value.

The join function is then performed pointwise, and joining with an uninitialized value does nothing.

To concretely represent this lattice, we wish to use a HashMap, which serves as only a partial mapping from lvalues to states. Therefore, we need to define a default state that is assigned to all lvalues that do not appear in the HashMap. The two obvious choices are ⊥, which corresponds to a union-based join function, and ⊤, which corresponds to an intersection-based join function.

To determine which is appropriate, we look at what lattice element should be used at the entry point of the function. Note that all arguments to the function must be at ⊤, since they are fully initialized, but could be anything. If they were instead ⊥,

fn foo(mut bar: i32, baz: bool) {
    if baz {
        bar = 15;
    }
    println!("{}", bar);
}

would be incorrectly optimized to

fn foo(mut bar: i32, baz: bool) {
    println!("{}", 15);
}

since at the end of the if statement, ⊥ and 15 would be joined to produce 15, incorrectly convincing the optimizer that bar must be 15. Similarly, every static must be ⊤, or

static mut foo: i32 = 0;

unsafe fn bar(flag: bool) {
    if flag {
        foo = 1;
    }
    println!("{}", foo);
}

would reduce to

static mut foo: i32 = 0;

unsafe fn bar(flag: bool) {
    if flag {
        foo = 1;
    }
    println!("{}", 1);
}

Relatedly, every static would have to have its state set to ⊤ after every function calls, which might arbitrarily mess with them. This is a tricky problem, as we would have to scan through the function to make sure that we were tracking every static that might be referenced. Vars and Temps could be ⊥ (after all, they aren't actually initialized on entry), but by Rust's design, I don't think this would actually improve any optimizations - Rust should prevent any case that could involve a usage of an uninitialized value. Overall, using ⊤ as the default value for the map seems like a much safer plan, since beyond obviating the need to track down all statics, it cannot cause miscompilations. If some value is missed, it might cause underoptimization, but this is a much better state to be in than producing incorrect code.

The one problem with ⊤ as the default value is that we can no longer actually represent bottom. This isn't necessarily, as the dataflow framework can be written to never call bottom for forward analyses, but if it is necessary, we can just use WBottom.

Therefore, I'm envisioning

struct CsLattice {
    map: WBottom<FnvHashMap<Lvalue, WBottom<ConstVal>>>
}

with an intersection-based join function.


Unfortunately, I'm a bit confused about what lattice and representation you're using for this PR. You're using an intersection-based join function, which makes "removed from the map" into the most top state that an lvalue can be in, yet you also have a separate Top that you can assign to specific values. Moreover, there doesn't seem to be any way to make an lvalue bottom. In fact, the bottom method actually returns the top of the lattice.

}
if unwind.is_some() {
let mut unwind = lat.clone();
unwind.remove(location);

This comment has been minimized.

@arielb1

arielb1 Aug 14, 2016

Contributor

location ends up being value in both cases - that's the point of DropAndReplace. This could be vec![lat, lat.clone()].

for arg in args {
if let Operand::Consume(ref lval) = *arg {
// FIXME: Probably safe to not remove any non-projection lvals.
lat.remove(lval);

This comment has been minimized.

@arielb1

arielb1 Aug 14, 2016

Contributor

There is no need to "top" things when they are moved out. In fact, uninitialized values should be set to undef (which is similar to bottoming them).

}
}
destination.as_ref().map(|&(ref lval, _)| lat.top(lval));
vec![lat; succ_count]

This comment has been minimized.

@arielb1

arielb1 Aug 14, 2016

Contributor

Unlike DropAndReplace, destiination is set to undef on the unwind branch. Leaving it topped is conservative should be precise tho because the paths won't ever merge.

@nagisa

This comment has been minimized.

Copy link
Contributor Author

nagisa commented Aug 14, 2016

@arielb1 @gereeter thanks for the comments!

Unfortunately, I'm a bit confused about what lattice and representation you're using for this PR. You're using an intersection-based join function, which makes "removed from the map" into the most top state that an lvalue can be in, yet you also have a separate Top that you can assign to specific values. Moreover, there doesn't seem to be any way to make an lvalue bottom. In fact, the bottom method actually returns the top of the lattice.

With an intersection based lattice, the empty map is actually the Top element, not the bottom element. join(x, bottom()) should be x, not bottom().

That’s indeed true, however in case {} = ⊤, becomes a map containing a for each lvalue in the function. That is unfeasible to construct in a fn bottom() {}, therefore I opted for an inverse of some sorts: if an Lvalue does not exist in a map, then it is a , and the becomes a HashMap with a Lvalue = ⊤ for each lvalue in the function. Then, when joining, any value which doesn’t fall within the intersection gets added to the map as a .

Does that make sense?

I think I may not be quite understanding what lattice you're trying to use for constant propagation and simplification. Here is the model that I am envisioning.

I certainly see why it could be confusing, because not only it was a last-minute idea, but I’m sure I forgot/missed some self.tops around the place. I already am aware of at least a few of them, without even re-reviewing the code.


WBottom<FnvHashMap<Lvalue, WBottom<ConstVal>>>

In my experience that doesn’t work quite well for the same reason I described above. It is hard to construct a half-bottom lattice (where most of the lvalues are still ) out from the completely-bottom one (the WBottom::Bottom). Remember, that in your scheme, the absence of an Lvalue in the map already means it is a .

I feel like it could also be somewhat unclear why distinction between a and for Lvalues matter (or why they are necessary at all), so I’ll answer that upfront: in case like “a reference of var” is taken, you want to make the var a , so it does not participate in further optimisation (i.e. I don’t want to tackle problems like aliasing etc. yet). Consider code like this:

var0 = const 0;
tmp0 = &var0;
// ... could be all sorts of stuff, including casts tmp0 to a raw pointer to potentially further inhibit analysis.
var0 = const 1;
// does not mean you can replace var0 with const 1 here, because the pointer taken above could be aliased and thus invalidates the transformation like this by all sorts of operations.
@nagisa

This comment has been minimized.

Copy link
Contributor Author

nagisa commented Aug 14, 2016

I took some quick time to respond to the most important questions, but I haven’t time to rebase or tweak code to the comments; I shall do that on Tuesday. Thanks for the reviews again!

@gereeter

This comment has been minimized.

Copy link
Contributor

gereeter commented Aug 14, 2016

That is unfeasible to construct in a fn bottom() {}, therefore I opted for an inverse of some sorts: if an Lvalue does not exist in a map, then it is a ⊥, and the ⊤ becomes a HashMap with a Lvalue = ⊤ for each lvalue in the function.

This is fine - it is the bottom-by-default proposal I made, and makes perfect sense.

However, it's not actually what you implemented. You wrote an intersection-based join function, which necessarily makes the default ⊤, since join(not present, x) = not present, the law for ⊤. I think this is basically the point that I'm confused about - I'm fine with either bottom-by-default or top-by-default, as long as they are implemented correctly (and I think top-by-default is easier to implement correctly), but you're using a bunch of code that seems to work as if the lattice were bottom-by-default with a lattice that is actually top-by-default.

Then, when joining, any value which doesn’t fall within the intersection gets added to the map as a ⊤.

I don't see any code that actually does this, and it definitely makes the map top-by-default.

In my experience that doesn’t work quite well for the same reason I described above. It is hard to construct a half-bottom lattice (where most of the lvalues are still ⊥) out from the completely-bottom one (the WBottom::Bottom). Remember, that in your scheme, the absence of an Lvalue in the map already means it is a ⊤.

Yes, this is a drawback of having top-by-default. However, I don't see why you want a half-bottom value - while you may want to bottom out individual values, this is as easy in either representation. The entry point to the function must be mostly-top (for reasons I argued before), and so the default will generally want to be top.

That said, after thinking about this some more, I have a third solution which I think is just universally clearer and easier to work with:

/// A `FullMap<K, V>` is a complete mapping from `K`s to `V`s - every `K` corresponds
/// to a `V`.
struct FullMap<K, V> {
    // The value corresponding to every key not in exceptions
    default: V,
    exceptions: HashMap<K, V>
}

impl<K, V> FullMap<K, V> {
    fn get<'a>(&'a self, key: &K) -> &'a V {
        self.exceptions.get(key).unwrap_or(&self.default)
    }

    fn insert(&mut self, key: K, value: V) -> V {
        if value == self.default {
            self.exceptions.remove(&key).unwrap_or(value)
        } else {
            self.exceptions.insert(key, value).unwrap_or_else(|| self.default.clone())
        }
    }
}

impl<K, V: Lattice> Lattice for FullMap<K, V> {
    fn bottom() -> Self {
        FullMap {
            default: bottom(),
            exceptions: HashMap::new()
        }
    }

    fn join(&mut self, other: &FullMap<K, V>) -> bool {
        // This could definitely be optimized when defaults are known to be top or bottom
        let old_default = self.default.clone();
        let mut changed = self.default.join(&other.default);
        for (k, v) in &mut self.exceptions {
            if !other.exceptions.contains_key(&k) {
                changed |= v.join(&other.default);
                // TODO: for efficiency, remove keys here equal to the default
            }
        }

        for (k, v) in &other.exceptions {
            match self.exceptions.entry(k.clone()) {
                Occupied(occupied) => {
                    changed |= occupied.get_mut().join(v);
                    if *occupied.get() == self.default {
                        occupied.remove();
                    }
                },
                Vacant(vacant) => {
                    let mut cur_v = old_default.clone();
                    changed |= cur_v.join(v);
                    if cur_v != self.default {
                        vacant.insert(cur_v)
                    }
                }
            }
        }
        changed
    }
}

struct CsLattice {
    inner: FullMap<Lvalue, WTopBottom<ConstVal>>
}
@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Aug 14, 2016

In my experience that doesn’t work quite well for the same reason I described above. It is hard to construct a half-bottom lattice (where most of the lvalues are still ⊥) out from the completely-bottom one (the WBottom::Bottom). Remember, that in your scheme, the absence of an Lvalue in the map already means it is a ⊤.

The entry-point state of all variables should be ⊤, not ⊥. In fact, your code already basically does that.

@nagisa nagisa force-pushed the nagisa:mir-dflow-3 branch from 0962575 to 77f7d2e Aug 15, 2016

@nagisa

This comment has been minimized.

Copy link
Contributor Author

nagisa commented Aug 15, 2016

Okay, I ripped out any and all hacks I had in place wrt the lattice (most of the junk was a leftover from my experimentation with move propagation etc, which I didn’t clean up before).

@@ -616,12 +616,14 @@ fn gather_moves<'a, 'tcx>(mir: &Mir<'tcx>, tcx: TyCtxt<'a, 'tcx, 'tcx>) -> MoveD
Rvalue::InlineAsm { .. } => {}
}
}
StatementKind::SetDiscriminant{ ref lvalue, .. } => {

This comment has been minimized.

@nagisa

nagisa Aug 15, 2016

Author Contributor

cc @arielb1 does this make sense to you?

This comment has been minimized.

@arielb1

arielb1 Aug 16, 2016

Contributor

SetDiscriminant and borrowck are basically incompatible. Do not do these kinds of optimizations before borrowck runs.

repr::StatementKind::SetDiscriminant{ .. } => {
span_bug!(stmt.source_info.span, "SetDiscrimant should not exist during borrowck");
}
repr::StatementKind::SetDiscriminant{ ref lvalue, .. } |

This comment has been minimized.

@nagisa

nagisa Aug 15, 2016

Author Contributor

@arielb1 also this.

@nagisa

This comment has been minimized.

Copy link
Contributor Author

nagisa commented Aug 15, 2016

Note, this PR still contains a hack wrt unused locals, which are handled in trans. These are better handled by a pass, which @scottcarr already has written and available somewhere, but not yet merged. @scottcarr could you try and split out that pass in a separate PR?

Not anymore.

@nagisa nagisa force-pushed the nagisa:mir-dflow-3 branch from 644d84d to 28128d1 Aug 16, 2016


passes.push_pass(box mir::transform::erase_regions::EraseRegions);

passes.push_pass(box mir::transform::deaggregator::Deaggregator);

This comment has been minimized.

@arielb1

arielb1 Aug 16, 2016

Contributor

This should be after ElaborateDrops, not before.

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Aug 16, 2016

☔️ The latest upstream changes (presumably #35162) made this pull request unmergeable. Please resolve the merge conflicts.

@nagisa nagisa force-pushed the nagisa:mir-dflow-3 branch from dbe58ab to ccef2ad Aug 16, 2016

@@ -63,6 +62,45 @@ impl BitVector {
}
}

/// Return and unset first bit set.
pub fn pop(&mut self) -> Option<usize> {

This comment has been minimized.

@nikomatsakis

nikomatsakis Aug 16, 2016

Contributor

some unit tests would be nice :)

This comment has been minimized.

@nagisa

nagisa Aug 16, 2016

Author Contributor

There’s a test for this function introduced by this PR (but not necessarily in the commit you commented on).

@@ -62,12 +62,12 @@ impl BitVector {
}
}

/// Return and unset first bit set.
/// Return and unset last bit set.

This comment has been minimized.

@nikomatsakis

nikomatsakis Aug 16, 2016

Contributor

Nit: it'd be nice to backport these to prior commit. But wevs.

nagisa added some commits Aug 6, 2016

Add Rvalue::is_pure
This method is used by the liveness (dead code elimination) pass, but for some reason I never ended
up staging it for commit
Clean-up the ConstLattice; Propagate global state
Previous version of the lattice was a mess from a number of experimets I’ve previously done with
copy/move propagations and reviewers were sure to notice the mess. This cleans up the mess by
ignoring anything related to non-constants.

In addition, this commit adds propagation of constants through global mutable statics.
Make SwitchInt switch on Operand
We can easily translate a SwitchInt with a Constant Operand, and it also helps us to constant
propagate the thing.
Disallow to optimise out constants in OOB tests
Apparently our constant errors are only reported if there’s a use in MIR for that constant. This
means that in these tests `let _ = B` is seen as a no-op use and thus is optimised out. Explicitly
drop() the B to make sure it is not optimised (yet, until the inliner happens).

NB: the `let _ = B` statement discussed here has no side effects (such as Drop implementation).
Do not be overly aggressive optimising MIR with -g
In debug mode if we optimise stuff out, then debug info also goes away. Do not optimise MIR in
these destructive ways if `-g` flag and -Z mir-opt-level <= 1 (which now defaults to -C opt-level +
1).

@nagisa nagisa force-pushed the nagisa:mir-dflow-3 branch from ccef2ad to 0791d56 Aug 18, 2016

@nagisa

This comment has been minimized.

Copy link
Contributor Author

nagisa commented Aug 19, 2016

Hi all,

Yesterday I thought up of a class of nasty corner cases which, I think, makes raw pointers, MIR and this sort of (backward) dataflow framework incompatible without some sort of way to query aliasing information.

fn main() {
    let mut x = 4;
    let y = &mut x as *mut i32;
    x = 5;
    unsafe ( assert_eq!(5, *y) };
}

Produces MIR like this (read bottom-to-up):

bb0: {
    var0 = const 4i32; 
    tmp1 = &mut var0; // and now finds there was a reference taken (which later gets cast into aliased raw pointer)
    tmp0 = &mut (*tmp1);
    var1 = tmp0 as *mut i32 (Misc); 
    var0 = const 5i32; // looks like a plain unused assignment, so gets removed
    ... // var0 is never used again in any path other than through the pointer in var1
}

Initially, I thought that only propagating locals (not projections) and completely ignoring locals once a reference of it is taken would be all right, but now it seems to me that it only works well with forward analysis, where you actually have a chance to find out about the reference before it may get used (which is what my response was to @arielb1 when he initially inquired about how aliasing gets handled).

The part of the framework which handles speculative rewrites cannot quite handle this case either, because it only kicks in when there’s two conflicting pieces of knowledge from two incoming edges. In this particular case bb0 only ever has a single incoming edge (both in forward and backward analysis). I guess one could go on and simply mark all variables which get references taken before flow begins, but that feels like a nasty hack (crude per-pass version of alias “analysis”).

I feel that without the DeadCode pass, the PR would be too shallow to land, still. You’d have some sort of DF framework and be able to write some very simple forward passes, but nothing that reaches the complexity levels (/s) of the DeadCode pass. And even then there would be little gain: without DeadCode, constant propagation only duplicates the work LLVM is as good at doing it as it gets; and even with DeadCode the gains are very small and only in memory consumption.

Fix aliasing in const-propagate
Something that got lost in the lattice refactor
@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Aug 20, 2016

@nagisa

A more difficult case is

fn main() {
    let mut x = 4;
    let y = &mut x as *mut i32;
    x = 3; // this assignment is not dead
    unsafe ( assert_eq!(5, *y) };
    x = 5; // and neither is this one
    unsafe ( assert_eq!(5, *y) };
}

I suppose that you would first need to forward-propagate which variables get their address taken. Maybe see what LLVM does?

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Aug 20, 2016

@arielb1 I wouldn't suggest looking at LLVM unless you want to implement MemorySSA for MIR.
I agree with propagating borrows and subsequently ignoring accesses as potentially aliased.

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Aug 25, 2016

☔️ The latest upstream changes (presumably #35979) made this pull request unmergeable. Please resolve the merge conflicts.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Aug 31, 2016

@nagisa while I agree that we have to be conservative around these examples, I'm not quite sure why it's a problem. I guess I have to look a bit more at how you are handling things, but it seems to me that a deref of (e.g.) a raw pointer needs to be (conseratively) considered a use of (e.g.) any variable whose address has been taken. The same is true for various other operations (e.g., calls) where we can't say for sure what actions they will take. This is annoying -- and we should be clear about the rules we currently enforce -- but seems quite doable.

Eventually we will want to be more precise but this also hinges on the unsafe code guidelines we wind up with, I imagine.

@DemiMarie

This comment has been minimized.

Copy link
Contributor

DemiMarie commented Oct 2, 2016

According to Slava Pestov's notes on the Factor optimizing compiler, the optimistic assumption (starting at bottom) produces better code than the pessimistic assumption (starting at top).

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Oct 31, 2016

ping, what's the next step here?

@nagisa

This comment has been minimized.

Copy link
Contributor Author

nagisa commented Nov 1, 2016

We need alias analysis of some sort before we can implement any of the more complex optimisations. Closing for now.

@nagisa nagisa closed this Nov 1, 2016

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Nov 1, 2016

Alias analysis sounds scary but all you need for most things is "two different non-ZST locals never occupy the same space" and if you want to go further, "a pointer value can't point to a local that was never borrowed". The latter is already part of @pcwalton's code, I believe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment