Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

incr.comp.: Clarify compatibility of red/green evaluation and recovering from cycle errors #42633

Closed
michaelwoerister opened this issue Jun 13, 2017 · 7 comments

Comments

@michaelwoerister
Copy link
Contributor

commented Jun 13, 2017

I originally planned to describe the differences between the current query system and the future version that supports red/green dependency tracking but while thinking about it, there's not too much of a difference when it comes to cycle errors:

  • The in-memory caches of the query system will not change in the future.
  • The try_mark_green step of red/green evaluation might force the evaluation of a query but that's no different from a regular query.
  • There's been talk about having to prevent the caching of intermediate results if those intermediate results arise from a query that turns out to be cyclic, because those intermediate results are "tainted". However, I don't see how one would get ahold of such an intermediate result (both in the current system and the future one) in the first place: If a query A starts a sub-query B and B starts a sub-sub-query C, which would invoke A again, then we would have to make sure that neither the results of B or C are cached. But since we are still in the middle of computing B and C, we don't have any result to cache anyway.

The main difference I see between the current query system and the red/green one is dependency tracking (and there the current system behaves a bit strangely, actually): When a query is started, a new DepNode is allocated and reads to that DepNode are registered as they occur. E.g., we let's say we have the following "query trace":

Queries execute in the order of their label:

            1 <------------+
            |              |
    +-------+-------+      |
    |       |       |      |
    v       v       v      |
    2       4   +-- 5 --+  |
    |           |       |  |
    v           v       v  |
    3           6       7--+

The dependency graph would look exactly like that, cycle and all.

In the red/green system, a DepNode would only be allocated after the successful evaluation of a query. Consequently, we would not end up with the dep-graph shown above. But we still want to record dependencies to those sub-queries because their result influenced what 1 evaluated to after recovering from the cycle. I propose to just merge all reads into the parent query when encountering a cycle error, effectively merging all reads into the root of the cycle. The dep-graph from above would then look like this:

            1
            | 
    +-------+-------+
    |       |       |
    v       v       v
    2       4       6
    |
    v
    3

There are no DepNodes for 5 and 7 (they never completed) but we don't lose the information that 6 has been read, which is important, because the result of 6 might also have caused a trace without cycle.

I still think that recovering from cycle errors is dubious but at least the situation should not get worse with the new system.

Comments? @eddyb @nikomatsakis @rust-lang/compiler

@eddyb

This comment has been minimized.

Copy link
Member

commented Jun 13, 2017

If a query A starts a sub-query B and B starts a sub-sub-query C, which would invoke A again, then we would have to make sure that neither the results of B or C are cached. But since we are still in the middle of computing B and C, we don't have any result to cache anyway.

Yes, but on the way back up to A, both B and C get cached today and they saw the cycle.

@michaelwoerister

This comment has been minimized.

Copy link
Contributor Author

commented Jun 13, 2017

@eddyb Oh, an Err(CycleError) seems to get cached, yeah. We could just... not do that, right?

@eddyb

This comment has been minimized.

Copy link
Member

commented Jun 13, 2017

@michaelwoerister The actual error is uninteresting, but B and C may change based on it.

@michaelwoerister

This comment has been minimized.

Copy link
Contributor Author

commented Jun 14, 2017

Looking at the code again, no Err(CycleError) actually seems to be cached. The semantics of this are so subtle that I keep wanting to get rid of it. Imagine having to explain this to a potential contributor :(

@eddyb

This comment has been minimized.

Copy link
Member

commented Jun 14, 2017

@michaelwoerister AFAIK we'd regress in a few places if we removed cycle recovery. I'd rather have it be enforced correct-by-default and have a warning that if you use try_get your queries may be triggered more than once, so it's a performance/side-effect hazard, but nothing worse.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

commented Jul 14, 2017

Sorry I've been slow to weigh in on this thread. I've been thinking about cycles a lot. At this point, my opinion is that -- at least for now -- we should remove cycle recovery from the query system (that is, cycles should always be considered fatal to compilation -- they don't necessarily have to panic). Things that legitimately want to recover from cycles (which I think is basically just the trait system) should handle it themselves for now.

We don't actually use cycle recovery that much right now, and I think all of the cases can make do with some form of hard error (after a bit of refactoring). We can survey them quite quickly.

  • item_path uses try_get as a kind of "best effort" attempt to peek at the type information and include it in the item path only when it's available. This was a stupid idea on my part. It induces false cycles. We should just deterministically turn off the use of type information for item paths except for in trans, I think, and maybe also there. Or something.
  • MIR inlining uses try_get to handle cycles in the call graph. This was a clever idea but I think ultimately not the right way -- it makes things nondeterministic. Imagine that you have foo() which calls bar() which calls foo(). If we start by asking for the optimized MIR for foo, it can inline bar, but bar will not inline foo. However, if we start by asking for the optimized MIR for bar, you get the opposite result. I think you should get the same result whichever optimized MIR you ask for first.
    • The workaround that I think we should be doing is to have a separate query (let's call it MIR-call-graph or something) and then having the inlining passes all request this call-graph. The call-graph would just walk the MIR for everything. This isn't great for responsiveness, but this call-graph would only be used when we are doing inlining and MIR optimization, so that's probably not an IDE use case, nor is it a prime incremental use-case. (You can make incremental do better by projecting out the SCC for particular def-ids, and then red-green can observe that they have not changed.)
    • We could probably do better here but I think for now this would be fine, and it's not clear that we need to do better ever.
  • sized_constraint and needs_drop_raw is using try_get, but in this case a cycle always indicates that an error should have occurred elsewhere. In particular, it can occur with non-representable types like struct Foo { f: Foo }. We can probably handle this not by recovering from the cycle but by using delay_span_bug to report it, or something like that.

Unless I missed something, those are all the uses of ::try_get in the code-base, and hence (I think) all the uses of cycle recovery that I am aware of.

I was thinking of making trait selection into a query, but I've since thought better of it. I think it will be something "query-like" -- in particular, it will be a function that starts with just a tcx and a key -- but it will want to manage its own caches and it will also need to manage its own dependencies (using anonymous nodes). There is a lot of stuff very particular to the trait system that needs to happen (e.g., for a cycle, we need to check all the traits involved in the cycle and check whether they are auto traits, etc), and I don't see that generalizing very well. Also, I think that ultimately trait matching will not want to be a "DFS" sort of procedure, but rather something more like the ObligationForest, where we build up a dependency graph and kind of "iterate over it" until we are done.

Ultimately my view is that we should aim to keep the query system fairly simple, at least for now. Maybe eventually we will want to expand it into a full-on attribute grammar system that can handle all sorts of data propagation or whatever, but I think that the current setup -- where a query is basically just a memoized function call -- seems about right.

@michaelwoerister

This comment has been minimized.

Copy link
Contributor Author

commented Jul 14, 2017

As a side-note regarding inlining: With #43183, the trans-collector builds up a fairly accurate, global "call graph" between translation items. This could be useful for inlining too. It is post-monomorphization though, which is good for the quality of information it can provide, but which also means that it is a bit late in the pipeline.

alexcrichton added a commit to alexcrichton/rust that referenced this issue Aug 26, 2017
rustc: Start moving toward "try_get is a bug" for incremental

This PR is an effort to burn down some of the work items on rust-lang#42633. The basic change here was to leave the `try_get` function exposed but have it return a `DiagnosticBuilder` instead of a `CycleError`. This means that it should be a compiler bug to *not* handle the error as dropping a diagnostic should result in a complier panic.

After that change it was then necessary to update the compiler's callsites of `try_get` to handle the error coming out. These were handled as:

* The `sized_constraint` and `needs_drop_raw` checks take the diagnostic and defer it as a compiler bug. This was a new piece of functionality added to the error handling infrastructure, and the idea is that for both these checks a "real" compiler error should be emitted elsewhere, so it's only a bug if we don't actually emit the complier error elsewhere.
* MIR inlining was updated to just ignore the diagnostic. This is being tracked by rust-lang#43542 which sounded like it either already had some work underway or was planning to change regardless.
* The final case, `item_path`, is still sort of up for debate. At the time of this writing this PR simply removes the invocations of `try_get` there, assuming that the query will always succeed. This turns out to be true for the test suite anyway! It sounds like, though, that this logic was intended to assist in "weird" situations like `RUST_LOG` where debug implementations can trigger at any time. This PR would therefore, however, break those implementations.

I'm unfortunately sort of out of ideas on how to handle `item_path`, but other thoughts would be welcome!

Closes rust-lang#42633
alexcrichton added a commit to alexcrichton/rust that referenced this issue Aug 26, 2017
rustc: Start moving toward "try_get is a bug" for incremental

This PR is an effort to burn down some of the work items on rust-lang#42633. The basic change here was to leave the `try_get` function exposed but have it return a `DiagnosticBuilder` instead of a `CycleError`. This means that it should be a compiler bug to *not* handle the error as dropping a diagnostic should result in a complier panic.

After that change it was then necessary to update the compiler's callsites of `try_get` to handle the error coming out. These were handled as:

* The `sized_constraint` and `needs_drop_raw` checks take the diagnostic and defer it as a compiler bug. This was a new piece of functionality added to the error handling infrastructure, and the idea is that for both these checks a "real" compiler error should be emitted elsewhere, so it's only a bug if we don't actually emit the complier error elsewhere.
* MIR inlining was updated to just ignore the diagnostic. This is being tracked by rust-lang#43542 which sounded like it either already had some work underway or was planning to change regardless.
* The final case, `item_path`, is still sort of up for debate. At the time of this writing this PR simply removes the invocations of `try_get` there, assuming that the query will always succeed. This turns out to be true for the test suite anyway! It sounds like, though, that this logic was intended to assist in "weird" situations like `RUST_LOG` where debug implementations can trigger at any time. This PR would therefore, however, break those implementations.

I'm unfortunately sort of out of ideas on how to handle `item_path`, but other thoughts would be welcome!

Closes rust-lang#42633
bors added a commit that referenced this issue Aug 26, 2017
rustc: Start moving toward "try_get is a bug" for incremental

This PR is an effort to burn down some of the work items on #42633. The basic change here was to leave the `try_get` function exposed but have it return a `DiagnosticBuilder` instead of a `CycleError`. This means that it should be a compiler bug to *not* handle the error as dropping a diagnostic should result in a complier panic.

After that change it was then necessary to update the compiler's callsites of `try_get` to handle the error coming out. These were handled as:

* The `sized_constraint` and `needs_drop_raw` checks take the diagnostic and defer it as a compiler bug. This was a new piece of functionality added to the error handling infrastructure, and the idea is that for both these checks a "real" compiler error should be emitted elsewhere, so it's only a bug if we don't actually emit the complier error elsewhere.
* MIR inlining was updated to just ignore the diagnostic. This is being tracked by #43542 which sounded like it either already had some work underway or was planning to change regardless.
* The final case, `item_path`, is still sort of up for debate. At the time of this writing this PR simply removes the invocations of `try_get` there, assuming that the query will always succeed. This turns out to be true for the test suite anyway! It sounds like, though, that this logic was intended to assist in "weird" situations like `RUST_LOG` where debug implementations can trigger at any time. This PR would therefore, however, break those implementations.

I'm unfortunately sort of out of ideas on how to handle `item_path`, but other thoughts would be welcome!

Closes #42633
@bors bors closed this in #44071 Aug 26, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.