Move the core MIR datastructures to librustc. #29929

michaelwoerister · 2015-11-19T15:42:27Z

This is done mostly so that we can refer to MIR types in csearch and other metadata related area.

Heads up, @rust-lang/compiler!

arielb1 · 2015-11-19T16:00:24Z

I would prefer not to move stuff to librustc because it is already too big. Maybe split csearch into a trait object and put it in another crate instead?

michaelwoerister · 2015-11-19T16:17:02Z

@arielb1 csearch and most of the other metadata stuff could probably be split out of librustc. But anything that can be restored from metadata would still have to be in librustc, so that a csearch trait could refer to it.

michaelwoerister · 2015-11-19T16:29:16Z

Another option would be not to have a centralized facility for loading things from metadata (i.e. csearch) but to separate that into their associated modules: codemap::metadata, hir::metadata, ty::metadata, mir::metadata and what have you. Not the worst option, in my opinion.

nikomatsakis · 2015-11-19T16:45:21Z

@arielb1 the main point of the rustc crate is to house common type definitions, I think. It seems like the logical place for these to go to me.

brson · 2015-11-19T17:19:51Z

I know I don't have much stake in this but I too find this surprising. I think we want code moving the other direction to dismantle the monolith.

(Of course I understand that metadata is a tangled nexus of interdependencies in rustc).

arielb1 · 2015-11-19T20:12:07Z

@michaelwoerister

mir::tcx and mir::repr are small enough I am not worried about adding them. I would prefer to split metadata.

michaelwoerister · 2015-11-19T21:08:36Z

To me this looks as if the whole metadata infrastructure (including automated serialization) is in need of a refactoring. However, I wouldn't want to rush this. It's something that would profit from putting some thought into and maybe producing an RFC that shows some alternatives and takes stock of all the requirements that we actually have here.

Regarding this PR: As I said above, putting all 'common' types into librustc is not the only option. A more staged approach that shields earlier compilation passes (typechecking, etc) from later ones (e.g. involving MIR) would be possible too. It would mean splitting metadata handling into a common core module that doesn't know anything specific and then having each of the successively later modules providing their specialized implementations for loading codemaps, type information, MIR, etc. If everything follows the same scheme, this could be a clean solution. (Just not something we'd want to do in an ad hoc manner).

arielb1 · 2015-11-19T23:03:06Z

automated serialization? I am working on splitting metadata to a trait and a crate that implements it. We can do something like astencode for MIR (but hopefully less horrible).

michaelwoerister · 2015-11-20T09:01:34Z

automated serialization?

Something in the vain of RustcEncodable but that allows to also serialize/deserialize things that need more context (e.g. ty::Ty<>) in order to get rid of as much boilerplate code as possible.

I am working on splitting metadata to a trait and a crate that implements it.

Sounds good :)

arielb1 · 2015-11-20T12:07:58Z

@michaelwoerister

Our serialization includes a non-trivial amount of non-trivial context - the dumb "dump tcx tables" part is relatively small.

nikomatsakis · 2015-11-24T11:01:04Z

On this specific scenario:

The goal was to get MIR serialization working quickly and painlessly. I think everyone agrees that maintaining astencode.rs (and to a lesser extent tyencode.rs) is an awful pain. In general, our strategy with crates has been that code that needs to shared across multiple crates is moved to librustc, and things that do not remain in their own individual crates, but it's certainly possible to imagine a more elaborate structure as well.

@arielb1 I'm not sure precisely what you are working on but it sounds interesting. Moving metadata to its own crate and having it communicate back via a trait object seems reasonable -- though in general I am wary of introducing too many layers of abstraction and traits, I think it's a sign of "over-crate-ification". How far away are you from having a PR ready for this?

My preference is to keep moving forward, basically. I think my preferred option is to land this PR so that @michaelwoerister can keep making progress and then move the MIR definitions back as part of the (proposed) PR that separates metadata out. This seems reasonable to me.

Some more general thoughts:

I see splitting into crates as a means to an end, not an end in itself. The original motivation for splitting up rustc was primarily reducing memory use. But I think the split is good so long as it helps to keep the code base more approachable. In that sense, having multiple crates helps to lay bare the structure of the codebase, which is nice.

But having lots of crates has costs, too. If you have too many, there stops to be a lot of benefit to the crates over modules in terms of offering a quick overview. They impose acyclicity requirements which, while sometimes helpful, can also be limiting -- e.g. the situation we are discussing right here, where metadata needs to intercommunicate with the ty types but the also other, later parts of the compiler. This can be resolved (as @arielb1 proposed) with a trait, and I actually think this will probably work pretty well in this situation, but it doesn't work very well in all situations, particularly if there are lots of types needed.

In general, the way I have understood crates and Rust is that the purpose of a crate is to be "the unit of code that gets shipped". In that sense, having multiple crates in the compiler is confusing, because it suggests that an end-user might "use" librustc_borrowck (or whatever) independently from the rest of the compiler, which we actively do not want to support, at least not today.

Honestly, if memory use were not a concern, and incremental compilation were working, I am not entirely sure whether I would want to continue having multiple crates. But memory use is a concern, and incremental compilation is not yet working, so obviously we should continue with multiple crates for now at least. But I still don't think we have to be dogmatic about it: we can promote shared types as needed.

nikomatsakis · 2015-11-24T11:05:12Z

Sorry, not sure why I felt the need to write that gigantic essay. :)

Let me just post back a few notes on serialization. I think we should definitely not hand-write code like astencode.rs for MIR -- maintaining that code is such a pain. I want to be able to #[derive(SomethingOrOther)]. Using TLS is certainly the easiest way to get the required context, but my preference has always been to extend the derive syntax so that we can add more information (like specifying the "extra context" that needs to be passed down), but that is a larger bit of design that hasn't happened. It might be that we could use something like https://crates.io/crates/custom_derive though, which might be very nice indeed.

michaelwoerister · 2015-11-24T14:40:23Z

I've rebased this and also moved the visitor module to librustc.

arielb1 · 2015-11-24T19:23:21Z

@nikomatsakis

I have a PR almost ready.

bors · 2015-11-25T20:42:55Z

☔ The latest upstream changes (presumably #30044) made this pull request unmergeable. Please resolve the merge conflicts.

nikomatsakis · 2015-11-25T20:57:22Z

OK, I think we should close this in favor of #30043, despite my impassioned defense of monolithic crates. ;) @michaelwoerister do you agree?

michaelwoerister · 2015-11-30T08:44:18Z

Well, splitting out metadata is mostly orthogonal to make MIR type definitions available in rustc, so I wouldn't close this PR, just rebase it, so I can later add MIR to the new CrateStore trait.

This is done mostly so that we can refer to MIR types in csearch and other metadata related area.

nikomatsakis · 2015-11-30T14:17:22Z

So now that @arielb1 has landed his PR, I see two options:

Initially I assumed we'd add a second metadata trait, probably in librustc_mir, that extends the original metadata trait with methods for loading MIR metadata. Since this trait object is not known to the tcx, it would have to be threaded to the later passes that need access to it. If we take this to the extreme, we basically wind up with a layered series of contexts -- but it's possible that we'll never need more than two layers.
We continue to move the data structures to rustc, so that we don't need these layered contexts, and instead can take more a "promise-based" approach. But we move the code out to separate crates where possible.

I am actually not sure which option I prefer. I was basically arguing for option 2 earlier, but option 1 has its appeal, in that the dag structure is more clear. But it also introduces more layers and more indirection, which can make the overall structure harder to discern, particularly for newcomers. I think in the end there is no firm rule, it will depend a lot on the particulars. Luckily, we don't have to decide this permanently now, this is basically something we can refactor over time.

Discussing with @michaelwoerister on IRC, we were thinking it probably makes sense to land this PR after all and push forward with incr. comp. work. If we want to refactor to option 1, we can do that too. Note though that process on incr. comp. also affects this decision, since bootstrap time may be affected (though I think it'll be some time before we get a lot of progress there).

EDIT: to be clear, about incr. comp. I mean it'll be some time before it's effective at helping with rustc and the whole stage0/stage1/stage2 situation, though I've not thought about this super hard.

nikomatsakis · 2015-11-30T14:18:27Z

@bors r+

bors · 2015-11-30T14:18:27Z

📌 Commit f28a4e9 has been approved by nikomatsakis

bors · 2015-11-30T14:18:29Z

⌛ Testing commit f28a4e9 with merge 8bc43ed...

@nikomatsakis

…atsakis This is done mostly so that we can refer to MIR types in csearch and other metadata related area. Heads up, @rust-lang/compiler! r? @nikomatsakis

bors · 2015-11-30T17:02:43Z

☀️ Test successful - auto-linux-32-nopt-t, auto-linux-32-opt, auto-linux-64-debug-opt, auto-linux-64-nopt-t, auto-linux-64-opt, auto-linux-64-x-android-t, auto-linux-cross-opt, auto-linux-musl-64-opt, auto-mac-32-opt, auto-mac-64-nopt-t, auto-mac-64-opt, auto-win-gnu-32-nopt-t, auto-win-gnu-32-opt, auto-win-gnu-64-nopt-t, auto-win-gnu-64-opt, auto-win-msvc-32-opt, auto-win-msvc-64-opt

michaelwoerister added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. A-MIR Area: Mid-level IR (MIR) - https://blog.rust-lang.org/2016/04/19/MIR.html A-compiler labels Nov 19, 2015

rust-highfive assigned nikomatsakis Nov 19, 2015

michaelwoerister force-pushed the mir-repr-to-librustc branch from 6db9e6f to 62fdb99 Compare November 24, 2015 13:36

michaelwoerister added 2 commits November 30, 2015 10:03

Move the core MIR datastructures to librustc.

bbe1d28

This is done mostly so that we can refer to MIR types in csearch and other metadata related area.

Also move the MIR visitor to librustc.

f28a4e9

michaelwoerister force-pushed the mir-repr-to-librustc branch from 62fdb99 to f28a4e9 Compare November 30, 2015 10:01

bors merged commit f28a4e9 into rust-lang:master Nov 30, 2015

This was referenced Nov 30, 2015

[MIR] Enhance the SimplifyCfg pass to merge consecutive blocks #29838

Closed

cleaned up trans/consts #30110

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move the core MIR datastructures to librustc. #29929

Move the core MIR datastructures to librustc. #29929

michaelwoerister commented Nov 19, 2015

arielb1 commented Nov 19, 2015

michaelwoerister commented Nov 19, 2015

michaelwoerister commented Nov 19, 2015

nikomatsakis commented Nov 19, 2015

brson commented Nov 19, 2015

arielb1 commented Nov 19, 2015

michaelwoerister commented Nov 19, 2015

arielb1 commented Nov 19, 2015

michaelwoerister commented Nov 20, 2015

arielb1 commented Nov 20, 2015

nikomatsakis commented Nov 24, 2015

nikomatsakis commented Nov 24, 2015

michaelwoerister commented Nov 24, 2015

arielb1 commented Nov 24, 2015

bors commented Nov 25, 2015

nikomatsakis commented Nov 25, 2015

michaelwoerister commented Nov 30, 2015

nikomatsakis commented Nov 30, 2015

nikomatsakis commented Nov 30, 2015

bors commented Nov 30, 2015

bors commented Nov 30, 2015

bors commented Nov 30, 2015

Move the core MIR datastructures to librustc. #29929

Move the core MIR datastructures to librustc. #29929

Conversation

michaelwoerister commented Nov 19, 2015

arielb1 commented Nov 19, 2015

michaelwoerister commented Nov 19, 2015

michaelwoerister commented Nov 19, 2015

nikomatsakis commented Nov 19, 2015

brson commented Nov 19, 2015

arielb1 commented Nov 19, 2015

michaelwoerister commented Nov 19, 2015

arielb1 commented Nov 19, 2015

michaelwoerister commented Nov 20, 2015

arielb1 commented Nov 20, 2015

nikomatsakis commented Nov 24, 2015

nikomatsakis commented Nov 24, 2015

michaelwoerister commented Nov 24, 2015

arielb1 commented Nov 24, 2015

bors commented Nov 25, 2015

nikomatsakis commented Nov 25, 2015

michaelwoerister commented Nov 30, 2015

nikomatsakis commented Nov 30, 2015

nikomatsakis commented Nov 30, 2015

bors commented Nov 30, 2015

bors commented Nov 30, 2015

bors commented Nov 30, 2015