Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace @T in the AST with a smart pointer. #13316

Merged
merged 14 commits into from Sep 14, 2014
Merged

Conversation

eddyb
Copy link
Member

@eddyb eddyb commented Apr 4, 2014

Replaces Gc in the AST with a custom owned smart pointer, P. Fixes #7929.

Benefits

  • Identity (affinity?): sharing AST nodes is bad for the various analysis passes (e.g. one could bypass borrowck with a shared ExprAddrOf node taking a mutable borrow), the only reason we haven't hit any serious issues with it is because of inefficient folding passes which will always deduplicate any such shared nodes. Even if we were to switch to an arena, this would still hold, i.e. we wouldn't just use &'a T in the AST, but rather an wrapper (P<'a, T>?).
  • Immutability: P<T> disallows mutating its inner T (unless that contains an Unsafe interior, which won't happen in the AST), unlike ~T.
  • Efficiency: folding can reuse allocation space for P<T> and Vec<T>, the latter even when the input and output types differ (as it would be the case with arenas or an AST with type parameters to toggle macro support). Also, various algorithms have been changed from copying Gc<T> to using &T and iterators.
  • Maintainability: there is another reason I didn't just replace Gc<T> with ~T: P<T> provides a fixed interface (Deref, and_then and map) which can remain fully functional even if the implementation changes (using a special thread-local heap, for example). Moreover, switching to, e.g. P<'a, T> (for a contextual arena) is easy and mostly automated.

@eddyb eddyb changed the title [WIP] P P Apr 5, 2014
@eddyb eddyb changed the title P Replace @T in the AST with a smart pointer. Apr 10, 2014
@brson
Copy link
Contributor

brson commented Apr 11, 2014

Why is "sharing AST nodes is bad for the various analysis passes"?

The justification for a smart pointer here seems pretty weak to me. The biggest advantage, that it's immutable, is easily solved by immutably rooting the entire AST.

@brson
Copy link
Contributor

brson commented Apr 11, 2014

That said, I don't strongly oppose P.

@eddyb
Copy link
Member Author

eddyb commented Apr 11, 2014

@brson I need to expand on that - if you have sharing, you can trivially bypass, say, borrock, and create an unsafe program without the unsafe keyword. I can't demo this because of all the folding passes that deduplicate such nodes (macro expansion used to produce them).

@nikomatsakis
Copy link
Contributor

OK, it took me a while to understand your point about sharing, but I think @eddyb you are saying: if the AST is not a tree but a DAG, that'd be bad, and hence @T is bad, because it permits a DAG?

@nikomatsakis
Copy link
Contributor

If that is what you are saying, I certainly agree, we want the AST to be a tree, and I can see that it's beneficial to choose types that enforce that.

@nikomatsakis
Copy link
Contributor

(Though I think the current pass which deduplicates and assigns indices is not a terrible thing.)

@nikomatsakis
Copy link
Contributor

All that said, I think that overall I still prefer the (more radical) idea of changing the AST into an array and making pointers into newtype'd indices into the array. This does not address your concern about enforcing the tree requirements, nor does it necessarily provide the ability for the folder to reuse memory, but it does address a lot of other nice things:

  • AST map goes away
  • It's easy to see how we can pass "pointers" to an AST node wherever we want them
  • We don't need to store indices in the tree or ever do a renumbering, the numbering is implicit

Do you have any kind of measurements suggesting that folding is a significant part of our execution time? I don't recall ever getting that impression from profiles.

(Under your scheme here, one solution to the AST map would be to allocate the P<> nodes in an arena, in which case the AST map could store &'ast P<> pointers.)

@eddyb
Copy link
Member Author

eddyb commented Apr 15, 2014

Folding doesn't take that much time, maybe a couple seconds here (the simplistic in-place implementation halves that time).
In-place folding does, however, reduce temporary heap pressure (and we maybe get 100MB less in the trans peak just from not repeatedly allocating and freeing nodes during folding).

Yes, an arena would work for the AST map, though I would've preferred a "stable vector" of inlined items, alongside the crate. Those form the roots of the AST forest that we map, anything they point to will have the same lifetime.

@alexcrichton
Copy link
Member

Closing due to inactivity, but feel free to reopen with a rebase! This looks like it may have significant wins, and I'd love to see those wins in rustc! (trying to keep the queue size down for now).

@lilyball
Copy link
Contributor

Reopening at @eddyb's request.

@lilyball lilyball reopened this May 18, 2014
@alexcrichton
Copy link
Member

I've canceled the build for this. There appears to not be a lot of discussion on this PR about this, and I'm unaware of other discussion that has happened through other channels. This is a massive change to the AST, and I want to make sure that this it the right direction we want to go in before committing to this.

I don't want to block this longer, I just want to make sure that there's enough consensus that we want to go in this direction. I like the idea here, but I don't see any data supporting this change and why it should be made (although it sounds like in-place folding is more efficient).

@alexcrichton
Copy link
Member

Additionally, this seems pretty radical from what we're already doing, and perhaps different from what one may expect. Is there documentation in the code about the rationale for a design such as this? I looked in ptr.rs but found no documentation.

@emberian
Copy link
Member

To me, this is a simple refactor from @ to a non-@ type (Gc doesn't count, because it's still @). It uses P rather than hard-coding a specific type to make it easier to experiment.

@emberian
Copy link
Member

And this doesn't seem like that big of a change, mostly just fallout. Until we move to a different AST represention entirely (#7929 has some ideas), this seems pretty benign.

let map = self.map.borrow();
if map.len() > id as uint {
Some(*map.get(id as uint))
Some(unsafe{::std::mem::transmute(map.get(id as uint))})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just transmuting lifetimes? If so, could you use copy_lifetime(self, map.get(id as uint), and it would be good to have a comment about why this is safe.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I don't remember seeing copy_lifetime before. It is safe because we never remove elements from the map nor do we mutate existing elements, i.e. the "stable vector" abstraction (I will put this in a comment).

bors added a commit that referenced this pull request Sep 12, 2014
Few visitors used the context passing feature and it can be easily emulated.
The added lifetime threading allows a visitor to keep safe references to AST
nodes it visits, making a non-owning ast_map design possible, for #13316.
@eddyb eddyb force-pushed the ast-ptr branch 2 times, most recently from 83403cb to b3f62ee Compare September 13, 2014 16:11
bors added a commit that referenced this pull request Sep 14, 2014
Replaces Gc<T> in the AST with a custom owned smart pointer, P<T>. Fixes #7929.

## Benefits
* **Identity** (affinity?): sharing AST nodes is bad for the various analysis passes (e.g. one could bypass borrowck with a shared `ExprAddrOf` node taking a mutable borrow), the only reason we haven't hit any serious issues with it is because of inefficient folding passes which will always deduplicate any such shared nodes. Even if we were to switch to an arena, this would still hold, i.e. we wouldn't just use `&'a T` in the AST, but rather an wrapper (`P<'a, T>`?).

* **Immutability**: `P<T>` disallows mutating its inner `T` (unless that contains an `Unsafe` interior, which won't happen in the AST), unlike `~T`.

* **Efficiency**: folding can reuse allocation space for `P<T>` and `Vec<T>`, the latter even when the input and output types differ (as it would be the case with arenas or an AST with type parameters to toggle macro support). Also, various algorithms have been changed from copying `Gc<T>` to using `&T` and iterators.

* **Maintainability**: there is another reason I didn't just replace `Gc<T>` with `~T`: `P<T>` provides a fixed interface (`Deref`, `and_then` and `map`) which can remain fully functional even if the implementation changes (using a special thread-local heap, for example). Moreover, switching to, e.g. `P<'a, T>` (for a contextual arena) is easy and mostly automated.
@bors bors merged commit 8577343 into rust-lang:master Sep 14, 2014
@eddyb eddyb deleted the ast-ptr branch September 14, 2014 11:15
@eddyb
Copy link
Member Author

eddyb commented Sep 14, 2014

There are only two uses of @T/Gc<T> left in the compiler (and both of them need to be Copy, so just using Rc is unacceptable).
One is in syntax::codemap::Span for expn_info (see #15594 - an integer instead of a pointer should work).
The other one is in rustc::middle::def::Def's DefUpvar variant. I believe that it just wraps DefLocal | DefArg | DefBinding and only the number of "wraps" matters, the extra information being redundant.
I will experiment with Def for now (EDIT: done, see #17259).
If you want to help remove Gc from Span or the hundreds of tests abusing it, please drop by in IRC and ping me (eddyb).


Thanks everyone for your help in making this happen, and @nikomatsakis for #16453 (which arrived just when I finally had time to work on Rust again).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

De-Gc the AST