Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing RefCells and Cells from FunctionContext in trans #25332

Closed
wants to merge 11 commits into from

Conversation

Projects
None yet
7 participants
@pythonesque
Copy link
Contributor

pythonesque commented May 12, 2015

The FunctionContext is now passed around mutably pretty much everywhere.

The RefCell motherlode in trans is in the local crate context, but I think this part can stand on its own. I did check to see if I could make the local crate context mutable really easily, but after a few tests I think it would require enough additional work that it probably isn't worth delaying this PR submission.

@rust-highfive

This comment has been minimized.

Copy link
Collaborator

rust-highfive commented May 12, 2015

r? @nrc

(rust_highfive has picked a reviewer for you, use r? to override)

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented May 12, 2015

If Block is two pointers, why change Block to &mut Block everywhere?
Also, 'r is not descriptive enough. I would suggest 'fcx but that would exacerbate the lifetime bloat issue we have.
Maybe it's time to move all those free functions to methods on Block.

@pythonesque

This comment has been minimized.

Copy link
Contributor Author

pythonesque commented May 12, 2015

@eddyb

If Block is two pointers, why change Block to &mut Block everywhere?

Since one of them is mutable, the pair would move every time, so we'd lose the function pointer unless we carefully threaded it back through every return value (I believe, anyway). We could work around that by constructing a fresh pair each time, but then you're really not gaining anything compared to just passing two parameters. I did consider trying it anyway though.

Also, 'r is not descriptive enough. I would suggest 'fcx but that would exacerbate the lifetime bloat issue we have.

Yeah, like I commented in the other PR, I think that in this case it may have been better to just get rid of Block entirely. Separating the basic blocks from the function context makes sense conceptually, and we could avoid the nomenclature issue since the lifetime we're talking about would be elided. Then you get "parameter bloat", though.

Maybe it's time to move all those free functions to methods on Block.

Making everything a method on Block will require us to duplicate all the function signatures in useless traits, though. I wish we could still do inherent impls outside the defining module :(

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented May 12, 2015

I wish we could still do inherent impls outside the defining module :(

Well, your wish came true. A few months ago, actually :).

As for creating a fresh pair every time, you could use a reborrow method to do this, though if you're passing bcx everywhere, that would get annoying quickly. A coercion would be nicer.

@pythonesque

This comment has been minimized.

Copy link
Contributor Author

pythonesque commented May 12, 2015

...Whoa, how did I miss that? Okay then, that makes the method solution a lot more attractive.

As for creating a fresh pair every time, you could use a reborrow method to do this, though if you're passing bcx everywhere, that would get annoying quickly. A coercion would be nicer.

Doesn't reborrowing only occur if the function takes a reference in the first place (in which case you wouldn't be passing by value)? Or am I misunderstanding what you're saying?

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented May 12, 2015

@pythonesque Yes, but you can reproduce the semantics of reborrowing with a method that takes &mut Block and returns a new Block.

@bors

This comment has been minimized.

Copy link
Contributor

bors commented May 13, 2015

☔️ The latest upstream changes (presumably #24619) made this pull request unmergeable. Please resolve the merge conflicts.

@nrc

This comment has been minimized.

Copy link
Member

nrc commented May 14, 2015

@pythonesque do you intend to move some of these functions to methods on Block? Should I hold back on reviewing until then?

I don't mind passing &mut Block, seems more convenient than passng pointers or reborrowing.

@pythonesque

This comment has been minimized.

Copy link
Contributor Author

pythonesque commented May 17, 2015

@nrc I actually started doing switching over to making everything methods on Block, and I'm not entirely sure it's that much nicer in practice. So I'd appreciate a review before that in case it's good enough as is (of course, I'll also need to fix the recent breakage, but I don't think it looks like that affects too much).

fn trans<'r, 'blk>(&self,
&mut Block { bl, ref mut fcx }: &mut Block<'r, 'blk, 'tcx>)
-> OptResult<'blk> {
let mut bcx = &mut bl.with(fcx);

This comment has been minimized.

@nrc

nrc May 20, 2015

Member

This is a bit weird - especially to newcomers to trans - we take a Block as parameter, then create a new one, using only the components of the passed Block, then later in the function we use the fields of the same Block again. Either with needs a better name indicating what is going on, or we need a comment here.

This comment has been minimized.

@pythonesque

pythonesque May 20, 2015

Author Contributor

This pattern (well, similar) was actually already used all over the place prior to this PR. But I'll take a closer look at this particular one.

I could rename it to with_fcx, if that helps. Note that the reason we do this is to reborrow it; otherwise borrowck yells at me really, really hard.

{
//let Block { bl, ref mut fcx } = *bcx;

This comment has been minimized.

@nrc

nrc May 20, 2015

Member

should this be removed?

This comment has been minimized.

@pythonesque

pythonesque May 20, 2015

Author Contributor

Whoops, yes. Good catch!

@@ -547,7 +552,7 @@ impl<'a, 'tcx> FunctionContext<'a, 'tcx> {
// code. Each basic block we generate is attached to a function, typically
// with many basic blocks per function. All the basic blocks attached to a
// function are organized as a directed graph.
pub struct BlockS<'blk, 'tcx: 'blk> {
pub struct BlockS {

This comment has been minimized.

@nrc

nrc May 20, 2015

Member

It's pretty bad that we export both Block and BlockS types from this module. Previously we only used Block and it was a typedef. If we need to use both publicly, then we should change BlockS to have a more descriptive name - BlockGraph or BlockSet or something.

This comment has been minimized.

@pythonesque

pythonesque May 20, 2015

Author Contributor

I think we may not need it to be public, actually. I'll double check.

Oh wait, from the module. Yeah, I'll rename it.

This comment has been minimized.

@eddyb

eddyb May 20, 2015

Member

BlockS actually means BlockStruct and it was @nikomatsakis' suggestion.

This comment has been minimized.

@pythonesque

pythonesque May 20, 2015

Author Contributor

@eddyb Yeah, I figured... perhaps something like BlockData would be better? Or, rename it to Block and call the thing we're passing around everywhere BlockContext or something (we even call it bcx everywhere).


// The function context for the function to which this block is
// attached.
pub fcx: &'r mut FunctionContext<'blk, 'tcx>,

This comment has been minimized.

@nrc

nrc May 20, 2015

Member

Since every block in a BlockS is in the same function, why do they each need their own reference to a function context? Seems like leaving it in BlockS is better?

This comment has been minimized.

@pythonesque

pythonesque May 20, 2015

Author Contributor

BlockS is the one that's being allocated in the arena (which is what we were doing before) which stored an extra fcx each time, which didn't make sense (IMO). Block is really just a temporary structure useful for passing along a function context and block reference. As you can see, the fcx is &mut, and we nearly always have a &mut Block, so we never really have more than one at a time anyway.

@nrc

This comment has been minimized.

Copy link
Member

nrc commented May 20, 2015

Wow, that is quite a PR! So, I think it makes sense and I'm in favour of landing. It certainly seems like a win in terms of perf. It still seems a bit messier than I would like (the design of Block, some how, rather than the quality of the code or anything like that), but it doesn't seem that there is a great solution under it all. I'd be keen to know what the Block + methods approach looks like, but I'd be happy to see this land.

r=me with the Block/BlockS and with naming issue addressed, unless the inherant impl thing works out.

pythonesque added some commits May 20, 2015

Addressing comments from review in librustc_trans.
Specifically, changing Block -> BlockContext, BlockS -> Block, and
Block::with to Block::with_fcx.
@pythonesque

This comment has been minimized.

Copy link
Contributor Author

pythonesque commented May 20, 2015

@nrc Addressed your comments and synced to latest master.

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented May 20, 2015

Maybe we can refactor BlockS/BlockData/Block away, it's a bit odd that we allocate in an arena when LLVM already has to allocate the block itself. Here's the list of fields, apart from the LLVM basic block:

     terminated: Cell<bool>,
     unreachable: Cell<bool>,

     /// Is this block part of a landing pad?
     is_lpad: bool,

     /// AST node-id associated with this block, if any. Used debugging purposes only.
     opt_node_id: Option<ast::NodeId>,

terminated and unreachable should be easy to replace with calls to LLVMGetBasicBlockTerminator - there might be a performance cost, due to the lack of cross-language inlining.
We could have our own getters for them, written in C++ so there's only the overhead of a single call.

is_lpad should also be accessible through the LLVM API, as it's determined by the presence of a LandingPadInst.

opt_node_id is not even used in debug info, only for a single debug!, in trans::base::invoke. Which nowadays has a debug_loc: DebugLoc argument, so it's likely entirely redundant.

Verdict: we should be able to pass around only the BasicBlockRef and ditch the arena altogether.
Where we previously needed 3 lifetimes ('tcx, 'fcx and 'blk where fcx: &'fcx FunctionContext<'tcx, 'blk>), two would work: Block<'tcx, 'blk>.
Or Block<'tcx, 'fcx> if that's preferred.

@pythonesque

This comment has been minimized.

Copy link
Contributor Author

pythonesque commented May 20, 2015

@eddyb Yes, I also was wondering about the arena. I think I eventually concluded that the references are held onto somewhere in the drop glue code and that trying to refactor that as well was something I could look at after the initial version landed.

I'm not 100% sure that all those fields are just caches of the LLVM ones; I would have to look into it. Either way, it seems to me that so long as we're relying on the information living in LLVM, we would still need some sort of lifetime attached (even if only as PhantomData) for safety purposes (I realize we're not doing it directly in the BasicBlockRef, but we probably should be).

I think I agree that you could get away with not having an explicit lifetime for the &mut outside of Block in general (though I'm not completely certain--borrowck can be really finicky in situations like this). But looking ahead, I think we may have to end up adding another lifetime anyway when we deal with the local crate context, which we do want to be &mut and which I believe needs to have a disjoint lifetime from the FunctionContext's reference. (I started looking into whether I could do it without adding any more lifetimes and almost immediately ran into issues).

In general: I agree there is lots of refactoring work we can do on top of this change, but I think it will be much easier to figure out what we can and can't get away with after we are passing around &muts everywhere, since they are the limiting factor. My experience so far suggests that it can be very hard to predict what difficulties this sort of refactoring is going to hit; as long as we're using RefCells everywhere, local reasoning about what is / should be aliased has a tendency to miss corner cases.

@bors

This comment has been minimized.

Copy link
Contributor

bors commented May 20, 2015

☔️ The latest upstream changes (presumably #25595) made this pull request unmergeable. Please resolve the merge conflicts.

@bors

This comment has been minimized.

Copy link
Contributor

bors commented May 20, 2015

☔️ The latest upstream changes (presumably #25645) made this pull request unmergeable. Please resolve the merge conflicts.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented May 20, 2015

I started a discuss thread to discuss this approach, since it seems to have some pros and cons:

https://internals.rust-lang.org/t/prs-removing-refcells-longer-term-plans/2099

@bors

This comment has been minimized.

Copy link
Contributor

bors commented May 23, 2015

☔️ The latest upstream changes (presumably #25653) made this pull request unmergeable. Please resolve the merge conflicts.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Jul 5, 2015

Closing due to inactivity, but feel free to reopen with a rebase!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.