Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement catch expressions #39849

Closed
nikomatsakis opened this issue Feb 15, 2017 · 4 comments
Closed

implement catch expressions #39849

nikomatsakis opened this issue Feb 15, 2017 · 4 comments
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. E-medium Call for participation: Medium difficulty. Experience needed to fix: Intermediate. E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Feb 15, 2017

This is a sub-issue of #31436 corresponding to the addition of catch expressions. What follows is a rough implementation plan. It consists of four PRs. I'll happily expand and add details upon request.

The high-level summary is that it would be nice to implement catch by having the HIR support the ability of a break to target arbitrary enclosing blocks. This would not be something exposed in the main language, which still permits only breaking out of loops (but we might choose to do so in the future, that'd require an RFC).

Here are some high-level notes on how to go about this. It's a "medium-sized" job, I think. It may be best to begin with some refactorings, as you will see.

Refactoring 1: Normalize how labeled and unlabeled breaks work.

  • Currently, the ExprBreak and ExprAgain variants of hir::Expr_ reference a Option<Label> indicating where we should break to
  • During HIR lowering, if the user gave an explicit label (e.g., 'a: loop { ... break 'a }), it will get translated into a specific node-id (the loop_id field of the Label struct)
  • This is done by consulting the name resolution tables in the lower_label() method of HIR lowering
    • Since we don't plan to change the surface language, we don't really have to change anything in name resolution, thankfully.
  • Annoyingly, if you have an anonymous break (e.g., loop { break; }), then we don't store any label or indication
    of where you break to. Instead, each pass must carry a stack of loops along with it. This is annoying and it would be nice to change it, so that every HIR break (and continue) already knows precisely where it is going.
  • In other words, we want to change the type of the break variant to ExprBreak(Label, Option<P<Expr>>) -- the "label", in particular, is no longer optional.
    • Probably we want to change the Label struct so that the "name" field is optional, for pretty-printing if nothing else
  • On the other hand, this refactoring may not be that important since most passes will still have to maintain a stack, but it seems good.
  • To do the refactoring:
    • lowering needs to grow a stack of "in-scope" loops, so that it can check which is on the top
    • the librustc_passes/loops.rs code currently does some sanity check and reports errors for malformed break (e.g., break outside of a loop)
    • that logic basically moves into lowering
    • we will then want to update the passes that consume ExprBreak and ExprAgain:
      • MIR construction, liveness, CFG construction
      • the changes are small, since you basically just want to take the Some
        path that exists (i.e., the code is there to handle the None that we are removing
      • in the case of liveness, the loop_scope stack can just be removed entirely, we won't be using it anymore

Why do this? This refactoring is an improvement because it consolidates the knowledge of where a break goes into one pass: HIR lowering. Right now, that knowledge is repeated in several passes, each of which must maintain a stack of loops. It will still be necessary to keep that stack of loops, but we'll be using it differently. In today's code, we sometimes search the stack for a loop with a given id (labeled break) and sometimes just take the top of the stack (unlabeled break). After this refactoring, we are always searching for the entry with a given id. This is important, because in the next section we are going to add blocks into these stacks, and so if we wanted to keep unlabeled breaks as they are, we'd have to ensure that when we check the top of the stack, we don't find some random block, but only the top-most loop.

Next step: add catch into AST

The goal here is just to do the plumbing of parsing catch and getting it into the AST.

  • Extend the parser to support catch
  • Add a feature-gate
  • Produce corresponding AST nodes
  • In HIR lowering, you can keep a stack of in-scope catch nodes
    • for the first PR, if ? is used inside of a catch node, I would just abort with bug!("unimplemented: ? in catch")
    • but we'll use the stack for more stuff later :P

Next step: allow HIR break to target blocks

The goal here is to make it possible for breaks to target blocks in HIR. We won't need to change the data structures very much for this to work. Probably it's a good idea to rename the field loop_id in hir::Label to something like node_id or target_id, since it won't always be naming a loop anymore.

In principle, this change can be landed as an initial PR. Annoyingly, though, since we are not changing the surface syntax, there isn't a good way to test this code without also adding support for catch, so we may want to do the two changes together in one PR. Though I'd also be ok to land them separately, or perhaps add some sort of internal testing attributes that enable anonymous break to break out of the innermost block (#[rustc_break_from_block]).

Anyway, we basically don't need to change HIR lowering at all. All the existing breaks are still valid breaks. The only change is that (in the next PR...) HIR lowering will produce new breaks that target blocks and not just loops. So we need to prepare the passes that consume breaks to be prepared for that. These are the same set of passes we saw in the first refactoring: MIR construction, liveness, and CFG construction. As we saw in that first refactoring, each of these passes maintains an internal stack of loops which they already use to search and find the appropriate place that a break should branch to. So the gist of the change is to extend these stacks to include breaks too. Here are some details on each pass:

  • liveness:
    • the with_loop_nodes helper is the one that sets up the targets for breaks and so forth. At present it pushes onto a stack (self.loop_scope), but that should have been removed in the first refactoring. It also stores into a map (self.break_ln) with the "place that a break with a given target should go to" (ln stands for "liveness node").
    • in the propagate_through_block(blk, succ_ln) method, we want to therefore record self.break_ln.insert(node_id, succ_ln), as succ_ln is the liveness node for the successor of the block
  • CFG construction:
    • CFG construction keeps a vec of LoopScope entries rather than a map like liveness. We'll need to push onto this for blocks. The relevant function is fn block(&mut self, blk: &hir::Block, pred: CFGIndex) -> CFGIndex; annoyingly, the way that the CFG construction works is that you are given the predecessor node (pred) and you return the successor. This means that whenever we process a block, we'll have to create a dummy node (self.add_dummy_node(&[])) that we can add into the LoopScope vector to use as the target of blocks. Once we're finished processing the contents of the block, it can branch to this dummy node and then we will return the dummy node.
    • (See the efficiency note below.)
  • MIR building:
    • MIR building works similarly; the helper that pushes on the stack is in_loop_scope()
    • one difference here is that we have to have the ExprBreak that targets a block generate an appopriate value; this should work the same way that it does for let x = loop { break 22 };.

Efficiency note. We may not want to create a bunch of dummy basic blocks for every block. Unfortunately, because MIR and CFG construction works forward, this is sort of hard to avoid. To fix this, I think we will want to extend the HIR for a block to include a boolean indicating whether it is targeted by any breaks, so we can only add the dummy node for those cases where it is needed. This will also mean you have to edit the HAIR, which is a kind of simple, HIR-like representation that exists briefly as we lower to MIR (src/librustc_mir/hair/). This change should be fairly straightforward -- in particular, until the next step, NO blocks can be targeted by breaks, so we can just set this value to always be false.

The final change: adding catch to the HIR

With those pieces in place, we can finally put all the various bits together. Basically when building the HIR for catch { ... }, we would do the following:

  • create a block that is eligible to be the target of a break (the boolean is true)
  • push the id for this block onto the internal stack of catch nodes
  • when we see ?, we would construct an ExprBreak targeting this block (with a value), instead of a ExprReturn

That's "it"! Then we can write some tests!

@nikomatsakis nikomatsakis added E-medium Call for participation: Medium difficulty. Experience needed to fix: Intermediate. E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. C-enhancement Category: An issue proposing an enhancement or a PR with one. labels Feb 15, 2017
@nikomatsakis
Copy link
Contributor Author

@cramertj has expressed interest in making these changes.

@nikomatsakis
Copy link
Contributor Author

GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this issue Feb 22, 2017
…tsakis

Normalize labeled and unlabeled breaks

Part of rust-lang#39849.
bors added a commit that referenced this issue Feb 24, 2017
Normalize labeled and unlabeled breaks

Part of #39849.
eddyb added a commit to eddyb/rust that referenced this issue Feb 25, 2017
…tsakis

Normalize labeled and unlabeled breaks

Part of rust-lang#39849.
eddyb added a commit to eddyb/rust that referenced this issue Feb 25, 2017
…tsakis

Normalize labeled and unlabeled breaks

Part of rust-lang#39849.
frewsxcv added a commit to frewsxcv/rust that referenced this issue Mar 8, 2017
frewsxcv added a commit to frewsxcv/rust that referenced this issue Mar 9, 2017
alexcrichton added a commit to alexcrichton/rust that referenced this issue Mar 10, 2017
alexcrichton added a commit to alexcrichton/rust that referenced this issue Mar 10, 2017
bors added a commit that referenced this issue Mar 14, 2017
frewsxcv added a commit to frewsxcv/rust that referenced this issue Mar 21, 2017
…sakis

Implement `?` in catch expressions

Builds on rust-lang#39921. Final part of rust-lang#39849.

r? @nikomatsakis
@cramertj
Copy link
Member

I believe this can be closed now that #40229 has landed.

@nikomatsakis
Copy link
Contributor Author

Agreed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. E-medium Call for participation: Medium difficulty. Experience needed to fix: Intermediate. E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

2 participants