implement `catch` expressions #39849

nikomatsakis · 2017-02-15T16:22:47Z

This is a sub-issue of #31436 corresponding to the addition of catch expressions. What follows is a rough implementation plan. It consists of four PRs. I'll happily expand and add details upon request.

The high-level summary is that it would be nice to implement catch by having the HIR support the ability of a break to target arbitrary enclosing blocks. This would not be something exposed in the main language, which still permits only breaking out of loops (but we might choose to do so in the future, that'd require an RFC).

Here are some high-level notes on how to go about this. It's a "medium-sized" job, I think. It may be best to begin with some refactorings, as you will see.

Refactoring 1: Normalize how labeled and unlabeled breaks work.

Currently, the ExprBreak and ExprAgain variants of hir::Expr_ reference a Option<Label> indicating where we should break to
During HIR lowering, if the user gave an explicit label (e.g., 'a: loop { ... break 'a }), it will get translated into a specific node-id (the loop_id field of the Label struct)
This is done by consulting the name resolution tables in the lower_label() method of HIR lowering
- Since we don't plan to change the surface language, we don't really have to change anything in name resolution, thankfully.
Annoyingly, if you have an anonymous break (e.g., loop { break; }), then we don't store any label or indication
of where you break to. Instead, each pass must carry a stack of loops along with it. This is annoying and it would be nice to change it, so that every HIR break (and continue) already knows precisely where it is going.
In other words, we want to change the type of the break variant to ExprBreak(Label, Option<P<Expr>>) -- the "label", in particular, is no longer optional.
- Probably we want to change the Label struct so that the "name" field is optional, for pretty-printing if nothing else
On the other hand, this refactoring may not be that important since most passes will still have to maintain a stack, but it seems good.
To do the refactoring:
- lowering needs to grow a stack of "in-scope" loops, so that it can check which is on the top
- the librustc_passes/loops.rs code currently does some sanity check and reports errors for malformed break (e.g., break outside of a loop)
- that logic basically moves into lowering
- we will then want to update the passes that consume ExprBreak and ExprAgain:
  - MIR construction, liveness, CFG construction
  - the changes are small, since you basically just want to take the Some
    path that exists (i.e., the code is there to handle the None that we are removing
  - in the case of liveness, the loop_scope stack can just be removed entirely, we won't be using it anymore

Why do this? This refactoring is an improvement because it consolidates the knowledge of where a break goes into one pass: HIR lowering. Right now, that knowledge is repeated in several passes, each of which must maintain a stack of loops. It will still be necessary to keep that stack of loops, but we'll be using it differently. In today's code, we sometimes search the stack for a loop with a given id (labeled break) and sometimes just take the top of the stack (unlabeled break). After this refactoring, we are always searching for the entry with a given id. This is important, because in the next section we are going to add blocks into these stacks, and so if we wanted to keep unlabeled breaks as they are, we'd have to ensure that when we check the top of the stack, we don't find some random block, but only the top-most loop.

Next step: add `catch` into AST

The goal here is just to do the plumbing of parsing catch and getting it into the AST.

Extend the parser to support catch
Add a feature-gate
Produce corresponding AST nodes
In HIR lowering, you can keep a stack of in-scope catch nodes
- for the first PR, if ? is used inside of a catch node, I would just abort with bug!("unimplemented: ? in catch")
- but we'll use the stack for more stuff later :P

Next step: allow HIR `break` to target blocks

The goal here is to make it possible for breaks to target blocks in HIR. We won't need to change the data structures very much for this to work. Probably it's a good idea to rename the field loop_id in hir::Label to something like node_id or target_id, since it won't always be naming a loop anymore.

In principle, this change can be landed as an initial PR. Annoyingly, though, since we are not changing the surface syntax, there isn't a good way to test this code without also adding support for catch, so we may want to do the two changes together in one PR. Though I'd also be ok to land them separately, or perhaps add some sort of internal testing attributes that enable anonymous break to break out of the innermost block (#[rustc_break_from_block]).

Anyway, we basically don't need to change HIR lowering at all. All the existing breaks are still valid breaks. The only change is that (in the next PR...) HIR lowering will produce new breaks that target blocks and not just loops. So we need to prepare the passes that consume breaks to be prepared for that. These are the same set of passes we saw in the first refactoring: MIR construction, liveness, and CFG construction. As we saw in that first refactoring, each of these passes maintains an internal stack of loops which they already use to search and find the appropriate place that a break should branch to. So the gist of the change is to extend these stacks to include breaks too. Here are some details on each pass:

liveness:
- the with_loop_nodes helper is the one that sets up the targets for breaks and so forth. At present it pushes onto a stack (self.loop_scope), but that should have been removed in the first refactoring. It also stores into a map (self.break_ln) with the "place that a break with a given target should go to" (ln stands for "liveness node").
- in the propagate_through_block(blk, succ_ln) method, we want to therefore record self.break_ln.insert(node_id, succ_ln), as succ_ln is the liveness node for the successor of the block
CFG construction:
- CFG construction keeps a vec of LoopScope entries rather than a map like liveness. We'll need to push onto this for blocks. The relevant function is fn block(&mut self, blk: &hir::Block, pred: CFGIndex) -> CFGIndex; annoyingly, the way that the CFG construction works is that you are given the predecessor node (pred) and you return the successor. This means that whenever we process a block, we'll have to create a dummy node (self.add_dummy_node(&[])) that we can add into the LoopScope vector to use as the target of blocks. Once we're finished processing the contents of the block, it can branch to this dummy node and then we will return the dummy node.
- (See the efficiency note below.)
MIR building:
- MIR building works similarly; the helper that pushes on the stack is in_loop_scope()
- one difference here is that we have to have the ExprBreak that targets a block generate an appopriate value; this should work the same way that it does for let x = loop { break 22 };.

Efficiency note. We may not want to create a bunch of dummy basic blocks for every block. Unfortunately, because MIR and CFG construction works forward, this is sort of hard to avoid. To fix this, I think we will want to extend the HIR for a block to include a boolean indicating whether it is targeted by any breaks, so we can only add the dummy node for those cases where it is needed. This will also mean you have to edit the HAIR, which is a kind of simple, HIR-like representation that exists briefly as we lower to MIR (src/librustc_mir/hair/). This change should be fairly straightforward -- in particular, until the next step, NO blocks can be targeted by breaks, so we can just set this value to always be false.

The final change: adding `catch` to the HIR

With those pieces in place, we can finally put all the various bits together. Basically when building the HIR for catch { ... }, we would do the following:

create a block that is eligible to be the target of a break (the boolean is true)
push the id for this block onto the internal stack of catch nodes
when we see ?, we would construct an ExprBreak targeting this block (with a value), instead of a ExprReturn

That's "it"! Then we can write some tests!

The text was updated successfully, but these errors were encountered:

nikomatsakis · 2017-02-15T16:23:21Z

@cramertj has expressed interest in making these changes.

nikomatsakis · 2017-02-17T20:57:36Z

Question that @cramertj encountered: https://internals.rust-lang.org/t/grammatical-ambiguity-around-catch-blocks/4807

…tsakis Normalize labeled and unlabeled breaks Part of rust-lang#39849.

Normalize labeled and unlabeled breaks Part of #39849.

…tsakis Normalize labeled and unlabeled breaks Part of rust-lang#39849.

…tsakis Add catch {} to AST Part of rust-lang#39849. Builds on rust-lang#39864.

Add catch {} to AST Part of #39849. Builds on #39864.

@nikomatsakis

…sakis Implement `?` in catch expressions Builds on rust-lang#39921. Final part of rust-lang#39849. r? @nikomatsakis

cramertj · 2017-03-29T19:22:50Z

I believe this can be closed now that #40229 has landed.

nikomatsakis · 2017-04-11T17:58:26Z

Agreed.

nikomatsakis mentioned this issue Feb 15, 2017

Tracking issue for ? operator and try blocks (RFC 243, question_mark & try_blocks features) #31436

Open

12 tasks

cramertj mentioned this issue Feb 15, 2017

Normalize labeled and unlabeled breaks #39864

Merged

cramertj mentioned this issue Feb 17, 2017

Add catch {} to AST #39921

Merged

GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this issue Feb 22, 2017

Rollup merge of rust-lang#39864 - cramertj:normalize-breaks, r=nikoma…

3952bb0

…tsakis Normalize labeled and unlabeled breaks Part of rust-lang#39849.

bors added a commit that referenced this issue Feb 24, 2017

Auto merge of #39864 - cramertj:normalize-breaks, r=nikomatsakis

d84b197

Normalize labeled and unlabeled breaks Part of #39849.

eddyb added a commit to eddyb/rust that referenced this issue Feb 25, 2017

Rollup merge of rust-lang#39864 - cramertj:normalize-breaks, r=nikoma…

2cf3aa7

…tsakis Normalize labeled and unlabeled breaks Part of rust-lang#39849.

eddyb added a commit to eddyb/rust that referenced this issue Feb 25, 2017

Rollup merge of rust-lang#39864 - cramertj:normalize-breaks, r=nikoma…

ad9079b

…tsakis Normalize labeled and unlabeled breaks Part of rust-lang#39849.

cramertj mentioned this issue Mar 3, 2017

Implement ? in catch expressions #40229

Merged

frewsxcv added a commit to frewsxcv/rust that referenced this issue Mar 8, 2017

Rollup merge of rust-lang#39921 - cramertj:add-catch-to-ast, r=nikoma…

cd9e0e7

…tsakis Add catch {} to AST Part of rust-lang#39849. Builds on rust-lang#39864.

frewsxcv added a commit to frewsxcv/rust that referenced this issue Mar 9, 2017

Rollup merge of rust-lang#39921 - cramertj:add-catch-to-ast, r=nikoma…

ded49bc

…tsakis Add catch {} to AST Part of rust-lang#39849. Builds on rust-lang#39864.

alexcrichton added a commit to alexcrichton/rust that referenced this issue Mar 10, 2017

Rollup merge of rust-lang#39921 - cramertj:add-catch-to-ast, r=nikoma…

4b7c16a

…tsakis Add catch {} to AST Part of rust-lang#39849. Builds on rust-lang#39864.

alexcrichton added a commit to alexcrichton/rust that referenced this issue Mar 10, 2017

Rollup merge of rust-lang#39921 - cramertj:add-catch-to-ast, r=nikoma…

e2d3770

…tsakis Add catch {} to AST Part of rust-lang#39849. Builds on rust-lang#39864.

bors added a commit that referenced this issue Mar 14, 2017

Auto merge of #39921 - cramertj:add-catch-to-ast, r=nikomatsakis

6f10e2f

Add catch {} to AST Part of #39849. Builds on #39864.

harpocrates mentioned this issue Mar 15, 2017

Add catch expression harpocrates/language-rust#15

Closed

3 tasks

frewsxcv added a commit to frewsxcv/rust that referenced this issue Mar 21, 2017

Rollup merge of rust-lang#40229 - cramertj:break-to-blocks, r=nikomat…

fcc2d25

…sakis Implement `?` in catch expressions Builds on rust-lang#39921. Final part of rust-lang#39849. r? @nikomatsakis

nikomatsakis closed this as completed Apr 11, 2017

scottmcm mentioned this issue Apr 20, 2017

catch blocks are not Ok-wrapping their value #41414

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement `catch` expressions #39849

implement `catch` expressions #39849

nikomatsakis commented Feb 15, 2017 •

edited

Loading

nikomatsakis commented Feb 15, 2017

nikomatsakis commented Feb 17, 2017

cramertj commented Mar 29, 2017

nikomatsakis commented Apr 11, 2017

implement catch expressions #39849

implement catch expressions #39849

Comments

nikomatsakis commented Feb 15, 2017 • edited Loading

Refactoring 1: Normalize how labeled and unlabeled breaks work.

Next step: add catch into AST

Next step: allow HIR break to target blocks

The final change: adding catch to the HIR

nikomatsakis commented Feb 15, 2017

nikomatsakis commented Feb 17, 2017

cramertj commented Mar 29, 2017

nikomatsakis commented Apr 11, 2017

implement `catch` expressions #39849

implement `catch` expressions #39849

nikomatsakis commented Feb 15, 2017 •

edited

Loading

Next step: add `catch` into AST

Next step: allow HIR `break` to target blocks

The final change: adding `catch` to the HIR