New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for RFC 2046, label-break-value #48594

Open
Centril opened this Issue Feb 27, 2018 · 55 comments

Comments

Projects
None yet
@Centril
Contributor

Centril commented Feb 27, 2018

This is a tracking issue for RFC 2046 (rust-lang/rfcs#2046).

Steps:

Unresolved questions:

@SoniEx2

This comment has been minimized.

Show comment
Hide comment
@SoniEx2

SoniEx2 Feb 28, 2018

Using "return" would have interesting implications for labeled ? (tryop questionmark operator thingy).

SoniEx2 commented Feb 28, 2018

Using "return" would have interesting implications for labeled ? (tryop questionmark operator thingy).

@est31

This comment has been minimized.

Show comment
Hide comment
@est31

est31 Mar 1, 2018

Contributor

Use return as the keyword instead of break?

@mark-i-m and @joshtriplett have already spoken out against return, but I'll join in given that it is still apparently an unresolved question.

break (and continue) are only usable with a loop.
If you can break something, you can continue it. (I don't think there's an obvious syntax to pick for continue on a block.)

In C, C++, Java, C#, JavaScript and probably more languages you are usually breaking a switch statement in order to prevent fall-through. Rust has solved this much nicer with | in patterns but people coming from those languages won't really see break as something only for for loops. Especially as Java and JavaScript expose the feature via break as well and not return.

The "rules to remember" argument works into the other direction really well however. From what I can tell it is a commonality of the mentioned languages as well as Rust that return only applies to functions and nothing else. So if you see a return, you know that a function is being left.

Labeling a block doesn't cause errors on existing unlabeled breaks

First, I think this happens very rarely as the labeled break feature is admittedly not something that will be used 10 times per 1000 lines. After all, it will only apply to unlabeled breaks that would cross the boundary of the block, not unlabeled breaks inside the block. Second, users of Rust are accustomed to complaints / error messages by the compiler, they will happily fix them! Third (this is the strongest point I think), if instead of labeling a block you wrap it into a loop, you already need to watch out for unlabeled breaks and there is no error message that conveniently lists the break statements, you've got to hunt for them yourself :).

Contributor

est31 commented Mar 1, 2018

Use return as the keyword instead of break?

@mark-i-m and @joshtriplett have already spoken out against return, but I'll join in given that it is still apparently an unresolved question.

break (and continue) are only usable with a loop.
If you can break something, you can continue it. (I don't think there's an obvious syntax to pick for continue on a block.)

In C, C++, Java, C#, JavaScript and probably more languages you are usually breaking a switch statement in order to prevent fall-through. Rust has solved this much nicer with | in patterns but people coming from those languages won't really see break as something only for for loops. Especially as Java and JavaScript expose the feature via break as well and not return.

The "rules to remember" argument works into the other direction really well however. From what I can tell it is a commonality of the mentioned languages as well as Rust that return only applies to functions and nothing else. So if you see a return, you know that a function is being left.

Labeling a block doesn't cause errors on existing unlabeled breaks

First, I think this happens very rarely as the labeled break feature is admittedly not something that will be used 10 times per 1000 lines. After all, it will only apply to unlabeled breaks that would cross the boundary of the block, not unlabeled breaks inside the block. Second, users of Rust are accustomed to complaints / error messages by the compiler, they will happily fix them! Third (this is the strongest point I think), if instead of labeling a block you wrap it into a loop, you already need to watch out for unlabeled breaks and there is no error message that conveniently lists the break statements, you've got to hunt for them yourself :).

@nikomatsakis

This comment has been minimized.

Show comment
Hide comment
@nikomatsakis

nikomatsakis Mar 1, 2018

Contributor

Especially as Java and JavaScript expose the feature via break as well and not return.

This to me is the killer point. break from blocks is a thing in many languages. return from blocks...not so much.

Contributor

nikomatsakis commented Mar 1, 2018

Especially as Java and JavaScript expose the feature via break as well and not return.

This to me is the killer point. break from blocks is a thing in many languages. return from blocks...not so much.

@Centril

This comment has been minimized.

Show comment
Hide comment
@Centril

Centril Mar 1, 2018

Contributor

Personally, I share @joshtriplett's view on using break instead of return, but it seemed to me like the discussion hadn't been resolved on the RFC... If you believe the question is resolved in the lang team, tick the box with a note =)

Contributor

Centril commented Mar 1, 2018

Personally, I share @joshtriplett's view on using break instead of return, but it seemed to me like the discussion hadn't been resolved on the RFC... If you believe the question is resolved in the lang team, tick the box with a note =)

@est31

This comment has been minimized.

Show comment
Hide comment
@est31

est31 Apr 15, 2018

Contributor

Just saying that I'm working on this. Don't need mentor instructions. Just to not duplicate any efforts. Expect a PR soon.

Contributor

est31 commented Apr 15, 2018

Just saying that I'm working on this. Don't need mentor instructions. Just to not duplicate any efforts. Expect a PR soon.

@scottmcm

This comment has been minimized.

Show comment
Hide comment
@scottmcm

scottmcm Apr 15, 2018

Member

I'm still in favour of return over break, but I can agree to disagree here. Question resolved.

Member

scottmcm commented Apr 15, 2018

I'm still in favour of return over break, but I can agree to disagree here. Question resolved.

bors added a commit that referenced this issue May 16, 2018

Auto merge of #50045 - est31:label_break_value, r=eddyb
Implement label break value (RFC 2046)

Implement label-break-value (#48594).
@topecongiro

This comment has been minimized.

Show comment
Hide comment
@topecongiro

topecongiro May 21, 2018

Contributor

Currently (with rustc 1.28.0-nightly (a1d4a9503 2018-05-20)) rustc does not allow unsafe on a labeled block. Is this expected?

Contributor

topecongiro commented May 21, 2018

Currently (with rustc 1.28.0-nightly (a1d4a9503 2018-05-20)) rustc does not allow unsafe on a labeled block. Is this expected?

@scottmcm

This comment has been minimized.

Show comment
Hide comment
@scottmcm

scottmcm May 21, 2018

Member

@topecongiro Yes, I believe it's intentional that this is currently allowed only on plain blocks. It might change in future, but given that this is such a low-level and unusual feature, I'm inclined towards that being a feature rather than a restriction. (On the extreme, I certainly don't want else 'a: {.)

Member

scottmcm commented May 21, 2018

@topecongiro Yes, I believe it's intentional that this is currently allowed only on plain blocks. It might change in future, but given that this is such a low-level and unusual feature, I'm inclined towards that being a feature rather than a restriction. (On the extreme, I certainly don't want else 'a: {.)

@mark-i-m

This comment has been minimized.

Show comment
Hide comment
@mark-i-m

mark-i-m May 21, 2018

Contributor

Definitely agree. Unsafe + unusual control flow sounds like something to discourage.

In a pinch, though, you could use:

'a: {
unsafe {...}
}

Right?

Contributor

mark-i-m commented May 21, 2018

Definitely agree. Unsafe + unusual control flow sounds like something to discourage.

In a pinch, though, you could use:

'a: {
unsafe {...}
}

Right?

@SoniEx2

This comment has been minimized.

Show comment
Hide comment
@SoniEx2

SoniEx2 May 22, 2018

Actually, altho else does create a new lexical scope, it's not a block. The whole if-else is a block (kinda). So no, you wouldn't get else 'a: { you'd get 'a: if ... else {.

SoniEx2 commented May 22, 2018

Actually, altho else does create a new lexical scope, it's not a block. The whole if-else is a block (kinda). So no, you wouldn't get else 'a: { you'd get 'a: if ... else {.

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb May 22, 2018

Member

else contains a block (expression). there is no "new lexical scope" without blocks.
An even worse surface syntax position than else would be 'a: while foo 'b: {...}.
(interestingly enough, continue 'a is break 'b, we might want to rely on that at least internally)

Member

eddyb commented May 22, 2018

else contains a block (expression). there is no "new lexical scope" without blocks.
An even worse surface syntax position than else would be 'a: while foo 'b: {...}.
(interestingly enough, continue 'a is break 'b, we might want to rely on that at least internally)

@Centril

This comment has been minimized.

Show comment
Hide comment
@Centril

Centril May 22, 2018

Contributor

(interestingly enough, continue 'a is break 'b, we might want to rely on that at least internally)

That's a great observation!

Contributor

Centril commented May 22, 2018

(interestingly enough, continue 'a is break 'b, we might want to rely on that at least internally)

That's a great observation!

@SoniEx2

This comment has been minimized.

Show comment
Hide comment
@SoniEx2

SoniEx2 May 22, 2018

I think labels should be part of block-containing expressions, not blocks themselves. We already have precedent for this with loop. (As it happens, a plain block itself is also a block-containing expression. But things like if and loop are block-containing expressions without being blocks I guess.)

(Things like while or for shouldn't support label-break-value, because they could or could not return a value based on whether they complete normally or with a break.)

SoniEx2 commented May 22, 2018

I think labels should be part of block-containing expressions, not blocks themselves. We already have precedent for this with loop. (As it happens, a plain block itself is also a block-containing expression. But things like if and loop are block-containing expressions without being blocks I guess.)

(Things like while or for shouldn't support label-break-value, because they could or could not return a value based on whether they complete normally or with a break.)

@mark-i-m

This comment has been minimized.

Show comment
Hide comment
@mark-i-m

mark-i-m May 22, 2018

Contributor

@eddyb

(interestingly enough, continue 'a is break 'b, we might want to rely on that at least internally)

Only if break 'b re-checks the loop condition...

Contributor

mark-i-m commented May 22, 2018

@eddyb

(interestingly enough, continue 'a is break 'b, we might want to rely on that at least internally)

Only if break 'b re-checks the loop condition...

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb May 23, 2018

Member

@mark-i-m It's equivalent to 'a: while foo {'b: {...}}, the break wouldn't check the loop condition, the loop itself would, because the loop condition is checked before each iteration of the body block.

Member

eddyb commented May 23, 2018

@mark-i-m It's equivalent to 'a: while foo {'b: {...}}, the break wouldn't check the loop condition, the loop itself would, because the loop condition is checked before each iteration of the body block.

@mark-i-m

This comment has been minimized.

Show comment
Hide comment
@mark-i-m

mark-i-m May 23, 2018

Contributor

Woah, I find that highly unintuitive. I expect break 'b to be basically goto 'b, meaning we never exit the loop body and the condition is not checked again...

Contributor

mark-i-m commented May 23, 2018

Woah, I find that highly unintuitive. I expect break 'b to be basically goto 'b, meaning we never exit the loop body and the condition is not checked again...

@mark-i-m

This comment has been minimized.

Show comment
Hide comment
@mark-i-m

mark-i-m May 23, 2018

Contributor

Oh 🤦‍♂️ I see...

Contributor

mark-i-m commented May 23, 2018

Oh 🤦‍♂️ I see...

@mark-i-m

This comment has been minimized.

Show comment
Hide comment
@mark-i-m

mark-i-m May 23, 2018

Contributor

This is why I don't like labeled break/continue :/

Contributor

mark-i-m commented May 23, 2018

This is why I don't like labeled break/continue :/

@rpjohnst

This comment has been minimized.

Show comment
Hide comment
@rpjohnst

rpjohnst May 23, 2018

Contributor

Well, we specifically don't have the ability to label these weird inner blocks, so I don't see the problem. break always means "leave this block" and, given the above restriction, there's no way for that to mean anything other than "goto the spot after the associated closing brace."

Contributor

rpjohnst commented May 23, 2018

Well, we specifically don't have the ability to label these weird inner blocks, so I don't see the problem. break always means "leave this block" and, given the above restriction, there's no way for that to mean anything other than "goto the spot after the associated closing brace."

@mark-i-m

This comment has been minimized.

Show comment
Hide comment
@mark-i-m

mark-i-m May 23, 2018

Contributor

My confusion was not specific to weird inner blocks, but I don't really want to reopen the discussion. That already happened and the community decided to add it.

Contributor

mark-i-m commented May 23, 2018

My confusion was not specific to weird inner blocks, but I don't really want to reopen the discussion. That already happened and the community decided to add it.

@SoniEx2

This comment has been minimized.

Show comment
Hide comment
@SoniEx2

SoniEx2 May 24, 2018

Okay, I understand accessibility is a big issue with programming languages... however, labeled break is extremely useful if you write code like me.

So, how can we make labeled break more accessible?

SoniEx2 commented May 24, 2018

Okay, I understand accessibility is a big issue with programming languages... however, labeled break is extremely useful if you write code like me.

So, how can we make labeled break more accessible?

@mark-i-m

This comment has been minimized.

Show comment
Hide comment
@mark-i-m

mark-i-m May 24, 2018

Contributor

So, how can we make labeled break more accessible?

That's a great question. Some ideas I had:

  • We should collect some samples of how people use this in the wild. We can look for undesirable patterns or lazy habits.
  • We should have opinionated style around when it is considered poor practice to use labeled break/continue.
  • We may be able to add lints for some patterns that could be turned into loops/iterator combinators mechanically (I can't think of any such patterns off the top of my head, though).

As a first (admittedly biased) sample, my last (and first) encounter with labeled break in real code was not stellar: https://github.com/rust-lang/rust/pull/48456/files#diff-3ac60df36be32d72842bf5351fc2bb1dL51. I respectfully suggest that if the original author had put in slightly more effort they could have avoided using labeled break in that case altogether... This is an example of the sort of practice I would like to discourage if possible.

Contributor

mark-i-m commented May 24, 2018

So, how can we make labeled break more accessible?

That's a great question. Some ideas I had:

  • We should collect some samples of how people use this in the wild. We can look for undesirable patterns or lazy habits.
  • We should have opinionated style around when it is considered poor practice to use labeled break/continue.
  • We may be able to add lints for some patterns that could be turned into loops/iterator combinators mechanically (I can't think of any such patterns off the top of my head, though).

As a first (admittedly biased) sample, my last (and first) encounter with labeled break in real code was not stellar: https://github.com/rust-lang/rust/pull/48456/files#diff-3ac60df36be32d72842bf5351fc2bb1dL51. I respectfully suggest that if the original author had put in slightly more effort they could have avoided using labeled break in that case altogether... This is an example of the sort of practice I would like to discourage if possible.

@rpjohnst

This comment has been minimized.

Show comment
Hide comment
@rpjohnst

rpjohnst May 24, 2018

Contributor

That's... not labeled break?

Contributor

rpjohnst commented May 24, 2018

That's... not labeled break?

@rfcbot

This comment has been minimized.

Show comment
Hide comment
@rfcbot

rfcbot Sep 15, 2018

Team member @Centril has proposed to merge this. The next step is review by the rest of the tagged teams:

Concerns:

Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

rfcbot commented Sep 15, 2018

Team member @Centril has proposed to merge this. The next step is review by the rest of the tagged teams:

Concerns:

Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@nrc

This comment has been minimized.

Show comment
Hide comment
@nrc

nrc Sep 18, 2018

Member

@rfcbot concern cost-benefit

From the RFC FCP proposal:

Another group felt that the "cost benefit" of this feature to the language didn't quite make the cut. In other words, the increased complexity that the feature would bring -- put another way, the possibility that people would actually use it and you'd have to read their code, I suppose, as well as the overall size of the language -- wasn't commensurate with its utility.

I still don't think we should have this feature at all. Since it's been implemented has it been used a lot? In the cases that it has been used is it significantly worse to use functions? If there is a benefit here, does it outweigh the cost of making the language larger and more complex?

Member

nrc commented Sep 18, 2018

@rfcbot concern cost-benefit

From the RFC FCP proposal:

Another group felt that the "cost benefit" of this feature to the language didn't quite make the cut. In other words, the increased complexity that the feature would bring -- put another way, the possibility that people would actually use it and you'd have to read their code, I suppose, as well as the overall size of the language -- wasn't commensurate with its utility.

I still don't think we should have this feature at all. Since it's been implemented has it been used a lot? In the cases that it has been used is it significantly worse to use functions? If there is a benefit here, does it outweigh the cost of making the language larger and more complex?

@joshtriplett

This comment has been minimized.

Show comment
Hide comment
@joshtriplett

joshtriplett Sep 18, 2018

Member

@nrc I share the same concern. I do understand the argument for having it available so that macros can use it, but at the same time, I'd much rather not have this at all.

I don't want to retread arguments from the original RFC thread, but I do think that this is a reasonable point to ask about experiences with this feature. What uses has it seen?

Member

joshtriplett commented Sep 18, 2018

@nrc I share the same concern. I do understand the argument for having it available so that macros can use it, but at the same time, I'd much rather not have this at all.

I don't want to retread arguments from the original RFC thread, but I do think that this is a reasonable point to ask about experiences with this feature. What uses has it seen?

@SoniEx2

This comment has been minimized.

Show comment
Hide comment
@SoniEx2

SoniEx2 Sep 18, 2018

hand-written parsers, mostly... (I like my hand-written parsers >.<)

would be more useful/easier to use with labeled-propagate (try_foo() 'bar?).

SoniEx2 commented Sep 18, 2018

hand-written parsers, mostly... (I like my hand-written parsers >.<)

would be more useful/easier to use with labeled-propagate (try_foo() 'bar?).

@joshtriplett

This comment has been minimized.

Show comment
Hide comment
@joshtriplett

joshtriplett Sep 18, 2018

Member

@rfcbot concern use-cases

Summarizing some discussion on Discord: We'd like to see concrete use cases for this, from actual code, that don't come across more clearly when rewritten to not use this feature.

Member

joshtriplett commented Sep 18, 2018

@rfcbot concern use-cases

Summarizing some discussion on Discord: We'd like to see concrete use cases for this, from actual code, that don't come across more clearly when rewritten to not use this feature.

@petrochenkov

This comment has been minimized.

Show comment
Hide comment
@petrochenkov

petrochenkov Sep 18, 2018

Contributor

FWIW, my implementation of EXPR is PAT relies on break 'label value being available at least in AST, I don't know how the desugaring could work without it.
Implementation of EXPR? in the compiler relies on break 'label value as well.

It's some kind of a basic building block for larger control flow features, so it may be needed to be implemented in the compiler anyway.
So the "cost" here is probably just making it available in surface syntax.

EDIT: I've totally misinterpreted the issue, I though this is about loop { ... break 'label value ... }, not block { ... break 'label value ... }.
I never had a chance to try this one because I always forget that it's already implemented.

Contributor

petrochenkov commented Sep 18, 2018

FWIW, my implementation of EXPR is PAT relies on break 'label value being available at least in AST, I don't know how the desugaring could work without it.
Implementation of EXPR? in the compiler relies on break 'label value as well.

It's some kind of a basic building block for larger control flow features, so it may be needed to be implemented in the compiler anyway.
So the "cost" here is probably just making it available in surface syntax.

EDIT: I've totally misinterpreted the issue, I though this is about loop { ... break 'label value ... }, not block { ... break 'label value ... }.
I never had a chance to try this one because I always forget that it's already implemented.

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Sep 18, 2018

Member

@petrochenkov Talking to @joshtriplett on Discord, they pointed out they were worried about user-facing complexity, not the implementation of the language.

Member

eddyb commented Sep 18, 2018

@petrochenkov Talking to @joshtriplett on Discord, they pointed out they were worried about user-facing complexity, not the implementation of the language.

@est31

This comment has been minimized.

Show comment
Hide comment
@est31

est31 Sep 18, 2018

Contributor

I think the complexity increase is minimal: for readers, it should be obvious what this means because the concept already exists in loops, etc.
I'd otherwise have to use a loop with an unconditional break statement, which is less clear and in fact there is even a clippy lint (never_loop) about this. So I think that there is a benefit.

As for the use cases, that topic has come up in the RFC already. I pointed out my use case here. Also see the use case listed by @scottmcm direcly below. Maybe there are more in the thread, idk. @joshtriplett does that resolve the use case question?

Contributor

est31 commented Sep 18, 2018

I think the complexity increase is minimal: for readers, it should be obvious what this means because the concept already exists in loops, etc.
I'd otherwise have to use a loop with an unconditional break statement, which is less clear and in fact there is even a clippy lint (never_loop) about this. So I think that there is a benefit.

As for the use cases, that topic has come up in the RFC already. I pointed out my use case here. Also see the use case listed by @scottmcm direcly below. Maybe there are more in the thread, idk. @joshtriplett does that resolve the use case question?

@withoutboats

This comment has been minimized.

Show comment
Hide comment
@withoutboats

withoutboats Sep 18, 2018

Contributor

I agree with @nrc and @joshtriplett & I also want to raise a process concern here: we tentatively accepted this RFC with an explicit caveat that stabilization was blocked on revisiting the questions that @nrc and @joshtriplett have raised, but @Centril's merge proposal doesn't mention this blocking concern at all, and treats this as a very standard "feature has baked" merge. I'm not blaming @Centril for this, but a process breakdown: if we're going to accept features tentatively with unresolved blockers on stabilization, we need to track those blockers.

It was concerning to me in terms of our entire processes to see that this went 2 and a half days without the blocking concern being brought up, and with most team members having already checked their box. Its conceivably, since we no longer require active consensus of all members, that this could have entered FCP without the blocker even being raised. This feels like a subversion of the prior agreement which led me to consense with merging the RFC, and I think its entirely caused by insufficient tracking of information.

Contributor

withoutboats commented Sep 18, 2018

I agree with @nrc and @joshtriplett & I also want to raise a process concern here: we tentatively accepted this RFC with an explicit caveat that stabilization was blocked on revisiting the questions that @nrc and @joshtriplett have raised, but @Centril's merge proposal doesn't mention this blocking concern at all, and treats this as a very standard "feature has baked" merge. I'm not blaming @Centril for this, but a process breakdown: if we're going to accept features tentatively with unresolved blockers on stabilization, we need to track those blockers.

It was concerning to me in terms of our entire processes to see that this went 2 and a half days without the blocking concern being brought up, and with most team members having already checked their box. Its conceivably, since we no longer require active consensus of all members, that this could have entered FCP without the blocker even being raised. This feels like a subversion of the prior agreement which led me to consense with merging the RFC, and I think its entirely caused by insufficient tracking of information.

@joshtriplett

This comment has been minimized.

Show comment
Hide comment
@joshtriplett

joshtriplett Sep 18, 2018

Member

@withoutboats Yes, exactly. This makes me rather less inclined, in the future, to accept things on a "we'll XYZ during the stabilization process" basis, until we have some process in place that makes it exceedingly unlikely that'll be missed.

Member

joshtriplett commented Sep 18, 2018

@withoutboats Yes, exactly. This makes me rather less inclined, in the future, to accept things on a "we'll XYZ during the stabilization process" basis, until we have some process in place that makes it exceedingly unlikely that'll be missed.

@Centril

This comment has been minimized.

Show comment
Hide comment
@Centril

Centril Sep 18, 2018

Contributor

@withoutboats

I'm not blaming @Centril for this, but a process breakdown: if we're going to accept features tentatively with unresolved blockers on stabilization, we need to track those blockers.

Apologies nonetheless; I was unaware of the caveat, but I should have checked for such things nonetheless when I created the tracking issue.

This feels like a subversion of the prior agreement which led me to consense with merging the RFC, and I think its entirely caused by insufficient tracking of information.

I should have made an unresolved question for it; I believe the process mistake happened at that point.
As for how to improve the process; I think it is important that unresolved questions make it to the text of the RFC; it becomes much harder to find them otherwise ;)


As for the fcp-merge proposal; I personally do think it would be useful for reasons of uniformity and for use by macros. However, if you believe it is too early to propose stabilization; feel free to cancel the proposal :)

Contributor

Centril commented Sep 18, 2018

@withoutboats

I'm not blaming @Centril for this, but a process breakdown: if we're going to accept features tentatively with unresolved blockers on stabilization, we need to track those blockers.

Apologies nonetheless; I was unaware of the caveat, but I should have checked for such things nonetheless when I created the tracking issue.

This feels like a subversion of the prior agreement which led me to consense with merging the RFC, and I think its entirely caused by insufficient tracking of information.

I should have made an unresolved question for it; I believe the process mistake happened at that point.
As for how to improve the process; I think it is important that unresolved questions make it to the text of the RFC; it becomes much harder to find them otherwise ;)


As for the fcp-merge proposal; I personally do think it would be useful for reasons of uniformity and for use by macros. However, if you believe it is too early to propose stabilization; feel free to cancel the proposal :)

@joshtriplett

This comment has been minimized.

Show comment
Hide comment
@joshtriplett

joshtriplett Sep 18, 2018

Member

@est31

I'd otherwise have to use a loop with an unconditional break statement,

Or restructure the code to avoid either.

I pointed out my use case here.

Why not replace the break 'pseudo_return with return Ok(vectors);?

Member

joshtriplett commented Sep 18, 2018

@est31

I'd otherwise have to use a loop with an unconditional break statement,

Or restructure the code to avoid either.

I pointed out my use case here.

Why not replace the break 'pseudo_return with return Ok(vectors);?

@SoniEx2

This comment has been minimized.

Show comment
Hide comment
@SoniEx2

SoniEx2 Sep 18, 2018

as I've mentioned here, this is useful for hand-written parsers (even without labeled-propagate (try_foo() 'bar?)).

label-break-value allows the easy imperativeization of otherwise functional code. generally, imperative code tends to be more readable than functional code.

SoniEx2 commented Sep 18, 2018

as I've mentioned here, this is useful for hand-written parsers (even without labeled-propagate (try_foo() 'bar?)).

label-break-value allows the easy imperativeization of otherwise functional code. generally, imperative code tends to be more readable than functional code.

@est31

This comment has been minimized.

Show comment
Hide comment
@est31

est31 Sep 18, 2018

Contributor

Or restructure the code to avoid either.

Of course, Rust is turing complete. But restructuring might be a bit difficult. Overall, you can reject almost every sugar feature (and this is sugar feature) on the basis "you can just use the existing ways".

Why not replace the break 'pseudo_return with return Ok(vectors);?

Actually in this instance you are right, one can replace the break with a return Ok. But in other instances you might want to do processing afterwards, etc. The borrow checker works poorly across function boundaries, you can't factor every single such block into a function.

Anyways, I've broken my silence about commenting on language features through official means, and tbh I'm regretting it. All the same points, rehashed over and over again. This shit is a waste of my time, sorry. So don't expect any further comments from me.

Contributor

est31 commented Sep 18, 2018

Or restructure the code to avoid either.

Of course, Rust is turing complete. But restructuring might be a bit difficult. Overall, you can reject almost every sugar feature (and this is sugar feature) on the basis "you can just use the existing ways".

Why not replace the break 'pseudo_return with return Ok(vectors);?

Actually in this instance you are right, one can replace the break with a return Ok. But in other instances you might want to do processing afterwards, etc. The borrow checker works poorly across function boundaries, you can't factor every single such block into a function.

Anyways, I've broken my silence about commenting on language features through official means, and tbh I'm regretting it. All the same points, rehashed over and over again. This shit is a waste of my time, sorry. So don't expect any further comments from me.

@joshtriplett

This comment has been minimized.

Show comment
Hide comment
@joshtriplett

joshtriplett Sep 18, 2018

Member

@est31 I really appreciate you providing details; thank you.

Member

joshtriplett commented Sep 18, 2018

@est31 I really appreciate you providing details; thank you.

@SoniEx2

This comment has been minimized.

Show comment
Hide comment
@SoniEx2

SoniEx2 Sep 18, 2018

There's an accessibility issue to testing and using these things, due to the requirement of nightly Rust.

I target stable. I hope this can be solved some day.

SoniEx2 commented Sep 18, 2018

There's an accessibility issue to testing and using these things, due to the requirement of nightly Rust.

I target stable. I hope this can be solved some day.

@rpjohnst

This comment has been minimized.

Show comment
Hide comment
@rpjohnst

rpjohnst Sep 18, 2018

Contributor

If we want use cases, here's one I hit a while ago; basically an elaboration of @est31's "do processing afterwards": I wrote a lexer that handles C++ string literal prefixes (in the actual C++ case, a combinatorial explosion of {L, u8, u, U, }{R, }), and multi-byte tokens that with "gaps" in the number of bytes used (not sure that one makes sense without the example). The get_token function currently looks like this:

fn get_token(&mut self) -> Token {
    match decode_byte(self.source) {
        // ...

        // repeat four times with small variations for b'U', b'L', and b'R':
        Some((b'u', rest)) => match decode_byte(rest) {
            Some((b'"', rest)) => self.string_literal(Utf16String, rest),
            Some((b'\'', rest)) => self.char_constant(Utf16Char, rest),
            Some((b'R', rest)) => match decode_byte(rest) {
                Some((b'"', rest)) => self.raw_string_literal(Utf16String, rest),
                _ => self.identifier(rest),
            },
            Some((b'8', rest)) => match decode_byte(rest) {
                Some((b'"', rest)) => self.string_literal(Utf8String, rest),
                Some((b'\'', rest)) => self.char_constant(Utf8Char, rest),
                Some((b'R', rest)) => match decode_byte(rest) {
                    Some((b'"', rest)) => self.raw_string_literal(Utf8String, rest),
                    _ => self.identifier(rest),
                },
                _ => self.identifier(rest),
            },
            _ => self.identifier(rest),
        },

        // ...

        // the "gap" mentioned above is here: single-byte '.' and triple-byte '...' but no double-byte '..':
        Some((b'.', rest)) => match decode_byte(rest) {
            Some((b'0'..=b'9', rest)) => self.number(rest),
            // note the _inner to avoid shadowing the outer `rest` used by the inner `Dot` case:
            Some((b'.', rest_inner)) => match decode_byte(rest_inner) {
                Some((b'.', rest)) => self.make_token(Ellipsis, rest),
                _ => self.make_token(Dot, rest),
            },
            _ => self.make_token(Dot, rest),
        },

        // ...
    }
}

Notice the pyramids of _ => self.identifier(rest)s (repeated four times for u, U, R, and L) and _ => self.make_token(Dot, rest)s, forming a kind of continuation-passing style where identifier, string_literal, etc. all must also call make_token.

I would have liked to consolidate things back to a less continuation-y style using break-from-block, and almost did so via labeled loops, but considered that version too weird to read. To be more specific:

  • I would have moved all the make_token calls to a single location after the main match decode_byte(self.source), and inlined it- it's small and contains unsafe with its invariants upheld by get_token.
  • I would have used break 'label self.string_literal(..) to short-circuit once finding a " or ', and then combined all the self.identifier(..) calls to the end of that match arm.
    • I may have also been able to linearize the combinatorial explosion of prefixes- check for u/u8/U/L, then check for R. This uses fewer break 'labels, but still a handful.
  • I would have used break 'label (Ellipsis, rest) to short-circuit once finding a ..., and then combined both (Dot, rest)s to the end of that match arm.

On the whole, this is basically the "flatten control flow with if + early return," without the requirement of extracting things into a separate function. That's extremely valuable in this case for a few reasons:

  • Most of these functions would be tiny, single-use functions with no good name, serving only to make things even less readable than this continuation-y style.
  • Some of these functions would require a bunch of extra parameters that would otherwise just be straightforward locals with inferred types.
    • Or alternatively, closures, which in some of these cases would cause borrow checker problems.
  • As I mentioned above, there are safety invariants being upheld across functions here. The more straight-line, state-machiney code here, the better, especially as people come back later to make small changes for new token types.

I would write out the whole thing but I guess I never committed that attempt (probably because I ran into all the problems I listed above) and I've spent enough words here already. :)

Contributor

rpjohnst commented Sep 18, 2018

If we want use cases, here's one I hit a while ago; basically an elaboration of @est31's "do processing afterwards": I wrote a lexer that handles C++ string literal prefixes (in the actual C++ case, a combinatorial explosion of {L, u8, u, U, }{R, }), and multi-byte tokens that with "gaps" in the number of bytes used (not sure that one makes sense without the example). The get_token function currently looks like this:

fn get_token(&mut self) -> Token {
    match decode_byte(self.source) {
        // ...

        // repeat four times with small variations for b'U', b'L', and b'R':
        Some((b'u', rest)) => match decode_byte(rest) {
            Some((b'"', rest)) => self.string_literal(Utf16String, rest),
            Some((b'\'', rest)) => self.char_constant(Utf16Char, rest),
            Some((b'R', rest)) => match decode_byte(rest) {
                Some((b'"', rest)) => self.raw_string_literal(Utf16String, rest),
                _ => self.identifier(rest),
            },
            Some((b'8', rest)) => match decode_byte(rest) {
                Some((b'"', rest)) => self.string_literal(Utf8String, rest),
                Some((b'\'', rest)) => self.char_constant(Utf8Char, rest),
                Some((b'R', rest)) => match decode_byte(rest) {
                    Some((b'"', rest)) => self.raw_string_literal(Utf8String, rest),
                    _ => self.identifier(rest),
                },
                _ => self.identifier(rest),
            },
            _ => self.identifier(rest),
        },

        // ...

        // the "gap" mentioned above is here: single-byte '.' and triple-byte '...' but no double-byte '..':
        Some((b'.', rest)) => match decode_byte(rest) {
            Some((b'0'..=b'9', rest)) => self.number(rest),
            // note the _inner to avoid shadowing the outer `rest` used by the inner `Dot` case:
            Some((b'.', rest_inner)) => match decode_byte(rest_inner) {
                Some((b'.', rest)) => self.make_token(Ellipsis, rest),
                _ => self.make_token(Dot, rest),
            },
            _ => self.make_token(Dot, rest),
        },

        // ...
    }
}

Notice the pyramids of _ => self.identifier(rest)s (repeated four times for u, U, R, and L) and _ => self.make_token(Dot, rest)s, forming a kind of continuation-passing style where identifier, string_literal, etc. all must also call make_token.

I would have liked to consolidate things back to a less continuation-y style using break-from-block, and almost did so via labeled loops, but considered that version too weird to read. To be more specific:

  • I would have moved all the make_token calls to a single location after the main match decode_byte(self.source), and inlined it- it's small and contains unsafe with its invariants upheld by get_token.
  • I would have used break 'label self.string_literal(..) to short-circuit once finding a " or ', and then combined all the self.identifier(..) calls to the end of that match arm.
    • I may have also been able to linearize the combinatorial explosion of prefixes- check for u/u8/U/L, then check for R. This uses fewer break 'labels, but still a handful.
  • I would have used break 'label (Ellipsis, rest) to short-circuit once finding a ..., and then combined both (Dot, rest)s to the end of that match arm.

On the whole, this is basically the "flatten control flow with if + early return," without the requirement of extracting things into a separate function. That's extremely valuable in this case for a few reasons:

  • Most of these functions would be tiny, single-use functions with no good name, serving only to make things even less readable than this continuation-y style.
  • Some of these functions would require a bunch of extra parameters that would otherwise just be straightforward locals with inferred types.
    • Or alternatively, closures, which in some of these cases would cause borrow checker problems.
  • As I mentioned above, there are safety invariants being upheld across functions here. The more straight-line, state-machiney code here, the better, especially as people come back later to make small changes for new token types.

I would write out the whole thing but I guess I never committed that attempt (probably because I ran into all the problems I listed above) and I've spent enough words here already. :)

@SergioBenitez SergioBenitez referenced this issue Sep 26, 2018

Open

Compile with stable Rust #19

7 of 12 tasks complete
@Centril

This comment has been minimized.

Show comment
Hide comment
@Centril

Centril Oct 7, 2018

Contributor

@SergioBenitez could you elaborate on http://rocket.rs/ 's usage of label_break_value and your views on stabilization?

Contributor

Centril commented Oct 7, 2018

@SergioBenitez could you elaborate on http://rocket.rs/ 's usage of label_break_value and your views on stabilization?

@SergioBenitez

This comment has been minimized.

Show comment
Hide comment
@SergioBenitez

SergioBenitez Oct 7, 2018

Contributor

@Centril Sure! Here's the gist:

fn transform(request: &Request, data: Data) -> Transform<Outcome<_, _>> {
    let outcome = 'o: {
        if !request.content_type().map_or(false, |ct| ct.is_form()) {
            break 'o Forward(data);
        }

        let mut form_string = String::with_capacity(min(4096, LIMIT) as usize);
        if let Err(e) = data.read_to_string(&mut form_string) {
            break 'o Failure(FormDataError::Io(e));
        }

        Success(form_string)
    };

    Transform::Borrowed(outcome)
}

Using this feature, I avoid:

  • Splitting the block out into a different function for early "returns".
  • Adding Transform::Borrowed to every "return" value in the block.
  • Possible incorrectness if a different Transform is returned in different cases.
    • Note: This is an invariant particular to Rocket and this piece of code.

I was pretty happy to see that this existed. This is exactly what I want to write. That being said, I can clearly write this differently to not depend on this feature.

Contributor

SergioBenitez commented Oct 7, 2018

@Centril Sure! Here's the gist:

fn transform(request: &Request, data: Data) -> Transform<Outcome<_, _>> {
    let outcome = 'o: {
        if !request.content_type().map_or(false, |ct| ct.is_form()) {
            break 'o Forward(data);
        }

        let mut form_string = String::with_capacity(min(4096, LIMIT) as usize);
        if let Err(e) = data.read_to_string(&mut form_string) {
            break 'o Failure(FormDataError::Io(e));
        }

        Success(form_string)
    };

    Transform::Borrowed(outcome)
}

Using this feature, I avoid:

  • Splitting the block out into a different function for early "returns".
  • Adding Transform::Borrowed to every "return" value in the block.
  • Possible incorrectness if a different Transform is returned in different cases.
    • Note: This is an invariant particular to Rocket and this piece of code.

I was pretty happy to see that this existed. This is exactly what I want to write. That being said, I can clearly write this differently to not depend on this feature.

@Centril

This comment has been minimized.

Show comment
Hide comment
@Centril

Centril Oct 7, 2018

Contributor

@SergioBenitez Thanks! I wonder if you could (eventually) rewrite this with try { .. }? Like so:

fn transform(request: &Request, data: Data) -> Transform<Outcome<_, _>> {
    Transform::Borrowed(try {
        if !request.content_type().map_or(false, |ct| ct.is_form()) {
            Forward(data)?;
        }

        let mut form_string = String::with_capacity(min(4096, LIMIT) as usize);
        if let Err(e) = data.read_to_string(&mut form_string) {
            Failure(FormDataError::Io(e))?;
        }

        form_string
    })
}
Contributor

Centril commented Oct 7, 2018

@SergioBenitez Thanks! I wonder if you could (eventually) rewrite this with try { .. }? Like so:

fn transform(request: &Request, data: Data) -> Transform<Outcome<_, _>> {
    Transform::Borrowed(try {
        if !request.content_type().map_or(false, |ct| ct.is_form()) {
            Forward(data)?;
        }

        let mut form_string = String::with_capacity(min(4096, LIMIT) as usize);
        if let Err(e) = data.read_to_string(&mut form_string) {
            Failure(FormDataError::Io(e))?;
        }

        form_string
    })
}
@SergioBenitez

This comment has been minimized.

Show comment
Hide comment
@SergioBenitez

SergioBenitez Oct 7, 2018

Contributor

@Centril Yeah, that would work as long as there's only one success path, which is the case here.

Contributor

SergioBenitez commented Oct 7, 2018

@Centril Yeah, that would work as long as there's only one success path, which is the case here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment