New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

labeled loops, labeled break, labeled continue #346

Closed
thejoshwolfe opened this Issue Apr 27, 2017 · 13 comments

Comments

Projects
None yet
6 participants
@thejoshwolfe
Member

thejoshwolfe commented Apr 27, 2017

EDIT: scroll to current proposal here: #346 (comment)

Proposal:

Remove the current goto and label constructs. Introduce two kinds of labels:

  • Labeled loops, like in Java. e.g. label: while (..., label: for (...
    • You can break label and continue label, like in Java.
    • You cannot goto label for this kind of label.
  • Labeled blocks. e.g. label: { ... }
    • Control flow must not "fallthrough" into or out of a labeled block.
    • You can goto label to enter the label.
    • We may want to allow label: return foo() or similar without a block, but this may be confusing when labeled loops are a distinct concept. You could always just do label: { return foo(); } anyway, so we probably don't need a blockless form.

I searched the Zig standard library as well as @andrewrk's tetris and clashos projects, and the only 3 uses of goto are effectively continues; one of them would need to be a labeled continue; the other two can be easily converted to status quo continues. I think those 3 pieces of code would be improved by this proposal.

I think there's a legitimate usecase for labeled blocks, and I'd like to see if I can write some Zig code that needs them. My idea so far is some kind of tokenizer (XML parser perhaps?). It's always possible to avoid goto by using labeled break and continue (see Java), but I think there are cases where it would be an abuse of the syntax to do so.

My rationale for proposing that control flow must not fallthrough into or out of a labeled block is to prevent accidents. My idea of a labeled block is a piece of code that exists outside the normal control flow of the function that you want to be able to jump into and out of with special control flow. In this sense, it's a more like a function than like a ... whatever an arbitrary label in the middle of code is. An important difference from a function is that a labeled block can return from its function, which every usecase for labeled blocks I can think of would do.

The real-world usage of goto is pivotal to this proposal. I don't want to act on this proposal without more real-world data.

@andrewrk

This comment has been minimized.

Show comment
Hide comment
@andrewrk

andrewrk May 4, 2017

Member

I added the ability for break to give an expression and loops now have an else (See #357). So this steers us in the direction of labeled loops.

Member

andrewrk commented May 4, 2017

I added the ability for break to give an expression and loops now have an else (See #357). So this steers us in the direction of labeled loops.

@thejoshwolfe

This comment has been minimized.

Show comment
Hide comment
@thejoshwolfe

thejoshwolfe May 10, 2017

Member

The proposed break label syntax clashes with the current break expression syntax. To disambiguate, we could do one of these:

  1. break :label, break expression, break :label expression
  2. break (expression), break label, break (expression) label
Member

thejoshwolfe commented May 10, 2017

The proposed break label syntax clashes with the current break expression syntax. To disambiguate, we could do one of these:

  1. break :label, break expression, break :label expression
  2. break (expression), break label, break (expression) label
@andrewrk

This comment has been minimized.

Show comment
Hide comment
@andrewrk

andrewrk May 10, 2017

Member

(1) looks good to me.

Member

andrewrk commented May 10, 2017

(1) looks good to me.

@andrewrk andrewrk changed the title from labeled loops and labeled blocks to labeled loops and labeled blocks instead of goto Aug 6, 2017

@PavelVozenilek

This comment has been minimized.

Show comment
Hide comment
@PavelVozenilek

PavelVozenilek Sep 10, 2017

There are several useful use cases for goto in C:

  1. Clean implementation of finite state machines using macros and gotos. This probably doesn't apply to Zig.

  2. Error handling for which deferwould be not enough or would be clumsy (defers have to repeat their code or use named function). Example:

    ... 
    if (...) goto error1;
    ...
    if (...) goto error2;
    ...
    if (...) goto error1;
    ...
    error1:
      ... lot of code here
    error2:
      ... lot of code
    

    Here all error code
    (a) is put at function's end (one always knows where to look for it),
    (b) stays together (inconsistency is easier to spot) and
    (c) does not clutter the happy path (readability win).

  3. Going from one code block to another inside a large function. Example:

     void nontrivial_function(void) { 
    
        phase1:   
        ...
        if (...) goto phase2;
         ...
         if (...) goto phase3;
        ...
    
        phase2:
        ...    
         if (...) goto phase4;
         ...
         if (...) goto phase3;
        ...
    
       phase3:
        ... etc
     }
    

Here goto is more natural than other constructs. Using several named sub-functions to implement these blocks would decrease readability, especially if they need many parameters.

Large functions are not inherently bad, if they can be read (mostly) linearly.


IMO goto should stay, probably restricted to "going down only", with style guidelines recommending clean use cases.


PS: GCC also has "computed goto" feature, which is used to implement fast interpreters (faster than those having big switch, thanks to better branch prediction).

PavelVozenilek commented Sep 10, 2017

There are several useful use cases for goto in C:

  1. Clean implementation of finite state machines using macros and gotos. This probably doesn't apply to Zig.

  2. Error handling for which deferwould be not enough or would be clumsy (defers have to repeat their code or use named function). Example:

    ... 
    if (...) goto error1;
    ...
    if (...) goto error2;
    ...
    if (...) goto error1;
    ...
    error1:
      ... lot of code here
    error2:
      ... lot of code
    

    Here all error code
    (a) is put at function's end (one always knows where to look for it),
    (b) stays together (inconsistency is easier to spot) and
    (c) does not clutter the happy path (readability win).

  3. Going from one code block to another inside a large function. Example:

     void nontrivial_function(void) { 
    
        phase1:   
        ...
        if (...) goto phase2;
         ...
         if (...) goto phase3;
        ...
    
        phase2:
        ...    
         if (...) goto phase4;
         ...
         if (...) goto phase3;
        ...
    
       phase3:
        ... etc
     }
    

Here goto is more natural than other constructs. Using several named sub-functions to implement these blocks would decrease readability, especially if they need many parameters.

Large functions are not inherently bad, if they can be read (mostly) linearly.


IMO goto should stay, probably restricted to "going down only", with style guidelines recommending clean use cases.


PS: GCC also has "computed goto" feature, which is used to implement fast interpreters (faster than those having big switch, thanks to better branch prediction).

@thejoshwolfe

This comment has been minimized.

Show comment
Hide comment
@thejoshwolfe

thejoshwolfe Sep 11, 2017

Member

I think this discussion would benefit from real actual examples of using goto in actual code. It's easy enough to come up with pseudo code that demonstrates a desire for goto, but it may be the case that every real usecase for goto could be better implemented without it.

I'm still very open minded to this feature, but i think it's time to bring in real examples.

GCC also has "computed goto" feature, which is used to implement fast interpreters

I read some details about this here: http://eli.thegreenplace.net/2012/07/12/computed-goto-for-efficient-dispatch-tables

Looks like the benefits come from two sources: skipping the bounds check on switch, and doing the jump in each case instead of jumping to a centralized dispatch instruction. Both of these optimizations are theoretically possible without computed goto in Zig, but achieving the same performance may require writing an optimization pass for llvm. I haven't done any research to see if this already works or not.

Skipping the bounds check can be enabled by making the default case unreachable. Then an optimizer can safely let an out-of-bounds value cause undefined behavior, like the computed goto case.

Putting the dispatch jump in each case isn't as obvious how to do, but this is equivalent to inlining something about the loop and switch. It's definitely possible that the right kind of compiler optimization could achieve this.

Member

thejoshwolfe commented Sep 11, 2017

I think this discussion would benefit from real actual examples of using goto in actual code. It's easy enough to come up with pseudo code that demonstrates a desire for goto, but it may be the case that every real usecase for goto could be better implemented without it.

I'm still very open minded to this feature, but i think it's time to bring in real examples.

GCC also has "computed goto" feature, which is used to implement fast interpreters

I read some details about this here: http://eli.thegreenplace.net/2012/07/12/computed-goto-for-efficient-dispatch-tables

Looks like the benefits come from two sources: skipping the bounds check on switch, and doing the jump in each case instead of jumping to a centralized dispatch instruction. Both of these optimizations are theoretically possible without computed goto in Zig, but achieving the same performance may require writing an optimization pass for llvm. I haven't done any research to see if this already works or not.

Skipping the bounds check can be enabled by making the default case unreachable. Then an optimizer can safely let an out-of-bounds value cause undefined behavior, like the computed goto case.

Putting the dispatch jump in each case isn't as obvious how to do, but this is equivalent to inlining something about the loop and switch. It's definitely possible that the right kind of compiler optimization could achieve this.

@andrewrk

This comment has been minimized.

Show comment
Hide comment
@andrewrk

andrewrk Sep 11, 2017

Member

The only use of goto in zig compiler is here: https://github.com/zig-lang/zig/blob/master/src/c_tokenizer.cpp

I think it's just a labeled break from the for loop though.

I'll throw in 2 use cases for goto:

  • Leaving goto in makes it easier to port C code to zig
  • goto is useful if you're using zig as a target language for a compiler. This is one current use of C - compilers compile to C.
Member

andrewrk commented Sep 11, 2017

The only use of goto in zig compiler is here: https://github.com/zig-lang/zig/blob/master/src/c_tokenizer.cpp

I think it's just a labeled break from the for loop though.

I'll throw in 2 use cases for goto:

  • Leaving goto in makes it easier to port C code to zig
  • goto is useful if you're using zig as a target language for a compiler. This is one current use of C - compilers compile to C.
@andrewrk

This comment has been minimized.

Show comment
Hide comment
@andrewrk

andrewrk Sep 11, 2017

Member

Here's another one: mixing inline assembly and zig.

If we had better inline assembly integration, we could have labels declared in assembly that we can use goto to jump into.

Member

andrewrk commented Sep 11, 2017

Here's another one: mixing inline assembly and zig.

If we had better inline assembly integration, we could have labels declared in assembly that we can use goto to jump into.

@andrewrk

This comment has been minimized.

Show comment
Hide comment
@andrewrk

andrewrk Sep 13, 2017

Member

Here's another idea instead of labeled blocks.

We already have the feature that const foo = this; and now foo refers to a specific block of code.

Here's my proposal:

  • Delete break.
  • Remove the option to leave off semicolons in block statements.
  • return now accepts an optional block argument like this: return :block expr or return :block. So you can return a value from any parent block. If the block argument is left off, it defaults to the block that is the function definition.
  • If you want to break out of a loop, you better alias the loop block so you can return from it:
while (true) {
    const while_loop = this;
    return :while_loop;
}

Now we don't have this awkward choice for functions, do we use return or not? Yes, you have to use return to return a value.

Member

andrewrk commented Sep 13, 2017

Here's another idea instead of labeled blocks.

We already have the feature that const foo = this; and now foo refers to a specific block of code.

Here's my proposal:

  • Delete break.
  • Remove the option to leave off semicolons in block statements.
  • return now accepts an optional block argument like this: return :block expr or return :block. So you can return a value from any parent block. If the block argument is left off, it defaults to the block that is the function definition.
  • If you want to break out of a loop, you better alias the loop block so you can return from it:
while (true) {
    const while_loop = this;
    return :while_loop;
}

Now we don't have this awkward choice for functions, do we use return or not? Yes, you have to use return to return a value.

@PavelVozenilek

This comment has been minimized.

Show comment
Hide comment
@PavelVozenilek

PavelVozenilek Sep 13, 2017

Ad computed goto - the only use I am aware is in Forth intepreters, where it usually claims large (tens of %) speedups. The speedup is due to better fit for the branch prediction mechamism, as I understand it. I personally never used it.

I use goto for error handling at the end of function, occasionally to jump down to a block. So far I avoided jumping up. I also used gotos hidden in macros to implement finite state machines (this results in ideal syntax).

I never had problems with break and continue. In fact I see the C syntax for control flow as almost perfect and would not recommend to remove anything.

Labeled break, perhaps, if only down. Return is IMO important visual clue. The proposal above (return instead of break) feels strange.

PavelVozenilek commented Sep 13, 2017

Ad computed goto - the only use I am aware is in Forth intepreters, where it usually claims large (tens of %) speedups. The speedup is due to better fit for the branch prediction mechamism, as I understand it. I personally never used it.

I use goto for error handling at the end of function, occasionally to jump down to a block. So far I avoided jumping up. I also used gotos hidden in macros to implement finite state machines (this results in ideal syntax).

I never had problems with break and continue. In fact I see the C syntax for control flow as almost perfect and would not recommend to remove anything.

Labeled break, perhaps, if only down. Return is IMO important visual clue. The proposal above (return instead of break) feels strange.

@raulgrell

This comment has been minimized.

Show comment
Hide comment
@raulgrell

raulgrell Sep 13, 2017

Contributor

I just want to voice my agreement with @PavelVozenilek - that return syntax feels strange. In terms of readability, we are leveraging some of the fundamental constructs of C to make the language more familiar and accessible to users, it feels strange to remove break and change return so much.

The points about porting are also very legitimate, in order to replace C, we must first replace it =P

Contributor

raulgrell commented Sep 13, 2017

I just want to voice my agreement with @PavelVozenilek - that return syntax feels strange. In terms of readability, we are leveraging some of the fundamental constructs of C to make the language more familiar and accessible to users, it feels strange to remove break and change return so much.

The points about porting are also very legitimate, in order to replace C, we must first replace it =P

@andrewrk

This comment has been minimized.

Show comment
Hide comment
@andrewrk

andrewrk Sep 13, 2017

Member

We can also keep break and have it be syntactic sugar for return :immediate_loop_parent_block

Member

andrewrk commented Sep 13, 2017

We can also keep break and have it be syntactic sugar for return :immediate_loop_parent_block

@pluto439

This comment has been minimized.

Show comment
Hide comment
@pluto439

pluto439 Nov 5, 2017

Goto is used in visual novels all the time. You rarely need to go back when reading a book, you only go forward. Functions return back, goto doesn't.

You guys are making life harder for yourself for no good reason.

Also, goto is good for error handling. It's used in linux kernel for that quite often. Defer seems to partially replace that usage, but I'm not sure if it can replace goto completely. It's also sometimes used to give a function two different exits.

Another example here, http://ollydbg.de/ , Disasm.zip , assembl.c . I don't quite understand what exactly this code does, but I think it's useful here.

Goto is a good lightweight replacement for exceptions, but using it for that is complicated. Need to make sure you don't have anything important on stack (no destructors), need to store correct stack pointer somewhere (to free a lot of stack space at once). If you goto inside a function, it's much easier, because the difference in stack before and after goto is always constant.

Exceptions are good for avoiding a lot of conditional branches in the program and annoying checks, it must be affecting both program speed and program complexity. It's especially good for file parsing, when something can go wrong at literally any moment. They are pretty much multi-function gotos.

Also read this #578. I think I need exceptions after all. Just don't need to abuse them.

pluto439 commented Nov 5, 2017

Goto is used in visual novels all the time. You rarely need to go back when reading a book, you only go forward. Functions return back, goto doesn't.

You guys are making life harder for yourself for no good reason.

Also, goto is good for error handling. It's used in linux kernel for that quite often. Defer seems to partially replace that usage, but I'm not sure if it can replace goto completely. It's also sometimes used to give a function two different exits.

Another example here, http://ollydbg.de/ , Disasm.zip , assembl.c . I don't quite understand what exactly this code does, but I think it's useful here.

Goto is a good lightweight replacement for exceptions, but using it for that is complicated. Need to make sure you don't have anything important on stack (no destructors), need to store correct stack pointer somewhere (to free a lot of stack space at once). If you goto inside a function, it's much easier, because the difference in stack before and after goto is always constant.

Exceptions are good for avoiding a lot of conditional branches in the program and annoying checks, it must be affecting both program speed and program complexity. It's especially good for file parsing, when something can go wrong at literally any moment. They are pretty much multi-function gotos.

Also read this #578. I think I need exceptions after all. Just don't need to abuse them.

@thejoshwolfe

This comment has been minimized.

Show comment
Hide comment
@thejoshwolfe

thejoshwolfe Nov 28, 2017

Member

New proposal:

  • Labeled for and while loops.
  • Labeled break and continue statements.

For the fate of goto, see #629.

For block return expressions, see #630.

Member

thejoshwolfe commented Nov 28, 2017

New proposal:

  • Labeled for and while loops.
  • Labeled break and continue statements.

For the fate of goto, see #629.

For block return expressions, see #630.

@thejoshwolfe thejoshwolfe changed the title from labeled loops and labeled blocks instead of goto to labeled loops, labeled break, labeled continue Nov 28, 2017

@andrewrk andrewrk added the accepted label Nov 28, 2017

@andrewrk andrewrk closed this in 8bc5232 Dec 21, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment