New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

break-with-value and for/else implementation #23260

Open
wants to merge 22 commits into
base: master
from

Conversation

@tkluck
Copy link
Contributor

tkluck commented Aug 14, 2017

This commit enables a combination of two language features that dovetail nicely together (as remarked by @StefanKarpinski in [1]): for/else and break-with-value, which together allow something like:

name = for person in people
    if person.id == id
        break person.name
    end
else
    generate_name()
end

The parsing patch is pretty straightforward. As for the code lowering part, I opted to make a second version of 'break-block, called 'break-block-with-value, which passes a target variable for the return value onto the break-labels stack. That's where the 'break operation finds it.

An alternative approach would be something more similar to the existing replace-return function: Instead of having an intermediate node representing 'break-block-with-value,we could traverse the expression tree and replace break-with-value by an assignment followed by a break.

I have no opinion either way; the current commit seemed like the obvious implementation to me, but that was before I saw replace-return's prior art.

This patch still has a few places marked TODO, where scoping issues for the else block need to be resolved. I'll tackle that after awaiting feedback. It also doesn't parse while/else yet, only for/else. That would be a trivial addition.

[1] #22891

edit: re-flow paragraphs.

tkluck added some commits Aug 14, 2017

Parse + emit code for for/else and break-with-value
This commit enables a combination of two language features that dovetail
nicely together (as remarked by @StefanKarpinski in [1]): for/else and
break-with-value, which together allow something like:

    name = for person in people
        if person.id == id
            break person.name
        end
    else
        generate_name()
    end

The parsing patch is pretty straightforward. As for the code lowering
part, I opted to make a second version of `'break-block`, called
`'break-block-with-value`, which passes a target variable for the return
value onto the `break-labels` stack. That's where the `'break` operation
finds it.

An alternative approach would be something more similar to the existing
`replace-return` function: Instead of having an intermediate node
representing `'break-block-with-value`,we could traverse the expression
tree and replace `break-with-value` by an assignment followed by a
break.

I have no opinion either way; the current commit seemed like the obvious
implementation to me, but that was before I saw `replace-return`'s prior
art.

This patch still has a few places marked `TODO`, where scoping issues
for the `else` block need to be resolved. I'll tackle that after
awaiting feedback.

[1] #22891
Implement findnext() in terms of for/else and break-with-value
Because a new language feature should probably be used in
the standard library, to encourage cargo-culting.
@ararslan

This comment has been minimized.

Copy link
Member

ararslan commented Aug 14, 2017

Wow, very nice first contribution! I'm against this change but I want to commend your work here.

@tkluck

This comment has been minimized.

Copy link
Contributor Author

tkluck commented Aug 14, 2017

Thanks for your remark @ararslan ! I thought it is a nice language feature but I certainly see how opinions can be different. If it turns out to have just been a little learning project for me, that still makes me happy :)

@JeffBezanson

This comment has been minimized.

Copy link
Member

JeffBezanson commented Aug 14, 2017

Thanks, @tkluck, this is great work! I haven't reviewed it in detail yet but it seems basically good. I imagine the main issue here will be whether people want this feature :)

@JeffBezanson

This comment has been minimized.

Copy link
Member

JeffBezanson commented Aug 14, 2017

Noting that this relates to #22659. Interestingly, this feature doesn't provide a way to get the last value of the iteration variable, so that might mean we should add both this and for outer i = ... as orthogonal features.

@@ -1251,10 +1251,11 @@ julia> findnext(A,3)
function findnext(A, start::Integer)
for i = start:length(A)
if A[i] != 0
return i

This comment has been minimized.

@KristofferC

KristofferC Aug 14, 2017

Contributor

Not related to the feature itself, but I find explicit returns easier to reason about.

This comment has been minimized.

@StefanKarpinski

StefanKarpinski Aug 14, 2017

Member

They are, but sometimes you don't want to factor a single loop into a function just so that you can use a return to get a value out. This basically lets you have that without a function.

This comment has been minimized.

@yuyichao

yuyichao Aug 14, 2017

Member

I do like the for else feature but isn't v = for ...; break i; else j end always equivalent to for ...; v = i; break; else v = j end (with the correct scope of v)?
The latter seems much easier to understand than the former. It is similar to the return value of if else but in that case at least the return value is easier to find.

This comment has been minimized.

@tkluck

tkluck Aug 14, 2017

Author Contributor

I agree with @KristofferC that the findnext example isn't the best use-case for this feature because a return suffices; I just included it for the sake of show-casing the working patch.

@yuyichao , yes that is indeed equivalent, and it's probably a matter of taste which is easier to understand.

IMHO, in a language where there is no distinction between statements and expressions, it is a bit wasteful and arguably unexpected (e.g. when compared to your if/else example) not to be able to specify a value for a loop expression. That's why I personally like this feature.

This comment has been minimized.

@tkluck

tkluck Aug 14, 2017

Author Contributor

Afterthought: it's also a bit more 'functional' because it's easier to reason about whether the resulting variable v is mutable or not.

This comment has been minimized.

@yuyichao

yuyichao Aug 14, 2017

Member

That's why I do like the for else and my example above includes the else block.

This comment has been minimized.

@JeffBezanson

JeffBezanson Aug 14, 2017

Member

Oops, I misread that. Got it.

This comment has been minimized.

@StefanKarpinski

StefanKarpinski Aug 14, 2017

Member

I do agree, however, that these cases are clearer with return; but they do serve as an in-situ test of the new syntax, which is useful for a WIP pull request.

This comment has been minimized.

@rfourquet

rfourquet Aug 30, 2017

Contributor

A nice example is intersect, which can be improved into:

function intersect(s::Set, sets...)
    i = similar(s)
    for x in s
        for t in sets
            !in(x, t) && break false
        else true
        end && push!(i, x)
    end
    return i
end

This comment has been minimized.

@tkluck

tkluck Sep 6, 2017

Author Contributor

Great example @rfourquet ! It could also be

for x in s
    for t in sets
        !in(x, t) && break
    else
        push!(i, x)
    end
end

which doesn't use break-with-value, but does use the else feature.

I personally like options with all or with comprehensions best, but that's a matter of taste and may not give the same performance (yet).

@StefanKarpinski

This comment has been minimized.

Copy link
Member

StefanKarpinski commented Aug 14, 2017

I can't build this branch – am I missing something here or is this work in progress?

essentials.jl
ctypes.jl
generator.jl
reflection.jl
options.jl
promotion.jl
tuple.jl
pair.jl
traits.jl
range.jl
expr.jl
error.jl
bool.jl
number.jl
int.jl
operators.jl
pointer.jl
indices.jl
array.jl
error during bootstrap:
ErrorException("syntax: break or continue outside loop")
rec_backtrace at /Users/stefan/projects/julia/src/stackwalk.c:86
record_backtrace at /Users/stefan/projects/julia/src/task.c:246 [inlined]
jl_throw at /Users/stefan/projects/julia/src/task.c:568
jl_errorf at /Users/stefan/projects/julia/src/rtutils.c:77
eval at /Users/stefan/projects/julia/src/interpreter.c:495
jl_interpret_toplevel_expr_in at /Users/stefan/projects/julia/src/interpreter.c:50
jl_parse_eval_all at /Users/stefan/projects/julia/src/ast.c:907
jl_load at /Users/stefan/projects/julia/src/toplevel.c:646 [inlined]
jl_load_ at /Users/stefan/projects/julia/src/toplevel.c:653
unknown function (ip: 0x10b3a1a01)
unknown function (ip: 0x10b3a1951)
do_call at /Users/stefan/projects/julia/src/interpreter.c:70
eval at /Users/stefan/projects/julia/src/interpreter.c:262
jl_interpret_toplevel_expr_in at /Users/stefan/projects/julia/src/interpreter.c:50
jl_toplevel_eval_flex at /Users/stefan/projects/julia/src/toplevel.c:608
jl_eval_module_expr at /Users/stefan/projects/julia/src/toplevel.c:204
jl_toplevel_eval_flex at /Users/stefan/projects/julia/src/toplevel.c:485
jl_toplevel_eval_in at /Users/stefan/projects/julia/src/builtins.c:505
unknown function (ip: 0x10b3a0f21)
do_call at /Users/stefan/projects/julia/src/interpreter.c:70
eval at /Users/stefan/projects/julia/src/interpreter.c:262
jl_interpret_toplevel_expr_in at /Users/stefan/projects/julia/src/interpreter.c:50
jl_toplevel_eval_flex at /Users/stefan/projects/julia/src/toplevel.c:608
jl_parse_eval_all at /Users/stefan/projects/julia/src/ast.c:913
jl_load at /Users/stefan/projects/julia/src/toplevel.c:646
exec_program at /Users/stefan/projects/julia/usr/bin/julia (unknown line)
true_main at /Users/stefan/projects/julia/usr/bin/julia (unknown line)
main at /Users/stefan/projects/julia/usr/bin/julia (unknown line)

make[1]: *** [/Users/stefan/projects/julia/usr/lib/julia/inference.ji] Error 1
make: *** [julia-inference] Error 2
@tkluck

This comment has been minimized.

Copy link
Contributor Author

tkluck commented Aug 14, 2017

@StefanKarpinski that's odd; I tried a clean build before pushing. Might still be a leftover build artifact somewhere; I'll look into it tomorrow (am in CEST).

Thanks for trying it out!

@tkluck

This comment has been minimized.

Copy link
Contributor Author

tkluck commented Aug 15, 2017

I understand the build issue now; the change to find_next works when run normally, but not when building the sysimg. I also understand the root cause but the proper fix isn't 100% obvious. I'm sure I'll come up with something.

@StefanKarpinski

This comment has been minimized.

Copy link
Member

StefanKarpinski commented Aug 15, 2017

Bootstrapping is tricky. The system is continually building itself and things you're used to being able to do are at various points not defined yet. So using @eval Base ... to develop is a good idea, but then you have to try bootstrapping as well at the end to make sure you didn't use anything too early.

Fix break-with-value in case the value expression is a symbol
Before 976ff84, a parsed `break` expression `'(break)` was being
augmented to `'(break <label>)` by the function `expand-forms`. Since
the latter is supposed to be idempotent [1], it used to check

    (pair? e)

before deciding on replacing it. In 976ff84 however, the parsed
break can have a value expression, so `(pair? e)` can be true for that
reason as well. Thinking I was being smart, I replaced the check by

    (symbol? (cadr e))

not realizing that a lone symbol can be a valid expression too!

This commit fixes that by *always* compiling the break expression to a
*triple*

    '(break <expression> <label>)

where <expression> may be `'(null)`. This should have no big
implications for the code being generated, because `'(null)` generates
no code when, in the function `compile`, `(and value tail)` is false,
which is likely the typical case.

[1] At least, that's what I'm guessing is the reason.
@tkluck

This comment has been minimized.

Copy link
Contributor Author

tkluck commented Aug 15, 2017

Just pushed a fix for the issue I found, and confirmed that make clean && make gives a working julia executable. Hope I didn't overlook anything this time!

I think it's possible to get rid of 'break-block-with-value, instead just adding a (possibly '(null)) else expression to every break-block. That would make it all slightly easier to maintain and I don't think it actually makes a difference for code generation.

As for scoping, I'm tempted to keep the else block in outer scope, as I don't see much value in hiding any of its variables from the outer scope. However, for least surprise, it's probably better to follow the behaviour of the finally block in try/finally, which does introduce a scope:

julia> try finally j=4 end; j
ERROR: UndefVarError: j not defined

So unless anyone objects, that's what I'll implement.

Simplify break-with-value implementation: don't distinguish break-blo…
…ck and break-block-with-value

Instead, we give every `break-block` an `else`-block, potentially equal
to `'(null)`. This gives us one less atom to worry about.

This commit also makes a small modification to `compile`, which now
passes the value of `tail` for a `break` onto the `break-labels` stack.
That means a `break` can be compiled to a `return` if the `break-block`
is in the last position of a function.
@StefanKarpinski

This comment has been minimized.

Copy link
Member

StefanKarpinski commented Aug 16, 2017

I get a lot of "invalid AST" now, e.g.:

julia> f(n) = for i = 1:n
           isprime(i) && break i
ERROR: syntax: invalid AST

julia> f(n) = for i = 1:n
           isprime(i) && break(i)
ERROR: syntax: invalid AST

julia> f(n) = for i = 1:n
           isprime(i) && break
ERROR: syntax: invalid AST

julia> for i = 1:n
           isprime(i) && break
ERROR: syntax: invalid AST

julia> for i = 1:n
           isprime(i) && break i
ERROR: syntax: invalid AST

julia>
[ stefan ] ./julia                                                              KARPINSKI-MAC:~/projects/julia
               _
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: https://docs.julialang.org
   _ _   _| |_  __ _   |  Type "?help" for help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.7.0-DEV.1331 (2017-08-15 22:22 UTC)
 _/ |\__'_|_|_|\__'_|  |  tkluck-break-with-value/3abd4d9a8d (fork: 4 commits, 5 days)
|__/                   |  x86_64-apple-darwin16.1.0

julia> for i = 1:n
           isprime(i) && break i
ERROR: syntax: invalid AST

julia> for i = 1:n
           isprime(i) && break
ERROR: syntax: invalid AST

julia> 1 + 2
3

julia> for i = 1:10
           println(i)
ERROR: syntax: invalid AST

tkluck added some commits Aug 16, 2017

Fix error message from incomplete for loop
The following issue:

    Test Failed
      Expression: Base.incomplete_tag(parse(str, raise=false)) == tag
       Evaluated: none == block

was due to returning a different error message string in case of parsing
the incomplete string "for i=1;". This commit fixes it by delegating
error handling to `expect-end`.
Avoid allocating a value variable when break-block is in tail position
This was triggered by

    Error in testset inference:
    Test Failed
      Expression: all(isleaftype, (ast12474[1])[1].slottypes)

which had, as a root cause, the unused new-mutable-var being of type
Any.

Now, that isn't really what this doctest is trying to avoid, but I'm
guessing it's worthwhile avoiding an unneeded mutable var nonetheless.
Remove unnecessary if relating to break-labels contents
This is an artifact from when I wasn't sure whether anything else would
write to the break-labels stack apart from the code point that I had
updated. I'm now confident that doesn't happen, so no need for special
cases.
@tkluck

This comment has been minimized.

Copy link
Contributor Author

tkluck commented Aug 16, 2017

@StefanKarpinski ai, thanks for finding that!

I can't reproduce that anymore with what I just pushed. It's possible that this commit fixed it. My hypothesis would be that the REPL relies on the incompleteness error from the parser to decide whether to give a new prompt or wait for more instructions. That would also explain why I didn't see it before, because I was testing with code in a source file.

If that sounds believable to you, would you mind trying the latest update? Thanks for your patience!

@fredrikekre fredrikekre removed the needs docs label Aug 16, 2017

@StefanKarpinski

This comment has been minimized.

Copy link
Member

StefanKarpinski commented Aug 16, 2017

My hypothesis would be that the REPL relies on the incompleteness error from the parser to decide whether to give a new prompt or wait for more instructions.

That is correct. I'll give it another try. It's always amazing how different people's workflows are an how that can lead to triggering bugs or completely missing them.

@tkluck

This comment has been minimized.

Copy link
Contributor Author

tkluck commented Aug 17, 2017

Thinking about scoping:

The current patch initiates a scope-block for the loop body and the else body together. That kind of bothers me, but then I realized that that is because this already bothers me:

julia> for i=1:2; println(@isdefined(j)); j=1; end
false
true
julia> map(1:2) do i; println(@isdefined(j)); j=1; end;
false
false

My scoping expectation would be that every iteration starts a new scope. This is particularly relevant because julia seems to want to move freely between the two constructions I use above, e.g. with the @parallel macro wrapping a for-loop.

This comment here lists a few similar scoping tickets.

Any thoughts on how it should relate to what we're doing here?

(BTW, in addition to this, this pull request still has a few TODO places for collecting variable info from the else body.)

tkluck added some commits Aug 17, 2017

break-with-value: add elsebody to a few recursive expression traversals
I postponed this mostly because the names of the methods sounded scary
and I thought I'd need to think about it; turns out it's just a tree
traversal.
More consistent indentation in julia-parser.scm
(Use `git blame -w` to see the most recent functional changes to this code.)
@rfourquet

This comment has been minimized.

Copy link
Contributor

rfourquet commented Aug 25, 2017

I really do like this feature, which increases the expression-rather-than-statement feel of the language. I wonder if after all that doesn't subsume the new outer annotation of a loop:

i = for i in itr
        cond(i) && break i
    else i
    end

which is more verbose than outer, but still. This becomes a bit more complex if other values have to be returned. We could also imagine a finally which executes unconditionally at the end, which can receive values from the break:

i, result = for i in itr
        cond(i) && break :success
    else :fail
    finally result
        (i, result)
    end

So when no finally is written, an implicit one is created with the return value that it gets as argument. I didn't put much thought in that, but just to get the ball rolling.

@tkluck

This comment has been minimized.

Copy link
Contributor Author

tkluck commented Sep 8, 2017

I can fix the merge conflict relatively easily, but please let me know first what the decision is going to be :)

I think @rfourquet's example (below with slight modification from me) is a good example in favour of the else block:

function intersect(s, sets...)
    i = similar(s)
    for x in s
        for t in sets
            !in(x, t) && break
        else
            push!(i, x)
        end
    end
    return i
end

In case of a positive decision, I propose to revert my modification to findnext (which was intended to have this feature see use in the standard library) and replace it by this use in intersect.

In addition, break-with-value is, in many cases, an alternative for the (imo slightly cumbersome) nonlocal iteration variables (it's no surprise that these features cause a merge conflict). They can exist side-by-side.

@StefanKarpinski

This comment has been minimized.

Copy link
Member

StefanKarpinski commented Sep 19, 2017

@tkluck, I just wanted to say that this is good work and that the holdup here is indecision about whether we want to add these features or not. The current work toward 1.0 means that we're unusually reluctant to add any features since we're trying to stabilize the ones we have and we can always introduce new ones in 1.x releases as long as they're backwards compatible.

tkluck added some commits Jan 30, 2018

Revert "Implement findnext() in terms of for/else and break-with-value"
This was only used for testing the for/else construct; it's not actually
more readable. As discussed in #23260, `intersect()` is a nicer show
case. Moreover, this prevents a merge conflict with `master`, which has
diverged significantly by now.

This reverts commit d8b2553.
@tkluck

This comment has been minimized.

Copy link
Contributor Author

tkluck commented Feb 1, 2018

Just did some updates to fix merge conflicts arising from #24075 and #22659. Also added a more explicit unit test that ensures side effects happen in the right order, including finally blocks.

Hope this can make it into Julia proper at some point :)

@StefanKarpinski

This comment has been minimized.

Copy link
Member

StefanKarpinski commented Feb 1, 2018

I do still think this is a nice feature – sorry about letting it languish. Right before 1.0 just seems like a bad time for new language features. I've wanted this a few times in the past couple of months and also multi-level break/continue a few times.

@tkluck

This comment has been minimized.

Copy link
Contributor Author

tkluck commented Feb 1, 2018

I fully appreciate the 1.0 hesitation. Just keeping the pull request up to date in order not to let it bit-rot too much. Who knows, maybe this can be the headline feature for 1.1? :D

tkluck added some commits Mar 11, 2018

Merge branch 'master' into break-with-value
Just keeping this branch up-to-date to prevent bitrot.
@tkluck

This comment has been minimized.

Copy link
Contributor Author

tkluck commented Apr 7, 2018

Just to keep this up to date for when we hopefully merge it beyond 1.0: pushed a fix to the merge conflict caused by @JeffBezanson's work on #330 (0fffb36).

tkluck added some commits Jun 12, 2018

Merge branch 'master' into break-with-value
This fixes merge conflicts introduced by 62fbad2 and 3a0c6e2.
neither conflict was fundamental or complicated to solve.
@staticfloat

This comment has been minimized.

Copy link
Member

staticfloat commented Jun 16, 2018

I just want to throw my support in for this PR as well, once we're beyond the 1.0 hurdle, I would support putting this in. Thanks so much for your tenacity on keeping with this PR @tkluck!

@ararslan

This comment has been minimized.

Copy link
Member

ararslan commented Jun 17, 2018

I remain opposed to this change but I would like to commend your impressive work here.

@tkluck

This comment has been minimized.

Copy link
Contributor Author

tkluck commented Sep 21, 2018

Congrats to everyone on the 1.0 release!

Now that the dust is hopefully settling a bit, maybe it's a good time for me to gently bump this one little pull request just a tiny bit 😇

Merge remote-tracking branch 'origin/master' into break-with-value
This updates this branch to current master by merging it
and by fixing the merge conflicts introduced by 57b46e7.

I don't fully understand the functionality of that commit and
though the fixes I did result in no compilation errors
and in working `make test-core` (where this branch introduces
tests), it may make sense to double-check interactions
between exceptions and break-with-value before merging this.
@tkluck

This comment has been minimized.

Copy link
Contributor Author

tkluck commented Dec 23, 2018

Pushed an update to fix the merge conflict introduced by #28878.

I don't fully understand the functionality of that PR and though the fixes I did result in no compilation errors and in working make test-core (where this branch introduces tests), it may make sense to double-check interactions between exceptions and break-with-value before merging this.

@tkluck

This comment has been minimized.

Copy link
Contributor Author

tkluck commented Dec 24, 2018

CI failure seems unrelated, and a quick inspection of https://github.com/JuliaLang/julia/commits/8eca27df422089a8c21301764c6485fb86d0fee8 suggests that is was introduced by #30442 and merged into this pull request when merging master.

@tkluck

This comment has been minimized.

Copy link
Contributor Author

tkluck commented Jan 18, 2019

Some new conflicts with master branch have just appeared as a result of #30656.

Is now maybe a good time to make a decision on this feature? If we put it in master now it will have maximal time in the 1.2.0-DEV window to be battle tested before general release.

I'll gladly and swiftly fix the merge conflict in case of a positive decision.

@StefanKarpinski

This comment has been minimized.

Copy link
Member

StefanKarpinski commented Jan 18, 2019

I think this is @JeffBezanson's call. He is Julia's Syntax Czar.

@tkluck

This comment has been minimized.

Copy link
Contributor Author

tkluck commented Jan 21, 2019

Hey @JeffBezanson -- anything I can do to help you review this?

@mbauman

This comment has been minimized.

Copy link
Member

mbauman commented Jan 22, 2019

For what it's worth, reading the documentation changes here made me like this feature much more — I had been on the fence before. It's introduced quite well and sensibly. Perhaps we should discuss on the next triage call?

@mbauman mbauman added the triage label Jan 22, 2019

@StefanKarpinski

This comment has been minimized.

Copy link
Member

StefanKarpinski commented Jan 23, 2019

Agree. The documentation is really lovely and motivates this feature quite nicely.

@JeffBezanson

This comment has been minimized.

Copy link
Member

JeffBezanson commented Feb 14, 2019

Triage has mixed reactions to this. Personally I'd say I'm weakly in favor of it; it's a useful pattern and I can't think of a really strong reason not to have it.

Points for:

  • It's good for expressions to have (useful) values whenever possible.
  • It clarifies the purpose of a loop.
  • else is great when computing the default value is expensive.

Points against:

  • It's a bit weird. The meaning of else doesn't seem entirely natural.
  • It's weird that for loops don't normally have values, but using break grants them the ability to yield useful values, when it seems like it's just control flow.
  • Putting the default value at the end is not always clearer.

It was also suggested that we look up python's experience with this and see if they regret it.

@StefanKarpinski

This comment has been minimized.

Copy link
Member

StefanKarpinski commented Feb 15, 2019

It was also suggested that we look up python's experience with this and see if they regret it.

At this point Rust has had break with values for long enough that we might be able to learn from their experience with that part of this feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment