Syntax to break an expression over multiple lines? #819

ajfriend · 2022-07-17T03:50:37Z

The PRQL syntax currently usually (with the exception of lists?) treats newlines as terminating characters for statements. (And please do correct me if I'm wrong about this!)

For long expressions (and perhaps also statements?), it would be nice to have a syntax that would allow users to break that expression up over multiple lines.

This idea is motivated by the discussion in #788.

Is there a current way to do this? If not, would parentheses be the natural way to express this?

max-sixty · 2022-07-18T15:24:24Z

This is a good point.

I don't think we should need it that often — one of the benefits of PRQL should be to split up logic into modular pieces, represented by transforms on separate lines.

But it would be presumptuous to assume we can always do this, and #788 is a good case — even if the "split up into multiple filters" suggestion is reasonable, it doesn't cover the case of a long list of or-ed conditions.

Newlines have an important semantic meaning, so we've "used up" that syntax. So a couple of options:

Use \ to allow line breaks
Find the cases that force long lines and offer a solution to them (e.g. an any function to take a list of conditions and string them together with ors)

Any thoughts?

ajfriend · 2022-07-18T17:53:19Z

My comment around parentheses was wondering if you could do something like what Python does to "free up" the newline syntax inside of parentheses:

mklopets · 2022-07-18T21:55:12Z

\ also seems like a decent solution to me

max-sixty · 2022-07-19T04:27:37Z

Yes, interesting. So we could have a rule that finishing the line on an operator (e.g. + / or / etc) causes a line continuation.

Are there any instances where we want to finish on an operator?

One concern is that while editing a line in the middle of a query, the error messages might be quite confusing:

from foo
derive bar = baz or<cursor>
aggregate (
  max fuz
)

...would this have a confusing error message saying "baz or aggregate is invalid"?

ajfriend · 2022-07-19T04:52:47Z

Yes, interesting. So we could have a rule that finishing the line on an operator (e.g. + / or / etc) causes a line continuation.

But that's maybe valid only inside a set of parens? (I'm not sure what the full Python parsing logic is.) Additionally, it might also be valid if the next line starts with an operator, which is that last Python example (inside parens).

ajfriend · 2022-07-19T04:55:51Z

...would this have a confusing error message saying "baz or aggregate is invalid"?

"Unexpected newline"?

max-sixty · 2022-07-19T06:53:28Z

"Unexpected newline"?

The challenge is that it wouldn't be parsed as a newline, it would be parsed as derive bar = baz or aggregate (, unless I'm misunderstanding something?

But that's maybe valid only inside a set of parens?

Yes, this could be possible. It starts to get a bit complicated — copy something from inside parens to outside parens and it doesn't work any longer.

But python is good prior art, I think this should be OK.

Additionally, it might also be valid if the next line starts with an operator, which is that last Python example (inside parens).

Yes, good idea. We'll have to see whether pest supports this, I think it should be possible with positive predicates

ajfriend · 2022-07-19T06:58:29Z

Cool! And to be clear, I'm just offering my random thoughts as a fan in the peanut gallery. Appreciate all the work you've already done!

max-sixty · 2022-07-19T07:00:14Z

Cool! And to be clear, I'm just offering my random thoughts as a fan in the peanut gallery. Appreciate all the work you've already done!

Thoughts generally welcome, and in the specific case they're also great ideas @ajfriend !

aljazerzen · 2022-08-04T07:43:21Z

I'd argue against trying to parse newlines after operators or inside parenthesis. I like \ far more, regardless of the fact that I find it quite ugly.

That's because newline (as a token) is already "used up" on higher levels of parsing, as @max-sixty said. Precedence goes roughly like:

newline, |
and, or
<, >, <=, >=
+, -
*, /
(), .., []

Simplified, tokens are first split by newlines and |, then by and and or, and so on.

Now, if we would want to ignore newlines in () or after +, we'd have to parse () and + higher in precedence list, which would add a lot of complexity.

But why can python support this? Python has two precedence lists: one for expressions and one for statements. Newline is a part of statement precedence list and it means "end of statement". And because statements cannot be part of expressions, and because ( indicates unfinished expression, Python can only use expression precedence list and resolve newline to mean "whitespace".

Back to PRQL, newlines can be part of expressions - actually the whole pipeline is an expression! That's why even in (), newlines must mean "pipe".

I came up with another idea: newlines meaning "pipe" make sense only in pipelines. Pipelines can be just top-level plain function calls, or function calls wrapped in (). But they cannot be function calls in []! Which means that without much hussle, I think we can make this work:

derive [
  some + long + expr + 
    spans + multiple + lines
]

or

derive [some + long + expr + 
    spans + multiple + lines]

sietse · 2022-12-23T16:41:05Z

I have a question, not about future multiline syntax, but about a potential workaround: perhaps in Prql 0.3.1, we can use currying to add arguments one at a time. Perhaps not.

This is not meant as a derailing feature request, only as a "can this workaround be made to work" query. If it can't, no worries.

Alright, the would-be workaround:

# This function takes three arguments, so maybe you want to write it on multiple lines
func addthree<column> a b c -> s"a + b + c"

# 'arg x' takes a curried function and adds x to the end of its argument list.
# addthree a | arg b  -> addthree a b
# because newlines are equal to pipes, this allows building up a function's arguments over multiple lines
func arg myarg myfunc -> ( myfunc myarg )

from mytable
derive [
  my_thing = (  # a multiline pipe that builds up a function call
    addthree apples  # addthree apples _ _
    arg bananas  # addthree apples bananas _
    arg citrus  # `addthree apples banana citrus`, a.k.a. `apples + bananas + citrus`
  )
]

Compiling this with 0.3.1 gives an error (below) that I think I can summarize as "I expected a column expression, but I found a function with signature infer -> infer". Is it possible to write the code differently so it works in 0.3.1? If not, no matter. Thanks for all you've done and are doing on PRQL!

# Linebreaks added. Character span not accurate.
Error {
  span: Some(span-chars-110-176),
  reason: Expected {
    who: Some("function std.derive, param `columns`"),
    expected: "type `column`",
    found: "type `func infer -> infer`"
  },
  help: None
}

aljazerzen · 2022-12-23T16:58:37Z

Woah, a great workaround using currying! Maybe it was worth implementing 😄

Your example should work and the fact that it doesn't is a bug. I'll look into it. There seems to be a problem with pipelines, because normal function calls do work (but defeats the whole purpose of the workaround):

func addthree<column> a b c -> s"{a} + {b} + {c}"

func arg myarg myfunc -> ( myfunc myarg )

from mytable
derive [
  my_thing = (
    arg "citrus" (arg "bananas" (addthree "apples"))
  )
]

aljazerzen · 2022-12-23T18:03:10Z

Cancel that, even my example produces a wrong output.

sietse · 2022-12-29T15:31:24Z

Well, thank you very much for taking a look. PRQL is a beautiful thing. Have a good slide into the new year!

sietse · 2022-12-29T16:04:40Z

A thought about backslash \ for line continuations: it can interact badly with line comments.

In languages where \ at the end of a line indicates a line continuation ...
... and '#' comments all characters until the end of the line ...
... in those languages, you can't have line comments on multiline expressions.

For example, this is invalid bash:

tar \
  -x  # extract \
  -z  # gzipped \
  -f  # read from file \
  myfile.tar.gz

and so is this:

mycommand \
  | head --bytes=-2  # remove spurious CR LF from end \
  | tail +2  # remove header

Some potentional ways to allow comments AND line continuations, if that is important to you.

Contrived example featuring addthree. Find shows with free places, by adding up how many places each kind of ticket represents, and then comparing that to the theatre's capacity.

Flavour 1: 'backslash EOL' for line continuation. 'backslash space comment EOL' also for line continuation. 'EOL' for pipe.

from tickets_per_show
filter addthree \
    seats \
    couches * 2 \   # couches are double-sized
    standing / 2 \  # standing people take less room
    != max_capacity

Flavour 2: 'EOL space backslash' for line continuation. 'EOL' for pipe.

In other words, line that starts with spaces + backslash gets glued to previous line, after comment removal.

from tickets_per_show
filter addthree
    \ seats
    \ couches * 2   # couches are double-sized
    \ standing / 2  # standing people take less room
    \ != max_capacity

Flavour 3: infix comments

from tickets_per_show
filter addthree \
    seats \
    couches * 2   /* couches are double-sized */ \
    standing / 2  /* standing people take less room */ \
    != max_capacity

Flavour 4: line continuations allow separate-line comments, not end-of-line comments,

from tickets_per_show
filter addthree \
    seats \
    # couches are double-sized
    couches * 2   \
    # standing people take less room
    standing / 2  \
    != max_capacity

aljazerzen · 2022-12-30T09:21:04Z

Good breakdown of options for the backslash escaping route. I prefer flavour 4 the most.

But as I said, backslash looks ugly (that's just my preference) and has to be adjusted when adding a new line at the end. It's like a trailing comma. And we cannot allow trailing backslashes...

max-sixty · 2022-12-31T00:30:59Z

I agree for Flavour 4! (I think the example is missing +s — is that right? Or I'm misunderstanding the example.)

Overall I'd be fine with this; it doesn't conflict with other syntax. I think we should try and reduce the amount it's needed — but sometimes it will be, as the example shows.

And agree with @aljazerzen on this point; specifically that it's not "composable", in the same way as a trailing comma isn't:

has to be adjusted when adding a new line at the end. It's like a trailing comma. And we cannot allow trailing backslashes...

If I can suggest something to the design — trailing spaces after the backslash should be ignored. In bash, \ does not count as a trailing backslash, which can be really confusing.

max-sixty · 2023-01-09T07:09:07Z

As discussed on Discord — are we broadly in agreement on this option?

Flavour 4: line continuations allow separate-line comments, not end-of-line comments,
from tickets_per_show
filter addthree \
    seats \
    # couches are double-sized
    couches * 2   \
    # standing people take less room
    standing / 2  \
    != max_capacity

Note:

(I think the example is missing +s — is that right? Or I'm misunderstanding the example.)

aljazerzen · 2023-01-09T07:38:19Z

This does seem like the best option. Nevertheless, I'd prefer not add, for the reasons above.

I'm -0.5 on it.

max-sixty · 2023-01-09T18:54:35Z

How would you represent the a long expression @aljazerzen ? I agree the trailing backslashes are ugly! But is there a better way?

FWIW this is longer the case:

But they cannot be function calls in []! Which means that without much hassle, I think we can make this work:

aljazerzen · 2023-01-09T19:17:38Z

Within lists, we could make this work:

from a
select [
  sum
  b,
  avg
  b
]

... and an idea: non-pipelining parenthesis!

from a
select ((
  sum
  b
))

max-sixty · 2023-01-09T19:50:14Z

... and an idea: non-pipelining parenthesis!

The abyss just stared back at me! (niche joke / reference!)

Within lists, we could make this work:

How about for a long expression with operators? I think that's the most likely long expression, rather than function calls. As above, we could allow ending a line with those (though it may create some odd error messages)?

I think that's possible. I'm not sure whether that's better than the trailing slashes.

I'm also OK saying "the current state isn't great. But the options we're looking at aren't good either, we're going to leave this open for a while and reflect, rather than implementing something bad", if we think that reflection might pay off.

Or "we're OK with X, but it's not a priority to implement for us, we'll accept a PR", which is an implicit filter without us getting in the way of anyone.

I think both of those are better than leaving something hanging for ages (as you pointed out on Discord / on the call!)

snth · 2023-01-09T20:54:59Z

FWIW, my favourite is probably Flavour 1 and I'm not such a fan of Flavour 4 - it just seems harder to read to me (but I'm just speaking as a user, not an implementer).

Wouldn't Flavour 2 possibly address the "trailing comma" type issue as only the first line is different and that's special anyway. It wouldn't be my first choice as you can't tell on each line that there is more coming but to me it seems acceptable if it deals with the trailing commas issue. How would this flavour deal with commenting things out though?

Would the following work?

from tickets_per_show
filter addthree
    \ seats
    # \ couches * 2   # couches are double-sized
    \ couches * 3   # couches are triple-sized
    \ standing / 2  # standing people take less room
    \ != max_capacity

@max-sixty 's suggestion of "leave this open and accept a PR" seems good for now.

max-sixty · 2023-01-09T22:09:44Z

Would the following work?

from tickets_per_show
filter addthree
    \ seats
    # \ couches * 2   # couches are double-sized
    \ couches * 3   # couches are triple-sized
    \ standing / 2  # standing people take less room
    \ != max_capacity

This actually might work! It makes the parser ambiguous for one item, but that's OK I think.

Or are there things I'm missing?

~~(I'm going to take the executive decision of filling in the +, which I'm fairly convinced are missing in error and make the examples wrong, since no one corrected me above)~~

snth · 2023-01-09T22:18:33Z

@max-sixty The reason there are no +s is because of the addthree function that was defined higher up (here).

sietse · 2023-01-10T10:56:46Z

snth is right. I felt that a long expression like seats + couches + standing risked a sidetrack into "a line ending in + is obviously incomplete" territory, hence why I made up addthree seats couches standing for the examples.

ajfriend mentioned this issue Jul 17, 2022

Syntax to filter a list of boolean expressions to be ANDed #788

Closed

max-sixty added the language-design Changes to PRQL-the-language label Jul 18, 2022

aljazerzen added the needs-discussion Undecided dilemma label Jan 8, 2023

aljazerzen removed the needs-discussion Undecided dilemma label Jan 17, 2023

max-sixty mentioned this issue Feb 22, 2023

Readability of switch statement with long lines #1922

Closed

max-sixty mentioned this issue May 11, 2023

Changing func syntax #2504

Closed

max-sixty mentioned this issue Jul 23, 2023

feat: multiline function definition syntax #2937

Merged

eitsupi mentioned this issue Aug 23, 2023

Line continuation inside parentheses #3380

Closed

2 tasks

max-sixty mentioned this issue Aug 25, 2023

feat: Add a line continuation character #3408

Merged

max-sixty closed this as completed in #3408 Aug 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Syntax to break an expression over multiple lines? #819

Syntax to break an expression over multiple lines? #819

ajfriend commented Jul 17, 2022

max-sixty commented Jul 18, 2022

ajfriend commented Jul 18, 2022

mklopets commented Jul 18, 2022

max-sixty commented Jul 19, 2022

ajfriend commented Jul 19, 2022 •

edited

Loading

ajfriend commented Jul 19, 2022

max-sixty commented Jul 19, 2022

ajfriend commented Jul 19, 2022

max-sixty commented Jul 19, 2022

aljazerzen commented Aug 4, 2022

sietse commented Dec 23, 2022 •

edited

Loading

aljazerzen commented Dec 23, 2022

aljazerzen commented Dec 23, 2022

sietse commented Dec 29, 2022

sietse commented Dec 29, 2022

aljazerzen commented Dec 30, 2022

max-sixty commented Dec 31, 2022

max-sixty commented Jan 9, 2023

Flavour 4: line continuations allow separate-line comments, not end-of-line comments,

aljazerzen commented Jan 9, 2023

max-sixty commented Jan 9, 2023

aljazerzen commented Jan 9, 2023

max-sixty commented Jan 9, 2023

snth commented Jan 9, 2023

max-sixty commented Jan 9, 2023 •

edited

Loading

snth commented Jan 9, 2023

sietse commented Jan 10, 2023

Syntax to break an expression over multiple lines? #819

Syntax to break an expression over multiple lines? #819

Comments

ajfriend commented Jul 17, 2022

max-sixty commented Jul 18, 2022

ajfriend commented Jul 18, 2022

mklopets commented Jul 18, 2022

max-sixty commented Jul 19, 2022

ajfriend commented Jul 19, 2022 • edited Loading

ajfriend commented Jul 19, 2022

max-sixty commented Jul 19, 2022

ajfriend commented Jul 19, 2022

max-sixty commented Jul 19, 2022

aljazerzen commented Aug 4, 2022

sietse commented Dec 23, 2022 • edited Loading

aljazerzen commented Dec 23, 2022

aljazerzen commented Dec 23, 2022

sietse commented Dec 29, 2022

sietse commented Dec 29, 2022

Flavour 1: 'backslash EOL' for line continuation. 'backslash space comment EOL' also for line continuation. 'EOL' for pipe.

Flavour 2: 'EOL space backslash' for line continuation. 'EOL' for pipe.

Flavour 3: infix comments

Flavour 4: line continuations allow separate-line comments, not end-of-line comments,

aljazerzen commented Dec 30, 2022

max-sixty commented Dec 31, 2022

max-sixty commented Jan 9, 2023

Flavour 4: line continuations allow separate-line comments, not end-of-line comments,

aljazerzen commented Jan 9, 2023

max-sixty commented Jan 9, 2023

aljazerzen commented Jan 9, 2023

max-sixty commented Jan 9, 2023

snth commented Jan 9, 2023

max-sixty commented Jan 9, 2023 • edited Loading

snth commented Jan 9, 2023

sietse commented Jan 10, 2023

ajfriend commented Jul 19, 2022 •

edited

Loading

sietse commented Dec 23, 2022 •

edited

Loading

max-sixty commented Jan 9, 2023 •

edited

Loading