Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can we allow compound statements (loops and conditionals) as the first pipeline segment and as sub-expressions too? #6817

Closed
mklement0 opened this issue May 3, 2018 · 14 comments
Labels
Issue-Discussion the issue may not have a clear classification yet. The issue may generate an RFC or may be reclassif

Comments

@mklement0
Copy link
Contributor

mklement0 commented May 3, 2018

In the world of assignments, compound statements (loops and conditionals) can be used as value-returning expressions too:

$str = 'hi'  # simple expression
$flag = if ($str -eq 'hi') { 1 } else { 0 }  # flow-control statement works too; returns 1

By contrast, in the context of a pipeline or as sub-expressions, flow-control statements do not work as-is:

# Simple expressions: OK as the 1st pipeline segment / with redirections.
'hi' > out.txt
'hi' | Out-File out.txt

# Flow-control expressions: do NOT work (directly) as the 1st pipeline segment.
# What follows the flow-control expression is treated as a *new statement* (and these 
# cause a syntax error here):
if ($str -eq 'hi') { 1 } else { 0 } > out.txt
if ($str -eq 'hi') { 1 } else { 0 } | Out-File out.txt

# Similarly, trying to use a flow-control statement as a *sub-expression* 
# (part of a larger expression), fails:
(foreach ($i in 1..3) { $i }) -join ', '   # !! Only works with $(...), not just (...)

You can work around this:

  • for use in the pipeline: by enclosing the flow-control statement either in & { ... } / . { ... } for streaming output.
  • for use in expressions: by enclosing the flow-control statement in $(...) or @(...), which collects all output up front.

However, the need for that is not obvious, and it is somewhat cumbersome, and in the case of sub-expressions carries a performance penalty.

There may be parsing challenges and ambiguities I'm not considering, but perhaps compound statements can be treated the same as simple expressions in these contexts, allowing their direct use as the first pipeline segment / redirection source.

In other words: If compound statements were bona fide expressions, the above problems would go away (though streaming behavior should be retained when used as the 1st pipeline segment).

P.S.:

  • Is there a better or more established term for compound statements?
  • This issue was inspired by this SO question.

Environment data

Written as of:

PowerShell Core v6.0.2
@BrucePay BrucePay added the Issue-Discussion the issue may not have a clear classification yet. The issue may generate an RFC or may be reclassif label May 3, 2018
@vexx32
Copy link
Collaborator

vexx32 commented May 4, 2018

This kind of begs the question of why can't we use these anywhere in the pipeline directly? It would effectively remove the need for things like Where-Object entirely.

@mklement0
Copy link
Contributor Author

mklement0 commented May 4, 2018

@vexx32:

It makes sense to limit use of expressions to the first pipeline segment, because while expressions create output, they are not prepared to handle pipeline input.
In other words: they can start a pipeline, but don't fit in the middle or at the end of pipeline.

In the world of expressions you already have ForEach-Object and Where-Object analogs: the .ForEach() and .Where() collection "operators" (methods); e.g.:

PS> (1..3).ForEach({ $_ + 1 })
2
3
4

An RFC proposal (of mine) suggests surfacing these methods as bona fide PowerShell operators, which would allow you to write:

PS> 1..3 -foreach { $_ + 1 }
2
3
4

@mklement0 mklement0 changed the title Can we allow flow-control expressions (loops and conditionals) as the first pipeline segment too? Can we allow flow-control statements (loops and conditionals) as the first pipeline segment and as sub-expressions too? Dec 2, 2018
@mklement0 mklement0 changed the title Can we allow flow-control statements (loops and conditionals) as the first pipeline segment and as sub-expressions too? Can we allow compound statements (loops and conditionals) as the first pipeline segment and as sub-expressions too? Oct 5, 2019
@mklement0
Copy link
Contributor Author

In #10967 (comment), @rjmholt has provided the explanation for why what the initial post is asking for is not possible with the current grammar :

PowerShell has pipelines contained by statements, not statements contained by pipelines. That's always been the case, [...]

In short, if I understand correctly:

  • What I call a compound statement in the initial post is one form of a statement.
  • Pipelines are another (which includes expressions by themselves and expressions as the 1st pipeline segment).

Currently, never the twain shall meet, sadly.

Essentially, with such a syntactic change you're asking for a new language, with a different treatment of syntactic and semantic constructs like expressions, pipelines and statements.

Asking largely hypothetically, @rjmholt - I do recognize how such a change would be the mother of all changes:

  • If we could start from scratch, would treating what I've called compound statements the same as expressions be feasible (that is, also allow them in a pipeline, but only as the first segment)?

  • Would any current code break, if we did?

In other words: Would a grammar such as the following work (adapted from #10967 (comment))?

pipeline:
 | ‘return’ pipeline [‘&’]
 | pipeline [‘&’]
 | pipeline ‘|’ command_expression
 | expression

expression:
 | ...             # true expressions such as `1+2` or `'1,2' -split ','`)
 | command_expression
 | compound_expression

compound_expression:
 | `foreach` ...
 | `while` ...
 | `if` ...
 # ...

command_expression: 
 | ‘Get-Item /‘   # e.g.

@SeeminglyScience
Copy link
Collaborator

Asking largely hypothetically, @rjmholt - I do recognize how such a change would be the mother of all changes:

  • If we could start from scratch, would treating what I've called compound statements the same as expressions be feasible (that is, also allow them in a pipeline, but only as the first segment)?

Yeah if you're starting from scratch, pretty much anything is feasible.

  • Would any current code break, if we did?

All third party AST based tooling for one. The shape of all of those API's would be dramatically different.

As for PowerShell scripts specifically, also yeah. I mean it's probably possible to redesign the language and rewrite the parser without breaking anything, but it'd be one a hell of a trick.

@mklement0
Copy link
Contributor Author

Thanks, @SeeminglyScience.

By contrast, in the context of introducing && and || the grammar had to be modified without breaking anything(?); what are your thoughts on the impact of the change I've proposed in #10967 (comment)? (Please comment there, if you're up for it.)

@SeeminglyScience
Copy link
Collaborator

@mklement0 I'm not sure what you're asking for in that issue, from a technical standpoint. How do you propose that be done without making return and exit something other than statements?

I think you'd have to make them pipelines or expressions, which would also be a breaking change for tooling. Maybe you could introduce new ASTs that are the expression/pipeline version of those statements but that makes it very confusing imo.

On a more subjective note, I also just don't like how hard it would be to read. I know PowerShell has historically been all about letting users shoot themselves in the foot, and that can be great. However, after the whole question mark in variable names situation I'm not keen on the idea of adding more features like that.

@mklement0
Copy link
Contributor Author

mklement0 commented Dec 24, 2019

return wouldn't have to change (from what it was before the current && / || implementation), only exit would be the exception - it would be allowed to moonlight as a non-initial link in a pipeline chain, which is what users will - sensibly - expect.

Note that shooting yourself in the foot is already possible; that is, exit works as the first link of a pipeline chain:

# This happily exits your session, irrespective of what comes after the `&&`
exit 0 && Get-Item /foo

That the - perfectly sensible and common - reversal does not work is why I'm proposing the exception:

# Doesn't work - requires $(...) around `exit 0`; ditto with `return`
Get-Item /foo && exit 0 

I think you'd have to make them pipelines or expressions, which would also be a breaking change for tooling. Maybe you could introduce new ASTs that are the expression/pipeline version of those statements but

It is the latter I was thinking of, but I'm definitely out of my depth here.

that makes it very confusing imo.

It may make the implementation more confusing, of necessity (too late to change the fundamentals), but to me it definitely lessens the confusion for the users.

letting users shoot themselves in the foot
I also just don't like how hard it would be to read.

I think that my proposal helps with both aspects, because in the current implementation:

  • That the return in return ls /foo || Write-Host 'Continuing?' applies to the whole chain, and not just to a chain link is unexpected and confusing.

  • That the very common idioms ls /foo || exit 1 and ls /foo || return must be written as ls /foo || $(exit 1) and ls /foo || $(return) is unexpected, confusing, and cumbersome (a hat-trick).

Note that in the context of PowerShell there is no behavioral precedent to adhere to in these situations: These are new features, and the behavior I'm proposing will not only make it easier for Bash users to use them, but, I believe, generally makes more sense and is easier to conceptualize than the current implementation.

@SeeminglyScience
Copy link
Collaborator

return wouldn't have to change (from what it was before the current && / || implementation), only exit would be the exception - it would be allowed to moonlight as a non-initial link in a pipeline chain, which is what users will - sensibly - expect.

I'm not following, why only exit? That makes it even harder to understand the language rules.

Note that shooting yourself in the foot is already possible; that is, exit works as the first link of a pipeline chain:

# This happily exits your session, irrespective of what comes after the `&&`
exit 0 && Get-Item /foo

Okay but how do you expect that to work? Because if you change 0 to 10 the result doesn't change. The exit statement is taking the pipeline 0 && Get-Item /foo, executing it, and then using the results (an array of @(0; Get-Item /foo)) as the "exit code" (which it doesn't know how to interpret and uses 0 instead). Don't think of it as the first link, think of it as a modifier of the whole chain. It's basically like doing this:

$myInvalidExitCode = 0 && Get-Item /foo
exit $myInvalidExitCode
# or
exit $(0 && Get-Item /foo)
# also, this exits with exit code 30
exit 0 && $(exit 30)

And that's kind of the point, the problem isn't "how would you make it work for later portions of the chain" it's how would you fit it into an actual link, changing the behavior significantly.

It may make the implementation more confusing, of necessity (too late to change the fundamentals), but to me it definitely lessens the confusion for the users.

I hear ya, but I don't agree. I think it makes the language a lot more confusing past the surface level, and any surface level benefit doesn't outweigh the cost of the complexity.

  • That the return in return ls /foo || Write-Host 'Continuing?' applies to the whole chain, and not just to a chain link is unexpected and confusing.
  • That the very common idioms ls /foo || exit 1 and ls /foo || return must be written as ls /foo || $(exit 1) and ls /foo || $(return) is unexpected, confusing, and cumbersome (a hat-trick).

Yeah I agree that it's unfortunate. I still don't think it makes sense for PowerShell.

Note that in the context of PowerShell there is no behavioral precedent to adhere to in these situations:

They are extensions of pipelines which currently follow the same rules.

These are new features, and the behavior I'm proposing will not only make it easier for Bash users to use them, but, I believe, generally makes more sense and is easier to conceptualize than the current implementation.

Probably right in regards to bash users, which is very regrettable. I don't agree with the rest though. My personal opinion is that it should stay how it is. I understand the arguments on both sides fully, I just don't agree on this one.

@mklement0
Copy link
Contributor Author

mklement0 commented Dec 24, 2019

I'm not following, why only exit?

My thought was: If we change the && / || implementation to how return used to work - i.e., returning a single pipeline's result - nothing needs to change in pipeline chains - but I now realize that return is not itself part of a pipeline and therefore indeed requires an exception too.

With the exit exception also in place, you then get consistent behavior for return and exit: both only apply to their chain link, not to the rest of the chain.

Okay but how do you expect that to work?

In that vein: I expect exit to only apply to the first link, and, by the nature of exit, for the second link to be ignored.

That exit 0 && Get-Item /foo would evaluate 0 && Get-Item /foo as a whole and pass the output to exit is, frankly, baffling to me.
(With return it's debatable, but there too only passing 0 - i.e. not crossing chain-link boundaries - makes much more sense to me).

As an aside: that exit quietly ignores an invalid exit code and defaults to 0(!) is a problem in itself.
Even the fact that you can pass a (single) whole pipeline to both return and exit is probably not widely known, if I were to guess - most real-world uses I've seen pass a variable / literal.

Scoping return and exit to a chain link is what most users will likely expect - not just coming from Bash - and it allows for the natural foo || exit / foo || return syntax.

As stated, conditionally exiting (returning from) the current scope based on individual pipelines' outcomes is a primary use case for && / || chains; passing entire chains to exit / return contravenes that goal.

My hunch is that most PowerShell users will not only naturally assume that foo || exit works, but also will also be oblivious to the fact that this use of exit doesn't fit the fundamentals of the grammar.

any surface level benefit

To me, ensuring that a feature makes sense to users is much more than a surface-level benefit.

They are extensions of pipelines which currently follow the same rules.

I don't think of them as extensions to pipelines, but as a conditional sequencing of them; that is, independent pipelines are combined.

Other than being forced by historical design decisions ("The current grammar just substitutes pipeline chains for pipelines" - #10967 (comment)), I see no justification for the current behavior.

That said, it may well be that what I'm asking for is still too much to shoehorn into / breaks the current grammar - a question I cannot answer by myself:

statement:
 | ...
 | ’return’ pipeline [‘&’]
 | ’exit’ pipeline
 | pipeline [‘&’]
 | pipeline_chain

pipeline_chain:
 | pipeline_chain ‘&&’ pipeline [‘&’]
 | pipeline_chain ‘&&’ ’return’ pipeline [‘&’]
 | pipeline_chain ‘&&’ ’exit’ pipeline
 | pipeline_chain ‘||’ pipeline [‘&’]
 | pipeline_chain ‘||’ ’return’ pipeline [‘&’]
 | pipeline_chain ‘||’ ’exit’ pipeline
 | pipeline

@SeeminglyScience
Copy link
Collaborator

I'm not following, why only exit?

My thought was: If we change the && / || implementation to how return used to work - i.e., returning a single pipeline's result - nothing needs to change in pipeline chains - but I now realize that return is not itself part of a pipeline and therefore indeed requires an exception too.

With the exit exception also in place, you then get consistent behavior for return and exit: both only apply to their chain link, not to the rest of the chain.

And inconsistent behavior with all other keywords, and with how all other language elements work with return and exit. From an implementation standpoint, the only thing I can think of that would be within the realm of feasibility would be allowing any statement to be chained (but that has it's own problems).

Okay but how do you expect that to work?

In that vein: I expect exit to only apply to the first link, and, by the nature of exit, for the second link to be ignored.

That exit 0 && Get-Item /foo would evaluate 0 && Get-Item /foo as a whole and pass the output to exit is, frankly, baffling to me.
(With return it's debatable, but there too only passing 0 - i.e. not crossing chain-link boundaries - makes much more sense to me).

🤷‍♂ that's how everything else works. I get the confusion in reference to other languages, but it just doesn't make sense for PowerShell imo.

As an aside: that exit quietly ignores an invalid exit code and defaults to 0(!) is a problem in itself.
Even the fact that you can pass a (single) whole pipeline to both return and exit is probably not widely known, if I were to guess - most real-world uses I've seen pass a variable / literal.

return working that way is pretty wildly known from what I've seen. exit is a bit surprising, though tbh I'm not really sure what I'd rather it do. Too late to change now anyway.

Scoping return and exit to a chain link is what most users will likely expect - not just coming from Bash - and it allows for the natural foo || exit / foo || return syntax.

As stated, conditionally exiting (returning from) the current scope based on individual pipelines' outcomes is a primary use case for && / || chains; passing entire chains to exit / return contravenes that goal.

Yeah I'm not disputing that. When I say that I don't think it makes sense for PowerShell, and that it is not feasible, I'm saying that with this in mind. I understand all of the reasons why on the surface this seems like the obvious right move, but I strongly disagree that it is.

@mklement0
Copy link
Contributor Author

🤷‍♂ that's how everything else works

Before && and || there was no everything else. There was just a single pipeline you could pass to return (which you say is well-known; as an inconsequential aside: I hadn't heard of it until this discussion, and don't recall seeing it on Stack Overflow) and to exit (which we agree is uncommon).

And that wouldn't go away with what I'm proposing.

No end user previously had to think about the fact that sticking a return or exit in front a pipeline technically made it a statement and that that means you can't use the whole thing as a pipeline - the question simply didn't arise in the absence of pipeline chains.

From an end user's perspective, conceiving of a pipeline chain as a sequence of pipelines each optionally preceded by return or exit - or made up of just those keywords - makes much more sense than conceiving of return or exit as something you stick in front of an entire chain: the whole point of a chain is conditional (exit) behavior, depending on what links execute.

And I don't think the proposed behavior contravenes the spirit of PowerShell in any way.

It does contravene the current implementation, however, and I get that how chaining was implemented fits in best with that.

And inconsistent behavior with all other keywords

Not if you think of exit and return as something you can stick in front of a pipeline - which you always could do - whereas you could never do that with any of the other keywords.

From an implementation standpoint, the only thing I can think of that would be within the realm of feasibility would be allowing any statement to be chained (but that has its own problems).

While I definitely wish we could also use compound statements such as foreach, while, ... in a pipeline - as the first segment only, like expressions (the original topic of this thread) - and therefore also in a chain link, my understanding is that this is what would constitute "the mother of all changes" and is therefore off the table.

I can also see how implementing just the return-and-exit-per-chain-link proposal based on the current implementation without breaking anything may turn out to be too challenging and too much of a maintenance burden (I can't personally assess that).

But it is clear to me that the current chain implementation was dictated by the limitations of the original grammar - whose subtleties most users are probably unaware of - not by what would make the feature most useful to end users.

@mklement0
Copy link
Contributor Author

To bring closure to the question asked in the OP, based on @rjmholt's feedback in #10967 (comment):
In the context of the current grammar, the suggested change isn't possible.

@mklement0
Copy link
Contributor Author

I've summarized my conclusions from this exchange with respect to && and || in #10967 (comment).

P.S.: Just noticed the thumbs-down on my previous comment, @SeeminglyScience:

I get not wanting to expend more energy on a discussion at a certain point (when you feel like the disagreement is understood but no shared understanding can be reached, when you feel like not being heard, when the conversation is going around in circles, ...), but a thumbs-down as the sole feedback on a comment were multiple points were argued in detail just tells us "I don't like this" - and nothing else; it is a gesture of opposition without content.

I value your expertise, especially in areas where my knowledge is superficial, but the overall tone of this exchange left a sour aftertaste.

@SeeminglyScience
Copy link
Collaborator

Sorry, I didn't mean any disrespect. Thumbs down is less common here than in other repos, so it might have a connotation that I didn't intend.

That said, it's often stated that decisions are largely made based on community consensus. My intent was to indicate that I disagree without bumping the thread or paraphrasing the reasons why. You could argue that it was clear I disagreed from context, but to be honest it seems like context is often lost in committee meetings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue-Discussion the issue may not have a clear classification yet. The issue may generate an RFC or may be reclassif
Projects
None yet
Development

No branches or pull requests

4 participants