Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The chain operators don't work with flow-control statements exit, return, and Throw #10967

Closed
mklement0 opened this issue Nov 1, 2019 · 38 comments
Labels
Committee-Reviewed PS-Committee has reviewed this and made a decision Issue-Question ideally support can be provided via other mechanisms, but sometimes folks do open an issue to get a Resolution-No Activity Issue has had no activity for 6 months or more WG-Engine core PowerShell engine, interpreter, and runtime

Comments

@mklement0
Copy link
Contributor

mklement0 commented Nov 1, 2019

A common idiom (in the Bash world, which inspired PowerShell's && and || operators) is to conditionally exit a script when invocation of a command fails, along the lines of:

# Assume existence of /somepath and exit, if it doesn't exist.
ls /somepath || exit 1

Currently, neither exit nor return nor throw can be used on the RHS of && / ||

Steps to reproduce

{ Get-Item /nosuch || return  } | Should -Not -Throw
{ Get-Item / && return  } | Should -Not -Throw

Expected behavior

The tests should pass.

Actual behavior

The tests fail, because return is not recognized

Expected no exception to be thrown, but an exception 
"The term 'return' is not recognized as the name of a 
cmdlet, function, script file, or operable program. ..."...

Environment data

PowerShell Core 7.0.0-preview.6
@mklement0 mklement0 added the Issue-Question ideally support can be provided via other mechanisms, but sometimes folks do open an issue to get a label Nov 1, 2019
@iSazonov iSazonov added the WG-Engine core PowerShell engine, interpreter, and runtime label Nov 1, 2019
@vexx32
Copy link
Collaborator

vexx32 commented Nov 2, 2019

cc @rjmholt

@rjmholt
Copy link
Collaborator

rjmholt commented Nov 2, 2019

Yes the PowerShell committee discussed this and it was decided against -- the RFC was updated to reflect that and it was taken out.

Given the syntactic complexity of it, I think it was actually the right decision. There was a lot of inconsistency in the edge cases of this proposal. Instead, you can use a subexpression pretty easily: Get-Item ./doesntexist || $(exit 1). It's not beautiful, but it also doesn't set up weirdness like Get-Item ./doesntexist || throw "More things" && "other stuff" &. At that point it was starting to get very jagged between correct usage and bad. So simplicity won out.

@rjmholt
Copy link
Collaborator

rjmholt commented Nov 2, 2019

Also see:

# Control flow statements
@{ Statement = 'foreach ($v in 0,1,2) { testexe -returncode $v || $(break) }'; Output = @('0', '1') }
@{ Statement = 'foreach ($v in 0,1,2) { testexe -returncode $v || $(continue); $v + 1 }'; Output = @('0', 1, '1', '2') }

@rjmholt
Copy link
Collaborator

rjmholt commented Nov 2, 2019

I did advocate fairly strongly for being able to use control flow statements in pipelines; I liked the idea and I think the concept is a nice one. Even though it didn't go in, I think it got a fair run.

But syntactically it doesn't work because the big difference between PowerShell and bash is that in PowerShell statements contain pipelines, whereas the opposite it the case in bash. Trying to shoehorn certain kinds of statement where a pipeline fits just made the syntax really unhelpful, and you'd either have to make dodgy syntax breaking changes or implement very fiddly syntax rules to massage it in.

The (badly named) subexpression syntax is designed for embedding a statement within a pipeline, so it offers this neatly.

@rjmholt rjmholt added the Review - Committee The PR/Issue needs a review from the PowerShell Committee label Nov 2, 2019
@mi-hol
Copy link

mi-hol commented Nov 3, 2019

I wonder how the fact "syntactically it doesn't work because the big difference between PowerShell and bash" can we made clear to users. I had assumed that this new feature would basically work the same in PowerShell as in bash, hence many others will very likely start with the same assumption. At least this conclusion seems 'another nail in the coffin' for the goal 'pwsh is a replacement to bash' :(

@rjmholt
Copy link
Collaborator

rjmholt commented Nov 3, 2019

But it does work:

{ Get-Item /nosuch || $(return)  } | Should -Not -Throw
{ Get-Item / && $(return)  } | Should -Not -Throw

@vexx32
Copy link
Collaborator

vexx32 commented Nov 3, 2019

@rjmholt if that already works, I don't see it being especially sensible to mandate the awkward syntax.

@rjmholt
Copy link
Collaborator

rjmholt commented Nov 3, 2019

Prior to pipeline chains, the grammar was (I’m using a simplified shorthand here):

statement:
 | ...
 | ‘return’ pipeline [‘&’]
 | pipeline [‘&’]

pipeline:
 | pipeline ‘|’ command_expression
 | expression

expression:
 | command_expression
 | ‘$(‘ statement ‘)’

command_expression: ‘Get-Item /‘

The current grammar just substitutes pipeline chains for pipelines, so that a pipeline is just a pipeline chain of length 1:

statement:
 | ...
 | ’return’ pipeline_chain [‘&’]
 | pipeline_chain [‘&’]

pipeline_chain:
 | pipeline_chain ‘&&’ pipeline
 | pipeline_chain ‘||’ pipeline
 | pipeline

pipeline: <as above>

If we allow control flow statements, then you might have this:

pipeline_chain:
 | pipeline_chain ‘&&’ chain_element
 | pipeline_chain ‘||’ chain_element
 | chain_element

chain_element:
 | ‘return’ pipeline_chain [‘&’]
 | pipeline

That gives us back gi / && return, but now we have:

  • gi / && return “Hi” && return “what?”
  • gi / && return “Job” &
  • gi / && return “Job” & &
  • if (gi / && return “Hi”) { throw “Never thrown” }

The biggest question is “what do I expect those to do” and if there’s no clear answer it’s not good syntax. So let’s fix it.

Putting control flow statements only at the end of chains fixes nothing because they already enforce themselves as the ends of chains; once you use one, the chain to the right is subordinate to return because they have a pipeline_chain under them in the grammar.

So let’s fix it instead by only allowing unchained pipelines under return. But now you can’t do return gi file && cat file.

Ok we can fix that if we allow chains when return is the start of the statement but not when we’re in a chain. But now we have two kinds of return statement and pipelines aren’t just degenerate pipeline chains, they have all these caveats.

And we still haven’t addressed how to deal with return 1 && 2 & vs 1 && return 2 &.

And we also still haven’t addressed how to deal with control flow statements in places where only pipelines working as expressions used to be expected.

In my original implementation that last was addressed by having two kinds of AST, so that return by itself wouldn’t just start working I’m an if or a while. But that meant tool implementors overriding VisitPipelineChainAst wouldn’t hit all the things that looked like &&/||, they also needed VisitStatementChain. Which was badly named since it didn’t apply to all statements. But then to make it apply to more statements we’d need to make even bigger changes in the syntax around job control and statement separation. And with each change we risk breaking compatibility with PS 5.1 and becoming more like Raku.

So it’s a bit like the conversation Roy Batty has with Eldon Tyrell; every attempt to save the situation has some ugly consequence. To save that one tree we have to move the forest around it, and we’ve introduced more problems. A programming language’s grammar is a human interface that must be above all else consistent and composable. Inserting chaotic corner cases into the grammar makes it impossible to reason about or teach. And PowerShell needs to be defensible in 5 or 10 years time.

@vexx32
Copy link
Collaborator

vexx32 commented Nov 4, 2019

Right, but if it's already possible to make it "work" by using subexpression we should probably make an effort to formalize it in a way that is reasonably intuitive. Otherwise, the behaviour of such subexpressions will always be pretty ambiguous, unless you intend to completely disable keyword use in a pipeline chain.

@SteveL-MSFT
Copy link
Member

An alternative is to make sure we include flow-control subexpressions in the examples for the documentation of this operator. Sub expressions is an existing concept and maybe users should use it more.

@mi-hol
Copy link

mi-hol commented Nov 4, 2019

@SteveL-MSFT, adding this to the documentation would be great! A search for the term gives 0 results currently!

image

or unrelated hits
image

@SteveL-MSFT
Copy link
Member

@PowerShell/powershell-committee reviewed this and we agree to that we will stay with what we agreed in the RFC which is that supporting flow-control statements complicates the implementation and using sub-expressions is the right way to use this. We should improve our documentation on sub-expressions if PowerShell users are not using them.

@SteveL-MSFT SteveL-MSFT added Committee-Reviewed PS-Committee has reviewed this and made a decision and removed Review - Committee The PR/Issue needs a review from the PowerShell Committee labels Nov 6, 2019
@mklement0 mklement0 changed the title The chain operators don't work with flow-control statements exit and return The chain operators don't work with flow-control statements exit, return, and Throw Nov 24, 2019
@mklement0
Copy link
Contributor Author

mklement0 commented Nov 24, 2019

@SteveL-MSFT, a quick tangent, though I believe it reinforces the point that we shouldn't force the awkward $(...) syntax (sorry for being late here).

Sub expressions is an existing concept and maybe users should use it more.

If anything, users should be using the (unfortunately named, as @rjmholt pointed out) sub-expression operator ($(...)) less:

@rkeithhill
Copy link
Collaborator

It is (...) that deserves promoting, but for that it needs a name

This is from the early days of PS but IIRC (...) was referred to as a grouping expression, $(...) as a subexpression and @(...) as an array subexpression.

@mklement0
Copy link
Contributor Author

Thanks, @rkeithhill.

The latter two definitely officially have that name in about_Operators - with suffix operator - but (...) isn't covered there and I wasn't aware of a name for it - is that name in any existing documentation? A quick search revealed nothing.

By coincidence, grouping operator is what I'm proposing in MicrosoftDocs/PowerShell-Docs#5154.

@rkeithhill
Copy link
Collaborator

rkeithhill commented Nov 24, 2019

It was called that in the first edition of Bruce Payette's "Windows PowerShell in Action" section 5.3 (page 119). And it's the same in the second edition (page 156).

@rkeithhill
Copy link
Collaborator

Hmm, in the second edition, it looks like "grouping expressions" refer to the set of three operators. It goes on to say:

Parentheses group expression operations and may contain either a simple expression or a simple pipeline. They may not contain more than one statement or things like while loops

For subexpressions it says:

Subexpressions group collections of statements as opposed to being limited to a single expression.

So maybe it is just parentheses operator? You might want to read through section 5.3 to see what you think.

@mklement0
Copy link
Contributor Author

@rjmholt

Thank you for putting up a good fight and for laying out the rationale.

Before I even try to understand the intricacies of the grammar, let me make a plea purely from a user's perspective:

If we force the use of $(...):

  • We burden what is probably the primary use case for && and ||.

  • New users will not expect the need for $(...) and won't necessarily know that it is what is required when foo || exit 1 fails (though documentation will help there).

  • Even once users do know about the need for $(...):

    • They will forget to use it on occasion, because of the counter-intuitive need for it.
    • When they do remember, this seemingly artificial requirement will be an ongoing source of frustration, especially since $(...) is hard to type.

As an aside: The same arguments apply to the unfortunate decision to require {...} around identifiers in order to be able to use the null-conditional member-access operator, ?..

So, for the sake of the end-users' experience, can we look for a solution that works as expected, is easy to explain, and technically not too tortu[r]ous?

Personally, I'm not worried about cases such as Get-Item ./doesntexist || throw "More things" && "other stuff" &, because I wouldn't expect many people to attempt such acrobatics.

@mklement0
Copy link
Contributor Author

I appreciate the pointers, @rkeithhill: I found "simple parenthetical notation" in that chapter for (...) specifically, and "expression and statement grouping operators" indeed seems to be the umbrella term.

In other words: (...) currently has no name, so we can and should pick one going forward: parentheses operator is problematic, because it doesn't reflect the operator's purpose, only its syntactic form.

I encourage you to join the conversation at MicrosoftDocs/PowerShell-Docs#5154 to either endorse the suggested grouping operator name or to suggest alternatives.

@rjmholt
Copy link
Collaborator

rjmholt commented Dec 4, 2019

Before I even try to understand the intricacies of the grammar, let me make a plea purely from a user's perspective

I very much understand and even sympathise, but the problem is that the grammar is the language. It's our most fine-grained API and also our primary user interface. PowerShell has pipelines contained by statements, not statements contained by pipelines. That's always been the case, and the way to put a statement within a pipeline is to use the $(...) syntax — that's its only purpose to exist outside of strings in the language. In addition to that being a huge change in PowerShell's grammar (effectively creating a fork in the language), there's no semantic support in PowerShell for general statement success; $? after if or while has no meaning.

Essentially, with such a syntactic change you're asking for a new language, with a different treatment of syntactic and semantic constructs like expressions, pipelines and statements.

this seemingly artificial requirement will be an ongoing source of frustration

Just to reiterate, it's not artificial. This isn't a case of us setting the rules because of an opinion. Programming language syntaxes are formal languages with specific requirements.

In our case, PowerShell is (mostly...) a traditionally LL(k)-parseable grammar (like most interpreted languages, since this is a big asset when you're reading the program on the fly). As the parser proceeds from left to right in the program, it decides what syntactic element it sees based on the state it's currently in (captured in our recursive descent parser mostly by the method we're in) and the element it sees right now. As the parser proceeds within each state, it has some way of deciding when to move to the next state or when to return to the state it came from. For example, in if (...) { ... }, it's expecting a statement at the outset and sees if, which in the "expect statement" state makes it transition to the "if statement state". Just as important is the final } at the end. It signals that we have finished seeing the if statement, and can now look for the next statement at that level. Every syntactic element in PowerShell (and in every formal language) has some way to declare "end" (think ), }, ;, <newline>, <end-of-file> -- just like </div> in HTML).

The problem arises because pipelines, being the default form of statement, terminate the same way as arbitrary statements. So when you embed a statement within a pipeline, it's not clear what you're terminating, unless you have a syntax to delineate the extent of the statement. That's precisely the purpose of $(...); when ) is seen, we go back to expression parsing. Without that (or some other syntax to wrap statements within pipelines), there's no terminator to say "ok statement's done, back to the pipeline state". So instead, you get a deepening syntactic hole into embedded statements (e.g. action1 && return action2 && return action3, which is grouped as action1 && return (action2 && return action3)); there's no way to indicate syntactically that the top level pipeline chain should be continued.

That's true for any implementation that removes a container syntax for statements in pipelines or pipeline chains, so it's not an implementation detail. In fact, if PowerShell used an LR parser, it would be a major issue in the parser (a reduce-reduce conflict). To quote that link from the GNU Bison docs:

A reduce/reduce conflict occurs if there are two or more rules that apply to the same sequence of input. This usually indicates a serious error in the grammar.

This is particularly important because syntaxes aren't a concept required by computers (those do fine with finite-block input like byte code and machine code). Syntaxes are a human interaction requirement, and a flaw in the syntax is as much a usability issue as it is a technical one. We could write the parser to make a call on the grammatical ambiguity (like C does), but it would still be confusing to users to pick apart write-error "bad" && return 0 || exit 1.

The same arguments apply to the unfortunate decision to require {...} around identifiers in order to be able to use the null-conditional member-access operator, ?..

I agree that the scenario is similar, but also think there that not breaking the way we parse existing valid syntax was the right way to go. Again, a change there isn't just a case of us making a decision to support a small number of wild programs out there — we've made numerous breaking changes in cases where we thought the pros outweighed the cons (not that I personally agree with all of them, or that those do or don't justify others). The issue is that changing some aspect of the tokenizer or parser means that two PowerShell versions will no longer generate the same AST for some scripts, so two PowerShells will "see" the same script differently. That means we can't even read the syntax the same way, so there's no PSScriptAnalyzer rule you can write to pick up possible issues around it; PSScriptAnalyzer will see a different syntax from one PowerShell version to the next.

Unlike C#, PowerShell can't pre-compile a different syntax to a compatible format for later execution, meaning that a syntactic change is tethered to the runtime. Once PowerShell makes a syntactic break, we reduce the number of scripts that can work against different versions, meaning users are forced to write two scripts, which is a serious issue for a shell and a scripting language. PowerShell is supposed to be dynamic enough to bend past all the differences with logic inside one script, but the syntax is the one non-negotiably static thing we have. And making a syntax change where both before and after are valid is especially pernicious, since there's no simple way to detect it, there's no warning for it and even after executing it in both environments, users might not know that a different behaviour occurred.

@mklement0
Copy link
Contributor Author

I appreciate the thoughtful, in-depth answer, @rjmholt.

Re the ?. syntax issue, I suggest we continue the conversation at #3240 (comment), where I've quoted the relevant part of your answer in my response.

@yecril71pl
Copy link
Contributor

0 && return 1

RETURN: The term 'RETURN' is not recognized as the name of a cmdlet, function, script file, or operable program.

0 && 1

0
1

Should I understand that 1 is not a term? I would rather say that RETURN is not.

@SeeminglyScience
Copy link
Collaborator

@yecril71pl I'm not sure I understand the question. Can you elaborate?

@yecril71pl
Copy link
Contributor

Why is 0 && 1 not in error?
If 1 is a term, then, according to the error message reported by 0 && return 1, it must be "the?? a name of a cmdlet, function, script file, or operable program" (note that the definite article is definitely out of place there). But 1 is none of them, at least at my place, so I conclude it must not be a term, or otherwise the error message would equally apply to it..
Also, what does an "operable program" mean? An operable application, that one I know: is an event-driven application that responds to input from the operator. If it stops responding, it becomes inoperable, a.k.a. hangs. I still do not know what an "operable program" is supposed to mean.

@SeeminglyScience
Copy link
Collaborator

SeeminglyScience commented Jul 26, 2020

Why is 0 && 1 not in error?

So in that parsing mode any CommandBaseAst is allowed. That includes:

  1. A command, like Get-ChildItem -Recurse
  2. A command expression, which is basically a wrapper for any expression.

The term expression here refers to a specific subset of language elements. In this case, it's a constant expression (the literal integer 1). Not included in this subset are statements. Statements are another subset that include things like if, return, foreach etc.

So basically because statements cannot be parsed, the snippet return 1 is parsed as a command with the literal 1 as an argument.

Also, what does an "operable program" mean? An operable application, that one I know: is an event-driven application that responds to input from the operator. If it stops responding, it becomes inoperable, a.k.a. hangs. I still do not know what an "operable program" is supposed to mean.

It just means an executable file found by command discovery. In Windows it means a file in the Portable Executable format, or just anything that can be invoked with the shell verb "Invoke" (which is pretty much any file system item).

@yecril71pl
Copy link
Contributor

The term expression here refers to a specific subset of language elements.

The error message I get gives me a specific subset of that subset, and that subset does not cover 1.

So basically because statements cannot be parsed, the snippet return 1 is parsed as a command with the literal 1 as an argument.

I do not think it should be parsed as a command because RETURN is not a bareword, it is a keyword, and it cannot begin a command.

Also, what does an "operable program" mean? An operable application, that one I know: is an event-driven application that responds to input from the operator. If it stops responding, it becomes inoperable, a.k.a. hangs. I still do not know what an "operable program" is supposed to mean.

It just means a command found by command discovery. Usually an executable application in %PATH%, function, binary cmdlet, alias, or script file.

So PowerShell has its own definition what "operable" means. We should definitely change that.

@SeeminglyScience
Copy link
Collaborator

The error message I get gives me a specific subset of that subset, and that subset does not cover 1.

Not sure what you mean. If you'd like clarification on something I've said feel free to ask.

I do not think it should be parsed as a command because RETURN is not a bareword, it is a keyword, and it cannot begin a command.

Anywhere where statements are not allowed it will be parsed as a command. If you'd like to know more about why statements are not allowed in this syntax, please read through the rest of the thread. After that if you have questions about a specific part feel free to ask.

So PowerShell has its own definition what "operable" means. We should definitely change that.

That's out of scope for this issue. If you'd like to discuss that I'd recommend opening a new one.

@yecril71pl
Copy link
Contributor

yecril71pl commented Jul 26, 2020

I do not think it should be parsed as a command because RETURN is not a bareword, it is a keyword, and it cannot begin a command.

Anywhere where statements are not allowed it will be parsed as a command. If you'd like to know more about why statements are not allowed in this syntax, please read through the rest of the thread. After that if you have questions about a specific part feel free to ask.

I think it is a wrong thing to do, UX-wise. I would expect 0 && return 1 to fail even if there is some return.exe in the executable search path.

@SeeminglyScience
Copy link
Collaborator

I think it is a wrong thing to do. I would expect 0 && return 1 to fail even if there is some return.exe in the executable search path.

Yeah I agree. Problem is that logic is very heavily relied on with the foreach alias pointing to ForEach-Object. That logic can't be changed without breaking a lot of folks.

@yecril71pl
Copy link
Contributor

So, returning to the problem at hand, if we allow && return then we should also allow && foreach, and then we rely on the fact that nobody would use Foreach-Object as the first pipline element, amirite? But what about nutjobs that would write 0 && foreach { $_ } -I 0 , and then they are broken?

@SeeminglyScience
Copy link
Collaborator

SeeminglyScience commented Jul 26, 2020

So, returning to the problem at hand, if we allow && return then we should also allow && foreach, and then we rely on the fact that nobody would use Foreach-Object as the first pipline element, amirite?

Believe it or not plenty of folks do. Especially if they are taking a ScriptBlock from an external source and want to emulate a process block. There are better ways to do it, but using ForEach-Object in the first pipeline slot is a pretty common way to do it.

0 && foreach { $_ } -I 0 , and then they are broken?

You got it. Not only would that no longer work, it would be a parse time error :/

Copy link
Contributor

This issue has not had any activity in 6 months, if this is a bug please try to reproduce on the latest version of PowerShell and reopen a new issue and reference this issue if this is still a blocker for you.

2 similar comments
Copy link
Contributor

This issue has not had any activity in 6 months, if this is a bug please try to reproduce on the latest version of PowerShell and reopen a new issue and reference this issue if this is still a blocker for you.

Copy link
Contributor

This issue has not had any activity in 6 months, if this is a bug please try to reproduce on the latest version of PowerShell and reopen a new issue and reference this issue if this is still a blocker for you.

@microsoft-github-policy-service microsoft-github-policy-service bot added Resolution-No Activity Issue has had no activity for 6 months or more labels Nov 16, 2023
Copy link
Contributor

This issue has been marked as "No Activity" as there has been no activity for 6 months. It has been closed for housekeeping purposes.

@amis92
Copy link

amis92 commented Nov 23, 2023

It's definitely not completed...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Committee-Reviewed PS-Committee has reviewed this and made a decision Issue-Question ideally support can be provided via other mechanisms, but sometimes folks do open an issue to get a Resolution-No Activity Issue has had no activity for 6 months or more WG-Engine core PowerShell engine, interpreter, and runtime
Projects
None yet
Development

No branches or pull requests

10 participants