Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: implement null-coalescing, null-conditional access (null-soaking), null-conditional assignment #3240

Open
mklement0 opened this issue Mar 2, 2017 · 66 comments

Comments

@mklement0
Copy link
Contributor

commented Mar 2, 2017

Null-coalescing and null-conditonal access (null-soaking) would be handy additions to the language.

Update: @BrucePay additionally suggests null-conditional assignment, with ?= - see comment below.

For instance, instead of writing:

if ($null -ne $varThatMayBeNull) { $varThatMayBeNull } else { $fallbackValue }
#
if ($null -ne $varThatMayBeNull) { $varThatMayBeNull.Name } else { $null }

one might be able to write:

$varThatMayBeNull ?? $fallbackValue  # null-coalescing
# 
$varThatMayBeNull?.Name   # null-conditional access, for Set-StrictMode -Version 2+
$varThatMayBeNull?[42]  # ditto, with indexing

Re null-conditional access: With Set-StrictMode being OFF (the default), you can just use $varThatMayBeNull.Name - no special syntax needed; however, if Set-StrictMode -Version 2 or higher is in effect, $varThatMayBeNull.Name would break and that's where the null-conditional operator (?.) is helpful, to signal the explicit intent to ignore the $null in a concise manner.


Open question:

$varThatMayBeNull?[42] handles the case where the variable is $null, but if it isn't, an array element with the specified index must exist.

It would therefore also be helpful to make indexing null-conditional - something that C# does not support, incidentally (you have to use .ElementAtOrDefault()).

The two basic choices are:

  • Come up with additional syntax that explicitly signal the intent to ignore a non-existent index:

    • The question is what syntax to choose for this, given that ?[...] and [...]? are not an option due to ambiguity.
    • Perhaps [...?], but that seems awkward .
  • Rely on the existing behavior with respect to accessing non-existent indices, as implied by the Set-StrictMode setting - see table below.


Related: implement ternary conditionals

@iSazonov

This comment has been minimized.

Copy link
Collaborator

commented Oct 8, 2017

@mklement0 It seems the second sample works by design without "?".
Maybe change to name()?

@mklement0

This comment has been minimized.

Copy link
Contributor Author

commented Oct 8, 2017

@iSazonov: Good point, but it only works without the ? unless Set-StrictMode -Version 2 or higher is not in effect - I've updated the initial post to make that clear.

@TheIncorrigible1

This comment has been minimized.

Copy link

commented Jul 24, 2018

I think syntactic sugar is something powershell could use that other languages enjoy. +1 on the proposal.

@kfsone

This comment has been minimized.

Copy link

commented Jul 25, 2018

Consider the following samples from other languages:

a="${b:-$c}"
a = b || c;
a = b or c
a := b ? c
a = b ? b : c;

even

a = b if b else c

is better than

if (b) { a = b } else { a = c }

and often times you need to clarify with the additionally verbose

a=(if (b -ne $null) { b } else { c })

Makes one feel all dirty and bashed.

@thezim

This comment has been minimized.

Copy link
Contributor

commented Aug 17, 2018

Found myself doing this today as work around.

$word = ($null, "two", "three").Where({$_ -ne $null}, "First")
@bgshacklett

This comment has been minimized.

Copy link

commented Aug 18, 2018

I use a similar pattern:

$word = ($null, "two", "three" -ne $null)[0]
@BrucePay

This comment has been minimized.

Copy link
Collaborator

commented Aug 19, 2018

@kfsone There is a small error in your example $a=(if ($b -ne $null) { $b } else { $c }). It should be $a=$(if ($b -ne $null) { $b } else { $c }) however the only version of PowerShell where $( ) was required was version 1. From v2 on, you can simply do:

$a = if ($b) { $b } else { $c }
@TheIncorrigible1

This comment has been minimized.

Copy link

commented Aug 21, 2018

@bgshacklett Don't you mean $word=($null,'two')[$null -ne 'three']?

It's a bit unfortunate this has gone from 6.1 to 6.2 to "Future".

@bgshacklett

This comment has been minimized.

Copy link

commented Aug 21, 2018

@TheIncorrigible1 No, if you copy and paste what I added above, you should see that the value of $word is set to "two". There's more detail in this Stack Overflow answer where I first saw the pattern:
https://stackoverflow.com/a/17647824/180813

@kfsone

This comment has been minimized.

Copy link

commented Aug 21, 2018

While perlesque hack-arounds appeal to the 20-year ago me that wrote a shebang that registered and/or queried a RIPE-DB user or organization record, what I'm hoping for from powershell is something that encourages my colleagues to use the language rather than instill fear in them.

My litmus test is this: Would I want to read this via my tablet at 3am New Years morning while hung over with the ceo on the phone crying at me how many millions of dollars we are losing a second.

(Aside: this was only a loose exaggeration of an actual experience until I worked at Facebook and came back from an urgent quick leak to be told that, in the 2 minutes I was gone, more people than the population of Holland had gotten an empty news feed. It wasn't my code and the mistake was a minute change in the semantics of return x vs return (x) in the c++ standard for very specific template cases, and "would you want to read this code with a 2 minute deadline, a full bladder and the fate of every clog wearing Dutch person's cat pictures on your shoulder???" didn't sound as cool)

@vexx32

This comment has been minimized.

Copy link
Contributor

commented Aug 21, 2018

Yup. There are some neat tricks possible in PS that are great in a pinch or if you need a one-time shorthand.

For maintainable code, ideally we should have explicit null-coalescing operators like C#. The main trouble there for me is -- what do we use for those? ? is already aliased to Where-Object (much as I'd love for that to be erased, it's in very common use). Mind you, % is aliased to ForEach-Object but that doesn't hinder modulus operations, so in theory at least having ? be a null-coalescing operator as well would potentially be fine; it would only be interpreted as such in expressions, where Where-Object isn't really valid anyway.

@mklement0

This comment has been minimized.

Copy link
Contributor Author

commented Aug 21, 2018

@vexx32:

At least syntactically ?? should be fine, because we're talking about expression mode, whereas ?, as a command [alias], is only recognized in argument mode.
Not sure if reuse of the symbol causes confusion, but it wouldn't be the first time that a symbol does double duty in different contexts.

Surprisingly, using ?. and ?[] for null-soaking would technically be a breaking change, because PowerShell currently allows ? as a non-initial character in a variable name.

PS> $foo? = @{ bar = 1 }; $foo?.bar   # !! $foo? is a legal variable name
1

However, I hope this would be considered a Bucket 3: Unlikely Grey Area change.

@vexx32

This comment has been minimized.

Copy link
Contributor

commented Aug 21, 2018

I can't say I've ever seen anyone use ? in a variable name... Nor would I, because chances are it would be misread. But yeah, hopefully that should be perfectly useable.

@bgshacklett

This comment has been minimized.

Copy link

commented Aug 22, 2018

I suspect it's a bit late to bring this up, but Bash has handled this (and some other cases) with its Parameter Substitution features: https://www.tldp.org/LDP/abs/html/parameter-substitution.html.

While it can be a bear to learn due to the sheer number of things that can be done with it, it's incredibly powerful. I understand that it would not be possible to use this exact notation due to the way PowerShell uses braces with variables, nor would it necessarily fit with the general feel of the language, but it seems like a useful data point.

@mklement0

This comment has been minimized.

Copy link
Contributor Author

commented Aug 23, 2018

@bgshacklett:

Yes, parameter substitution is powerful, but, unfortunately, it's not only a bear to learn, but also to remember.

So while Bash's features are often interesting, their syntactic form is often arcane, hard to remember, and not a good fit for PowerShell.

Brace expansion (e.g., a{1,2,3} in bash expanding to a1 a2 a3) is an example of an interesting feature whose expressiveness I'd love to see in PowerShell, but with PowerShell-appropriate syntax - see #4286

@bgshacklett

This comment has been minimized.

Copy link

commented Aug 23, 2018

I completely agree. I brought it up more as an example of how this issue has been solved elsewhere than an exact solution.

@BrucePay

This comment has been minimized.

Copy link
Collaborator

commented Aug 30, 2018

There's one other operator to possibly consider:

$x ?= 12

which would set $x if it's not set (doesn't exist). This is part of the "initializer pattern" which is not common in conventional languages but for (dynamically scoped) languages like shells, make tools, etc. it's pretty common to have a script that sets a default if the user hasn't specified it. (Though in fact parameter initializers are pretty widespread.)

Extending this to properties:

$obj.SomeProperty ?= 13

would add and initialize a note property SomeProperty on the object if it didn't exist.

And - for fun - one more variation on initializing a variable using -or:

$x -or ($x = 3.14) > $null
@mklement0

This comment has been minimized.

Copy link
Contributor Author

commented Sep 1, 2018

$x ?= 12 sounds like a great idea.

It occurred to me that we probably should apply all of these operators not only if the LHS doesn't exist, but also if it does exist but happens to contain $null (with [System.Management.Automation.Internal.AutomationNull]::Value treated like $null in this context).

add and initialize a note property SomeProperty

In that vein, $obj.SomeProperty ?= 13 makes sense to me only if .SomeProperty exists and contains $null, given that you cannot implicitly create properties even with regular assignments (by contrast, for hasthables the implicit entry creation makes sense).

All operators discussed will need to exempt the LHS from strict-mode existence checking.

@mklement0 mklement0 changed the title Suggestion: implement null-coalescing and null-soaking Suggestion: implement null-coalescing, null-soaking, null-conditional assignment Oct 17, 2018

@KirkMunro

This comment has been minimized.

Copy link
Contributor

commented Aug 23, 2019

That ship has sailed long ago, and I wouldn't be surprised to see variables ending with ? in use in scripts today.

At this point if we simply prioritize parsing ? as part of an operator rather than as part of a variable name when it comes to these new operators, folks who use variable names ending in ? would need to use {} or spaces (where spaces are allowed) to use those variables with these operators.

@vexx32

This comment has been minimized.

Copy link
Contributor

commented Aug 23, 2019

It's strange to hear the phrase "that ship has sailed" in this context. This is a major version change. It's not unreasonable for this to change here, I think.

@TheIncorrigible1

This comment has been minimized.

Copy link

commented Aug 23, 2019

@rkeithhill I've used it in personal stuff, but thought it would be unclear to collaborative work since it's such an anti-intuitive thing to programmers to have symbols as part of variable names (similar to using emojis as variables)

@KirkMunro having "prioritizied parsing" sounds like an open door for bugs.

@KirkMunro

This comment has been minimized.

Copy link
Contributor

commented Aug 23, 2019

@vexx32: It's not unreasonable for a breaking change since this is a major version change; however, the bar for such breaking changes should remain very high, and I don't think this comes close to passing that bar, since users could use variables ending in ? just fine as long as they use the {} enclosures to identify the variable name.

Note that you can have a property name that ends in ? as well. Currently if you try to view such a property in PowerShell without wrapping the name in quotes, you'll get a parser error.

For example:

PS C:\> $o = [pscustomobject]@{
    'DoesItBlend?' = $true
}
PS C:\> $o.DoesItBlend?
At line:1 char:15
+ $o.DoesItBlend?
+               ~
Unexpected token '?' in expression or statement.
+ CategoryInfo          : ParserError: (:) [], ParentContainsErrorRecordException
+ FullyQualifiedErrorId : UnexpectedToken

PS C:\> $o.'DoesItBlend?'
True

I'm a little surprised that doesn't parse today (why doesn't that parse?), but regardless, for such properties you need to enclose their name in quotes, in which case you could follow the quote with a ternary, null-coalescing, etc. operator without spaces and it would work just fine. I find this syntax very similar to ${x?}?.name, and I'm ok with the stance that you can use such variable/property names if you want, but such names may require extra syntax or spacing to work with ternary or null-* operators.

@TheIncorrigible1

This comment has been minimized.

Copy link

commented Aug 23, 2019

@KirkMunro Nothing stops people from using variable-bounds going forward if they want esoteric variable names. I do agree with the others in that I wonder what the current usage of that behavior is currently in use.

People on PowerShell 7 are likely already enthusiasts and will be aware of the breaking changes. People who are not, are still using <=v5.1 and will continue to for a long time; likely until msft removes it from Windows 10 (never).

@vexx32

This comment has been minimized.

Copy link
Contributor

commented Aug 23, 2019

@KirkMunro Sure, but removing it from the standard permissible variable characters doesn't prevent users from just doing ${Valid?} as the variable name anyway. Since they'd have to do that with these operators regardless, I think it'd be better to just have it consistent, rather than have ? become a character that's only sometimes considered part of a variable name.

That already is going to be a breaking change, and I'd think it best to at least be consistent about it and go all the way rather than introduce another level of ambiguity. 🙂

@TheIncorrigible1

This comment has been minimized.

Copy link

commented Aug 23, 2019

@adityapatwardhan I think it would be better for the language to remove ? as a valid variable character. It would easily enable both null-soak/-coalesce and ternary operators in a familiar syntax which add a lot of ergonomics to the script authoring process.

@KirkMunro

This comment has been minimized.

Copy link
Contributor

commented Aug 23, 2019

@KirkMunro having "prioritized parsing" sounds like an open door for bugs.

@TheIncorrigible1: Any code can open the door for bugs if it's not implemented properly. I'm just talking about a simple single-character lookahead to identify if PowerShell runs into a ternary or null-* operator when it parses a variable name that is not enclosed in variable bounds and encounters a ? character. That's not complicated, and doesn't open the door for more bugs than any other code change does.

People on PowerShell 7 are likely already enthusiasts and will be aware of the breaking changes. People who are not, are still using <=v5.1 and will continue to for a long time; likely until msft removes it from Windows 10 (never).

@TheIncorrigible1: What basis/evidence do you have of that statement? PowerShell 7 is in preview, so today it's only used by enthusiasts. That's a given. But beyond preview, if PowerShell 7 or later offer compelling features that companies need, while supporting the functionality they need, they'll use those versions. That is especially true if PowerShell 7 gets installed with the OS. The only point that enthusiasts comes into play is in organizations that don't have a business need for what PowerShell 7 brings to the table.

That already is going to be a breaking change, and I'd think it best to at least be consistent about it and go all the way rather than introduce another level of ambiguity. 🙂

@vexx32 That's stretching it. It would be a breaking change if you had ?? in a variable name, but the likelihood of that is much more remote than having a variable whose name ends in a single ?. Other than that, how would the introduction of null-* operators while still supporting ? as a standard permissible variable character break scripts today?

In my experience breaking changes for nice-to-haves (which is what this seems to be) are by far more often than not rejected, and the discussion/arguments around them only serve to slow down the process of getting things done dramatically, to the point where things just stall or miss getting into a release, etc. The slow down is often necessary, because if you're proposing a breaking change evidence is needed to be able to assess the impact and justify such a change. It's hard to gather that evidence today. I'm just saying this because I'll choose getting features done now over arguing about a nice-to-have breaking change any day.

I never use ? in variable names, nor would I. I expect some folks do, though, because it can read very well in a script, and putting up unnecessary barriers to entry for PowerShell 7 just slows down adoption, especially when many people working with PowerShell aren't developers who are more accustomed to working through breaking changes.

Anyway, it is not my intent to slow down this process -- rather the opposite. But I've shared my thoughts and experience, so I won't comment further on whether or not we should push for a breaking change here.

@vexx32

This comment has been minimized.

Copy link
Contributor

commented Aug 23, 2019

Consider this contrived example:

$ValueIsValid? = @( $true, $false, $false, $true )

$ValueIsValid?[0]
# old behaviour? gets `$true`
# new behaviour? gets nothing, because the `?[0]` is interpreted as a null-conditional access.

This behaviour would already break with the proposed changes. I would prefer a consistent, clean break than a confusing break that needs a half-page explanation to list all the possible exceptions and when and where ? suddenly isn't valid, and where it still is.

@TheIncorrigible1

This comment has been minimized.

Copy link

commented Aug 23, 2019

@KirkMunro if PowerShell 7 or later offer compelling features that companies need, while supporting the functionality they need, they'll use those versions. That is especially true if PowerShell 7 gets installed with the OS.

Having worked in a few Fortune 50s with some employee bases being in the hundreds of thousands, even getting away from what was default on the OS was a challenge (i.e., moving to v5.x). I have yet to see any place adopt Core; they'd rather just move to Python for cross-compatibility. Enabling Windows optional features was also a pretty seldom task.

I think companies work with what they have and the knowledge base of their employees over seeking out new technology or language versions to solve their problems. Some would be perfectly content staying on v2 forever and never touching the Win10/Server16 cmdlets that make life dramatically easier.

My point with all of this, is that these features are not in a vacuum. If you make a language more ergonomic, it will see greater adoption by people being interested in the tool to solve the problems they have faster. (See: C# and the growth in popularity with more/better language features)

@lzybkr

This comment has been minimized.

Copy link
Member

commented Aug 23, 2019

Regarding variables ending in ? - that could be a significant breaking change because of the automatic variable $?.

@TheIncorrigible1

This comment has been minimized.

Copy link

commented Aug 23, 2019

@lzybkr I suspect the best case for dealing with that is a special case like what already exists for $^ and $$.

@vexx32

This comment has been minimized.

Copy link
Contributor

commented Aug 23, 2019

@lzybkr @TheIncorrigible1 As far as I remember, all of those variables are explicitly special-cased in the tokenizer. $^ already isn't a valid variable name. The tokenizer has special cases for all of those before it starts looking for standard variable names.

@SteveL-MSFT

This comment has been minimized.

Copy link
Member

commented Aug 23, 2019

The only solution here is to use ¿:

$silly?¿=1

@vexx32

This comment has been minimized.

Copy link
Contributor

commented Aug 23, 2019

I dunno Steve, I'm firmly in the interrobang crowd on this one.

$silly?‽=1
@adityapatwardhan

This comment has been minimized.

Copy link
Member

commented Aug 23, 2019

@lzybkr - to my knowledge $? $$ and $^ are treated in a special way.

// $$, $?, and $^ can only be single character variables. Otherwise keep scanning.
if (!(c == '$' || c == '?' || c == '^'))
{
bool scanning = true;

To summarize, we have these four options

  • Big breaking change - not allow ? in the variable names.
  • Prefer ?. as an operator. This means variable names ending with ? must use ${variablename} syntax. Still a breaking change.
  • Prefer old behavior. This means to use ?. and ?[], ${variablename}?.property must be used. No breaking change, but makes using the new operators clumsy.
  • Do not implement the new operators.

I personally, do not prefer the 1st and the last one.

@rjmholt

This comment has been minimized.

Copy link
Member

commented Aug 23, 2019

This is a major version change. It's not unreasonable for this to change here, I think.

I think an important point to make is that the major version is being incremented to signal readiness to replace Windows PowerShell, which is to say it indicates compatibility, not breakage. From the original announcement:

Note that the major version does not imply that we will be making significant breaking changes.

@vexx32

This comment has been minimized.

Copy link
Contributor

commented Aug 23, 2019

Are options 1 and 2 not the same? Or would variables still be permitted to use ? in the case that there is no null-coalescing or ternary operator following them?

If so, I think we may have a lot of trouble handling the tokenizing / parsing logic without a lot of backtracking. This might lead to performance degradations when variables end with ?.

Given that from what I can see, option 2 seems to be the general preference, I'm not really sure I understand the reluctance to make the breaking change here. Doing it that way will actively discourage use of ? in variable names anyway, simply by introducing scenarios that they aren't usable without enclosing the variable name.

I think this is a fairly minor change as breaking changes go, and having a break in the consistency of what can and can't be used in a variable name is probably the worse option, in my opinion.

These things should have clear rules that apply in pretty much all situations. Currently, this is the case. We're proposing to muddy the waters here and make the behaviour less clear. I can't see anything good coming from it, except perhaps that variable names containing ? become less used simply because they're harder to use in some scenarios -- which (effectively) brings us right back to option 1 by default, almost... so I don't see any particular reason not to take the break now and just avoid the ambiguous behaviour.

@adityapatwardhan

This comment has been minimized.

Copy link
Member

commented Aug 23, 2019

@vexx32 There is a slight difference between 1 and 2.

For # 1 we disallow ? to be used in variable names all together. This means $x? = 1, $x?? = 1 $x?x?x = 1 will not parse.

For # 2, $x? = 1 is still valid but $x?.name is equivalent to ${x}?.name. This only breaks variable names with ? at the end, which are accessing members. So, $x? = 1, $x?? = 1 $x?x?x = 1 would still be valid.

@vexx32

This comment has been minimized.

Copy link
Contributor

commented Aug 23, 2019

Yeah, I'd be a little concerned about tokenizing overhead there. Might be worth implementing both possibilities and doing some benchmarks on a script that uses similar variables names reasonably heavily.

My preference is definitely option 1... a one-time breaking change is, to me at least, way more preferable to having to deal with the inconsistencies in how that can be parsed way into the future.

@rjmholt

This comment has been minimized.

Copy link
Member

commented Aug 23, 2019

If so, I think we may have a lot of trouble handling the tokenizing / parsing logic without a lot of backtracking. This might lead to performance degradations when variables end with ?.

I share this concern about that possibility. Having ? as a sometimes-token-separator feels like a jagged line in the language to me.

@KirkMunro

This comment has been minimized.

Copy link
Contributor

commented Aug 24, 2019

If so, I think we may have a lot of trouble handling the tokenizing / parsing logic without a lot of backtracking. This might lead to performance degradations when variables end with ?.

It depends on how you implement it. As long as you take a lookahead approach rather than a tokenize and then back-up approach, you should be fine. You can look ahead at what characters are next when you encounter a ? in a variable name, and make a decision on how you want to "wrap up" the variable token based on what's next. That isn't a lot of trouble and shouldn't require backtracking or result in noticable performance degradations.

@joeyaiello

This comment has been minimized.

Copy link
Member

commented Aug 26, 2019

@PowerShell/powershell-committee reviewed this one today, we have a couple thoughts:

  • No matter what we do, we're going to do some analysis of our corpus of scripts to see how often folks use ? in variable names
  • Some of us have a hypothesis (that others would like to validate) that the users who are using ? in their variable names may be less advanced users (as we agree in this room we'd stay away from it because of the potential problems that could arise). On the other hand, anyone using the functionality described here will be able to understand a slightly more complicated syntax (like ${foo}?.bar). Therefore, we prefer option 3 because it avoids breaking changes on these less experienced users. (But again, we will validate it per the first bullet.)
  • @daxian-dbw raised a good point about making it harder to find all variable references when scripters are mixing usage of $foo and ${foo}. However, we agreed that this can be fixed in tooling (like the VS Code PowerShell extension), and users like @JamesWTruher who use editors like Vim can easily match on both with their search syntax.
@rjmholt

This comment has been minimized.

Copy link
Member

commented Aug 29, 2019

harder to find all variable references when scripters are mixing usage of $foo and ${foo}

That depends on whether the AST differentiates the variable name based on whether it saw the braces. I suspect that any tool that uses the PowerShell AST will have no trouble with braces here.

@rjmholt

This comment has been minimized.

Copy link
Member

commented Aug 29, 2019

But for regex, it certainly means you have to be more aware: varname -> \{?varname\}?

@TheIncorrigible1

This comment has been minimized.

Copy link

commented Aug 29, 2019

@rjmholt I did an analysis over here. A quick breakdown: Out of nearly 400,000 scripts with 22,000,000+ variable names (not unique), there were only ~70 unique variables that ended with ?.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.