Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance foreach loops to allow iterating over multiple, parallel collections in tandem (pairs) #14732

Closed
mklement0 opened this issue Feb 8, 2021 · 11 comments
Labels
Issue-Enhancement the issue is more of a feature request than a bug Resolution-Declined The proposed feature is declined. WG-Language parser, language semantics

Comments

@mklement0
Copy link
Contributor

mklement0 commented Feb 8, 2021

Summary of the new feature/enhancement

Iterating over multiple collections that have corresponding elements iteratively is a common scenario.
This is currently fairly cumbersome, so letting the foreach statement provide syntactic sugar would be a nice simplification (and would amount to a generalization of the pairwise Linq.Enumerable.Zip() method):

# 3 collections with corresponding elements('foo' relates to 'baz' and 'and', 'bar' relates to 'qux' and 'so on')
$a = 'foo', 'bar'
$b = 'baz', 'qux'
$c = 'and', 'so on'

# WISHFUL THINKING
# Iterate over the corresponding elements from each RHS collection.
# Note: Number of iterator variables must match the number of collections.
foreach ($aElem, $bElem, $cElem in $a, $b, $c) { 
   "$aElem, $bElem, $cElem"
}

The above would yield:

foo, baz, and
bar, qux, so on

and would be the equivalent of the (far more cumbersome) following:

foreach ($i in 0..($a.count-1)) {
  $aElem, $bElem, $cElem = $a[$i], $b[$i], $c[$i]
  "$aElem, $bElem, $cElem"
}

Due to its resemblance to destructuring assignments, I think the purpose of the syntax is easy to infer.

Note:

  • With multiple iterator variables on the LHS (which activates the new feature being introduced), their number would be required to match the number of explicitly listed collections (RHS); e.g.:

    • foreach ($aElem, $bElem, $cElem in $a, $b, $c) { ... } ... VALID: 3 iterator variables, 3 collections

    • foreach ($aElem, $bElem in $a, $b, $c) { ... } ... INVALID: only 2 iterator variables vs. 3 collections

    • The single-iterator variable (the currently supported syntax) would continue to function as-is; e.g.:

      • foreach ($elem in $a, $b, $c) { ... } ... RHS is conceived as a single collection to iterate over directly, resulting in 3 iterations with $elem bound to the values of $a, $b, $c in sequence (irrespective of whether these variables contain scalars or collections).
  • The largest among the RHS collections would drive the number of iterations, and the iterator variables would contain $null for those collections that have run out of elements.

Backward-compatibility considerations: N/A, because the proposed new syntax currently results in a syntax error.

@mklement0 mklement0 added Issue-Enhancement the issue is more of a feature request than a bug Needs-Triage The issue is new and needs to be triaged by a work group. labels Feb 8, 2021
@mklement0 mklement0 changed the title Enhance foreach loops to allow iterating over multiple collections in tandem Enhance foreach loops to allow iterating over multiple collections in tandem (pairs) Feb 8, 2021
@daxian-dbw daxian-dbw added the WG-Language parser, language semantics label Feb 8, 2021
@iRon7
Copy link

iRon7 commented Feb 18, 2021

As this is already a valid syntax:

$aElem, $bElem = 'One', 1

I would expect that the iterator assignment part ("$aElem, $bElem" as in the example below) should actually work (but unfortunately doesn't) which makes this part more a bug than a feature request:

$pairs = @(@('one', 1), @('two', 2), @('Three', 3))

foreach ($aElem, $bElem in $Pairs) {
  '{0}: {1}' -f $aElem, $bElem
}

If the iterator is indeed a single item:

foreach ($aElem in $Pairs) {
  '{0}: {1}' -f $aElem[0], $aElem[1]
}

It should return an array, similar to (as it currently does):

$aElem = 'One', 1

Meaning the last item in the iterator list should contain the rest of the array, similar to:

$aElem, $bElem = 'One', 1, 'A'

Where $aElem holds 'One' and $bElem holds 1, 'A'

@mklement0
Copy link
Contributor Author

@iRon7, the in in a foreach statement is not an assignment, it is a distinct syntax construct that can be thought of as a higher-level assignment:

It embodies an instruction for a (behind-the-scenes) assignment to the iterator variable in every loop iteration, namely to the current element of the enumeration.

As such, there's no inherent requirement to support multiple iterator variables, the way true assignments support multiple target variables in destructuring assignments.

What I meant to express in the OP is that allowing multiple iterator variables is inspired by regular destructuring assignments, in that they would function analogously, albeit with enforced symmetry:

The number of iterator variables on the LHS would have to match the number of explicitly listed collections on the RHS:

That is, the following would be syntactically valid:

$a = 'foo', 'bar'
$b = 'baz', 'qux'

# OK: two iterator variables, 2 collections.
foreach ($aElem, $bElem in $a, $b) { 
   "$aElem, $bElem"
}

# Ditto
foreach ($aElem, $bElem in (Get-ChildItem foo/), (Get-ChildItem bar/)) { 
   "$aElem, $bElem"
}

By contrast, foreach ($aElem, $bElem in $Pairs) would not be valid, because there's only one RHS collection for the 2 LHS iterator variables.

@bpayette
Copy link
Contributor

@mklement0 Quick note: your proposed semantic for the LHS of in would collide with the currently defined behaviour:

PS> foreach ($i in 1,2,3) {$i}
1
2
3

(@iRon7 's suggestion would work and it's what I would have implemented if I'd had time since it naturally falls out of destructuring.)

That said, the foreach statement could accept additional tokens besides in such as over or from. Or something after the foreach e.g. foreach pair ($x, $y in $list1,$list2) { ... }. Basically the syntax is up for grabs - there is no need to try to retrofit new semantics over old syntax.

@mklement0
Copy link
Contributor Author

mklement0 commented Feb 27, 2021

Thanks, @BrucePay, but I don't think there would be a collision, at least not technically:

  • foreach ($i in 1,2,3) { $i } - i.e. the single-iterator case - would continue to function as before: the RHS operand is a single collection to whose elements the iterator variable is bound in sequence (in the first iteration, $i is 1, then 2 in the second, ...).

  • Only multiple iterator variables trigger the new behavior, in which case the number of iterator variables must then match the number of explicitly listed multiple collections on the RHS; to use a simplified example with array literals (in the real world I would expect the collections to be stored in variables or to come from commands):

    • foreach ( $i, $j in (1 ,2, 3), ('a', 'b', 'c') ) { $i, $j } (in the first iteration, $i is 1 and $j is 'a', ...)

I've updated the OP to make that clearer.

If the consensus ends up being that this distinction is too subtle, a new syntax could be pondered, but personally I don't think it's necessary.

@iRon7
Copy link

iRon7 commented Mar 11, 2021

I needed some time to think this through and enhance my Join-Object cmdlet (without losing its existing features), but I think that something similar is possible with a "Join-Object" cmdlet by building a list of ([Collections.ObjectModel.Collection[psobject]]) collections that don't immediately chance to an Object[] (but will when they e.g. being assigned to a single variable).
Meaning that apart from investing too much in traditional operators, you might also investigate in a standard Join-Object cmdlet:

$a = 'foo', 'bar'
$b = 'baz', @(1,2)
$c = 'and', 'so on'

$a |Join $b |Join $c |% {
    $aElem, $bElem, $cElem = $_
    "$aElem | $bElem | $cElem"
}

foo | baz | and
bar | 1 2 | so on

See also: Join-Object issue: #14 Support non-object arrays for more examples.
And my Add a Join-Object cmdlet to the standard PowerShell equipment purpose.

@mklement0
Copy link
Contributor Author

mklement0 commented Mar 11, 2021

Re not "investing too much in traditional operators" (read: statements).
This strikes me as a false dichotomy, as I've argued in #14724 (comment):

There is no reason to pit cmdlet-based solutions against expression / statement-based solutions: both are necessary, and in certain cases use of one over the other is the only option. Instead, we should strive for feature parity, to the extent that is feasible.

Therefore, this is again not an either-or scenario, so I suggest you add your previous example to #14994

@iRon7
Copy link

iRon7 commented Mar 11, 2021

This strikes me as a false dichotomy

Agree, I have rephrased it in the comment of this issue and removed it from the Join-Object purpose

@mklement0 mklement0 changed the title Enhance foreach loops to allow iterating over multiple collections in tandem (pairs) Enhance foreach loops to allow iterating over multiple, parallel collections in tandem (pairs) Mar 30, 2021
@iRon7
Copy link

iRon7 commented Apr 10, 2021

Thinking outside the box and following the PowerShell's streaming concept...
It would also be nice if we could have parallel input streams besides pushing for multiple in-memory arrays.

This:

$Count = (1..3 |)

Causes currently an error:

An empty pipe element is not allowed.

Instead it could possibly create a kind of a "deferred pipeline object" and each time the object is used/invoked it processes and returns the next item in the deferred pipeline (until it is empty where it returns an AutomationNull)

Wishful thinking:

$Count = (1..3 |) # Initialize the deferred pipeline object
$Count
1
$Count
2
$Count
3
$Count # Nothing (`AutomationNull`) returns

More specific:

$a = (Get-Connect .\MyHughFile.txt |) # or any other long stream
$b = (1..1e9 |) # In the idea, the range shouldn't affect the member used for $b 
$c = (Import-Csv .\Large.csv |)
$a | ForEach-Object {
    Write-Host '$a item:' $_ 
    Write-Host '$b item:' $b # Everytime the $b is used, it processes the next item in the deferred $b pipeline
    Write-Host '$c item property:' $c.myProperty # Dito for $c
}

(I am happy to do a separate propose for this but just want to check whether this idea makes any sense at all.)

@mklement0
Copy link
Contributor Author

mklement0 commented Apr 11, 2021

@iRon7, implementing an analogous feature for use streaming use, in the pipeline, makes sense to me (as in the $PSIndex, index-variable in foreach pairing).

I do suggest creating a separate issue for that.

As a thought up front: I'm not sure we need new syntax for that, perhaps even a ForEach-Object enhancement along the following lines is doable, using script blocks as input that are then executed and stepped through in parallel:

# WISHFUL THINKING
{ 0..2 }, { 'a'..'c' } | ForEach-Object -InvokeScriptBlocks { '{0}: {1}' -f $_[0], $_[1] }
0: a
1: b
2: c

@rjmholt
Copy link
Collaborator

rjmholt commented May 13, 2021

Sorry, I had the wrong tab open before.

I discussed this with the Engine working group and we don't think this should be implemented at the language/syntax level. Instead it's something that LINQ, Python and other languages provide as a method or function. A function for that in PowerShell could be implemented in an external module first before we evaluate whether it should be included in PowerShell itself.

@rjmholt rjmholt closed this as completed May 13, 2021
@rjmholt rjmholt added Resolution-Declined The proposed feature is declined. and removed Needs-Triage The issue is new and needs to be triaged by a work group. labels May 13, 2021
@rjmholt rjmholt removed their assignment May 13, 2021
@mklement0
Copy link
Contributor Author

A function for that in PowerShell could be implemented in an external module first

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue-Enhancement the issue is more of a feature request than a bug Resolution-Declined The proposed feature is declined. WG-Language parser, language semantics
Projects
None yet
Development

No branches or pull requests

5 participants