Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Passing variables by reference to thread jobs is not correctly documented #7378

Open
martinprikryl opened this issue Mar 25, 2021 · 4 comments
Labels
area-about Area - About_ topics area-core Area - Microsoft.PowerShell.Core module area-parallelism Area - Parallel processing (ForEach-Object, Jobs, etc)

Comments

@martinprikryl
Copy link

martinprikryl commented Mar 25, 2021

Documentation Issue

PowerShell Scopes documentation says this about using variables to "Thread jobs":

The Using scope modifier is supported in the following contexts:

  • ...
  • Thread jobs, started via Start-ThreadJob or ForEach-Object -Parallel (separate thread session)

Depending on the context, embedded variable values are either independent copies of the data in the caller's scope or references to it.
...
In thread sessions, they are passed by reference. This means it is possible to modify call scope variables in a different thread. To safely modify variables requires thread synchronization.

To me, coming from C#/C++ background, passing by reference means that you can assign these variables and have the assigned value be available in the calling code.

Yet the following fails to run:

$foo = 1

Start-ThreadJob {
    Write-Host $using:foo
    $using:foo = 2
} | Wait-Job | Out-Null

Write-Host $foo

It errors on $using:foo = 2 with:

The assignment expression is not valid. The input to an assignment operator must be an object that is able to accept assignments, such as a variable or a property.

I assume it's not a bug in PowerShell, but rather the documentation does not really correctly document how the variable can be modified. That the actual variable cannot be modified, but if one passes something like an object or a hash table, one can modify its fields/contents. I.e. it is conceptually more like passing a pointer to an object by value, rather then passing a variable by reference.

Context of the issue

https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_scopes#the-using-scope-modifier

Originally posted on Stack Overflow:
Modifying PowerShell $using scope variables in thread started with Start-ThreadJob

@martinprikryl martinprikryl added the issue-question Issue - support question label Mar 25, 2021
@chasewilson chasewilson self-assigned this Mar 25, 2021
@chasewilson chasewilson added area-about Area - About_ topics area-core Area - Microsoft.PowerShell.Core module and removed issue-question Issue - support question labels Mar 25, 2021
@chasewilson
Copy link
Contributor

Hey @martinprikryl thanks for the feedback here.
This is a fair point.
We don't have an article talking about concurrency specifically realated to PowerShell but have an issue open here to address that.

A good solution here will be to link to that article once it's created.

@chasewilson chasewilson removed their assignment Mar 25, 2021
@sdwheeler sdwheeler added the area-parallelism Area - Parallel processing (ForEach-Object, Jobs, etc) label Jun 11, 2021
@mklement0
Copy link
Contributor

The problem is that https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_scopes#the-using-scope-modifier confuses variables with their values.

This means it is possible to modify call scope variables in a different thread.

A $using: reference only ever expands to a variable's value - there is no way to refer to the caller's variables themselves from out-of-runspace code.

Therefore, you can never update a caller's variable.

What you can do - in thread-based parallelism only - is to modify an object that a caller's variable references, which only applies if the variable value happens to be an instance of a .NET reference type, such as a hash table.

@santisq
Copy link
Contributor

santisq commented Apr 16, 2023

You can update the PSVariable instance, the same way you can update any other reference type:

$foo = 1
$refOfFoo = Get-Variable foo

Start-ThreadJob {
    ($using:refOfFoo).Value = 2
} | Receive-Job -Wait -AutoRemoveJob

Write-Host $foo

Worth noting this is clearly not a thread safe operation. Looking at the answer you got from SO, its also worth noting that a synchronized hash table will not ensure that updating the same key from multiple threads is thread safe, that's incorrect. There must be a locking mechanism implemented while updating it. A simple way to demonstrate it:

$attempts = 0

do {
    $attempts++

    $foo = [hashtable]::Synchronized(@{
        Value = 0
    })

    0..10 | ForEach-Object -Parallel {
        Start-Sleep -Milliseconds 200
        ($using:foo).Value++
    } -ThrottleLimit 11
}
until($foo['Value'] -ne 11)

"It took $attempts attempts to make this fail."

@DennisL68
Copy link

DennisL68 commented Jun 22, 2023

I don't get this demonstration. Why do you expect the $foo['Value'] to be updated in sequence?
The only thing you need to care about is that two different threads doesn't try to update the value at the exact same moment, right?

And shouldn't you be using any of the thread safe methods?

What I can tell we care about removing and adding objects when talking about thread safe?
MS Learn - Thread-safe collections

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-about Area - About_ topics area-core Area - Microsoft.PowerShell.Core module area-parallelism Area - Parallel processing (ForEach-Object, Jobs, etc)
Projects
None yet
Development

No branches or pull requests

6 participants