Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ForEach-Object -Parallel disallows script blocks via $using:, unlike Start-ThreadJob #12378

Closed
mklement0 opened this issue Apr 18, 2020 · 13 comments
Labels
Issue-Question ideally support can be provided via other mechanisms, but sometimes folks do open an issue to get a Resolution-Fixed The issue is fixed.

Comments

@mklement0
Copy link
Contributor

mklement0 commented Apr 18, 2020

Start-ThreadJob allows passing a script block by way of a $using: reference, which is a convenient way to make a function from the caller's scope available to the job:

PS> function foo { 'bar' }; Start-ThreadJob { & $using:function:foo } | Receive-Job -Wait -AutoRemoveJob
bar

That is, the foo function's script block ([scriptblock] instance containing the function body) reported by $function:foo (namespace variable notation) was referenced via $using:, allowing it to be called with &.

Unexpectedly, ForEach-Object -Parallel, whose behavior is generally very similar to Start-ThreadJob, explicitly disallows referencing [scriptblock]-typed values from the caller's scope via $using:

Note:

Steps to reproduce

function foo { 'bar' }; ForEach-Object -Parallel { & $using:function:foo } | Should -Be 'bar'

Expected behavior

The test should succeed.

Actual behavior

The test fails, because a statement-terminating error occurs:

A ForEach-Object -Parallel using variable cannot be a script block. 
Passed-in script block variables are not supported with ForEach-Object -Parallel, 
and can result in undefined behavior.

Environment data

PowerShell Core 7.1.0-preview.1
@mklement0 mklement0 added the Issue-Question ideally support can be provided via other mechanisms, but sometimes folks do open an issue to get a label Apr 18, 2020
@bpayette
Copy link
Contributor

@mklement0 ScriptBlocks are bound to the runspace where they were created so you can't reliably invoke them in a different runspace without regenerating the scriptblock in the new runspace. Fortunately regenerating the script block is pretty cheap using the existing (internal) APIs. I'm guessing @PaulHigin is doing this for thread jobs but not for foreach -parallel. Paul?

BTW - another thing to check out are classes. An instance of a PowerShell class is also affinitized to the runspace where it was created, so passing that instance to another runspace won't work reliably.

@PaulHigin
Copy link
Contributor

As @bpayette states, the concern was script block affinity to the runspace in which it was created. As @mklement0 mentions, this should also be disallowed for Start-ThreadJob. And I agree with @mklement0 that with the new upcoming optional "replicate runspace state" feature, it may make sense to allow a scriptblock variable by re-creating it in the new runspace. I'll include that as part of the feature.

Thanks!

@vexx32
Copy link
Collaborator

vexx32 commented Apr 20, 2020

Users consistently seem to come to these kinds of features wanting to be able to transfer a scriptblock to the new runspace or process and have it run. This isn't possible to do directly due to those concerns, but perhaps there's something we can do around automatically serializing/deserializing the scriptblock as it crosses the runspace boundary when the session state isn't replicated.

Even if it's an opt in via a new [SerializableScriptblock] sort of type that automatically re-hydrates (completely re-parsing if necessary, but if I recall correctly the AST can be passed around without these issues?) the scriptblock and binds itself to the new runspace it finds itself in, I think users will find that experience much more intuitive than simply preventing scriptblocks to be passed around at all.

@PaulHigin
Copy link
Contributor

I don't think it makes sense to re-create a new scriptblock unless the runspace in which it is being created is replicated from the client script block, because it (the scriptblock) may be referencing variables, functions, modules that don't exist. But I see no reason not to do it into a replicated runspace, other than known variable mulit-threading issues.

/cc @daxian-dbw

@vexx32
Copy link
Collaborator

vexx32 commented Apr 20, 2020

It might be, sure. But just as often as I see that, I'll see someone trying to throw the same self-contained script into a handful of different threads to parallelize a task.

For a case like that, I fail to see why permitting such a thing meets so much resistance. 😕 It just means you don't have to waste your time thinking about how you're going to get the script over to that runspace and recreate it.

@mklement0
Copy link
Contributor Author

I agree with @vexx32: simply transparently recreating the script block in the target runspace - whether it is a replicated one or not - is sensible default behavior that will cover a lot of use cases, notably the case where the script block is a function body or a similarly self-contained script block, as in the example in the OP.

All that is then needed is to document the behavior and limitations, in the context of the $using: specifier in the about_Remote_Variables topic.

I also think it is the sensible default behavior in remoting / background job scenarios: see #11698

@bpayette
Copy link
Contributor

@PaulHigin IIRC you can quickly create a new script block by using the AST of the existing script block. There is a non-public API (or constructor?) on scriptblock to do this. I measured the performance and it was quit good. This will be especially important if you're implementing a Runspce.Clone(), as you'll have to do this for every function being copied to the new runspace.

@vexx32
Copy link
Collaborator

vexx32 commented Apr 21, 2020

Not sure about scriptblock itself, but ScriptblockAst has a GetScriptBlock() method, at least. 🙂

@Viajaz
Copy link

Viajaz commented Oct 14, 2021

Maybe a different approach but perhaps ForEach-Object -Parallel with -AsThreadJob similar to that described in #10841 ?

MicrosoftDocs/PowerShell-Docs#4977 implies that ForEach-Object -Parallel -AsJob is a ThreadJob but it's a PSTaskJob isn't it?

Docs imply something similar: https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_thread_jobs?view=powershell-7.1#thread-jobs-and-variables

Provides information about PowerShell thread-based jobs. A thread job is a type of background job that runs a command or expression in a separate thread within the current session process.

@segraef
Copy link

segraef commented Feb 17, 2022

Hi there, #16564 fixed #16445 but is it part of v7.3.0-preview.1 or 7.2.1?

@PaulHigin
Copy link
Contributor

#16564 fix went into v7.3.0-preview branch. However, it is marked for consideration to backport to v7.2x.

@kasini3000
Copy link

hi who add backport label?
from i tested,ps 7.2 ps6.2 has this problem.
but ps 7.0,ps7.1 normal

@ImportTaste
Copy link
Contributor

ImportTaste commented Aug 4, 2023

For a case like that, I fail to see why permitting such a thing meets so much resistance. 😕 It just means you don't have to waste your time thinking about how you're going to get the script over to that runspace and recreate it.

Yes, very much so.

Here's a way to still do it:

function test { 'asdf' }
$foo = { 'bar' }

$pVars = (Get-Variable).Where{ -not ($_.Attributes -or $_.Description) }
$pFuncs = (Get-Command -Type Function).Where{ -not $_.Module }

1..20 | ForEach-Object -Parallel {
    $pVars = $using:pVars
    $pFuncs= $using:pFuncs

    $foo = $pVars.Where{ $_.Name -eq 'foo' }[0].Value
    $function:test = $pFuncs.Where{ $_.Name -eq 'test' }[0].Definition

    &$foo
    test
}

@PaulHigin IIRC you can quickly create a new script block by using the AST of the existing script block. There is a non-public API (or constructor?) on scriptblock to do this. I measured the performance and it was quit good. This will be especially important if you're implementing a Runspce.Clone(), as you'll have to do this for every function being copied to the new runspace.

Not sure about scriptblock itself, but ScriptblockAst has a GetScriptBlock() method, at least. 🙂

Could either of you give me an example of how to use them in a PowerShell script with ForEach-Object -Parallel?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue-Question ideally support can be provided via other mechanisms, but sometimes folks do open an issue to get a Resolution-Fixed The issue is fixed.
Projects
None yet
Development

No branches or pull requests

9 participants