Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

$using variable should support method call on it #10876

Closed
daxian-dbw opened this issue Oct 23, 2019 · 26 comments
Closed

$using variable should support method call on it #10876

daxian-dbw opened this issue Oct 23, 2019 · 26 comments
Assignees
Labels
Issue-Enhancement the issue is more of a feature request than a bug Resolution-No Activity Issue has had no activity for 6 months or more WG-Engine core PowerShell engine, interpreter, and runtime

Comments

@daxian-dbw
Copy link
Member

$using: prefixed variables are used in Foreach-Object -Parallel and Start-ThreadJob to pass value of reference types. Given the value is passed by reference, it's natural to call method on it directly in a thread job or script running in another Runspace, and thus we should remove the semantics check and allow expression to be used on the $using: variable.

Steps to reproduce

{ $using:blah.Add() }

Expected behavior

No error thrown.

Actual behavior

PS> { $using:blah.Add() }
At line:1 char:3
+ { $using:blah.Add() }
+   ~~~~~~~~~~~~~~~~~
Expression is not allowed in a Using expression.
+ CategoryInfo          : ParserError: (:) [], ParentContainsErrorRecordException
+ FullyQualifiedErrorId : InvalidUsingExpression

Environment data

Name                           Value
----                           -----
PSVersion                      7.0.0-preview.4
PSEdition                      Core
GitCommitId                    7.0.0-preview.4
OS                             Microsoft Windows 10.0.18363
Platform                       Win32NT
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0…}
PSRemotingProtocolVersion      2.3
SerializationVersion           1.1.0.1
WSManStackVersion              3.0
@daxian-dbw daxian-dbw added the Issue-Question ideally support can be provided via other mechanisms, but sometimes folks do open an issue to get a label Oct 23, 2019
@daxian-dbw daxian-dbw self-assigned this Oct 23, 2019
@daxian-dbw daxian-dbw added WG-Engine core PowerShell engine, interpreter, and runtime and removed Issue-Question ideally support can be provided via other mechanisms, but sometimes folks do open an issue to get a labels Oct 23, 2019
@vexx32
Copy link
Collaborator

vexx32 commented Oct 23, 2019

Related: #10499

Depending on exactly what's preventing the current use of methods with $using: (is it a deliberate constraint applied manually? Plumbing not entirely hooked up?) We may want to implement this alongside changing $using in threadjob/foreach -parallel to $ref:

That is, if the committee agrees with that change when it is reviewed 😄

Edit: ah, I see it has been. Nevermind then :)

@jhoneill
Copy link

There are places where Using can't provide methods (remoting, DSC)
But this is silly

$hash = @{a=1;b=2;c=3}
"a".."z" | foreach -Parallel {$using:hash[[string]$_]}
ParserError:
Line |
   1 | "a".."z" | foreach -Parallel {$using:hash[[string]$_]}
     |                                           ^ Expression is not allowed in a Using expression.

especially as
"a".."z" | foreach -Parallel {$h=$using:hash; $h[[string]$_]}
works.

This would be a good change but it should be made clear somewhere what works and what does not. This fails
$x=1; 1..5 | foreach -Parallel {$using:x += $_}

because $using:x gives a read only variable. So a method which changes X should only change the copy in the parallel runspace. If users expect to change the copy in the calling workspace that will cause problems.

@iSazonov iSazonov added the Issue-Enhancement the issue is more of a feature request than a bug label Oct 24, 2019
@vexx32
Copy link
Collaborator

vexx32 commented Oct 24, 2019

Honestly given the current strictures of $using, I would think in makes more sense for #10499 to be reconsidered and a different prefix chosen, both to avoid complicating the code paths in the parser for $using (how is the parser meant to differentiate $using restrictions in the different scenarios? That makes little sense to me), and to have it be clear to users.

If they get a parse error for applying indexing or property access to $using now, I don't think making that parse error somehow conditional based on what command it's being passed to is a good user experience at all. What about creating and storing scriptblocks ahead of time? There's no meaningful way to make the distinction in the parser.

Different prefixes kind of have to be used here, or we have to offload the error states to the individual commands, which leaves us much more open to even more variant, confusing, and inconsistent behaviours.

Not great UX at all in my opinion.

@jhoneill
Copy link

jhoneill commented Oct 24, 2019

I have struggled to get my head round this, but this may help others.
#10499 basically says "using in foreach-object -parallel is dangerous"

What we have here is the parser blocking those dangerous operations - where they are can be seen.

$x = 1 ; 1..5 | foreach -Parallel {$using:x += $_}
Gives an error " The assignment expression is not valid. The input to an assignment operator must be an object that is able to accept assignments"

$hash = @{a=1;b=2;c=3}; "a".."z" | foreach -Parallel {$using:hash[[string]$_]}
Gives an error "Expression is not allowed in a Using expression" as does
$hash = @{a=1;b=2;c=3}; "a".."z" | foreach -Parallel {$using:hash.ContainsKey([string]$_)}

However one can hide the unsafe behaviour

$hash = @{a=1;b=2;c=3}; "d".."f" | foreach -Parallel {
    $p = $using:hash.psobject ;
    $p.baseobject.add($_,"99") 
}
$hash
Name                           Value
----                           -----
e                              99
d                              99
a                              1
b                              2
c                              3
f                              99

This doesn't work for simple types like strings and integers.

So we have two requests, one here which says make it easier to do things which are not thread-safe, and #10499 - I'm not sure if it is saying block non-thread safe things completely or ensure that people can't do something unsafe by accident - the parser is doing the latter.
Having a different prefix, say $ref: , would make it clearer that it is a thread-unsafe reference; but you can be sure than some users will just accept "$ref" as "what you use to do this" and will get themselves into trouble....
With $using: you can call baseobject.add() or use baseobject["b"] , it is cumbersome but can't be done by accident
I think every argument for one way or the other has a roughly equal counter argument

@vexx32
Copy link
Collaborator

vexx32 commented Oct 24, 2019

@jhoneill the latter parts of the discussion in #10499 make it more clear that a $ref: keyword rather than $using: was indeed the desired outcome, my apologies for not making that clear.

$ref: isn't a perfect solution, but frankly I don't think one really exists. It makes it clear that "hey, this is a different thing you're working with here! watch out!", it calls back to the existing [ref] that is occasionally useful, and if $using: is ignored or creates errors in threadjobs / foreach -parallel that point to using $ref: instead users know exactly what to search for to better understand what's going on.

There will always be users who don't fully comprehend what they're doing; I think the onus is on us to make it clear that these two usages are different in a pretty important way, and whether users seek to understand it before using it is then up to them. Hiding the difference by simply modifying the existing behaviour of $using: is, in my opinion, doing users a disservice. It doesn't give them a good avenue to explore and further understand.

@mklement0
Copy link
Contributor

mklement0 commented Oct 24, 2019

The quickest workaround for calling methods is to enclose the reference in (...):

PS> $ht = @{ one=1 }; % -parallel { ($using:ht).Add('two', 2) }; $ht

Name                           Value
----                           -----
one                            1
two                            2

I think the confusion stems from thinking of $using:var as a variable reference, when it reality is a variable's value.

PowerShell - unlike Bash, for instance - doesn't normally make that distinction: in $var = ..., $var is a variable (object) whose value is being modified, whereas in Write-Output $var $var refers to the variable's value.

$using:var is only ever a value, methinks - that's why you can't assign to it or increment it - even though its syntactic form may make you think you can.

Whether the value $using:var is something that is mutable therefore entirely depends on whether $var in the caller's context happens to contain an instance of a reference type.

These are the same by-[reference]-value passing semantics as when you call a script or function (except that nothing prevents you from assigning to a parameter variable later): if the parameter value is a value-type instance, you get a copy, if it's a reference-type instance, you get a copy of the reference - and therefore the potential for modifying the referenced object.

Therefore, I don't think $ref: is appropriate (potential confusion with [ref], which is a fundamentally different thing) or necessary.

I can't think of a prefix that could capture the complexity of the above.

However, we do have to properly document the issue.

And, back to @daxian-dbw's original issue, there's no reason not to allow method calls directly on $using: "variables", especially given that property access already works (and that member access generally works on values (literals) too; e.g., 'foo'.ToUpper()):

PS> $ht = @{ one=1 }; % -parallel { $using:ht.count }

1

@vexx32
Copy link
Collaborator

vexx32 commented Oct 24, 2019

@mklement0 $using: is currently a reference, not simply a value, when used in Start-ThreadJob or ForEach-Object -Parallel scriptblocks.

😕

@mklement0
Copy link
Contributor

mklement0 commented Oct 24, 2019

@vexx32:

These terms are not mutually exclusive:

  • $using:var is the value of variable $var from the caller's scope
  • That value can be a reference - namely if the caller's variable contains an instance of a reference type.

This applies to any function/script call in PowerShell - even though the syntax is different here, the concept is the same:

# Same example as above, only with a local script-block call:
PS> $ht = @{ one=1 }; & { param($htParam) $htParam.Add('two', 2) } $ht; $ht


Name                           Value
----                           -----
one                            1
two                            2

Again, this is unrelated to [ref], where you essentially pass a value holder (variable-like object) around.


Yes, in the context of Start-ThreadJob and ForEach-Object -Parallel this presents a thread-safety challenge that by definition cannot exist in the different-process based contexts where $using: is also supported (Start-Job, remoting).

That challenge is already documented in the ForEach-Object help topic (for version 7).

Despite the context-specific differences, in all these contexts $using: refers to a value from the caller's scope, so it makes sense to use the same syntax form, and, as stated, I couldn't think of a prefix that captures the subtleties of the above without obscuring this commonality and causing further confusion.

@jhoneill
Copy link

jhoneill commented Oct 25, 2019

@mklement0

  • $using:var is the value of variable $var from the caller's scope
  • That value can be a reference - namely if the caller's variable contains an instance of a reference type.

That's not what happens in foreach-object -parallel

Quick demo

$hash = @{"a"=1}
Invoke-Command -ScriptBlock {$using:hash} causes an error
A Using variable cannot be retrieved. ... When it is used with Invoke-Command, the Using variable is valid only if the script block is invoked on a remote computer.

This works

Invoke-Command -ScriptBlock {$using:hash} -ComputerName localhost -EnableNetworkAccess

Name                           Value
----                           -----     
a                              1           

Can we change the contents ? And does that change the content of source variable. ?

Invoke-Command -ScriptBlock {($using:hash).add("B",2) ; $using:hash } -ComputerName localhost -EnableNetworkAccess

Name                           Value 
----                           -----  
a                              1        
B                              2                

$hash

Name                           Value  
----                           -----       
a                              1 

The "remote" machine can't write back to the original - it's Call-by-value behavior.
Now take the same script block and run it with foreach-object -parallel

1,2 | ForEach-Object -Parallel {($using:hash).add("B",2) ; $using:hash }

Name                           Value
----                           -----
B                              2
a                              1
MethodInvocationException: Exception calling "Add" with "2" argument(s): 
"Item has already been added. Key in dictionary: 'B'  Key being added: 'B'"
B                              2
a                              1

We got an error because one thread job sees the modification made by the other.

$hash

Name                           Value
----                           -----
B                              2
a                              1

The value in the callers scope has changed. This is call-by-reference behavior.

@PaulHigin explained it in post here.
https://devblogs.microsoft.com/powershell/powershell-foreach-object-parallel-feature/

there is a big difference when using the $using: keyword in ForEach-Object -Parallel. And that is for remoting, the variable being passed is a copy sent over the remoting connection. But with ForEach-Object -Parallel, the actual object reference is being passed from one script to another, violating normal isolation restrictions.

That's what we see here : the parallel runspaces have read/write access to a variable in the calling runspace. There might be some implementation oddity which means it is not truly call-by-reference, but using the "if it walks like a duck and quacks like a duck then just call it a duck" logic, that's how I'd describe it.

@vexx32 Agree. I think as things are, there is a bit of "jumping through hoops" with using: as it is, that means you don't accidentally get unsafe behaviour. I keep meaning to look at synchronized hash tables and other things which designed for thread safety

PowerShell
PowerShell ForEach-Object Parallel Feature PowerShell 7.0 Preview 3 is now available with a new ForEach-Object Parallel Experimental feature. This feature is a great new tool for parallelizing work, but like any tool, it has its uses and drawbacks. This article describes this new feature,

@mklement0
Copy link
Contributor

@jhoneill:

None of what you state contradicts my explanation, except for fuzzy terminology that may explain why you think there is a disagreement (see below).

Yes, there is a thread-safety issue that is unique to ForEach-Object -Parallel and Start-ThreadJob (though given that you need to start each thread individually, it's probably less likely to surface there in practice).

To recap my argument:

  • $using:, across all uses cases, has the commonality of referring to the value of a variable from the caller's scope (whether that value is a value-type instance or a reference-type instance), which is why it makes sense to use the same syntax form in all cases:

    • In the remoting / background scenarios, that value is is always a fully independent copy of the object that the value refers to, so there is no concern about concurrent access; note that this copy is an imperfect copy in many scenarios, because serialization / deserialization is involved, where only a handful of types are deserialized with type fidelity.

    • In the ForEach-Object -Parallel / Start-ThreadJob scenarios, the very same object reference is passed to the threads if the caller's variable contains a reference-type instance - and that's where you need to ensure thread safety if you're modifying the object being referenced.

  • The thread-safety pitfall requires documenting - it already is documented in the ForEach-Object topic; it wouldn't hurt to mention it in Start-ThreadJob too.


As for your examples:

$using: variables only apply in remoting / job contexts, that's why Invoke-Command without -ComputerName doesn't support them - just as it isn't supported in directly invoked script blocks (& { $using:HOME } breaks, for instance)

And does that change the content of source variable. ?

Here's where it gets fuzzy: It isn't the content of the source variable that is changed, at least not in the sense that a different value is now stored in it - the latter is impossible in this scenario.

What changes is the object that the unchanged variable-content references.

This is call-by-reference behavior.

As argued before, it is by-reference behavior in the same sense as passing any reference-type instance to a script/function passes a copy of the reference.

This is fundamental PowerShell / .NET behavior that is to be expected, and needs to be understood generally, in all contexts where you pass parameters in-process.

To recap my previous example:

PS> $ht = @{one=1}; & { param($htParam) $htParam.Add('two', 2) } $ht; $ht

Name                           Value
----                           -----
one                            1
two                            2

$htParam received a reference to the same hashtable instance that $ht references.

Technically, this is still passing by value in that a copy of the $ht variable's value is passed.

  • If $ht contained a value-type instance, $htParam would indeed receive a copy of its data, because the variable value is the data.
  • If $ht contains a reference-type instance, as in this case, $htParam receives a copy of its reference, which, of course, points to the very same object.

So, in effect, what is technically by-value passing is situationally "object-pointer passing" - as in C/C++, except that the fact is more obvious there.

The type of by-value semantics that you have in mind - always passing an independent copy of the data - are de facto impossible to achieve.

  • It would require a generic deep-cloning mechanism that works with any data type - and there is no such thing.

    • Hence the imperfect approximation of deep-cloning in cross-process scenarios (remoting, background jobs), where passing a copy of the data is a necessity.
  • Even if that were an option, It would be impractical - at least as the default behavior - due to the potentially large performance penalty incurred by cloning.

@vexx32
Copy link
Collaborator

vexx32 commented Oct 25, 2019

All this back and forth only seems to perfectly illustrate the exact reason that we should separate the two types of $using: behaviour, to me.

Confusion and lengthy discussions are already present, and a few notes in documentation are unlikely to be sufficient to properly illustrate the differences, in my view. 🙂

As you say, truly deep-cloning is impossible. By and large, most $using usage is currently working with/around the serialization and job interfaces. In comparison, the way it behaves in Start-ThreadJob or ForEach-Object -Parallel is going to be very different, even if on a technicality it's sort of the same kind of thing. 🙂

@mklement0
Copy link
Contributor

My conclusion is different:

  • If you want to use remoting / background jobs / thread jobs / parallel foreach-object properly, there is no way around understanding the underlying concepts - documentation should do its best to explain them.

    • The way this discussion unfolded suggests to me that needing to understand the fundamentals of parameter passing is the primary problem, which goes deeper than the ForEach-Object -Parallel / Start-ThreadJob issue at hand.

    • Given how fundamentally the value-type/reference-type dichotomy impacts by-value parameter passing (which $using: in essence is), it is more than a technicality. I get that it's conceptually tricky, but there's no way around needing to understand it.

  • No different prefix name such as $ref: can compensate for that:

    • Generally, it obscures the underlying commonality that $using: represents: passing a value from the caller's scope and requires you to memorize another syntax.

      • Hypothetically, if $using: also worked, with copy-data-always behavior, there would be a useful distinction, but the only way to get that would be to employ serialization / deserialization (as in the cross-process scenarios), which, apart from the loss of type fidelity, would defeat the purpose of using thread-based concurrency due to the overhead involved.
    • Specifically, $ref: may mistakenly suggest that you can pass value-type instances by reference too; that is, it may more explicitly promote the misconception that $ref:var is a variable rather than a value.

Pragmatically speaking, the issue only arises:

  • if a $using: reference expands to a reference-type instance

  • and you're modifying that instance in-place.

It is only then that you need to be aware of the thread-safety issue, and an explanation of the issue in the ForEach-Object topic (already present, along with showing how to solve the problem with System.Collections.Concurrent data types) and the Start-ThreadJob topic (to be added) should suffice.

I suppose we should also document the complementary solution: wanting to work with thread-local clones of $using: values; e.g.:

# OK, because thread-local clones of the hashtable are being modified.
PS> $ht = @{one=1}; 1..2 | % -parallel { $ht = ($using:ht).Clone(); $ht.Add('two', 2); $ht }

Name                           Value
----                           -----
one                            1
two                            2
one                            1
two                            2

[hashtable] conveniently offers - shallow - cloning, but in general you'll face the usual cloning challenges.

@vexx32
Copy link
Collaborator

vexx32 commented Oct 25, 2019

🤔 Fair points.

I dislike the confusion that is more or less inevitable here, but you're right in that $ref: will create its own misconceptions. 🤷‍♂

Appreciate you taking the time to lay it all out! 💖

@jhoneill
Copy link

@mklement0

I'll let you argue with @BrucePay , see #10499 (comment)

while the rest of PowerShell passes variables by value.

This is not actually correct. In fact, PowerShell always passes by reference both with reference types and boxed value types.

I understand what you are saying; after executing $ht = @{one=1}; leaves $ht as a pointer to a hash table, not the hash table itself; Passing a copy of that pointer (call-by-value) gives behavior which looks like call-by-reference because the original pointer and the copy both point to the same data (it's like being back in 1st year computer science, 35 years ago :-) ) . It's just that the guy who implemented it says it doesn't do that.

Using: is restricted to certain places. In remoting or DSC "pointed to" data is expanded, because sending a pointer to a remote machine or writing it to a MOF file won't be helpful.
In foreach-object -parallel it's a "live" pointer. It's a moot point whether that live pointer is the original one or a copy...
The point being made in #10499 a writer not realizing this using giving live pointers in one place could be dangerous and the point here this variables from outside the script block should act like any other scriptblock referring to its parent scope. One solution, which doesn't tie the parser in knots is to use another label. The other is to say that making use of the live pointer is not onerous but sufficient to make it clear that you are doing it.

@mklement0
Copy link
Contributor

@jhoneill:

variables from outside the script block should act like any other scriptblock referring to its parent scope

They behave the same as in the parameter-passing / thread-based $using: scenario: if the parent scope's variable value happens to be an instance of a reference type, you can modify the object being referenced (whereas if you assigned to $ht you'd create an independent local variable):

$ht = @{ one = 1 }; & { $ht.one = 2 }; $ht

Name                           Value
----                           -----
one                            2

making use of the live pointer is not onerous

  • It's only onerous (treacherous, counterintuitive) if the underpinnings aren't understood - hence the need for documenting the potential pitfalls.

  • On the flip side, the thread-based behavior is a feature if you actively take advantage of it: it enables collaborative cross-thread population of a data structure - if you understand that it is your responsibility to ensure thread safety (synchronize access to the data structure).
    This option is not available in cross-process $using: scenarios (remoting, background jobs, DSC).

Also note that users who are accustomed to $using: in the context of remoting / background jobs / DSC probably don't expect - and certainly shouldn't - to be able to directly modify values in the caller's scope.

It is only if they happen to pass $using: values that (a) are reference-type instances and (b) are mutable and (c) in-place mutation methods are actually called that the behavior would change in the thread-based scenarios.

Again: A different namespace prefix is incapable of conveying these subtle differences while obscuring the commonalities and increasing developers' memory burden.

@mklement0
Copy link
Contributor

mklement0 commented Oct 25, 2019

@jhoneill: As for the quote from #10499 (comment):

That PowerShell uses boxing (wrapping value-type instances in a reference-type instance) behind the scenes is irrelevant to the effective behavior in PowerShell code:

PS> $i = 1; & { param($intParam) ++$intParam } $i; $i

1  # The value-type instance stored in the caller's $i variable was NOT modified.

Even in C#, where you can make boxing explicit, boxing doesn't change the semantics of parameters that are ultimately value-type instances (you can paste the code directly at a dotnet-script prompt):

> class Foo { public static void Bar(object i) { i = 2; } }; object i = 1; Foo.Bar(i); i

1  // The value-type instance stored in the caller's i variable was NOT modified.

Even though local i parameter variable originally contains a reference to the boxed value, assigning a different value to the local i variable has no effect on the previously referenced boxed value.

@jhoneill
Copy link

@jhoneill: As for the quote from #10499 (comment):

That PowerShell uses boxing (wrapping value-type instances in a reference-type instance) behind the scenes is irrelevant to the effective behavior in PowerShell code:

Look, you can argue with Bruce if you want to :-) The OP talks about it as by reference, and @PaulHigin 's post talks about parallel being different from other using, being by reference.

PS> $i = 1; & { param($intParam) ++$intParam } $i; $i

1  # The value-type instance stored in the caller's $i variable was NOT modified.

++ $x does $x=$x+1 ,it doesn't call an increment() method. You said yourself in the previous post

you can modify the object being referenced (whereas if you assigned to $ht you'd create an independent local variable

then

[requestor wants] variables from outside the script block should act like any other scriptblock referring to its parent scope

They behave the same as in the parameter-passing / thread-based $using: scenario:

I thought we had established using with parallel can change the variable in the calling scope (unlike remoting) but method calls don't work unless the passed variable is wrapped in () unlike a script block -from the initial post.

$using: prefixed variables are used in Foreach-Object -Parallel and Start-ThreadJob to pass value of reference types. Given the value is passed by reference, it's natural to call method on it directly in a thread job or script running in another Runspace, and thus we should remove the semantics check and allow expression to be used on the $using: variable.

Steps to reproduce

{ $using:blah.Add() }

i.e he doesn't want to write {($using:blah).add() }

making use of the live pointer is not onerous

  • It's only onerous (treacherous, counterintuitive)

Sorry I was using it in the sense of "excessively work"

Also note that users who are accustomed to $using: in the context of remoting / background jobs / DSC probably don't expect - and certainly shouldn't - to be able to directly modify values in the caller's scope.

This is what was being said in #10499 . And you say users shouldn't be able to modify values in the caller's scope.. Most people would say after this :
>$ht = @{"Value"=999} ; $ht.value is a value in the current scope . So then ... > 1..3 | foreach -Parallel {start-sleep -Milliseconds (get-random 1000); ($using:ht).value = $_}`

ensures the last thread to try to change $ht.value is random.
Back in the parent scope it has changed

$ht

Name                           Value
----                           -----
Value                          2

@mklement0
Copy link
Contributor

mklement0 commented Oct 26, 2019

The OP talks about it as by reference, and @PaulHigin 's post talks about parallel being different from other using, being by reference.

Yes. Everything that needs to be said about this has been said.

++ $x does $x=$x+1 ,it doesn't call an increment() method. You said yourself in the previous post

The point of the example was that you can't modify value-type instances in this scenario - whether or not they have mutating methods (which is rare to begin with):

# Define a value type with a mutating method.
Add-Type 'public struct Foo {  public int Value { get; private set; } public void Increment() { ++Value; } }'

$foo = [Foo]::new()
$foo.Increment()  # $foo.Value is now 1

# Passing the value type as a parameter value passes a *data copy* to the
# script block, so any mutating methods invoked on that copy only affect *that copy*.
& { param($fooParam) $fooParam.Increment() } $foo

$foo.Value # still 1

I thought we had established using with parallel can change the variable in the calling scope (unlike remoting) but method calls don't work unless the passed variable is wrapped in ()
i.e he doesn't want to write {($using:blah).add() }

Yes. And I agree that (...) shouldn't be required. Why do you mention that?

Sorry I was using it in the sense of "excessively work"

I'm confused:
Using a "live pointer" is the opposite of excessive work - no cloning / serialization+deserialization overhead required.

And you say users shouldn't be able to modify values in the caller's scope...

I didn't say that, and I didn't mean to (though I certainly think it should be a deliberate action).
Why do you think I did?

What I did say was that modifying objects in the caller's scope is simply not an option in the cross-process scenarios, so the in-process, cross-thread scenario of being able to modify the very same object is simply not an issue.

Back in the parent scope it has changed

Indeed, as has been amply discussed - including the scenario where that can be used as a feature.
Why do you mention that?

@jhoneill
Copy link

OK.... We have two competing requests. This one which says accessing a method of a using variable should impose less work on the author. And 10499 which says using is already too easy for an author to change something in a different runspace, meaning that accidental modification can occur. Their wishes for using are mutually exclusive. IF one wants to satisfy both, changing the label from using to something else (ref has been mooted), might allay the fear of someone writing using with the wrong expectation and could make life easier for authors while avoiding the need for the parser to identify different contextual behaviors allowed for using

I thought we had established using with parallel can change the variable in the calling scope (unlike remoting) but method calls don't work unless the passed variable is wrapped in ()
i.e he doesn't want to write {($using:blah).add() }

Yes. And I agree that (...) shouldn't be required. Why do you mention that?

Right; we all know what can be observed with PowerShell as it is now. And I don't think I'm misrepresenting anything to say in a "Make it easier" vs "Already dangerously easy" debate you and the OP in this issue would be inclined towards the former. I'm not strongly for or against either of them. Nor am I convinced that the label should be changed in an attempt to satisfy both. I think the present position is OK, but in an uncommitted and agnostic way :-)

Sorry I was using it in the sense of "excessively work"

I'm confused:
Using a "live pointer" is the opposite of excessive work - no cloning / serialization+deserialization overhead required.

What I tried to say was this; making a script author write ($using:something).method does impose requirements (knowing the needs for () and inserting them) but not excessively so [in my view]. The extra work makes it deliberate and reduces the risk of accidental modification to [what I think is] an acceptable level but not to zero.

And you say users shouldn't be able to modify values in the caller's scope...

I didn't say that, and I didn't mean to (though I certainly think it should be a deliberate action).
Why do you think I did?

I may have misunderstood / lost the context in quoting what you said 3 paragraphs from the bottom of #10876 (comment)
It doesn't matter whether you misspoke or I misunderstood, it's clear now (I think)

Back in the parent scope it has changed

Indeed, as has been amply discussed - including the scenario where that can be used as a feature.
Why do you mention that?

Because I lost track of what you were saying should or should not be possible. Can we park that since I we agree that it is indeed a feature ? The question is

  1. Is this feature, as is, a trap, and something which should be made harder and/or blocked (issue Address Dangerous $using: in Foreach-Object -Parallel #10499) OR
  2. Is this feature a benefit, but too hard too use as is, and should be made accessible (this issue) OR
  3. Does this feature, as-is impose, enough extra work to avoid being a trap; without becoming a burden on those wanting to use it.

I'm inclined towards 3 which is the status quo; nothing said for either of the others (so far) has convinced me of their case - in my view both have equal and opposite merit.

@mklement0
Copy link
Contributor

I see this issue and #10499 as unrelated:

  • Address Dangerous $using: in Foreach-Object -Parallel #10499: For all the reasons discussed here, I don't think any action is required to address Address Dangerous $using: in Foreach-Object -Parallel #10499: $using: in thread-based contexts is a potential feature as much as a potential pitfall; you need to understand the ramifications, as documented.

    • Introducing a separate prefix doesn't replace the need to understand and cannot inspire that understanding, while, conversely, obscuring common functionality and increasing developers' memory burden.

    • So: yes to 2.

  • There was never a good reason syntactic reason to require (...) around a $using: reference; $using:var.property already works, making $using:var.method() work amounts to simply removing an inconsistency.

    • That said, assigning to $using:var.property does not work, and should be allowed as well; the following currently breaks:
      [pscustomobject] $o = @{ one = 1 }; % -parallel { $using:o.one = 2 }; $o

    • The - non-obvious - workaround is again to use (...)
      [pscustomobject] $o = @{ one = 1 }; % -parallel { ($using:o).one = 2 }; $o


Re 3.:

I generally think it's not a good idea to encumber the syntax in order to prevent pitfalls (see #6551, for instance); it is of necessity arbitrary and therefore:

  • makes it non-obvious - I had to experiment to find the (...) workarounds.

  • ends up burdening proper use of a given feature as well.

While it ultimately doesn't matter, I suspect that the current inability to call methods directly on $using: references was not a deliberate feature; after all:

  • It was implemented before thread-based jobs / parallel foreach-object, and in the older scenarios the pitfall never presented itself, so there wouldn't have been a need to erect a syntactic barrier.

  • The current barrier is inconsistent in that (get-only) property access works as-is, without (...)

The only required barrier here is one of necessity: attempts to assign directly to a $using: "variable" (which isn't one, despite its syntactic form - but this ambiguity is at the heart of PowerShell) - $using:var = ... - must be prevented (which already happens), but there is certainly room for improvement in the current error message, which is too general for this specific case:

Current, general error message:

The assignment expression is not valid.
The input to an assignment operator must be an object that is able to accept assignments, such as a variable or a property.

In the case of attempting to assign to a $using: reference, I suggest emitting a specific error, along the lines of:

The assignment expression is not valid.
$using: references are values, not variables, so they cannot be assigned to.

@jhoneill
Copy link

@mklement0
I'm starting to come round to your view :-) I think #10499 is connected but the danger is overstated... so no action required.

Using in a remoting scenario serializes and de-serializes objects, so in a lot of cases the method is lost. In something like this

$p = start notepad -PassThru
Invoke-Command -ComputerName localhost -EnableNetworkAccess -ScriptBlock {$using:p.GetHashCode() | gm }

The parser error is stopping me from trying to call a method which won't exist (and wouldn't make sense).
But if I pass a hash table (for example) the deserialized object does have an add() method and brackets let me say "I know what I'm doing , it's OK to call this method" get-help about_remote_variables doesn't give this information, so there is a discovery problem

Using in DSC does writes $VariableName = <<present value of variable>> to the MOF file and converts all instances of $using: to $ after that. Some roll-your-own serialization is needed in my experience.

Using in a threadjob scenario (with Start-threadjob or foreach-object parallel has the parser behavior for remoting but is accessing the original content without serialization.
Start-threadjob -ScriptBlock {$using:p.kill() }
gives the parser error and again brackets let me say "I know what I'm doing , it's OK to call this method".
In this instance $using:p.kill() would work, the parser is just requiring authors to demonstrate that they know what they are doing. I still don't think that's excessive, but (as with remoting) there is a discovery problem.
How easy it is to have one parser behaviour in one place and another somewhere else I don't know. That's not a reason for opt not to do it, but might mean it's thought to be a good idea but not done.

@mklement0
Copy link
Contributor

mklement0 commented Oct 29, 2019

Glad to hear it, @jhoneill.

Yes, only a handful of known types deserialize with type fidelity in remoting / background jobs; all others are emulated with method-less [pscustomobject] instances.

(#10916 seems to want to change that selectively, which sounds quite challenging.)

As an aside: While that fact is specified in the MS-PRSP protocol specification, there's virtually no information in the end-user documentation (about_Remote_Output, about_Remote_Variables).

For all the reasons previously discussed, my preference is not to erect syntax barriers, especially not selectively - documentation should suffice.

$using:var should be treated like any other expression, on which members can be directly accessed ($using:var.Prop, $using:var.Method()).

As for how likely it is that users will encounter the pitfall: Since methods are often lost in cross-process parallelism due to deserialization, there may not even be an expectation of being able to call methods - anecdotally, the vast majority of remoting / background-job commands I've seen on Stack Overflow simply pass values that are accessed as-is (no method calls, rarely property access).

So that means that only the following users are affected: Those who actively use not only methods on deserialized objects, but specifically mutating methods and who expect such mutations to be confined to an independent copy of the caller's value.


Finally, let me offer a way through the conceptual fog of by-value vs. by-reference argument passing vs. data that happens to be a reference:

  • The terms by-value vs. by-reference are data holder (container) concepts:

    • They indicate whether it is a data holder's data that is passed to a function/command or, loosely speaking, the data holder itself, the latter enabling direct manipulation of what data is stored in the data holder.

    • In that sense, PowerShell (and C#) always pass by value - unless you explicitly opt into a by-reference approach with [ref] (ref, out).

  • Separately, data in .NET comes in two flavors:

    • value-type instances, whose content itself is directly stored in data holders; a given value-type instance can never be indirectly referenced: accessing a data holder's data that is a value-type instance means invariably getting a copy of that data.

    • reference-type instances, only a reference ("pointer") to which is stored in data holders; the data itself is stored elsewhere, and multiple references can exist to it.

The two concepts are independent of one another, but it gets confusing, because something passed by-value (in the data-holder sense) can still be a reference (in the data sense) that the callee can potentially modify.

Again, this behavior is fundamental to all in-process parameter passing.

Copy link
Contributor

This issue has not had any activity in 6 months, if this is a bug please try to reproduce on the latest version of PowerShell and reopen a new issue and reference this issue if this is still a blocker for you.

1 similar comment
Copy link
Contributor

This issue has not had any activity in 6 months, if this is a bug please try to reproduce on the latest version of PowerShell and reopen a new issue and reference this issue if this is still a blocker for you.

Copy link
Contributor

This issue has not had any activity in 6 months, if this is a bug please try to reproduce on the latest version of PowerShell and reopen a new issue and reference this issue if this is still a blocker for you.

@microsoft-github-policy-service microsoft-github-policy-service bot added Resolution-No Activity Issue has had no activity for 6 months or more labels Nov 16, 2023
Copy link
Contributor

This issue has been marked as "No Activity" as there has been no activity for 6 months. It has been closed for housekeeping purposes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue-Enhancement the issue is more of a feature request than a bug Resolution-No Activity Issue has had no activity for 6 months or more WG-Engine core PowerShell engine, interpreter, and runtime
Projects
None yet
Development

No branches or pull requests

5 participants