Use stricter rule to avoid unnecessary unwrapping of PSObject when operating on a COM object #4614

daxian-dbw · 2017-08-18T17:40:11Z

Summary

GetMember/SetMember/InvokeMember operations on a COM object will generate code to unwrap the COM object if it's wrapped in PSObject. Due to a loose restriction, if you have a string wrapped to a PSObject with a ETS member of the the same name, then accessing that member will fail.

Fix

Use a more restricted rule by checking the base object type when it's necessary.

iSazonov · 2017-08-21T08:18:52Z

src/System.Management.Automation/engine/runtime/Binding/Binders.cs

-                var baseValue = PSObject.Base(args[i].Value);
-                if (baseValue != args[i].Value)
+                bool checkBaseTypeForRestriction = false;
+                if (i == 0) { checkBaseTypeForRestriction = moreRestrictionForArg0; }


We could exclude this by call overload DeferForPSObject if args.Length == 1 before the cycle.

@iSazonov - that is a good suggestion. Alternatively, I prefer:

bool checkBaseTypeForRestriction = i == 0 && moreRestrictionForArg0;

My thoughts were - if args.Length usually much more 1 it is better to move the "initialization" out of cycle.

I considered bool checkBaseTypeForRestriction = i == 0 && moreRestrictionForArg0; when working on the changes, but that menas "1 'comparison' and 1 'and' operations" for every arg. I felt that maybe a 'i==0' check is less overhead ... But I might overthink about it 😄 Will change it.

@iSazonov We only want to use moreRestrictionForArg0 for the first element of args. For the rest of elements, we just pass in false.

lzybkr · 2017-08-21T21:32:20Z

Are you sure this is the right fix?

The original ideal behind DeferForPSObject was to create a site that unwrapped the PSObject and did nothing else - we would have a new nested dynamic site that would add tests like you've added.

daxian-dbw · 2017-08-21T22:36:19Z

@lzybkr The restrictions originally used in DeferForPSObject will cause PSObject to be unwrapped when it shouldn't. For example, when it comes to GetMember, the PSObject should be unwrapped only if the base is a COM object, but the restriction used in DeferForPSObject is 'arg is PSObject and arg.Base != arg', which means upcoming PSObject will be unwrapped even if the base is not a COM object. This fix doesn't change the behavior of DeferForPSObject -- it still just creates a site that unwraps PSObject, but with a stricter restriction if needed.

lzybkr · 2017-08-21T22:53:06Z

In that case, the fix is not correct, consider something like:

foreach ($o in & { New-Object string "a",5; New-Object -ComObject Something }) {
    $o.DoSomething()
}

The first time hitting the site invoking DoSomething, we'll have a PSObject wrapping a string and generate a rule without your additional restriction.

The second time hitting the site, the previous rule will succeed and you won't generate a new rule.

I don't see how avoiding unwrapping fixes a bug - isn't it possible that a site might receive the COM object unwrapped - and assuming so, shouldn't it behave the same as if the same instance is wrapped? To be concrete, the following should work the same, right?

$o = New-Object -COM Something
($o, $o.psobject.BaseObject) | % { $_.DoSomething() }

daxian-dbw · 2017-08-22T00:15:13Z

Example 1 from Jason's comment

foreach ($o in & { New-Object string "a",5; New-Object -ComObject Something }) {
    $o.DoSomething()
}

For the string wrapped in PSObject, the code path won't fall in DeferForPSObject, because we defer only if the base of target is a COM object. The code is as follows in FallbackInvokeMember:

if ((baseObject != null) && (baseObject.GetType().FullName.Equals("System.__ComObject")))
{
    return this.DeferForPSObject(args.Prepend(target).ToArray()).WriteToDebugLog(this);
}

So when it's the second time hitting the site (the COM object), the previous rule won't succeed. This is what I get with the fix of this PR:

PS:1> foreach ($o in & { New-Object string "a",5; New-Object -ComObject "Shell.Application" }) {
>> $o.Windows()
>> }
Method invocation failed because [System.String] does not contain a method named 'Windows'.
At line:2 char:1
+ $o.Windows()
+ ~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (:) [], RuntimeException
    + FullyQualifiedErrorId : MethodNotFound


Application          : System.__ComObject
Parent               : System.__ComObject
Container            :
Document             : System.__ComObject
TopLevelContainer    : True
Type                 :
Left                 : 398
Top                  : 316
Width                : 2034
Height               : 1158
LocationName         : Quick access
LocationURL          :
Busy                 : False
Name                 : File Explorer
HWND                 : 656452
FullName             : C:\windows\Explorer.EXE
Path                 : C:\windows\
...

Example 2 from Jason's comment

To be concrete, the following should work the same, right?

$o = New-Object -COM Something
($o, $o.psobject.BaseObject) | % { $_.DoSomething() }

Well, this example may not be ideal because object will be wrapped into PSObject when passed to % {}, so in this example it ends up with calling DoSomething on PSObject for both pipeline objects.

$o = New-Object -COM Something
$o.psobject.BaseObject.DoSomething
$o | % { $_.DoSomething }

This example might be what you meant to have, the first call happens on a COM object directly, and the the second happens on a PSObject. In the first time, rule A was generated with a check of '$arg TypeEqual __COMobject'. In the second time, rule A will fail due to this 'TypeEqual' check, and thus it will fall back to FallbackInvokeMember, where the PSObject will be unwrapped and return a nested dynamic site, and when the nested dynamic site gets called, rule A will succeed.

My additional comments

The additional restriction is not added to all cases where we need to unwrap PSObject, but only when it's needed. More specifically, for GetMember/SetMember/InvokeMember binders, PSObject should be unwrapped only when the target's base is a COM object, not any other PSObject. Take the following code as an instance:

$o = New-Object -COM "Shell.Application"
$str = Add-Member -InputObject $str -MemberType ScriptMethod -Name Windows -Value { "Windows" } -PassThru
$o | % { $_.Windows() }
$str.Windows()

The first call will fall into DeferForPSObject -- unwrap the PSObject and generate nested dynamic site with rule 'arg TypeEqual PSObject andalso arg.Base != arg'. Then the second call comes and the previous rule will succeed, so the str will be unwrapped and we will lose the ETS instance method Windows(). This is the issue this PR tries to fix.

lzybkr · 2017-08-22T01:52:01Z

I think I understand the issue now.

We unwrap PSObject wrapped com objects only - because the COM binding code from C# does not expect PSObject and we did not update that code to generate correct restrictions and unwrap as needed when seeing PSObject.

The restrictions created by DeferForPS were intentionally not checking the type because that would create too many dynamic methods.

I believe the restrictions should mirror the test that sends us down the path of deferring - namely that the base object is a COM object. This is more general than what you have in the fix right now which is checking the exact type. So basically - call PSEnumerableBinder.IsComObject (though maybe move the method to somewhere more general.)

lzybkr · 2017-08-22T02:00:10Z

test/powershell/engine/COM/COM.Basic.Tests.ps1

+        It "GetMember binder should differentiate PSObject that wraps COM object from other PSObjects" {
+            ## GetMember on the member name 'Name'.
+            ## '$_' here is a PSObject that wraps a COM object
+            $item | ForEach-Object { $_.Name } > $null


I think it's useful to test with a single dynamic site, something like this:

$t1 = ($item, "?") $t2 = ($str, "Hello") for ($pair in ($t1, $t2, $t2, $t1, $t1, $t2)) { $pair[0].Name | Should Be $pair[1] }

It's not much different from what you have, but this way you know it's the site's cache that influences your test and not the process wide cache, plus you can test all the rule orderings explicitly.

lzybkr · 2017-08-22T02:04:19Z

src/System.Management.Automation/engine/runtime/Binding/Binders.cs

@@ -338,32 +338,82 @@ internal static string ToDebugString(this BindingRestrictions restrictions)

    internal static class DynamicMetaObjectBinderExtensions
    {
-        internal static DynamicMetaObject DeferForPSObject(this DynamicMetaObjectBinder binder,
-                                                           params DynamicMetaObject[] args)
+        internal static DynamicMetaObject DeferForPSObject(this DynamicMetaObjectBinder binder, DynamicMetaObject arg0, bool moreRestrictionForArg0 = false)


For consistency, arg0 should be renamed target because we only care about the target.

I also think restrictions are specific to COM so the optional parameter could be named targetIsComObject.

daxian-dbw · 2017-08-22T17:12:47Z

I think checking IsComObject in the restriction is too specific. COM is not the only special condition that we need to unwrap a PSObject -- it also happens in PSBinaryOperationBinder in FallbackBinaryOperation (and might as well happen in future changes):

if ((target.Value is PSObject && PSObject.Base(target.Value) != target.Value) ||
     (arg.Value is PSObject && PSObject.Base(arg.Value) != arg.Value))
 {
     // When adding to an array, we don't want to unwrap the RHS - it's unnecessary,
     // and in the case of strings, we actually lose instance members on the PSObject.
     if (!(Operation == ExpressionType.Add && PSEnumerableBinder.IsEnumerable(target) != null))
     {
         // Defer when the object is wrapped, but not for empty objects.
         return this.DeferForPSObject(target, arg).WriteToDebugLog(this);
     }
}

The Add operation has its own call site, and we unwrap PSObject only if the target is not enumerable (IsEnumerable(target) == null). So potentially this will cause a problem -- target in first call is a PSObject of a scalar object; target in second call is a PSObject of a collection, and the previous rule will succeed. The reason I didn't change this in the PR is that I cannot reproduce the issue in real code (plus this PR focuses on the issue related to COM).

As for GetMember/SetMember/InvokeMember, since we will defer only if the base of target is COM object, checking the base object type in restriction won't result in multiple dynamic methods because the rule will be 'arg0.base TypeEquals __ComObject', and thus all PSObject's that wrap COM objects will fall in this rule.

lzybkr · 2017-08-22T17:43:40Z

Strictly speaking, DeferForPSObject is never needed, but it is a convenient way to avoid writing a lot more code.

It is used in many of the binders because those binders see fewer instances of PSObject and it was a trade off between generating the best possible code and just getting things to work correctly.

In the case of Get/Set/Invoke Member - instances of PSObject are very common, so DeferForPSObject was undesirable, so I spent the extra effort to avoid the extra call site and generate more efficient code.

In hindsight, it might not be a ton of work to eliminate DeferForPSObject entirely, I think the tricky part is minimizing calls to PSObject.Base(target) when constructing a new DynamicMetaObject for target.

In general, there should be no difference between a wrapped object and an unwrapped object - we need to generate different code, but unwrapping shouldn't gives us a different result. I believe strings with instance members are the only common exception to this statement (it can be an issue for value types as well, but because of boxing and unboxing, it is not a common problem.)

So barring an example that proves otherwise, I think the other sites that call DeferForPSObject are not affected by this problem - they do not access instance members of a string.

As for the generated restrictions for COM objects - the code you have now will unnecessary test PSObject.Base(obj) != obj - that test could go away with the type test.

And from my limited understanding of COM - I thought strongly typed wrappers are possible (commonly generated with tlbexp) - so not every COM object is an instance of System.__ComObject, but my recollection in this area is fuzzy.

daxian-dbw · 2017-08-22T19:15:21Z

So barring an example that proves otherwise, I think the other sites that call DeferForPSObject are not affected by this problem - they do not access instance members of a string.

Use the same example about FallbackBinaryOperation in PSBinaryOperationBinder:

target in the first call is a PSObject of a scalar object; target in the second call is a PSObject of a collection, and the previous rule will succeed.

Since the previous rule succeeds for the second call, target and all args will be unwrapped. If one of the args is a PSObject that wraps a string with ETS instance members, then the members will be stripped after adding to the collection. So theoretically, this is affected, though I don't have real code to show this issue. Do you think we should consider this case?

daxian-dbw · 2017-08-22T19:15:24Z

I thought strongly typed wrappers are possible (commonly generated with tlbexp) - so not every COM object is an instance of System.__ComObject, but my recollection in this area is fuzzy.

You are absolutely right about this! This is what I got when using the excel COM interop assembly:

object type: Microsoft.Office.Interop.Excel.ApplicationClass
Is __ComObject assignable from? True

So yes, a simple TypeEqual is not sufficient. I will use IsComObject. And the checks for COM object in GetMember/SetMember/InvokeMember are not sufficient too and need to be changed to use IsComObject.

lzybkr · 2017-08-22T19:31:05Z

Since the previous rule succeeds for the second call, target and all args will be unwrapped. If one of the args is a PSObject that wraps a string with ETS instance members, then the members will be stripped after adding to the collection. So theoretically, this is affected, though I don't have real code to show this issue. Do you think we should consider this case?

Like I said - I think it's a non-issue because the instance members don't matter when adding strings - we'll never attempt to access the instance members that we wouldn't find, and the unwrapped instance is not saved anywhere, so I think there is no bug there.

lzybkr · 2017-08-22T19:39:26Z

So yes, a simple TypeEqual is not sufficient.

Actually it probably is sufficient, but we would generate dynamic methods with the identical expressions and differing constraints - maybe fine if the constraint was expensive, but with the added cost of extra jit time and extra memory.

daxian-dbw · 2017-08-22T19:45:40Z

So yes, a simple TypeEqual is not sufficient.

It would be sufficient if we don't change the current COM checking code in GetMember/SetMember/InvokeMember, because currently we don't even consider the strongly types RWCs.

If we update the COM checking code to use IsComObject, then it's sufficient to make the code correct, but it's not sufficient to avoid generating unnecessary dynamic methods. After updating the constraint to use IsComObject(), one dynamic method will cover all COM cases.

daxian-dbw · 2017-08-23T16:36:06Z

@lzybkr I think your comments are all addressed. Can you please take another look?

lzybkr · 2017-08-23T20:28:36Z

src/System.Management.Automation/engine/runtime/Binding/Binders.cs

-                        .Merge(args[i].GetSimpleTypeRestriction())
-                        .Merge(BindingRestrictions.GetExpressionRestriction(Expression.NotEqual(exprs[i], args[i].Expression)));
+                        .Merge(arg.GetSimpleTypeRestriction())
+                        .Merge(BindingRestrictions.GetExpressionRestriction(Expression.Call(CachedReflectionInfo.Utils_IsComObject, expr)));
                }


Here expr is not a PSObject, and IsComObject calls PSObject.BaseObject.

A clean fix is to introduce an overload on IsComObject, one accepting PSObject and one that accepts object, calling the later here.

The IsComObject method is also used in FallbackConvert, and that target.Expression there could represent either
a COM object or a PSObject wrapping a COM object. Are you suggesting to make the expression restriction a bit complex by changing it to '(target.expression is PSObject AND IsComObject(PSObject target.Expression)) OR (IsComObject(Object target.Expression))'?

I think it's better to avoid the more complex version of 'expression restriction' for FallbackConvert. We can do one of the following two:

internal static bool IsComObject(object obj) { if (obj is PSObject) { obj = PSObject.Base(obj); } return obj != null && ComObjectType.IsAssignableFrom(obj.GetType()); }

OR

internal static bool IsComObject(object obj, bool mayBeWrapped) { if (mayBeWrapped) { obj = PSObject.Base(obj); } return obj != null && ComObjectType.IsAssignableFrom(obj.GetType()); }

I vote for the first one.

In that case, I vote for neither - I'd leave it or introduce another helper with a different name.

I tend to err on the side of less general code to get a faster test, but it probably doesn't matter much anyway.

OK, I will leave the PR as is.

Actually, we can have IsComObject(PSObject) and IsComObject(object), and make the 'expression restriction' in FallbackConvert look like this: IsComObject(PSObject.Base(target.Expression)), it makes the restriction a little bit more complex, but not much. I will go with it, let me know what you think.

lzybkr · 2017-08-23T20:32:13Z

src/System.Management.Automation/engine/runtime/Binding/Binders.cs

-                var baseValue = PSObject.Base(args[i].Value);
-                if (baseValue != args[i].Value)
+                // Target maps to arg[0] of the binder.
+                bool argIsComObject = (i == 0) && targetIsComObject;


FWIW, I believe this is what @iSazonov had in mind (and I don't care much one way or the other to be honest):

exprs[0] = ProcessOnePSObject(args[0], ref restrictions, targetIsComObject); for (int i = 1; i < args.Length; i++) { exprs[i] =ProcessOnePSObject(args[i], ref restrictions, argIsComObject: false); }

I see. It actually will need a bit more code -- check if args.Length > 0 because args[0] could fail otherwise. Since the perf is not a concern here, I will keep it as is.

When would args.Length == 0? That seems like an impossible condition, we wouldn't have anything to defer.

We already assumed args is not null in this method, so I guess it's OK to assume 'args.Length > 0'.

daxian-dbw · 2017-08-24T19:13:44Z

@lzybkr @iSazonov Your comments are addressed. Please take another look. Thanks!

daxian-dbw · 2017-08-26T01:11:16Z

@iSazonov Thanks for the review!
@lzybkr I updated the code to make IsComObject a faster test (see #4614 (comment)). Could you please take another look? Thanks!

daxian-dbw added 4 commits August 17, 2017 17:27

Use more restricted rule in general for DeferForPSObject

4d48d06

Use stricter rule only when needed

8e5392a

Add tests and touch up the comments

665e439

[Feature] Change '%' to 'ForEach-Object'

0ef2048

daxian-dbw self-assigned this Aug 18, 2017

daxian-dbw requested review from lzybkr and iSazonov August 18, 2017 17:40

daxian-dbw requested review from BrucePay and vors as code owners August 18, 2017 17:40

msftclas added the cla-not-required label Aug 18, 2017

daxian-dbw removed request for vors and BrucePay August 18, 2017 17:40

iSazonov reviewed Aug 21, 2017

View reviewed changes

lzybkr reviewed Aug 22, 2017

View reviewed changes

daxian-dbw added 3 commits August 22, 2017 14:54

Address comments

f6ccde0

correct parameter names

2a90178

Fix test -- setup test only if on windows desktop

772b3c8

lzybkr reviewed Aug 23, 2017

View reviewed changes

Address another comment

b2482a0

iSazonov approved these changes Aug 25, 2017

View reviewed changes

daxian-dbw added 2 commits August 25, 2017 08:52

Make IsComObject a fater test

027c1d2

Remove unnecessary parenthesis

598e0b4

lzybkr approved these changes Aug 28, 2017

View reviewed changes

daxian-dbw merged commit 12002e7 into PowerShell:master Aug 28, 2017

daxian-dbw deleted the binder branch August 28, 2017 20:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use stricter rule to avoid unnecessary unwrapping of PSObject when operating on a COM object #4614

Use stricter rule to avoid unnecessary unwrapping of PSObject when operating on a COM object #4614

daxian-dbw commented Aug 18, 2017

iSazonov Aug 21, 2017

lzybkr Aug 22, 2017

iSazonov Aug 22, 2017

daxian-dbw Aug 22, 2017

lzybkr commented Aug 21, 2017

daxian-dbw commented Aug 21, 2017

lzybkr commented Aug 21, 2017

daxian-dbw commented Aug 22, 2017 •

edited

lzybkr commented Aug 22, 2017

lzybkr Aug 22, 2017

daxian-dbw Aug 22, 2017

lzybkr Aug 22, 2017

daxian-dbw Aug 22, 2017

daxian-dbw commented Aug 22, 2017 •

edited

lzybkr commented Aug 22, 2017

daxian-dbw commented Aug 22, 2017

daxian-dbw commented Aug 22, 2017

lzybkr commented Aug 22, 2017

lzybkr commented Aug 22, 2017

daxian-dbw commented Aug 22, 2017

daxian-dbw commented Aug 23, 2017

lzybkr Aug 23, 2017

daxian-dbw Aug 24, 2017

daxian-dbw Aug 24, 2017

lzybkr Aug 24, 2017

daxian-dbw Aug 24, 2017

daxian-dbw Aug 25, 2017 •

edited

lzybkr Aug 23, 2017

daxian-dbw Aug 24, 2017

lzybkr Aug 24, 2017

daxian-dbw Aug 24, 2017 •

edited

daxian-dbw Aug 24, 2017

daxian-dbw commented Aug 24, 2017

daxian-dbw commented Aug 26, 2017

Navigation Menu

Use stricter rule to avoid unnecessary unwrapping of PSObject when operating on a COM object #4614

Use stricter rule to avoid unnecessary unwrapping of PSObject when operating on a COM object #4614

Conversation

daxian-dbw commented Aug 18, 2017

Summary

Fix

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lzybkr commented Aug 21, 2017

daxian-dbw commented Aug 21, 2017

lzybkr commented Aug 21, 2017

daxian-dbw commented Aug 22, 2017 • edited

Example 1 from Jason's comment

Example 2 from Jason's comment

My additional comments

lzybkr commented Aug 22, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

daxian-dbw commented Aug 22, 2017 • edited

lzybkr commented Aug 22, 2017

daxian-dbw commented Aug 22, 2017

daxian-dbw commented Aug 22, 2017

lzybkr commented Aug 22, 2017

lzybkr commented Aug 22, 2017

daxian-dbw commented Aug 22, 2017

daxian-dbw commented Aug 23, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

daxian-dbw Aug 25, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

daxian-dbw Aug 24, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

daxian-dbw commented Aug 24, 2017

daxian-dbw commented Aug 26, 2017

daxian-dbw commented Aug 22, 2017 •

edited

daxian-dbw commented Aug 22, 2017 •

edited

daxian-dbw Aug 25, 2017 •

edited

daxian-dbw Aug 24, 2017 •

edited