Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use stricter rule to avoid unnecessary unwrapping of PSObject when operating on a COM object #4614

Merged
merged 10 commits into from Aug 28, 2017

Conversation

daxian-dbw
Copy link
Member

Fix #4607

Summary

GetMember/SetMember/InvokeMember operations on a COM object will generate code to unwrap the COM object if it's wrapped in PSObject. Due to a loose restriction, if you have a string wrapped to a PSObject with a ETS member of the the same name, then accessing that member will fail.

Fix

Use a more restricted rule by checking the base object type when it's necessary.

var baseValue = PSObject.Base(args[i].Value);
if (baseValue != args[i].Value)
bool checkBaseTypeForRestriction = false;
if (i == 0) { checkBaseTypeForRestriction = moreRestrictionForArg0; }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could exclude this by call overload DeferForPSObject if args.Length == 1 before the cycle.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iSazonov - that is a good suggestion. Alternatively, I prefer:

bool checkBaseTypeForRestriction = i == 0 && moreRestrictionForArg0;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thoughts were - if args.Length usually much more 1 it is better to move the "initialization" out of cycle.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered bool checkBaseTypeForRestriction = i == 0 && moreRestrictionForArg0; when working on the changes, but that menas "1 'comparison' and 1 'and' operations" for every arg. I felt that maybe a 'i==0' check is less overhead ... But I might overthink about it 😄 Will change it.

@iSazonov We only want to use moreRestrictionForArg0 for the first element of args. For the rest of elements, we just pass in false.

@lzybkr
Copy link
Member

lzybkr commented Aug 21, 2017

Are you sure this is the right fix?

The original ideal behind DeferForPSObject was to create a site that unwrapped the PSObject and did nothing else - we would have a new nested dynamic site that would add tests like you've added.

@daxian-dbw
Copy link
Member Author

@lzybkr The restrictions originally used in DeferForPSObject will cause PSObject to be unwrapped when it shouldn't. For example, when it comes to GetMember, the PSObject should be unwrapped only if the base is a COM object, but the restriction used in DeferForPSObject is 'arg is PSObject and arg.Base != arg', which means upcoming PSObject will be unwrapped even if the base is not a COM object. This fix doesn't change the behavior of DeferForPSObject -- it still just creates a site that unwraps PSObject, but with a stricter restriction if needed.

@lzybkr
Copy link
Member

lzybkr commented Aug 21, 2017

In that case, the fix is not correct, consider something like:

foreach ($o in & { New-Object string "a",5; New-Object -ComObject Something }) {
    $o.DoSomething()
}

The first time hitting the site invoking DoSomething, we'll have a PSObject wrapping a string and generate a rule without your additional restriction.

The second time hitting the site, the previous rule will succeed and you won't generate a new rule.

I don't see how avoiding unwrapping fixes a bug - isn't it possible that a site might receive the COM object unwrapped - and assuming so, shouldn't it behave the same as if the same instance is wrapped? To be concrete, the following should work the same, right?

$o = New-Object -COM Something
($o, $o.psobject.BaseObject) | % { $_.DoSomething() }

@daxian-dbw
Copy link
Member Author

daxian-dbw commented Aug 22, 2017

Example 1 from Jason's comment

foreach ($o in & { New-Object string "a",5; New-Object -ComObject Something }) {
    $o.DoSomething()
}

For the string wrapped in PSObject, the code path won't fall in DeferForPSObject, because we defer only if the base of target is a COM object. The code is as follows in FallbackInvokeMember:

if ((baseObject != null) && (baseObject.GetType().FullName.Equals("System.__ComObject")))
{
    return this.DeferForPSObject(args.Prepend(target).ToArray()).WriteToDebugLog(this);
}

So when it's the second time hitting the site (the COM object), the previous rule won't succeed. This is what I get with the fix of this PR:

PS:1> foreach ($o in & { New-Object string "a",5; New-Object -ComObject "Shell.Application" }) {
>> $o.Windows()
>> }
Method invocation failed because [System.String] does not contain a method named 'Windows'.
At line:2 char:1
+ $o.Windows()
+ ~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (:) [], RuntimeException
    + FullyQualifiedErrorId : MethodNotFound


Application          : System.__ComObject
Parent               : System.__ComObject
Container            :
Document             : System.__ComObject
TopLevelContainer    : True
Type                 :
Left                 : 398
Top                  : 316
Width                : 2034
Height               : 1158
LocationName         : Quick access
LocationURL          :
Busy                 : False
Name                 : File Explorer
HWND                 : 656452
FullName             : C:\windows\Explorer.EXE
Path                 : C:\windows\
...

Example 2 from Jason's comment

To be concrete, the following should work the same, right?

$o = New-Object -COM Something
($o, $o.psobject.BaseObject) | % { $_.DoSomething() }

Well, this example may not be ideal because object will be wrapped into PSObject when passed to % {}, so in this example it ends up with calling DoSomething on PSObject for both pipeline objects.

$o = New-Object -COM Something
$o.psobject.BaseObject.DoSomething
$o | % { $_.DoSomething }

This example might be what you meant to have, the first call happens on a COM object directly, and the the second happens on a PSObject. In the first time, rule A was generated with a check of '$arg TypeEqual __COMobject'. In the second time, rule A will fail due to this 'TypeEqual' check, and thus it will fall back to FallbackInvokeMember, where the PSObject will be unwrapped and return a nested dynamic site, and when the nested dynamic site gets called, rule A will succeed.

My additional comments

The additional restriction is not added to all cases where we need to unwrap PSObject, but only when it's needed. More specifically, for GetMember/SetMember/InvokeMember binders, PSObject should be unwrapped only when the target's base is a COM object, not any other PSObject. Take the following code as an instance:

$o = New-Object -COM "Shell.Application"
$str = Add-Member -InputObject $str -MemberType ScriptMethod -Name Windows -Value { "Windows" } -PassThru
$o | % { $_.Windows() }
$str.Windows()

The first call will fall into DeferForPSObject -- unwrap the PSObject and generate nested dynamic site with rule 'arg TypeEqual PSObject andalso arg.Base != arg'. Then the second call comes and the previous rule will succeed, so the str will be unwrapped and we will lose the ETS instance method Windows(). This is the issue this PR tries to fix.

@lzybkr
Copy link
Member

lzybkr commented Aug 22, 2017

I think I understand the issue now.

We unwrap PSObject wrapped com objects only - because the COM binding code from C# does not expect PSObject and we did not update that code to generate correct restrictions and unwrap as needed when seeing PSObject.

The restrictions created by DeferForPS were intentionally not checking the type because that would create too many dynamic methods.

I believe the restrictions should mirror the test that sends us down the path of deferring - namely that the base object is a COM object. This is more general than what you have in the fix right now which is checking the exact type. So basically - call PSEnumerableBinder.IsComObject (though maybe move the method to somewhere more general.)

It "GetMember binder should differentiate PSObject that wraps COM object from other PSObjects" {
## GetMember on the member name 'Name'.
## '$_' here is a PSObject that wraps a COM object
$item | ForEach-Object { $_.Name } > $null
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's useful to test with a single dynamic site, something like this:

$t1 = ($item, "?")
$t2 = ($str, "Hello")
for ($pair in ($t1, $t2, $t2, $t1, $t1, $t2)) {
    $pair[0].Name | Should Be $pair[1]
}

It's not much different from what you have, but this way you know it's the site's cache that influences your test and not the process wide cache, plus you can test all the rule orderings explicitly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@@ -338,32 +338,82 @@ internal static string ToDebugString(this BindingRestrictions restrictions)

internal static class DynamicMetaObjectBinderExtensions
{
internal static DynamicMetaObject DeferForPSObject(this DynamicMetaObjectBinder binder,
params DynamicMetaObject[] args)
internal static DynamicMetaObject DeferForPSObject(this DynamicMetaObjectBinder binder, DynamicMetaObject arg0, bool moreRestrictionForArg0 = false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, arg0 should be renamed target because we only care about the target.

I also think restrictions are specific to COM so the optional parameter could be named targetIsComObject.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@daxian-dbw
Copy link
Member Author

daxian-dbw commented Aug 22, 2017

I think checking IsComObject in the restriction is too specific. COM is not the only special condition that we need to unwrap a PSObject -- it also happens in PSBinaryOperationBinder in FallbackBinaryOperation (and might as well happen in future changes):

if ((target.Value is PSObject && PSObject.Base(target.Value) != target.Value) ||
     (arg.Value is PSObject && PSObject.Base(arg.Value) != arg.Value))
 {
     // When adding to an array, we don't want to unwrap the RHS - it's unnecessary,
     // and in the case of strings, we actually lose instance members on the PSObject.
     if (!(Operation == ExpressionType.Add && PSEnumerableBinder.IsEnumerable(target) != null))
     {
         // Defer when the object is wrapped, but not for empty objects.
         return this.DeferForPSObject(target, arg).WriteToDebugLog(this);
     }
}

The Add operation has its own call site, and we unwrap PSObject only if the target is not enumerable (IsEnumerable(target) == null). So potentially this will cause a problem -- target in first call is a PSObject of a scalar object; target in second call is a PSObject of a collection, and the previous rule will succeed. The reason I didn't change this in the PR is that I cannot reproduce the issue in real code (plus this PR focuses on the issue related to COM).

As for GetMember/SetMember/InvokeMember, since we will defer only if the base of target is COM object, checking the base object type in restriction won't result in multiple dynamic methods because the rule will be 'arg0.base TypeEquals __ComObject', and thus all PSObject's that wrap COM objects will fall in this rule.

@lzybkr
Copy link
Member

lzybkr commented Aug 22, 2017

Strictly speaking, DeferForPSObject is never needed, but it is a convenient way to avoid writing a lot more code.

It is used in many of the binders because those binders see fewer instances of PSObject and it was a trade off between generating the best possible code and just getting things to work correctly.

In the case of Get/Set/Invoke Member - instances of PSObject are very common, so DeferForPSObject was undesirable, so I spent the extra effort to avoid the extra call site and generate more efficient code.

In hindsight, it might not be a ton of work to eliminate DeferForPSObject entirely, I think the tricky part is minimizing calls to PSObject.Base(target) when constructing a new DynamicMetaObject for target.

In general, there should be no difference between a wrapped object and an unwrapped object - we need to generate different code, but unwrapping shouldn't gives us a different result. I believe strings with instance members are the only common exception to this statement (it can be an issue for value types as well, but because of boxing and unboxing, it is not a common problem.)

So barring an example that proves otherwise, I think the other sites that call DeferForPSObject are not affected by this problem - they do not access instance members of a string.

As for the generated restrictions for COM objects - the code you have now will unnecessary test PSObject.Base(obj) != obj - that test could go away with the type test.

And from my limited understanding of COM - I thought strongly typed wrappers are possible (commonly generated with tlbexp) - so not every COM object is an instance of System.__ComObject, but my recollection in this area is fuzzy.

@daxian-dbw
Copy link
Member Author

So barring an example that proves otherwise, I think the other sites that call DeferForPSObject are not affected by this problem - they do not access instance members of a string.

Use the same example about FallbackBinaryOperation in PSBinaryOperationBinder:

target in the first call is a PSObject of a scalar object; target in the second call is a PSObject of a collection, and the previous rule will succeed.

Since the previous rule succeeds for the second call, target and all args will be unwrapped. If one of the args is a PSObject that wraps a string with ETS instance members, then the members will be stripped after adding to the collection. So theoretically, this is affected, though I don't have real code to show this issue. Do you think we should consider this case?

@daxian-dbw
Copy link
Member Author

I thought strongly typed wrappers are possible (commonly generated with tlbexp) - so not every COM object is an instance of System.__ComObject, but my recollection in this area is fuzzy.

You are absolutely right about this! This is what I got when using the excel COM interop assembly:

object type: Microsoft.Office.Interop.Excel.ApplicationClass
Is __ComObject assignable from? True

So yes, a simple TypeEqual is not sufficient. I will use IsComObject. And the checks for COM object in GetMember/SetMember/InvokeMember are not sufficient too and need to be changed to use IsComObject.

@lzybkr
Copy link
Member

lzybkr commented Aug 22, 2017

Since the previous rule succeeds for the second call, target and all args will be unwrapped. If one of the args is a PSObject that wraps a string with ETS instance members, then the members will be stripped after adding to the collection. So theoretically, this is affected, though I don't have real code to show this issue. Do you think we should consider this case?

Like I said - I think it's a non-issue because the instance members don't matter when adding strings - we'll never attempt to access the instance members that we wouldn't find, and the unwrapped instance is not saved anywhere, so I think there is no bug there.

@lzybkr
Copy link
Member

lzybkr commented Aug 22, 2017

So yes, a simple TypeEqual is not sufficient.

Actually it probably is sufficient, but we would generate dynamic methods with the identical expressions and differing constraints - maybe fine if the constraint was expensive, but with the added cost of extra jit time and extra memory.

@daxian-dbw
Copy link
Member Author

So yes, a simple TypeEqual is not sufficient.

It would be sufficient if we don't change the current COM checking code in GetMember/SetMember/InvokeMember, because currently we don't even consider the strongly types RWCs.

If we update the COM checking code to use IsComObject, then it's sufficient to make the code correct, but it's not sufficient to avoid generating unnecessary dynamic methods. After updating the constraint to use IsComObject(), one dynamic method will cover all COM cases.

@daxian-dbw
Copy link
Member Author

@lzybkr I think your comments are all addressed. Can you please take another look?

.Merge(args[i].GetSimpleTypeRestriction())
.Merge(BindingRestrictions.GetExpressionRestriction(Expression.NotEqual(exprs[i], args[i].Expression)));
.Merge(arg.GetSimpleTypeRestriction())
.Merge(BindingRestrictions.GetExpressionRestriction(Expression.Call(CachedReflectionInfo.Utils_IsComObject, expr)));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here expr is not a PSObject, and IsComObject calls PSObject.BaseObject.

A clean fix is to introduce an overload on IsComObject, one accepting PSObject and one that accepts object, calling the later here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The IsComObject method is also used in FallbackConvert, and that target.Expression there could represent either
a COM object or a PSObject wrapping a COM object. Are you suggesting to make the expression restriction a bit complex by changing it to '(target.expression is PSObject AND IsComObject(PSObject target.Expression)) OR (IsComObject(Object target.Expression))'?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to avoid the more complex version of 'expression restriction' for FallbackConvert. We can do one of the following two:

internal static bool IsComObject(object obj)
{
    if (obj is PSObject) { obj = PSObject.Base(obj); }
    return obj != null && ComObjectType.IsAssignableFrom(obj.GetType());
}

OR

internal static bool IsComObject(object obj, bool mayBeWrapped)
{
    if (mayBeWrapped) { obj = PSObject.Base(obj); }
    return obj != null && ComObjectType.IsAssignableFrom(obj.GetType());
}

I vote for the first one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, I vote for neither - I'd leave it or introduce another helper with a different name.

I tend to err on the side of less general code to get a faster test, but it probably doesn't matter much anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will leave the PR as is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we can have IsComObject(PSObject) and IsComObject(object), and make the 'expression restriction' in FallbackConvert look like this: IsComObject(PSObject.Base(target.Expression)), it makes the restriction a little bit more complex, but not much. I will go with it, let me know what you think.

var baseValue = PSObject.Base(args[i].Value);
if (baseValue != args[i].Value)
// Target maps to arg[0] of the binder.
bool argIsComObject = (i == 0) && targetIsComObject;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I believe this is what @iSazonov had in mind (and I don't care much one way or the other to be honest):

exprs[0] = ProcessOnePSObject(args[0], ref restrictions, targetIsComObject);
for (int i = 1; i < args.Length; i++)
{
    exprs[i] =ProcessOnePSObject(args[i], ref restrictions, argIsComObject: false);
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. It actually will need a bit more code -- check if args.Length > 0 because args[0] could fail otherwise. Since the perf is not a concern here, I will keep it as is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would args.Length == 0? That seems like an impossible condition, we wouldn't have anything to defer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already assumed args is not null in this method, so I guess it's OK to assume 'args.Length > 0'.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@daxian-dbw
Copy link
Member Author

@lzybkr @iSazonov Your comments are addressed. Please take another look. Thanks!

@daxian-dbw
Copy link
Member Author

@iSazonov Thanks for the review!
@lzybkr I updated the code to make IsComObject a faster test (see #4614 (comment)). Could you please take another look? Thanks!

@daxian-dbw daxian-dbw merged commit 12002e7 into PowerShell:master Aug 28, 2017
@daxian-dbw daxian-dbw deleted the binder branch August 28, 2017 20:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants