Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warn on mismatch for ref params #2769

Merged
merged 20 commits into from
May 25, 2022
Merged

Conversation

jtschuster
Copy link
Member

This is a first attempt at adding tracking for ref params in the linker.

The expected behavior for by-reference parameters is:

  • ref parameters
    • The linker should enforce annotations match as if the values are flowing both from the argument to the parameter, as well as parameter to the argument. For references to variables that are explicitly annotated (fields, parameters), this means that DAM annotations must match exactly, but local variables and array elements can have extra DAMT
    • After the method call, the value of the variable passed by ref should contain the result of Meet-ing the previous value with a ByRefParameterValue
  • out parameters
    • The linker should enforce annotations match as if the values are flowing from the parameter to the argument only (the opposite of normal). For references to variables that can be explicitly annotated (fields, parameters), this means that the parameter must have at least the same annotations.
    • After the method call, the value of the variable passed to the out parameter should only contain the ByRefParameterValue
  • in parameters
    • These should be treated exactly the same as normal parameters

This change enforces DAM annotations match for ref values in all cases where the ref argument might be an annotated variable. However, this causes excess warnings when a local variable has a FieldValue or MethodParameterValue.

// Warns correctly
MethodWithRefParameter (ref s_typeWithPublicParameterlessConstructor);
// Local variable gets FieldValue
var x = s_typeWithPublicParameterlessConstructor;
// Warns because there is a FieldValue, but shouldn't because the variable referenced is a local which can't be annotated
MethodWithRefParameter (ref x);

This gets to the main issue / design decision for implementing by-reference parameters. There is no way to tell what kind of variable is passed as a ref parameter, but we want different behavior for locals (which cannot be annotated) and fields or parameters (which have immutable annotations). We also should keep in mind how ref locals would fit into the picture when we design this.

One way we could go about this is adding a flag with what the source of the values in MultiValue. I think anytime we Meet values to create a new MultiValue, it must be placed in a local variable, where the implicit conversion from a SingleValue should mean that the value comes directly from that field / parameter etc. If we add a flag that gets set only if the MultiValue comes from a Meet, we could differentiate between locals and directly annotated values. However, this means we'd have to enforce that Meet is used if and only if the target is a local variable, and it might be easy to accidentally assign a value to a local variable directly and lose that invariant.

To update the values in the local state after the method call, we could flow the new values for the by-ref params back up to the local state and update the values there. However, it might make sense to make a type to make a representation of references that enable mutability of the referenced variable's value.

@sbomer
Copy link
Member

sbomer commented Apr 27, 2022

There is no way to tell what kind of variable is passed as a ref parameter, but we want different behavior for locals (which cannot be annotated) and fields or parameters (which have immutable annotations).

Not only do we need to tell whether a ref parameter comes from a local/field/parameter, we also need to track the side-effect of a local being assigned through a ref param. This feels like a concern which may be better handled by the part of the code that's already responsible for tracking locals/assignments, so the visitor on the analyzer side and the scanner on the linker side.

On the analyzer side, I think this will show up as an assignment to a FlowCaptureReference, and we can tell from CapturedReferenceValue whether it was to a local/field/property. For the linker maybe we need some extra tracking - along the lines of your idea to track an extra flag to tell if something was a local, but rather than being part of MultiValue/Meet, I think it makes sense to handle it in the scanner, probably when we call ScanLdloc.

it might be easy to accidentally assign a value to a local variable directly and lose that invariant.

Agreed, I would not handle this in Meet - I think it makes sense to keep a separation between the values we track vs what kind of storage they have.

@sbomer
Copy link
Member

sbomer commented Apr 27, 2022

Awesome write-up by the way!

@vitek-karas
Copy link
Member

After the method call, the value of the variable passed by ref should contain the result of Meet-ing the previous value with a ByRefParameterValue

Doing this is not going to break anything, but it's not strictly speaking correct. After the call the caller has no idea what is the value of the ref - it is basically an out parameter at that point. It's true that some methods will not modify the value and so there's a possibility it will have the previous value, but there's no way to tell and more importantly there are methods which will always overwrite in which case it's actually wrong to remember the previous values.

I think there might be ways to expose this in such a way that the code will warn in places where it should not (but maybe there's no such way).

For ref parameters I think a good way to model them from a data flow point of view is that they're in parameters before and during the call and they are out parameters after the call. (assuming we "unwrap" the indirection and we're always treat them as the underlying values, not some special ref value).

@vitek-karas
Copy link
Member

but we want different behavior for locals (which cannot be annotated) and fields or parameters (which have immutable annotations).

I must admit I don't understand why it has to be this way. Currently if you observe the behavior of the analysis from the outside (the developer of the app) there are no locals. All the warnings we report are in terms of explicitly annotatable entities (params, fields, methods, ...). This has a good reason behind it - the developer can't do anything about the local since it can't be annotated by hand.

So in your example above:

// Warns correctly
MethodWithRefParameter (ref s_typeWithPublicParameterlessConstructor);
// Local variable gets FieldValue
var x = s_typeWithPublicParameterlessConstructor;
// Warns because there is a FieldValue, but shouldn't because the variable referenced is a local which can't be annotated
MethodWithRefParameter (ref x);

I think it's perfectly OK that we warn in both cases. In fact it would be a bug if we didn't warn in the second case. Take this for example:

void PrintProperties([DAM(PublicProperties)] ref Type t) { ... t.GetProperties() ... }

[DAM(PublicMethods)] Type s_typeWithMethods;

void Test()
{
    var type = s_typeWithMethods;
    PrintProperties(ref type);  // MUST WARN
}

The call to PrintProperties MUST warn because the type passed in doesn't have all properties preserved and it WILL break at runtime.

@vitek-karas
Copy link
Member

I haven't spend much time thinking about this, but right now I think ref/out parameters are relatively the easier part. Ref return values are also interesting but not too much (they act a lot like out parameters in a way). Ref locals are a different matter and they will be tricky.

We also should keep in mind how ref locals would fit into the picture when we design this.

I agree, but maybe we can start simple and only solve parameters on their own for now as they're much more common.

@vitek-karas
Copy link
Member

I don't understand the need for a ByRefParameterValue.

I think this is one of the core problems to solve here:

To update the values in the local state after the method call, we could flow the new values for the by-ref params back up to the local state and update the values there. However, it might make sense to make a type to make a representation of references that enable mutability of the referenced variable's value.

I ran into it with the ArrayValue but those have relatively easy workaround since the ArrayValue is already mutable. So the solution was to mutate the value itself and leave the state dictionary as-is.

For the ref params maybe we will need to make the state dictionary mutable - I didn't look in detail what that would mean.

@jtschuster
Copy link
Member Author

I think it's perfectly OK that we warn in both cases. In fact it would be a bug if we didn't warn in the second case. Take this for example:

void PrintProperties([DAM(PublicProperties)] ref Type t) { ... t.GetProperties() ... }

[DAM(PublicMethods)] Type s_typeWithMethods;

void Test()
{
    var type = s_typeWithMethods;
    PrintProperties(ref type);  // MUST WARN
}

The call to PrintProperties MUST warn because the type passed in doesn't have all properties preserved and it WILL break at runtime.

Just realized I didn't annotate things right in my example. I agree with your example that both should warn. For the example I'm thinking of it would look like this:

[DAM (DAMT.PublicFields | DAMT.PublicMethods)]
Type s_field;

void MethodWithRefParameter ([DAM (DAMT.PublicFields)] ref Type type) { }


// Warns correctly
MethodWithRefParameter (ref s_field);
// Local variable gets FieldValue
var x = s_field;
// Warns because there is a FieldValue, but shouldn't because the variable referenced is a local which can't be annotated
MethodWithRefParameter (ref x);

Since the field is annotated with PublicMethods and PublicFields (and that value can't change), we cannot pass it as the ref parameter since the method could assign a value without PublicMethods to the field. However, the local x isn't annotated in code and just takes the value assigned to it, so it can be "less annotated" after the method call, so we don't have to warn here. We just need to track that after the call, we can only guarantee that x will have at least PublicFields. However, from the view of HandleMethodCall there's no way to differentiate the two calls, it just sees the same FieldValue.

@jtschuster
Copy link
Member Author

I don't understand the need for a ByRefParameterValue.

My understanding is that SingleValues are meant to represent sources of DAM annotations that can't change. I thought it would make for better warning messages and easier debugging to have a distinct type to represent the value assigned to a variable after it has been passed as a ref parameter to differentiate it from the parameters of the containing method.

For example, I think it would be less confusing to have a different warning message for this:

[DAM (DAMT.PublicFields | DAMT.PublicMethods)]
Type s_field;

void MethodWithRefParameter ([DAM (DAMT.PublicFields)] out Type type) { }

void Method (Type type)
{
    // Warning: by-reference parameter 'type' of method 'MethodWithRefParameter' does not meet DAM requirements of field 's_field'
    MethodWithRefParameter (out s_field);

    MethodWithRefParameter (out var x);
    // Warning: by-reference parameter 'type' of method 'MethodWithRefParameter' does not meet DAM requirements of field 's_field'
    s_field = x;

    // Warning:  parameter 'type' of method 'Method' does not meet DAM requirements of field 's_field'
    s_field = type;
}

I was thinking ByRefParameter would only be used as a source for the values passed to the calling method (never a target for a value) and checking that dataflow values match going from parameter -> argument. Then we would continue to use MethodParameterValue when checking dataflow from argument -> parameter. If we think it complicates things more than it's worth it isn't strictly necessary, though.

@vitek-karas
Copy link
Member

Re field versus local: That's a good example, I didn't think of that. I think you're right that we will have to add something here. One thing we had previously but removed it for now is values which are "see through" for diagnostics. So basically something like a LocalVariableValue which stores a SingleValue which is the underlying real value. So it looks like the underlying SingleValue and it never participates in warning construction, but is there to help data flow tracking. So in this case when we load the local variable we would represent it as LocalVariableValue with the FieldValue in it. And when there's an assignment to this, we rewrite the inner value to the new value (so its mutable).

@vitek-karas
Copy link
Member

Re ByRefParameterValue: Personally I don't see much of a difference between the two warning messages, but his is subjective, so don't trust me on it :-).
As for the change, I would probably start without this, make it work (warns when it should). We can improve the messages afterwards.

@jtschuster
Copy link
Member Author

@vitek-karas @sbomer and I spoke today about the path forward for ref params and ref locals.

The concept of references should be contained in as little code as possible (preferably only in MethodBodyScanner in linker and TrimAnalysisVisitor in analyzer) to avoid leaking into shared code and requiring special casing everywhere there.

In the linker, we can create a new ReferenceValue with variants for locals, fields parameters, and array elements, that contains the info necessary to retrieve the value when needed. We should avoid having an actual reference to a location in memory because values are copied and Meet'ed at every basic block. This reference should look up the value contained in the variable and pass that value to HandleCallAction. We will also need to handle checking that annotations match for ref parameters outside of HandleCallAction as well.

Ref locals and more complex reference type / array value handling is not a likely scenario and should be a lower priority. For now we'll just solve ref parameters as simply as possible.

Vitek and Sven, did I miss anything else from our discussion?

@sbomer
Copy link
Member

sbomer commented May 4, 2022

I think that captures it!

@jtschuster jtschuster requested a review from sbomer May 16, 2022 20:31
src/linker/Linker.Dataflow/MethodBodyScanner.cs Outdated Show resolved Hide resolved
src/ILLink.Shared/SharedStrings.resx Outdated Show resolved Hide resolved
src/linker/Linker.Dataflow/FieldReferenceValue.cs Outdated Show resolved Hide resolved
src/linker/Linker.Dataflow/FieldReferenceValue.cs Outdated Show resolved Hide resolved
src/linker/Linker.Dataflow/MethodBodyScanner.cs Outdated Show resolved Hide resolved
src/linker/Linker.Dataflow/MethodBodyScanner.cs Outdated Show resolved Hide resolved
src/linker/Linker.Dataflow/ReflectionMethodBodyScanner.cs Outdated Show resolved Hide resolved
src/linker/Linker.Dataflow/ReflectionMethodBodyScanner.cs Outdated Show resolved Hide resolved
src/linker/Linker.Dataflow/ReflectionMethodBodyScanner.cs Outdated Show resolved Hide resolved
src/linker/Linker.Dataflow/MethodBodyScanner.cs Outdated Show resolved Hide resolved
src/ILLink.Shared/SharedStrings.resx Outdated Show resolved Hide resolved
src/ILLink.Shared/TrimAnalysis/ByRefParameterValue.cs Outdated Show resolved Hide resolved
src/ILLink.Shared/TrimAnalysis/ByRefParameterValue.cs Outdated Show resolved Hide resolved
src/linker/Linker.Dataflow/MethodBodyScanner.cs Outdated Show resolved Hide resolved
src/linker/Linker.Dataflow/MethodBodyScanner.cs Outdated Show resolved Hide resolved
{
VariableDefinition localDef = GetLocalDef (operation, methodBody.Variables);
if (localDef == null) {
PushUnknownAndWarnAboutInvalidIL (currentStack, methodBody, operation.Offset);
return;
}

bool isByRef = operation.OpCode.Code == Code.Ldloca || operation.OpCode.Code == Code.Ldloca_S
|| localDef.VariableType.IsByRefOrPointer ();
bool isByRef = operation.OpCode.Code == Code.Ldloca || operation.OpCode.Code == Code.Ldloca_S;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment that this doesn't yet support ref types (so a local ref variable for example)?
Since the code before the change sort of tried to support that.
Another interesting thing to add to notes/todo/issues - pointer types - I don't know how they should be handled - just that it's VERY low priority.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should still handle ref locals fine. It's just if VariableType.IsByRefOrPointer returns true, the value held should already be a ReferenceValue, and we wouldn't want to wrap it in another LocalVariableReferenceValue.

src/linker/Linker.Dataflow/ReferenceValue.cs Show resolved Hide resolved
src/linker/Linker.Dataflow/MethodBodyScanner.cs Outdated Show resolved Hide resolved
// after the call with out parameter.
[ExpectedWarning ("IL2072", nameof (DataFlowTypeExtensions.RequiresPublicMethods))]
[ExpectedWarning ("IL2072", nameof (DataFlowTypeExtensions.RequiresPublicMethods), ProducedBy = ProducedBy.Analyzer)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have an issue tracking the fact that analyzer doesn't seem to handle ref parameters correctly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have https://github.com/dotnet/linker/issues/2632 which generally is about ref and out parameters, but I don't think we have one specifically for the analyzers.

Share switch when storing a value by ref
Remove separate errors for by-ref parameters
- Warn on unhandled StoreReference
- Update warning codes for reference tracking failures
- Add HandleStoreThisParameter and HandleStoreReturnValue for assigning
  to refs of `this` or ref return values
- Add ArrayElementReferenceValue and reset elements when a reference is
  taken.
Copy link
Member

@vitek-karas vitek-karas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Great work!

src/linker/Linker.Dataflow/MethodBodyScanner.cs Outdated Show resolved Hide resolved
src/linker/Linker.Dataflow/MethodBodyScanner.cs Outdated Show resolved Hide resolved
src/linker/Linker.Dataflow/LocalVariableReferenceValue.cs Outdated Show resolved Hide resolved
src/linker/Linker.Dataflow/MethodBodyScanner.cs Outdated Show resolved Hide resolved
src/linker/Linker.Dataflow/ParameterReferenceValue.cs Outdated Show resolved Hide resolved
src/linker/Linker.Dataflow/MethodBodyScanner.cs Outdated Show resolved Hide resolved
Copy link
Member

@sbomer sbomer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks a lot!

@jtschuster jtschuster merged commit 87539d4 into dotnet:main May 25, 2022
agocke pushed a commit to dotnet/runtime that referenced this pull request Nov 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants