C#: Extract and use ambiguous type information for call target resolution #14891

tamasvajk · 2023-11-23T13:34:47Z

This PR

extracts
- candidate type symbols reported by Roslyn when facing ambiguous types
- invoked method names
modifies MethodCall::getTarget to look up candidate call targets when the qualifier of the call has an ambiguous type.

tamasvajk · 2023-12-04T11:13:44Z

@hvitved, @michaelnebel Could you have a look at this PR? It's not exactly ready for review. DCA is showing significant slowdown. But I'd like to get some early feedback on the approach, which significantly changes how we come up with call targets.

michaelnebel · 2023-12-05T09:44:31Z

csharp/extractor/Semmle.Extraction.CSharp/Entities/Expressions/Invocation.cs

+            {
+                if (memberName is not null)
+                {
+                    trapFile.invocation_member_name(this, memberName);


Why are all dynamic calls excluded?
A far as I understand, a call can be considered dynamic, if just one of the arguments are.

That's a good point. We could include dynamic_member_name in MethodCall::getACandidateTarget, where we use invocation_member_name. But there's no need to store dynamic member names duplicately in the DB.

I've done this in 82c675a.

michaelnebel · 2023-12-05T10:04:14Z

csharp/ql/lib/semmle/code/csharp/exprs/Call.qll

-          not this.hasMultipleParamsArguments()
+          not this.hasMultipleParamsArguments(c)
+          or
+          result.getType().isImplicitlyConvertibleTo(p.getType().(ArrayType).getElementType())


Why is this needed?

I think getArgumentForParameter was not handling correctly params before. The corresponding tests also report more argument-parameter pairs now. See here.

In case of params, we were already checking the types with isValidExplicitParamsType, which I think checks whether an array type is convertible to another array type, such as string[] to object[]. For the expanded form, we'd need to check the types agains the element type of the array.

void M1(string s, params int[] a) {} void M2(string s, params double[] a) {} M1("", 1, 2, 3); // Viable M1("", 1.1, 2.2, 3.3); // This is not a viable call to M due to `1.1` having `double` as a type, which is not convertible to the element type `int` of array type `int[]`. M2("", 1, 2, 3); // Viable, `int` is implicitly convertible to `double`. M2("", 1.1, 2.2, 3.3); // Viable

@hvitved raised the point below that isImplicitlyConvertibleTo might not work for generics. This is something that we should check and fix here. final override Expr getArgumentForParameter should work for traced DBs.

michaelnebel · 2023-12-05T10:13:05Z

csharp/ql/lib/semmle/code/csharp/exprs/Call.qll

+        this.getQualifier()
+            .getType()
+            .(ValueOrRefType)
+            .getABaseType*()


Including base types here somehow seems related to dispatching. Could this be a potential performance killer? Would it make sense to initially only look at type alternatives without taking base types into account?

The test cases show that including the base classes is needed. At the same time, it can easily be a performance killer.

michaelnebel · 2023-12-05T10:13:47Z

csharp/ql/lib/semmle/code/csharp/exprs/Call.qll

+            .getType()
+            .(ValueOrRefType)
+            .getABaseType*()
+            .getAnAmbiguousAlternativeType*() and


What happens, if we only look at "one" level of alternatives? Does that cover most cases?

I've done this in a4eb47f

michaelnebel · 2023-12-05T10:15:47Z

I think this looks really interesting.
Have you found the source of the performance issues?

tamasvajk · 2023-12-12T08:07:21Z

@hvitved Have you got a chance to do an early review of this PR?

hvitved

Some initial comments.

hvitved · 2023-12-12T14:05:19Z

csharp/ql/lib/semmle/code/csharp/Type.qll

@@ -45,6 +45,23 @@ class Type extends DotNet::Type, Member, TypeContainer, @type {

  /** Holds if this type is a value type, or a type parameter that is a value type. */
  predicate isValueType() { none() }
+
+  /** Gets an alternative type that is reported by the compiler as being ambiguous with this type. */
+  Type getAnAmbiguousAlternativeType() {


Modeling it in this way means that we cannot distinguish when a type is used unambiguously and when it is used ambiguously, that may be too simplistic.

Wouldn't not exists(this.getAnAmbiguousAlternativeType()) mean that the type is used unambiguously?

I may have misunderstood how it works, but consider this example:

namespace N1 { class C {} } namespace N2 { class C {} } namespace N3 { class C {} } namespace N4 { using N1; using N2; class C3 { C Field; // either `N1.C` or `N2.C`, not `N3.C` } } namespace N5 { using N1; using N3; class C3 { C Field; // either `N1.C` or `N3.C`, not `N2.C` } }

AFAICT, N1.C, N2.C, and N3.C will be considered equal modulo getAnAmbiguousAlternativeType, but that is an over approximation

No, your understanding is correct, we over approximate. What would be your suggestion? (Let's continue this discussion on Slack)

hvitved · 2023-12-12T14:10:18Z

csharp/ql/lib/semmle/code/csharp/exprs/Call.qll

+      not p.isParams() and
+      arg.getType().isImplicitlyConvertibleTo(p.getType())
+      or
+      p.isParams() and
+      arg.getType().isImplicitlyConvertibleTo(p.getType().(ArrayType).getElementType())


I don't think this works if either the parameter type or the argument type contains type parameters.

We can definitely improve this further. At the same time, I don't think it's critical. It means that there are cases when we don't find a candidate call target, doesn't it? The opposite can also happen: we may find too many candidate call targets. These are expected as we haven't implemented the entire resolution logic.

… them to compute call targets

tamasvajk · 2024-01-18T08:24:18Z

DCA performance looks good. Removed/added alert count doesn't seem as good as I expected. We do find 9 new cs/hardcoded-credentials in aspnetboilerplate/aspnetboilerplate which was prompting this change in the first place. At the same time, we only find 3 additional alerts and we lose 51 in the test suite. I'll need to check if the removed 51 were false positives or not...

github-actions bot added the C# label Nov 23, 2023

tamasvajk force-pushed the standalone/ambiguousTypes branch from f7cac13 to f21f16a Compare November 28, 2023 10:23

tamasvajk changed the title ~~C#: Add test case for ambiguous types in Standalone extraction~~ C#: Extract and use ambiguous type information for call target resolution Dec 1, 2023

michaelnebel reviewed Dec 5, 2023

View reviewed changes

hvitved reviewed Dec 12, 2023

View reviewed changes

tamasvajk force-pushed the standalone/ambiguousTypes branch 2 times, most recently from 527ce26 to 815c2ec Compare January 11, 2024 09:56

C#: Extract candidate types from ambiguous error type symbols and use…

b67f8f3

… them to compute call targets

tamasvajk force-pushed the standalone/ambiguousTypes branch from 2c65eba to b67f8f3 Compare January 14, 2024 14:02

tamasvajk and others added 4 commits January 16, 2024 13:05

WIP: Performance fixes

9d09624

C#: Fix bad join in UselessUpcast.ql

6f42dd2

C#: Improve some join orders

0e5e7b0

Minor fix

52aa355

tamasvajk force-pushed the standalone/ambiguousTypes branch from d35fc33 to 52aa355 Compare January 18, 2024 07:54

C#: Extract and use ambiguous type information for call target resolution #14891

Are you sure you want to change the base?

C#: Extract and use ambiguous type information for call target resolution #14891

Uh oh!

Conversation

tamasvajk commented Nov 23, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tamasvajk commented Dec 4, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

michaelnebel commented Dec 5, 2023

Uh oh!

tamasvajk commented Dec 12, 2023

Uh oh!

hvitved left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tamasvajk commented Jan 18, 2024

Uh oh!

Uh oh!

tamasvajk commented Nov 23, 2023 •

edited

Loading