Skip to content

Conversation

michaelnebel
Copy link
Contributor

@michaelnebel michaelnebel commented Feb 15, 2024

It turns out that the extraction of primary constructors didn't extract initialisers correctly. This is fixed in this PR.

That is, for

public class C1(object o1) { }
public class C2(object o2) : C1(o2) { }

We need to extract

  • The implicit inititaliser (calling the object constructor) for the primary constructor on C1.
  • The explicit initialiser (calling the C1 constructor) for the primary constructor on C2.

This is needed for

  • Getting data-flow to work properly for simple classes with primary constructors (otherwise we won't get a control flow node corresponding to the basic block containing the class declaration). Adding the implicit initialiser means that we at least have a first statement, which is calling the object constructor.
  • First step for getting data flow to work for classes with primary constructors that uses inheritance (the explicit initialiser needs to be extracted otherwise we can't detect flow from the primary constructor of C2 into the C1 base class constructor).

It also turns out that we also need to extract at least an empty (synthetic) body for primary constructors to avoid lots of control flow issues.

@github-actions github-actions bot added the C# label Feb 15, 2024
@michaelnebel michaelnebel force-pushed the csharp/primaryconstructorinitializer branch 8 times, most recently from b6737bb to a149bde Compare February 19, 2024 10:29
@michaelnebel michaelnebel added the no-change-note-required This PR does not need a change note label Feb 19, 2024
@michaelnebel michaelnebel marked this pull request as ready for review February 19, 2024 11:41
@michaelnebel michaelnebel requested a review from a team as a code owner February 19, 2024 11:41
@michaelnebel michaelnebel requested a review from hvitved February 19, 2024 12:01
@michaelnebel
Copy link
Contributor Author

Sadly DCA doesn't show any new results.

Copy link
Contributor

@tamasvajk tamasvajk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some minor comments/question.

@@ -10,8 +12,16 @@ namespace Semmle.Extraction.CSharp.Entities
{
internal class Constructor : Method
{
private readonly Lazy<List<SyntaxNode>> DeclaringReferenceSyntax;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do we gain with Lazy<>? I think we're using these nodes in the ReportingLocation. Doesn't this mean that we need the DeclaringReferenceSyntax for all constructors?

Minor: fields should start with lower case letters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I will just initialize it directly in the constructor.

if (IsPrimary)
{
// Create a synthetic empty body for primary constructors.
Statements.SyntheticEmptyBlock.Create(Context, this, 0, Location);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed on Slack, this might also be useful for default constructors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is a good point!
I would prefer to make furthermore improvements on follow up PRs. Is that ok?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, a separate PR sounds good.

Comment on lines +53 to +57
if (Block is null && ExpressionBody is null && !IsPrimary ||
!IsSourceDeclaration)
{
return;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed on Slack, this might also apply to default constructors too.

class Base {}
class Derived : Base {}

Derived has a default constructor, which implicitly is calling base().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above (and thank you for putting it here!)

@@ -25,6 +25,7 @@ constructors.cs:
# 23| -1: [TypeMention] object
# 23| 1: [Parameter] s
# 23| -1: [TypeMention] string
# 23| 4: [BlockStmt] {...}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For generated expressions, we're using expr_compiler_generated. Should we mark SyntheticEmptyBlocks as compiler generated too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add it to the compiler_generated table?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that will require database changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could create stmt_compiler_generated. Should we do that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be fine to do this in a separate PR, so that you could merge this one.

Is there any benefit of having expr_compiler_generated, compiler_generated, and stmt_compiler_generated? It looks like any solution would need a DB schema change. I think my preference would be merging all of this into compiler_generated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be a nice simplification in any case. I will add that to my list of follow up items!

@michaelnebel michaelnebel force-pushed the csharp/primaryconstructorinitializer branch from a149bde to f246272 Compare February 20, 2024 10:48
Copy link
Contributor

@tamasvajk tamasvajk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@michaelnebel michaelnebel merged commit ed3dba8 into github:main Feb 20, 2024
@michaelnebel michaelnebel deleted the csharp/primaryconstructorinitializer branch February 20, 2024 14:12
@michaelnebel michaelnebel restored the csharp/primaryconstructorinitializer branch February 21, 2024 09:18
@michaelnebel michaelnebel deleted the csharp/primaryconstructorinitializer branch February 21, 2024 09:19
@michaelnebel
Copy link
Contributor Author

I realized that I had forgotten to start the DCA experiment with flags to rebuild the extractor.
The above experiment is based on the merge commit and the parent from main.
It shows some new results, but only in quality queries, mostly in useless-type-test, but they seem unrelated to the changes (but they look sound) and they are all in generated code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C# no-change-note-required This PR does not need a change note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants