Skip to content

Conversation

@michaelnebel
Copy link
Contributor

@michaelnebel michaelnebel commented Nov 28, 2025

It appears that sometimes there are spurious errors when running the dotnet --info command on .NET 10, if tests are running in parallel. However, if we run all tests sequentially, there are no problems (which indicate that this is not as such a "real" extractor issue).

To "handle" the problem we introduce some retry logic when the extractor invokes the dotnet --info command. If the dotnet --info command is killed (sigterm 143), we do up to three retries (with exponential backoff to give the OS time to finish whatever it is doing).

I think this PR further highlights the need for consolidating the "script" implementation and "dotnet" invocations used in the dependency fetcher.

@github-actions github-actions bot added the C# label Nov 28, 2025
@michaelnebel michaelnebel changed the title C#: Gracefully handle non-zero exitcodes for dotnet --info. C#: Do not crash on dotnet --info failure. Nov 28, 2025
@michaelnebel michaelnebel force-pushed the csharp/nocrashdotnetinfo branch from e425d75 to af64e33 Compare November 28, 2025 13:44
@michaelnebel michaelnebel changed the title C#: Do not crash on dotnet --info failure. C#: Gracefully handle dotnet --info exit code 143. Nov 28, 2025
@michaelnebel michaelnebel requested a review from hvitved November 28, 2025 14:38
Copy link
Contributor

@hvitved hvitved left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens to the subsequent dotnet restore invocations, presumably those also fail with the same exit code?

@michaelnebel
Copy link
Contributor Author

What happens to the subsequent dotnet restore invocations, presumably those also fail with the same exit code?

No, they don't appear to be affected (for the current run, the restore fails with exit code 1 for the restore - but that is due to another issue, which I have a fix for).

@michaelnebel michaelnebel force-pushed the csharp/nocrashdotnetinfo branch from 2ad1ba9 to 4a6ae21 Compare December 2, 2025 13:42
@michaelnebel michaelnebel marked this pull request as ready for review December 2, 2025 14:27
@michaelnebel michaelnebel requested a review from a team as a code owner December 2, 2025 14:27
Copilot AI review requested due to automatic review settings December 2, 2025 14:27
@michaelnebel michaelnebel added the no-change-note-required This PR does not need a change note label Dec 2, 2025
@michaelnebel michaelnebel requested a review from hvitved December 2, 2025 14:27
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds retry logic to handle spurious exit code 143 (SIGTERM) failures when running dotnet --info on .NET 10 during parallel test execution. The fix implements up to 4 attempts (initial + 3 retries) with exponential backoff.

Key changes:

  • Added RunCommandExitCode method to return exit codes instead of just success/failure booleans
  • Implemented retry logic with exponential backoff in both synchronous (Info()) and script-based (InfoScript()) contexts
  • Re-enabled previously skipped .NET 10 integration tests

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
csharp/ql/integration-tests/all-platforms/dotnet_10/test.py Removed skip decorators, re-enabling .NET 10 tests that were previously failing
csharp/extractor/Semmle.Extraction.Tests/DotNet.cs Updated test stub to support exit code testing and adjusted error message assertion
csharp/extractor/Semmle.Extraction.CSharp.DependencyFetching/IDotNetCliInvoker.cs Added RunCommandExitCode interface method for retrieving command exit codes
csharp/extractor/Semmle.Extraction.CSharp.DependencyFetching/DotNetCliInvoker.cs Refactored to extract exit code logic and implemented new RunCommandExitCode method
csharp/extractor/Semmle.Extraction.CSharp.DependencyFetching/DotNet.cs Added retry logic in Info() method and new InfoScript() method for build script context
csharp/autobuilder/Semmle.Autobuild.CSharp/DotNetRule.cs Replaced local GetInfoCommand with centralized DotNet.InfoScript implementation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.

Comment on lines 225 to 237
script = BuildScript.Bind(script, ret =>
{
switch (ret)
{
case 0:
return BuildScript.Success;
case 143 when attempt < 3:
HandleRetryExitCode143(dotnet, attempt, logger);
return info.Script;
default:
return BuildScript.Failure;
}
});
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loop variable attempt is captured in the closure passed to BuildScript.Bind, which will be executed later. Due to closure semantics in C#, the value of attempt will be its final value (3) when the script runs, not the value from each iteration. This means:

  1. The condition case 143 when attempt < 3 will always be true on the 4th iteration (when attempt == 3)
  2. The sleep duration calculation in HandleRetryExitCode will always use attempt=3, resulting in a sleep of 8 seconds on every retry instead of exponential backoff (1, 2, 4 seconds)

To fix this, capture the loop variable in a local variable:

for (int i = 0; i < 4; i++)
{
    var attempt = i; // Capture in local variable
    script = BuildScript.Bind(script, ret =>
    {
        switch (ret)
        {
            case 0:
                return BuildScript.Success;
            case 143 when attempt < 3:
                HandleRetryExitCode143(dotnet, attempt, logger);
                return info.Script;
            default:
                return BuildScript.Failure;
        }
    });
}
Suggested change
script = BuildScript.Bind(script, ret =>
{
switch (ret)
{
case 0:
return BuildScript.Success;
case 143 when attempt < 3:
HandleRetryExitCode143(dotnet, attempt, logger);
return info.Script;
default:
return BuildScript.Failure;
}
});
var attemptCopy = attempt; // Capture in local variable
script = BuildScript.Bind(script, ret =>
{
switch (ret)
{
case 0:
return BuildScript.Success;
case 143 when attemptCopy < 3:
HandleRetryExitCode143(dotnet, attemptCopy, logger);
return info.Script;
default:
return BuildScript.Failure;
}
});

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Damn, you got me there CoPilot!!

{
var res = dotnetCliInvoker.RunCommand("--info", silent: false);
if (!res)
// Allow up to three retry attempts to run `dotnet --info`, to mitigate transient issues
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says "Allow up to three retry attempts" but the loop runs 4 times (attempt = 0, 1, 2, 3). This means there are up to 3 retries after the initial attempt, for a total of 4 attempts. Consider updating the comment to clarify this is "up to 4 attempts (with 3 retries)" or adjust the loop to match the comment.

Suggested change
// Allow up to three retry attempts to run `dotnet --info`, to mitigate transient issues
// Allow up to four attempts (with up to three retries) to run `dotnet --info`, to mitigate transient issues

Copilot uses AI. Check for mistakes.
@michaelnebel michaelnebel changed the title C#: Gracefully handle dotnet --info exit code 143. C#: Retry logic for dotnet --info when it fails with exit code 143. Dec 3, 2025
Copy link
Contributor

@hvitved hvitved left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks plausible to me, one comment.

return;
case 143 when attempt < 3:
HandleRetryExitCode143(dotnetCliInvoker.Exec, attempt, logger);
break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be a continue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The break statement is in the scope of the switch statement to prevent fall-through (it doesn't break out of the loop). As there are no statements after the loop, it should be equivalent to use continue here. Do you prefer that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, it is for the switch. Perhaps continue is better to avoid confusion.

@michaelnebel
Copy link
Contributor Author

DCA looks good (some degree of wobliness, but nothing out of the usual).

@michaelnebel michaelnebel merged commit e74031b into github:main Dec 4, 2025
21 checks passed
@michaelnebel michaelnebel deleted the csharp/nocrashdotnetinfo branch December 4, 2025 08:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

C# no-change-note-required This PR does not need a change note

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants