Skip to content

[clr-interp] Fix GC liveness reporting stale data for a var, before it is actually set#128709

Open
BrzVlad wants to merge 2 commits into
dotnet:mainfrom
BrzVlad:fix-clrinterp-pinning
Open

[clr-interp] Fix GC liveness reporting stale data for a var, before it is actually set#128709
BrzVlad wants to merge 2 commits into
dotnet:mainfrom
BrzVlad:fix-clrinterp-pinning

Conversation

@BrzVlad
Copy link
Copy Markdown
Member

@BrzVlad BrzVlad commented May 28, 2026

Liveness for a var starts at the instruction that writes it. Consider this scenario:

call GC.Collect()
ldc var <- null

When the GC runs, the ip in this frame points to the address of the ldc instruction. The liveness of var starts at this instruction as well, which means the GC will see it as alive, even though it hasn't been set yet, reporting stale data as a root.

As a simple fix, we consider the liveness start for a var as being the last ip inside the instruction that sets it. The var is not logically alive until the instruction that sets it actually finishes executing.

Fixes LibraryImportGenerator.UnitTests.IncrementalGenerationTests.GeneratorRun_WithNewCompilation_DoesNotKeepOldCompilationAlive

While the interpreter is not as precise as JIT when it comes to stack roots (and making it precise is not a clear goal), it seems to work pretty well in practice and I believe this fix is simple enough.

@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @BrzVlad, @janvorli, @kg
See info in area-owners.md if you want to be subscribed.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adjusts CoreCLR interpreter GC liveness reporting so a temporary/local variable becomes live only after the instruction that defines it has completed, avoiding reporting stale stack data as a GC root.

Changes:

  • Shifts non-global variable live-start offsets past the defining instruction.
  • Skips zero-length live ranges for variables that are set but never subsequently used.
  • Updates interpreter GC diagnostic dump formatting to use hex offsets.

@kg
Copy link
Copy Markdown
Member

kg commented May 28, 2026

How does this interact with calls and their return address? Based on your description it sounds like the return address wouldn't be live 'during' the call instruction, even though it's possible that the call target may write to the return address mid-execution and then trigger a GC

@BrzVlad
Copy link
Copy Markdown
Member Author

BrzVlad commented May 28, 2026

@kg The same rules should apply, aka the return value is dead during the call. I don't see in which scenario a GC could be triggered mid execution. The returnVar is written at the very end, when the called frame invokes one of the INTOP_RET.. opcodes. This jumps to EXIT_FRAME. No GC can happen during this time.

@davidwrighton
Copy link
Copy Markdown
Member

@BrzVlad that's not a good assumption when there can be jitted code that returns back into the interpreter. I believe the JIT requires return buffers to be kept live for the lifetime of the called method. (NOTE that this doesn't apply if the return is a simple object return, but it does apply in many struct return situations.

@BrzVlad
Copy link
Copy Markdown
Member Author

BrzVlad commented May 28, 2026

@davidwrighton Hmm, I thought we were using intermediary buffers and copying the result to the interpreter frame only during the call stubs, but this doesn't seem to be necessarily the case. It does seem unreasonably risky in that case, regardless of whether it works or not right now. Thoughts about restricting this PR to vars not created by calls ? (Although I would have to double check if it still passes the test in question, otherwise it is rather pointless).

EDIT: Might also work if we set the liveness start as the end of the defining instruction (rather than the start of the next one)

BrzVlad added 2 commits May 29, 2026 08:03
This matches the way the interp IR is being logged.
…executes

Liveness for a var starts at the instruction that writes it. Consider this scenario:
```
call GC.Collect()
ldc var <- null
```

When the GC runs, the ip in this frame points to the address of the `ldc` instruction. The liveness of var starts at this instruction as well, which means the GC will see it as alive, even though it hasn't been set yet, reporting stale data as a root.

As a simple fix, we consider the liveness start for a var as being the last ip inside the instruction that sets it. The var is not logically alive until the instruction that sets it actually finishes executing.
@BrzVlad BrzVlad force-pushed the fix-clrinterp-pinning branch from 4750d10 to 4bdc501 Compare May 29, 2026 05:50
@BrzVlad
Copy link
Copy Markdown
Member Author

BrzVlad commented May 29, 2026

Looking at this again, the affirmation that the returnValue is dead during the call is misleading because, unlike for normal instructions, during calls, the ip points to the next instruction, where the return value is alive regardless of this change.

There is a corner case however in my previous change, where if the return value is not used anywhere, then the liveStart and liveEnd would be equal (the ip of the instruction following the call) and then we would end up not reporting any live range at all because the var is considered always dead. Potentially, this could have been problematic if the jit treats this space as a root, as David pointed out.

@kg @davidwrighton Let me know if you think the change looks reasonable now

@BrzVlad
Copy link
Copy Markdown
Member Author

BrzVlad commented May 29, 2026

/azp run runtime-interpreter

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@BrzVlad
Copy link
Copy Markdown
Member Author

BrzVlad commented May 29, 2026

/azp run runtime-libraries-interpreter

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants