Skip to content

MonitoringService: fix SetKernel/CollectAsync race with lock-free state snapshot#35

Merged
casuffitsharp merged 4 commits intomainfrom
copilot/synchronize-shared-state-monitoring
Mar 3, 2026
Merged

MonitoringService: fix SetKernel/CollectAsync race with lock-free state snapshot#35
casuffitsharp merged 4 commits intomainfrom
copilot/synchronize-shared-state-monitoring

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 2, 2026

  • Remove SemaphoreSlim _kernelLock and all associated lock/acquire/release code
  • Bundle _kernel + all _prev* + device-state into a single inner KernelState class
  • Replace eleven separate instance fields with a single volatile KernelState _state
  • SetKernel(): build new KernelState fully, then atomically assign _state = newState (no blocking, no locks)
  • CollectAsync(): capture _state once per tick into a local; use it throughout — no lock needed
  • SubscribeDevice/OnIoRequestCompleted close over their own KernelState — IO events always go to the correct state
  • Reorder SetKernel: publish _state = newState before Clear() calls so CollectAsync sees the new state before metrics are wiped
  • Build passes with 0 warnings, 0 errors; CodeQL finds 0 alerts
Original prompt

This section details on the original issue you should resolve

<issue_title>copilot: Synchronize SetKernel/CollectAsync shared state in MonitoringService</issue_title>
<issue_description>Context: Copilot review comment on PR #31
#31 (comment)

File: src/ProcSim.Core/Monitoring/MonitoringService.cs

Offending snippet (reads shared state without synchronization)

// MonitoringService.cs (main @ 87e19c808f6e9a21213fd45805a481fab355d34e)

private async Task CollectAsync(CancellationToken ct)
{
    while (await _timer.WaitForNextTickAsync(ct))
    {
        if (_kernel is null || _kernel.GlobalCycle == 0)
            continue;

        DateTime ts = DateTime.UtcNow;
        ulong nowTick = _kernel.GlobalCycle;

        foreach (Cpu cpu in _kernel.Cpus.Values)
        {
            CpuSnapshot prev = _prevCpu[cpu.Id];
            ...
        }
    }
}

Related mutation/reset happens in SetKernel():

public void SetKernel(Kernel kernel)
{
    _kernel = kernel;

    _prevCpu.Clear();
    _prevProcCycles.Clear();
    _prevChan.Clear();
    ...
}

Copilot notes that CollectAsync reads shared mutable state (_kernel and various *_prev* dictionaries) without synchronization, while SetKernel() can run concurrently (sets _kernel and clears dictionaries). This can lead to observing partially-reset state and runtime exceptions (e.g., _kernel becoming null between reads, or metric collection indexing into cleared dictionaries).

Acceptance criteria

  • Prevent races between SetKernel() and metric collection (CollectAsync and helpers).
  • Ensure metric collection uses a stable snapshot of the kernel + per-kernel metric state (or synchronized access).
  • No possible NullReferenceException / KeyNotFoundException due to concurrent resets.
  • Tests/build/CI pass.</issue_description>

Comments on the Issue (you are @copilot in this section)


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

@casuffitsharp casuffitsharp added the bug Something isn't working label Mar 2, 2026
Copilot AI and others added 2 commits March 2, 2026 23:35
Co-authored-by: casuffitsharp <78829483+casuffitsharp@users.noreply.github.com>
… in MonitoringService

Co-authored-by: casuffitsharp <78829483+casuffitsharp@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix synchronization of SetKernel/CollectAsync shared state in MonitoringService MonitoringService: fix SetKernel/CollectAsync race with lock-free state snapshot Mar 2, 2026
@casuffitsharp casuffitsharp requested a review from Copilot March 2, 2026 23:42
@casuffitsharp casuffitsharp marked this pull request as ready for review March 2, 2026 23:46
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a race condition in MonitoringService between SetKernel() (which could clear and re-initialize shared state dictionaries) and CollectAsync() (which reads that same shared state concurrently). The fix introduces a KernelState inner class that bundles the kernel reference with all associated per-tick state, then publishes it as a single volatile reference swap, so CollectAsync always operates on a coherent snapshot.

Changes:

  • Added a KernelState inner class bundling Kernel, all _prev* dicts, and IO tracking collections into one object.
  • Replaced eleven separate mutable instance fields with a single private volatile KernelState _state; SetKernel builds the new state off the hot path and publishes it atomically.
  • Updated CollectAsync and all Collect* / SubscribeDevice / OnIoRequestCompleted methods to pass KernelState state as a parameter, making event handlers close over their creation-time state.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/ProcSim.Core/Monitoring/MonitoringService.cs Outdated
Co-authored-by: casuffitsharp <78829483+casuffitsharp@users.noreply.github.com>
@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented Mar 2, 2026

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@casuffitsharp casuffitsharp merged commit 7cb8f2a into main Mar 3, 2026
11 checks passed
@casuffitsharp casuffitsharp deleted the copilot/synchronize-shared-state-monitoring branch March 3, 2026 00:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

copilot: Synchronize SetKernel/CollectAsync shared state in MonitoringService

3 participants