Skip to content

Fix CPU spin in BatchingChannelReader when source completes#53

Merged
electricessence merged 5 commits intomasterfrom
copilot/fix-batching-channel-hang
Jan 31, 2026
Merged

Fix CPU spin in BatchingChannelReader when source completes#53
electricessence merged 5 commits intomasterfrom
copilot/fix-batching-channel-hang

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Jan 30, 2026

Tasks reading from batched channels would hang indefinitely and consume CPU when the source channel completed but the buffer was empty.

Root Cause

In BatchingChannelReader.WaitToReadAsyncCore, after Task.WhenAny returned, the source task was not awaited before looping. When the source channel closed, it returned completed ValueTasks immediately, creating a tight spin loop:

await Task.WhenAny(s.AsTask(), b).ConfigureAwait(false);
if (b.IsCompleted) { /* ... */ }
// Missing await here - loops immediately if source is closed
goto start;

Changes

  • BatchingChannelReader.cs: Await source task after WhenAny to verify completion status before looping
  • HangReproTest.cs: Add test reproducing concurrent batched readers scenario

The fix aligns with the base BufferingChannelReader pattern and prevents the infinite loop while preserving timeout/force-batch semantics.

Original prompt

This section details on the original issue you should resolve

<issue_title>Some tasks hang and consume CPU when reading from a batched channel</issue_title>
<issue_description>### Description
When reading from a batched channel using Batch and ReadAllAsync, some tasks occasionally hang indefinitely and consume CPU. This happens even with a simple repro using multiple tasks reading from an IAsyncEnumerable

Repro

public static class Program
{
    public static async Task Main()
    {
        var counts = new int[10];
        var tasks = Enumerable.Range(0, 10)
            .Select(x => Task.Run(async () =>
            {
                while (true)
                {
                    await GetSource()
                        .ToChannel(singleReader: true, cancellationToken: CancellationToken.None)
                        .Batch(Random.Shared.Next(15, 30), singleReader: true)
                        .ReadAllAsync(async batch =>
                        {
                            await Task.Delay(Random.Shared.Next(2, 15));
                        }, CancellationToken.None);

                    counts[x] += 1;
                    if (counts[x] % 10 == 0)
                    {
                        Console.WriteLine(counts.ToJson());
                    }
                }
            })).ToArray();

        await Task.WhenAll(tasks);
    }

    private static async IAsyncEnumerable<int> GetSource()
    {
        foreach (var value in Enumerable.Range(0, Random.Shared.Next(80, 120)))
        {
            yield return value;

            if (value % Random.Shared.Next(15, 25) == 0)
                await Task.Delay(Random.Shared.Next(2, 15));
        }
    }
}

Result

The program works correctly for a while. Then some tasks stop incrementing their counts (for example, task with index 2). They appear to hang in WaitToReadAsyncCore indefinitely, consuming CPU. Sample output of counts every 10 iterations

[1094,1096,642,1101,1098,1088,1093,1101,1076,1090]
[1096,1097,642,1102,1099,1090,1095,1103,1078,1092]
[1096,1097,642,1102,1100,1090,1095,1103,1078,1092]
[1098,1099,642,1104,1101,1092,1097,1105,1080,1094]
[1098,1100,642,1104,1101,1092,1097,1105,1080,1094]
[1100,1102,642,1107,1102,1093,1098,1107,1081,1096]
[1101,1103,642,1108,1103,1094,1100,1109,1083,1097]
[1101,1103,642,1108,1103,1095,1100,1110,1083,1097]

</issue_description>

<agent_instructions>Investigate this issue to see if it's something we can fix or is related to a Channel bug that is out of our control.</agent_instructions>

Comments on the Issue (you are @copilot in this section)

@electricessence @arkolka, When using batching, have you tried `.WithTimeout` ?

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI and others added 3 commits January 30, 2026 22:52
Co-authored-by: electricessence <5899455+electricessence@users.noreply.github.com>
The issue was that after Task.WhenAny completed, the code did not await
the source task before looping. This caused a tight CPU-consuming loop
when the source channel was closed but the buffer was empty, as the
closed source would immediately return a completed ValueTask on each
WaitToReadAsync call.

The fix properly awaits the source task after WhenAny to check its
completion status, preventing the tight loop. This matches the pattern
used in the base BufferingChannelReader class.

Co-authored-by: electricessence <5899455+electricessence@users.noreply.github.com>
- Remove unused System.Text.Json import from test
- Use consistent attribute style in test
- Add clarifying comment explaining the buffer completion check

Co-authored-by: electricessence <5899455+electricessence@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix task hanging and CPU consumption in batched channel reading Fix CPU spin in BatchingChannelReader when source completes Jan 30, 2026
Copilot AI requested a review from electricessence January 30, 2026 23:20
Updated Open.ChannelExtensions.csproj to increase the project version from 9.1.1 to 9.1.2. No other modifications were made.
@electricessence electricessence marked this pull request as ready for review January 31, 2026 01:35
Copilot AI review requested due to automatic review settings January 31, 2026 01:35
@electricessence electricessence merged commit 150c616 into master Jan 31, 2026
4 checks passed
@electricessence electricessence deleted the copilot/fix-batching-channel-hang branch January 31, 2026 01:35
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a critical CPU spin bug in BatchingChannelReader where tasks would hang indefinitely and consume CPU when the source channel completed with an empty buffer. The root cause was that after Task.WhenAny returned, the source task was not awaited before looping, causing a tight spin loop when the source channel closed.

Changes:

  • Fixed BatchingChannelReader.WaitToReadAsyncCore to await the source task after WhenAny before looping back
  • Added regression test to verify concurrent batched readers don't hang
  • Bumped version from 9.1.1 to 9.1.2

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
Open.ChannelExtensions/Readers/BatchingChannelReader.cs Fixes CPU spin by awaiting source task after WhenAny and checking buffer completion before looping
Open.ChannelExtensions.Tests/HangReproTest.cs Adds test to reproduce and verify the fix for the hanging issue with multiple concurrent batch readers
Open.ChannelExtensions/Open.ChannelExtensions.csproj Version bump from 9.1.1 to 9.1.2 for the bug fix release

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

[Fact]
public static async Task MultipleBatchReadersDoNotHang()
{
var cts = new CancellationTokenSource(TimeSpan.FromSeconds(30)); // 30 second timeout
Copy link

Copilot AI Jan 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CancellationTokenSource should be disposed using a using declaration to follow the codebase convention and ensure proper resource cleanup. Other tests in this codebase consistently use using var tokenSource = new CancellationTokenSource(...) pattern.

Suggested change
var cts = new CancellationTokenSource(TimeSpan.FromSeconds(30)); // 30 second timeout
using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(30)); // 30 second timeout

Copilot uses AI. Check for mistakes.
<GenerateDocumentationFile>true</GenerateDocumentationFile>
<Version>9.1.1</Version>
<Version>9.1.2</Version>
<PackageReleaseNotes>Ensure the current task scheduler is used and that it can be configured.</PackageReleaseNotes>
Copy link

Copilot AI Jan 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PackageReleaseNotes should be updated to reflect the actual changes in this release. The current text "Ensure the current task scheduler is used and that it can be configured." appears to be from a previous version and doesn't describe the CPU spin bug fix that is the focus of this PR.

Suggested change
<PackageReleaseNotes>Ensure the current task scheduler is used and that it can be configured.</PackageReleaseNotes>
<PackageReleaseNotes>Fix a CPU spin issue under high contention when using channel extensions.</PackageReleaseNotes>

Copilot uses AI. Check for mistakes.
// If so, return immediately without trying to pipe more items
if (b.IsCompleted)
return await b.ConfigureAwait(false);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot. after re-reviewing this code, I'm concerned that the assumptions here are wrong.

		if (b.IsCompleted) // Assuming it was bufferWait that completed.
		{
			await tokenSource.CancelAsync().ConfigureAwait(false);
			return await b.ConfigureAwait(false);
		}

If you make it past the above block due to Task.WhenAny, then you absolutely know that sTask must of completed.
So like 258 is moot.
And 261 might actually change the behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Some tasks hang and consume CPU when reading from a batched channel

3 participants