Skip to content

fix: correct FSI before throwing "Not enough space" to break poisoned-cache loop#59

Merged
mrdevrobot merged 3 commits intomainfrom
copilot/fix-insert-bulk-async-error
Apr 14, 2026
Merged

fix: correct FSI before throwing "Not enough space" to break poisoned-cache loop#59
mrdevrobot merged 3 commits intomainfrom
copilot/fix-insert-bulk-async-error

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 13, 2026

The in-memory Free Space Index (_fsi) is updated immediately (non-transactionally) when DeleteCore compacts a page. A rolled-back delete reverts the physical page via WAL but leaves the FSI with an inflated free-space value. Every subsequent insert is routed to that stale page, hits the physical capacity limit, and throws — without ever correcting the FSI — creating an unrecoverable loop.

Changes

  • DocumentCollection.InsertIntoPage — call _fsi.Update(pageId, actualFreeSpace) before throwing "Not enough space". The error message now captures the stale FSI value (pre-correction) for diagnostics:

    _fsi.TryGetFreeBytes(pageId, out var staleFsi);
    _fsi.Update(pageId, freeSpace);                  // break the loop
    throw new InvalidOperationException(
        $"Not enough space: need {requiredSpace}, have {freeSpace} | ... | FSI={staleFsi}");
  • DynamicCollection.InsertIntoPage — same fix.

  • Regression test InsertAsync_SucceedsAfterRollbackInducedFsiDesync — populates pages with large documents, induces desync via delete + rollback, then asserts that a subsequent insert succeeds within two attempts (first attempt corrects the FSI; second allocates a fresh page).

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • av-build-tel-api-v1.avaloniaui.net
    • Triggering command: /usr/share/dotnet/dotnet dotnet exec --runtimeconfig /home/REDACTED/.nuget/packages/avalonia.buildservices/11.3.2/tools/netstandard2.0/runtimeconfig.json /home/REDACTED/.nuget/packages/avalonia.buildservices/11.3.2/tools/netstandard2.0/Avalonia.BuildServices.Collector.dll (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Copilot AI and others added 2 commits April 13, 2026 17:19
…-cache loop

When InsertIntoPage detects that the physical page has less free space
than the FSI claimed, update the FSI with the actual physical free-space
value before throwing the InvalidOperationException. This prevents the
"poisoned-cache loop" where subsequent inserts keep routing to the same
full page indefinitely after a transaction rollback caused FSI desync.

Apply the same fix in DynamicCollection.InsertIntoPage.

Add regression test: InsertAsync_SucceedsAfterRollbackInducedFsiDesync.

Agent-Logs-Url: https://github.com/EntglDb/BLite/sessions/42159dfa-db7f-49de-b7b8-352a160c3070

Co-authored-by: mrdevrobot <12503462+mrdevrobot@users.noreply.github.com>
- Show the stale FSI value (before correction) in the "Not enough space"
  error message for clearer diagnosis
- Fix American English spelling in test comment

Agent-Logs-Url: https://github.com/EntglDb/BLite/sessions/42159dfa-db7f-49de-b7b8-352a160c3070

Co-authored-by: mrdevrobot <12503462+mrdevrobot@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix InsertBulkAsync fails with 'Not enough space' error fix: correct FSI before throwing "Not enough space" to break poisoned-cache loop Apr 13, 2026
Copilot AI requested a review from mrdevrobot April 13, 2026 17:23
@mrdevrobot mrdevrobot marked this pull request as ready for review April 14, 2026 07:11
Copilot AI review requested due to automatic review settings April 14, 2026 07:11
@mrdevrobot mrdevrobot merged commit 561da42 into main Apr 14, 2026
2 checks passed
@mrdevrobot mrdevrobot deleted the copilot/fix-insert-bulk-async-error branch April 14, 2026 07:12
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Addresses a rollback-induced Free Space Index (FSI) desynchronization where a stale (overly optimistic) in-memory FSI entry can cause repeated inserts to be routed to an actually-full page, resulting in a “Not enough space” failure loop.

Changes:

  • Update FSI with the page’s actual free space before throwing “Not enough space” in DocumentCollection.InsertIntoPage.
  • Apply the same “correct FSI before throw” behavior in DynamicCollection.InsertIntoPage.
  • Add a regression test that reproduces the rollback-induced FSI desync and asserts inserts succeed within two attempts.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
tests/BLite.Tests/InsertBulkTests.cs Adds regression test reproducing the poisoned-cache loop scenario.
src/BLite.Core/DynamicCollection.cs Updates FSI before throwing on insufficient page space to break the loop.
src/BLite.Core/Collections/DocumentCollection.cs Captures stale FSI value for diagnostics and corrects FSI before throwing.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1323 to +1329
{
// Correct the FSI to match the physical page state so that the next call to
// FindPageWithSpace does not route back to this page (breaks the poisoned-cache loop
// that can arise when a transaction that updated the FSI was later rolled back).
_fsi.Update(pageId, header.AvailableFreeSpace);
throw new InvalidOperationException($"Not enough space in page {pageId}: need {requiredSpace}, have {header.AvailableFreeSpace}");
}
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fix was applied to DynamicCollection.InsertIntoPage as well, but the regression test added in this PR only exercises the typed DocumentCollection path. Since DynamicCollection has its own test suite (e.g., tests/BLite.Tests/DynamicCollectionTests.cs), consider adding a corresponding rollback-induced FSI desync test for DynamicCollection to prevent regressions in that codepath too.

Copilot uses AI. Check for mistakes.
Comment on lines +1114 to +1116
_fsi.TryGetFreeBytes(pageId, out var staleFsi);
_fsi.Update(pageId, freeSpace);
throw new InvalidOperationException($"Not enough space: need {requiredSpace}, have {freeSpace} | PageId={pageId} | SlotCount={header.SlotCount} | Start={header.FreeSpaceStart} | End={header.FreeSpaceEnd} | FSI={staleFsi}");
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

staleFsi will be 0 when TryGetFreeBytes returns false, which changes the diagnostic from the previous FSI=n/a to a misleading FSI=0. Consider preserving the prior behavior by capturing the bool return and formatting FSI as n/a when the page isn't tracked in the FSI yet.

Suggested change
_fsi.TryGetFreeBytes(pageId, out var staleFsi);
_fsi.Update(pageId, freeSpace);
throw new InvalidOperationException($"Not enough space: need {requiredSpace}, have {freeSpace} | PageId={pageId} | SlotCount={header.SlotCount} | Start={header.FreeSpaceStart} | End={header.FreeSpaceEnd} | FSI={staleFsi}");
var hasStaleFsi = _fsi.TryGetFreeBytes(pageId, out var staleFsi);
var fsiDiagnostic = hasStaleFsi ? staleFsi.ToString() : "n/a";
_fsi.Update(pageId, freeSpace);
throw new InvalidOperationException($"Not enough space: need {requiredSpace}, have {freeSpace} | PageId={pageId} | SlotCount={header.SlotCount} | Start={header.FreeSpaceStart} | End={header.FreeSpaceEnd} | FSI={fsiDiagnostic}");

Copilot uses AI. Check for mistakes.
inserted = true;
break;
}
catch (InvalidOperationException ex) when (ex.Message.StartsWith("Not enough space"))
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exception filter uses ex.Message.StartsWith("Not enough space") without a StringComparison, which makes the match culture-sensitive and a bit brittle. Use the StartsWith(string, StringComparison.Ordinal) overload (or another non-message-based discriminator if available) to keep the test stable across environments.

Suggested change
catch (InvalidOperationException ex) when (ex.Message.StartsWith("Not enough space"))
catch (InvalidOperationException ex) when (ex.Message.StartsWith("Not enough space", System.StringComparison.Ordinal))

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

InsertBulkAsync fails with "Not enough space" due to an unrecoverable desync between in-memory FSI and physical page (v4.3)

3 participants