Skip to content

Fix a crash caused by improper management of async workers#21

Merged
savetheclocktower merged 2 commits intomasterfrom
fix-subsequence-job-cancellation
Feb 16, 2026
Merged

Fix a crash caused by improper management of async workers#21
savetheclocktower merged 2 commits intomasterfrom
fix-subsequence-job-cancellation

Conversation

@savetheclocktower
Copy link

This is the cause of at least one of the crashes reported in pulsar#1438, but hopefully all three.

The theory of the crash, produced during a rubber-ducking session with Claude:

  • TextBuffer::findWordsWithSubsequence is called (this is used by autocomplete-plus’s built-in subsequence provider) and delegates tofind_words_with_subsequence_in_range in the C++ code
  • that can be a costly task, so it's async; a job is scheduled and placed in outstanding_workers
  • a subsequent call to TextBuffer::setTextInRange calls set_text_in_range in the C++; this can invalidate whatever data find_words_with_subsequence_in_range was collecting, so it cancels the worker via cancel_queued_workers and worker->Cancel()
  • if we're unlucky, the worker is never removed from outstanding_workers because that happens in the worker’s Execute method — and we could've cancelled before that method was called
  • later, set_text_in_range triggers another call to cancel_queued_workers
  • the already-cancelled job from before is still present in the set
  • we try to call Cancel on it again, but the memory has been freed, and everything falls apart

The fix is to take the code that removes a job from outstanding_workers and move it from Execute to OnWorkComplete. The latter is guaranteed to be called, no matter how a job finished. Also, Execute runs in its own thread, but OnWorkComplete runs on the main thread, where all outstanding_workers access should happen in the first place.

Because this is a theory produced by a man and his hallucinating robot, it's important to back it up with proof. The first commit in this PR adds a failing test; the second commit makes the test pass. So if you're reviewing this PR:

  • Checkout 1f1caff, run npm run build:node, then npm run test:node and ensure you see a failing test with a crashdump
  • Checkout b89da52, repeat these steps, and see a green test suite

The theory of the crash, produced during a rubber-ducking session with Claude:

* `find_words_with_subsequence_in_range` is called
* a job is scheduled and placed in `outstanding_workers`
* a call to `set_text_in_range` cancels a pending worker via `cancel_queued_workers` and `worker->Cancel`
* the worker is never removed from `outstanding_workers` because that happens in `Execute` — and we cancelled before the job got that far
* later, `set_text_in_range` triggers another call to `cancel_queued_workers`
* the already-cancelled job from before is still present in the set
* we try to call `Cancel` on it again, but the memory has been freed


The fix is to take the code that removes a job from `outstanding_workers` and move it from `Execute` to `OnWorkComplete`. The latter is guaranteed to be called, no matter how a job finished.

Because this is a theory produced by a man and his hallucinating robot, it's important to back it up with proof. The test added in the previous commit should now pass instead of crashing. So this fix works, even if there are slight flaws in the reasoning above.
Copy link
Member

@confused-Techie confused-Techie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While not versed in C, the changes here are extremely minimal, simply moving existing coding in-line with what you've described.

Plus seeing tests added for exactly this, and that they are passing in CI makes me think we are 100% in the clear for this being a resolution.

Just prior to merging make sure to bump the version key in package.json so we can easily make a release from this, and with that change happy to approve and get this merged!

@savetheclocktower savetheclocktower merged commit 2dd2070 into master Feb 16, 2026
20 checks passed
@savetheclocktower savetheclocktower deleted the fix-subsequence-job-cancellation branch February 16, 2026 20:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants