Skip to content

fix: race race in attaching to existing sessions#366

Merged
ethanpailes merged 2 commits intomasterfrom
fix-attach-race
May 7, 2026
Merged

fix: race race in attaching to existing sessions#366
ethanpailes merged 2 commits intomasterfrom
fix-attach-race

Conversation

@ethanpailes
Copy link
Copy Markdown
Contributor

@ethanpailes ethanpailes commented May 7, 2026

Issue Link

n/a I just noticed this when I logged on this morning

AI Policy Ack

Ack

This PR was:

  • mostly or completely vibe coded
  • mostly or completely meat coded
  • bit of both

I wrote the fix, gemini wrote the test and helped with diagnosis.

Description

This patch fixes a race condition that could happen when two
different procs raced to attach to a given session. Since we
release the session lock breifly as part of the attach handshake,
we were leaving a window open in which two racing attaches could
trample on one another. The regression test in this patch shows
the issue in detail.

To fix the issue, I had select_shell_desc start returning the lock
guard for the session so that there was no gap in which the lock was
released.

In order to do that, I needed to switch to use parking_lot mutexes,
since it is impossible to return the lock guard for an Arc<Mutex<T>>
using the stdlib types.

This patch migrates from stdlib mutexes to parking lot
mutexes. This probably provides some small perf win, but
the main reason I want it is for the arc_lock method that
I'm going to need in order to fix a race condition in how
we handle attaches.
@ethanpailes
Copy link
Copy Markdown
Contributor Author

In practice this should only happen during a suspend or something like that. That's how I encountered the issue.

This patch fixes a race condition that could happen when two
different procs raced to attach to a given session. Since we
release the session lock breifly as part of the attach handshake,
we were leaving a window open in which two racing attaches could
trample on one another. The regression test in this patch shows
the issue in detail.

To fix the issue, I had select_shell_desc start returning the lock
guard for the session so that there was no gap in which the lock was
released.

In order to do that, I needed to switch to use parking_lot mutexes,
since it is impossible to return the lock guard for an Arc<Mutex<T>>
using the stdlib types.
@ethanpailes
Copy link
Copy Markdown
Contributor Author

I split this up into one commit for the Mutex migration and one for the actual fix to make it easier to review in isolation.

@ethanpailes ethanpailes merged commit edf44cf into master May 7, 2026
7 checks passed
@ethanpailes ethanpailes deleted the fix-attach-race branch May 7, 2026 17:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants