Skip to content

Fix race condition in LanguageModelSession when streaming responses#126

Merged
mattt merged 4 commits intohuggingface:mainfrom
shareup:fix-race-condition
Feb 17, 2026
Merged

Fix race condition in LanguageModelSession when streaming responses#126
mattt merged 4 commits intohuggingface:mainfrom
shareup:fix-race-condition

Conversation

@atdrendel
Copy link
Contributor

@atdrendel atdrendel commented Feb 14, 2026

LanguageModelSession.transcript was usually, but not always, protected from race conditions by limiting changes to the MainActor. Of course, the "but not always" (such as in streamResponse(to:generating:includeSchemaInPrompt:options:)) means that race conditions were possible and LanguageModelSession's @unchecked Sendable wasn't really accurate. I tested this by adding MainActor.assumeIsolated() in the functions that didn't dispatch to MainActor and ran the tests. The assertions fired, showing that access to LanguageModelSession.transcript was not correctly protected against race conditions.

CleanShot 2026-02-14 at 02 07 08@2x CleanShot 2026-02-14 at 02 07 18@2x

When I looked more carefully at the code of LanguageModelSession, I noticed that there were different synchronization methods being used. LanguageModelSession.transcript was, as I said, kind of being protected by sometimes dispatching to MainActor. RespondingState was, on the other hand, an actor, which introduced (to my mind) unnecessary thread hops to the codebase.

My solution was to bundle up the mutable state of LanguageModelSession into a State struct and guard it with OSAllocatedUnfairLock<State> Locked<State>.

private struct State: Equatable, Sendable {
    var transcript: Transcript

    var isResponding: Bool { count > 0 }
    private var count = 0

    init(_ transcript: Transcript) {
        self.transcript = transcript
    }

    mutating func beginResponding() {
        count += 1
    }

    mutating func endResponding() {
        count = max(0, count - 1)
    }
}

Running the tests, everything seemed to work fine, and it reduced the number await calls we need to make. Honestly, when compared with the time cost of streaming AI responses, neither thread hops nor acquiring locks matter much, but I would guess acquiring uncontested locks would be a bit faster than thread hops.

- Fix race condition when streaming responses.
Copilot AI review requested due to automatic review settings February 14, 2026 17:55
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR attempts to fix race conditions in LanguageModelSession by replacing the mixed synchronization approach (MainActor dispatching and a RespondingState actor) with a unified OSAllocatedUnfairLock<State> pattern. The changes bundle mutable state (transcript and isResponding) into a State struct protected by a lock.

Changes:

  • Replace stored properties isResponding and transcript with computed properties that read from locked state
  • Remove RespondingState actor and replace with State struct protected by OSAllocatedUnfairLock
  • Convert all MainActor.run calls to state.withLock calls for synchronization

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@atdrendel
Copy link
Contributor Author

atdrendel commented Feb 14, 2026

...shoot. I had looked through the README.md first, and I didn't think we were trying to support Linux with this library. We can't support Mutex because that requires iOS 18. NSLock can't be used in asynchronous contexts. I'll see what I can come up with.

OK, I replaced OSAllocatedUnfairLock with Locked, which uses a pretty widely-used and widely-understood method of synchronization that is safe on multiple platforms. It's based on my own open-source implementation of Locked, which has been running in production for years without any issues. I also added a bunch of tests for Locked.

Obviously, I would prefer not to introduce any synchronization primitives to the codebase, and I'm not sure how you'll feel about the addition, @mattt., but I'm pretty happy with the fix overall.

...now I'll deal with the Observation stuff Copilot called out.

public private(set) var isResponding: Bool = false
public private(set) var transcript: Transcript
public var isResponding: Bool {
access(keyPath: \.isResponding)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to Philippe Hausler on the Swift Forums, the access() and withMutation() calls need to happen outside of locks.

Comment on lines +102 to +103
withMutation(keyPath: \.isResponding) {
state.access { $0.beginResponding() }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to Philippe Hausler on the Swift Forums, the access() and withMutation() calls need to happen outside of locks.

@atdrendel
Copy link
Contributor Author

atdrendel commented Feb 15, 2026

OK, this should be ready to go. I didn't mean for it to get so large when I started out. Sorry about that.

@mattt
Copy link
Collaborator

mattt commented Feb 17, 2026

OK, this should be ready to go. I didn't mean for it to get so large when I started out. Sorry about that.

@atdrendel Not at all! Thank you so much for diving into the implementation details of Swift concurrency and observability. I think what's there now reflects some combination of wishful thinking and misunderstanding on my part. Taking a look now.

@mattt
Copy link
Collaborator

mattt commented Feb 17, 2026

@atdrendel I feel like I've gone on the same journey a few times, trying to implement cross-platform synchronization. We're not making it easy for ourselves, with such early deployment targets (maybe we should relax those?), but I swear, it's "oh, right, we can't use that" over and over again 🫠

Where we diverge is that I give up and reach for NSLock. You had a comment about how "NSLock can't be used in asynchronous contexts". For short, synchronous critical sections with no suspension points inside, my understanding is that it should be alright. So I put up #130 to see what that would look like.

Do you think that approach works alright? Like, are you able to reproduce the race condition you observed in that branch?

@atdrendel
Copy link
Contributor Author

@atdrendel I feel like I've gone on the same journey a few times, trying to implement cross-platform synchronization. We're not making it easy for ourselves, with such early deployment targets (maybe we should relax those?), but I swear, it's "oh, right, we can't use that" over and over again 🫠

Where we diverge is that I give up and reach for NSLock. You had a comment about how "NSLock can't be used in asynchronous contexts". For short, synchronous critical sections with no suspension points inside, my understanding is that it should be alright. So I put up #130 to see what that would look like.

Do you think that approach works alright? Like, are you able to reproduce the race condition you observed in that branch?

Thanks for taking a look at it, @mattt. Yeah, I wasn't thrilled with the idea of adding a new synchronization primitive to your project, but I felt comfortable with it because the same pattern was used in swift-async-algorithms and swift-nio, but I can understand your hesitancy.

The reason why I said we couldn't use NSLock was Xcode's warning that NSLock was not available in asynchronous contexts (i.e., when NSLock.lock() and NSLock.unlock() are used directly inside of an async function). I should have investigated that a bit more. I thought there was something inherently unsafe about that class itself, but, of course, the warning turned out to be warning people not to use the manual lock and unlock functions because it would be possible to have a suspension point in between lock and unlock, which means the thread could change between lock and unlock, probably crashing the app.

I checked out your branch. It looked OK and ran fine, but I'm not a huge fan of keeping locks and the thing they are locking stored separately. It would be way too easy, in the future, to forget to lock access to the State. So, I pushed a commit to this branch that replaced the os_unfair_lock and pthread_mutex_t primitives with NSLock, while keeping everything locked away inside of the Locked type. What do you think about that?

@mattt
Copy link
Collaborator

mattt commented Feb 17, 2026

I checked out your branch. It looked OK and ran fine, but I'm not a huge fan of keeping locks and the thing they are locking stored separately. It would be way too easy, in the future, to forget to lock access to the State. So, I pushed a commit to this branch that replaced the os_unfair_lock and pthread_mutex_t primitives with NSLock, while keeping everything locked away inside of the Locked type. What do you think about that?

That's a fair point! There's a competing interest in minimizing API surface area (even in the implementation), because it's hard to keep track of what is and what isn't in Foundation Models. But I think correctness trumps parsimony here.

I'll go ahead and close #130, pick this up, and get it merged.

@atdrendel
Copy link
Contributor Author

@mattt Thanks, Matt. Oh, and to answer your earlier question about deployment targets, I think keeping them as low as possible is the way to go. I've started to contribute to AnyLanguageModel because I see a world where I can replace a bunch of the custom code in our app with AnyLanguageModel, but increasing the deployment target to iOS 18 right now would be tough for us. We're actually just now increasing it to iOS 17 from iOS 16 because mlx-swift-lm now requires iOS 17. mlx-swift-examples only ever required iOS 16, which had been our minimum.

@mattt mattt merged commit 99679b3 into huggingface:main Feb 17, 2026
3 checks passed
@mattt
Copy link
Collaborator

mattt commented Feb 17, 2026

This is now available in 0.7.1

@atdrendel atdrendel deleted the fix-race-condition branch February 17, 2026 21:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants