fix: engine cannot give response to the second user request with stre… #701

lihaiyin88 · 2025-06-24T06:26:48Z

No description provided.

…am way

CharlieFRuan · 2025-06-24T14:55:34Z

thanks a lot for the contribution. Would it be possible for you to provide a script for reproducing the issue / elaborate on the issue? Thank you!

Copilot

Pull Request Overview

This PR restructures the streaming generation method in MLCEngine to ensure the model lock is always released once, fixing an issue where the engine could not handle a second streaming request.

Flattened multiple nested try/catch blocks into a single try + finally around the main logic.
Converted inner helper functions (_countTrailingReplacementChar, _getChunk) to arrow function expressions.
Removed redundant lock.release() calls spread across catches; now released once in finally.

Comments suppressed due to low confidence (2)

src/engine.ts:706

This TODO is still open—either implement the usage support for non-chat completions or file a tracking issue to ensure it isn't overlooked.

          // TODO(Charlie): support usage for completion

src/engine.ts:483

Add a test case that performs two consecutive streaming requests (with and without errors) to verify the lock is always released and reacquired correctly.

    genConfig: GenerationConfig,

Copilot · 2025-06-24T14:56:28Z

src/engine.ts

@@ -483,18 +483,15 @@ export class MLCEngine implements MLCEngineInterface {
    genConfig: GenerationConfig,
    timeReceived: number,
  ): AsyncGenerator<ChatCompletionChunk | Completion, void, void> {
-    // Since it is an async generator, we need to do fine-grained try-catch to ensure lock is
-    // released only when errors occur. Then release at the very end when no error occurs.
-    // TODO: This makes code less readable, is there a better way to do this?
    const lock = this.loadedModelIdToLock.get(model)!;


Consider adding a brief comment above this try/finally to clarify its scope is for lock acquisition and release, improving future readability.

Suggested change

const lock = this.loadedModelIdToLock.get(model)!;

const lock = this.loadedModelIdToLock.get(model)!;

// Acquire the lock and ensure its release in the `finally` block.

fix: engine cannot give response to the second user request with stre…

fb351cd

…am way

CharlieFRuan requested a review from Copilot June 24, 2025 14:55

Copilot AI reviewed Jun 24, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: engine cannot give response to the second user request with stre… #701

fix: engine cannot give response to the second user request with stre… #701

Uh oh!

lihaiyin88 commented Jun 24, 2025

Uh oh!

CharlieFRuan commented Jun 24, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jun 24, 2025

Uh oh!

Uh oh!

	const lock = this.loadedModelIdToLock.get(model)!;
	const lock = this.loadedModelIdToLock.get(model)!;
	// Acquire the lock and ensure its release in the `finally` block.

fix: engine cannot give response to the second user request with stre… #701

Are you sure you want to change the base?

fix: engine cannot give response to the second user request with stre… #701

Uh oh!

Conversation

lihaiyin88 commented Jun 24, 2025

Uh oh!

CharlieFRuan commented Jun 24, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI Jun 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!