[releases/2026/0][Legacy] Remove disconnected requests from execution queue (#3968) by mzegla · Pull Request #3973 · openvinotoolkit/model_server

mzegla · 2026-02-12T09:38:20Z

No description provided.

Copilot

Pull request overview

This PR adds request-level disconnection tracking to legacy LLM/VLM servables so disconnected clients’ queued requests can be skipped during execution.

Changes:

Add a clientDisconnected flag to legacy execution contexts.
Register a client disconnection callback during scheduling to flip the flag.
Update legacy executors to skip generation when the request is marked disconnected.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
src/llm/visual_language_model/legacy/servable.hpp	Add `clientDisconnected` flag to VLM legacy execution context.
src/llm/visual_language_model/legacy/servable.cpp	Register disconnection callback during VLM scheduling.
src/llm/visual_language_model/legacy/legacy_executor.cpp	Skip VLM generation when `clientDisconnected` is set.
src/llm/language_model/legacy/servable.hpp	Add `clientDisconnected` flag to LLM legacy execution context.
src/llm/language_model/legacy/servable.cpp	Register disconnection callback during LLM scheduling.
src/llm/language_model/legacy/legacy_executor.cpp	Skip LLM generation when `clientDisconnected` is set.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-12T09:39:51Z

src/llm/visual_language_model/legacy/servable.cpp

+    std::weak_ptr<VisualLanguageModelLegacyServableExecutionContext> weakContext = legacyExecutionContext;
+    legacyExecutionContext->payload.client->registerDisconnectionCallback([weakContext]() {
+        if (auto context = weakContext.lock()) {
+            context->clientDisconnected = true;
+        }
+    });
    if (legacyExecutionContext->payload.client->isDisconnected()) {
        return absl::CancelledError();
    }


Registering the disconnection callback before the isDisconnected() early-return can add unnecessary callback registrations for already-disconnected clients. Consider checking isDisconnected() first and returning absl::CancelledError() before registering the callback.

Suggested change

std::weak_ptr<VisualLanguageModelLegacyServableExecutionContext> weakContext = legacyExecutionContext;

legacyExecutionContext->payload.client->registerDisconnectionCallback([weakContext]() {

if (auto context = weakContext.lock()) {

context->clientDisconnected = true;

}

});

if (legacyExecutionContext->payload.client->isDisconnected()) {

return absl::CancelledError();

}

if (legacyExecutionContext->payload.client->isDisconnected()) {

return absl::CancelledError();

}

std::weak_ptr<VisualLanguageModelLegacyServableExecutionContext> weakContext = legacyExecutionContext;

legacyExecutionContext->payload.client->registerDisconnectionCallback([weakContext]() {

if (auto context = weakContext.lock()) {

context->clientDisconnected = true;

}

});

Copilot · 2026-02-12T09:39:52Z

src/llm/language_model/legacy/servable.cpp

+    std::weak_ptr<LegacyServableExecutionContext> weakContext = legacyExecutionContext;
+    legacyExecutionContext->payload.client->registerDisconnectionCallback([weakContext]() {
+        if (auto context = weakContext.lock()) {
+            context->clientDisconnected = true;
+        }
+    });
    if (legacyExecutionContext->payload.client->isDisconnected()) {
        return absl::CancelledError();
    }


Same pattern as the VLM path: checking isDisconnected() before registering the callback avoids unnecessary callback registrations when the client is already disconnected.

Suggested change

std::weak_ptr<LegacyServableExecutionContext> weakContext = legacyExecutionContext;

legacyExecutionContext->payload.client->registerDisconnectionCallback([weakContext]() {

if (auto context = weakContext.lock()) {

context->clientDisconnected = true;

}

});

if (legacyExecutionContext->payload.client->isDisconnected()) {

return absl::CancelledError();

}

if (legacyExecutionContext->payload.client->isDisconnected()) {

return absl::CancelledError();

}

std::weak_ptr<LegacyServableExecutionContext> weakContext = legacyExecutionContext;

legacyExecutionContext->payload.client->registerDisconnectionCallback([weakContext]() {

if (auto context = weakContext.lock()) {

context->clientDisconnected = true;

}

});

Copilot · 2026-02-12T09:39:53Z

src/llm/language_model/legacy/legacy_executor.cpp

+    auto& requestExecutionContext = requests.front();
+    if (requestExecutionContext->clientDisconnected) {
+        requestExecutionContext->success = false;
+        SPDLOG_LOGGER_DEBUG(llm_executor_logger, "Client disconnected, skipping request processing.");


Prefer explicitly reading the atomic via load() (optionally with std::memory_order_relaxed) rather than relying on implicit conversion, to make the concurrency intent clear and consistent across platforms/toolchains.

Copilot · 2026-02-12T09:39:53Z

src/llm/visual_language_model/legacy/legacy_executor.cpp

+    auto& requestExecutionContext = requests.front();
+    if (requestExecutionContext->clientDisconnected) {
+        requestExecutionContext->success = false;
+        SPDLOG_LOGGER_DEBUG(llm_executor_logger, "Client disconnected, skipping request processing.");


Same as the LLM executor: use requestExecutionContext->clientDisconnected.load(...) for a clearer, explicit atomic read.

Copilot · 2026-02-12T09:39:53Z

src/llm/visual_language_model/legacy/legacy_executor.cpp

+        SPDLOG_LOGGER_TRACE(llm_executor_logger, "Generation started");
+        try {
+            requestExecutionContext->results = pipe->generate(requestExecutionContext->inputText, requestExecutionContext->inputImages, requestExecutionContext->generationConfigBuilder->getConfig(), requestExecutionContext->textStreamer);
+        } catch (std::exception& e) {
+            requestExecutionContext->success = false;
+            SPDLOG_LOGGER_ERROR(llm_executor_logger, "VLM pipeline generation failed: {}.", e.what());
+        }
+        SPDLOG_LOGGER_TRACE(llm_executor_logger, "Generation ended");
    }


The disconnection flag is only checked before starting pipe->generate(...). If a client disconnects during generation, the callback will set clientDisconnected, but the request will still run to completion and be reported as successful unless an exception occurs. If cancellation during generation is a goal, consider re-checking clientDisconnected after generation and marking success = false, or integrating a cancellation mechanism supported by pipe->generate(...) (if available).

[Legacy] Remove disconnected requests from execution queue (#3968)

5018ece

mzegla requested review from dtrawins and michalkulakowski February 12, 2026 09:38

mzegla self-assigned this Feb 12, 2026

Copilot AI review requested due to automatic review settings February 12, 2026 09:38

mzegla changed the base branch from main to releases/2026/0 February 12, 2026 09:38

mzegla added this to the 2026.0 milestone Feb 12, 2026

Copilot AI reviewed Feb 12, 2026

View reviewed changes

dtrawins approved these changes Feb 12, 2026

View reviewed changes

michalkulakowski approved these changes Feb 12, 2026

View reviewed changes

Merge branch 'releases/2026/0' into legacy_disconnect_rel

21d2499

mzegla merged commit ce41e54 into releases/2026/0 Feb 12, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[releases/2026/0][Legacy] Remove disconnected requests from execution queue (#3968)#3973

[releases/2026/0][Legacy] Remove disconnected requests from execution queue (#3968)#3973
mzegla merged 2 commits intoreleases/2026/0from
legacy_disconnect_rel

mzegla commented Feb 12, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 12, 2026

Uh oh!

Copilot AI Feb 12, 2026

Uh oh!

Copilot AI Feb 12, 2026

Uh oh!

Copilot AI Feb 12, 2026

Uh oh!

Copilot AI Feb 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

mzegla commented Feb 12, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments