Skip to content

fix: use _max_concurrent_semantic for Semantic queue worker instead of hardcoded 1#877

Closed
r266-tech wants to merge 1 commit intovolcengine:mainfrom
r266-tech:fix/semantic-queue-concurrency
Closed

fix: use _max_concurrent_semantic for Semantic queue worker instead of hardcoded 1#877
r266-tech wants to merge 1 commit intovolcengine:mainfrom
r266-tech:fix/semantic-queue-concurrency

Conversation

@r266-tech
Copy link
Contributor

Summary

Fixes #873

The variable was stored in QueueManager.__init__ but never used in _start_queue_worker. The Semantic queue worker always ran with max_concurrent=1, ignoring the configured vlm.max_concurrent value for queue-level task concurrency.

Change

Before:

max_concurrent = self._max_concurrent_embedding if queue.name == self.EMBEDDING else 1

After:

max_concurrent = self._max_concurrent_embedding if queue.name == self.EMBEDDING else self._max_concurrent_semantic

Impact

  • Semantic queue now respects the max_concurrent_semantic parameter (default: 100)
  • Users can configure Semantic queue concurrency through ov.conf via vlm.max_concurrent
  • Enables parallel processing of multiple pending Semantic tasks

@github-actions
Copy link

Failed to generate code suggestions for PR

The _max_concurrent_semantic variable was stored in QueueManager.__init__
but never used in _start_queue_worker. The Semantic queue worker always
had max_concurrent=1, ignoring the configured vlm.max_concurrent value.

Fixes volcengine#873
@r266-tech r266-tech force-pushed the fix/semantic-queue-concurrency branch from b155230 to 2c3c22d Compare March 23, 2026 00:44
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

zeattacker pushed a commit to zeattacker/OpenViking that referenced this pull request Mar 24, 2026
Per-message: skip expensive LLM overview generation when ≤5 files
changed and cached overview exists. Rebuild overview from summaries
without LLM call (0 LLM calls vs 3 batches × 10+ min each).

Daily at 04:00 WIB (21:00 UTC): full LLM regen for directories with
file count delta ≥ 5 since last run, keeping overviews coherent.

Also applies PR volcengine#877 fix: use _max_concurrent_semantic for queue
worker concurrency instead of hardcoded 1. Default set to 2 to
match llama.cpp parallel slots.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@zhoujh01
Copy link
Collaborator

Thank you for your code. I just merged in the same fix(https://github.com/volcengine/OpenViking/pull/905), so I'm closing this PR for now.

@zhoujh01 zhoujh01 closed this Mar 24, 2026
@github-project-automation github-project-automation bot moved this from Backlog to Done in OpenViking project Mar 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

[Bug]: _max_concurrent_semantic variable stored but not used in Semantic queue worker

4 participants