Skip to content

fix(clp-package): Prevent query scheduler from resetting job start time during sub-job dispatch (fixes #1874).#1875

Merged
hoophalab merged 2 commits intoy-scope:mainfrom
hoophalab:starttime
Jan 21, 2026
Merged

fix(clp-package): Prevent query scheduler from resetting job start time during sub-job dispatch (fixes #1874).#1875
hoophalab merged 2 commits intoy-scope:mainfrom
hoophalab:starttime

Conversation

@hoophalab
Copy link
Contributor

@hoophalab hoophalab commented Jan 15, 2026

Description

This PR fixes Issue #1874.

While initially reported in the WebUI, this bug affects any large search (that doesn't write to a network address directly) on a dataset with more than 16 archives.

Cause:

The issue stems from how the QueryScheduler handles sub-jobs for queries that require a reducer.

  1. NetworkAddress Queries: For a query like search.sh '*' that involves 85 archives, all tasks are dispatched in a single batch, and a start time is set when dispatching that batch.

  2. Reducer/WebUI/ResultsCache Queries: When a reducer is required (e.g., search.sh '*' --count ), the scheduler breaks the query into batches (sub-jobs) of 16 archives. code

  3. The Bug: To dispatch reducer queries, the scheduler incorrectly reuses the simple query dispatching method dispatch_job_and_update_db . Inside this method, the job's start_time is overwritten with datetime.now() every time a new batch is dispatched.

    The start_time in database remains unaffected due to the CAS-like operation that only updates rows in a PENDING state.

Fix:

There were several ways to approach this fix, such as implementing a dedicated dispatch function for sub-jobs, or setting the time when constructing a QueryJob

In this PR, we modify the existing dispatch function to ensure the start_time is only set if it has not been assigned previously. This mimics the CAS operation on the database.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

Duration is correctly stored in the database. WebUI displays the duration correctly.

Summary by CodeRabbit

Release Notes

  • Bug Fixes
    • Fixed the job scheduler to properly preserve the original start time for jobs that have already been started, preventing the start time from being incorrectly overwritten during subsequent job updates.

✏️ Tip: You can customize this high-level summary in your review settings.


@hoophalab hoophalab requested a review from a team as a code owner January 15, 2026 21:44
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 15, 2026

Walkthrough

The change modifies dispatch_job_and_update_db to conditionally update new_job.start_time only when it is currently None, preserving existing start_time values instead of unconditionally overwriting them.

Changes

Cohort / File(s) Summary
Start Time Preservation
components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py
Conditional start_time assignment now guards the update to only set start_time if currently None, preserving pre-existing values

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: preventing the query scheduler from resetting job start time during sub-job dispatch, and references the specific issue being fixed.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In
`@components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py`:
- Around line 629-631: Add a brief inline comment above the conditional
assignment that sets new_job.start_time only when it's None to explain that this
preserves the original dispatch start_time across recursive/subsequent calls to
dispatch_job_and_update_db and prevents overwriting when the DB CAS check
(prev_status=QueryJobStatus.PENDING) has already advanced the job; target the
block using the symbols start_time, new_job.start_time, and
dispatch_job_and_update_db so maintainers understand the in-memory vs DB
consistency rationale.
📜 Review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 491e321 and e7e26fc.

📒 Files selected for processing (1)
  • components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1152
File: components/clp-package-utils/clp_package_utils/scripts/start_clp.py:613-613
Timestamp: 2025-08-08T06:59:42.436Z
Learning: In components/clp-package-utils/clp_package_utils/scripts/start_clp.py, generic_start_scheduler sets CLP_LOGGING_LEVEL using clp_config.query_scheduler.logging_level for both schedulers; compression scheduler should use its own logging level. Tracking via an issue created from PR `#1152` discussion.
📚 Learning: 2024-11-15T20:07:22.256Z
Learnt from: haiqi96
Repo: y-scope/clp PR: 569
File: components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py:380-392
Timestamp: 2024-11-15T20:07:22.256Z
Learning: The current implementation assumes single-threaded execution, so race conditions in functions like `is_target_extracted` in `components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py` are not currently a concern.

Applied to files:

  • components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: package-image
  • GitHub Check: lint-check (macos-15)
  • GitHub Check: lint-check (ubuntu-24.04)
  • GitHub Check: check-generated
  • GitHub Check: build (macos-15)

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines 629 to +631
start_time = datetime.datetime.now()
new_job.start_time = start_time
if new_job.start_time is None:
new_job.start_time = start_time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

LGTM! Correctly preserves start_time for sub-job dispatches.

The conditional assignment ensures the original start_time is retained when dispatch_job_and_update_db is called again for subsequent sub-job batches. The database update on line 638 remains unchanged, but since the CAS condition (prev_status=QueryJobStatus.PENDING) prevents overwrites after the first dispatch, this fix properly aligns the in-memory state with the intended database behaviour.

Consider adding a brief comment for future maintainers:

📝 Suggested comment
     start_time = datetime.datetime.now()
+    # Only set start_time on first dispatch; preserve it for sub-job batches.
     if new_job.start_time is None:
         new_job.start_time = start_time
🧰 Tools
🪛 Ruff (0.14.11)

629-629: datetime.datetime.now() called without a tz argument

(DTZ005)

🤖 Prompt for AI Agents
In
`@components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py`
around lines 629 - 631, Add a brief inline comment above the conditional
assignment that sets new_job.start_time only when it's None to explain that this
preserves the original dispatch start_time across recursive/subsequent calls to
dispatch_job_and_update_db and prevents overwriting when the DB CAS check
(prev_status=QueryJobStatus.PENDING) has already advanced the job; target the
block using the symbols start_time, new_job.start_time, and
dispatch_job_and_update_db so maintainers understand the in-memory vs DB
consistency rationale.

@junhaoliao junhaoliao added this to the Backlog milestone Jan 19, 2026
Copy link
Member

@junhaoliao junhaoliao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm too

@junhaoliao junhaoliao modified the milestones: Backlog, February 2026 Jan 21, 2026
@hoophalab hoophalab merged commit 56d7bc7 into y-scope:main Jan 21, 2026
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants