Skip to content

Deactivate one-shot Claude jobs after firing#119

Merged
dcellison merged 2 commits intomainfrom
fix/deactivate-oneshot-claude-jobs
Mar 23, 2026
Merged

Deactivate one-shot Claude jobs after firing#119
dcellison merged 2 commits intomainfrom
fix/deactivate-oneshot-claude-jobs

Conversation

@dcellison
Copy link
Copy Markdown
Owner

Summary

One-shot Claude jobs (schedule_type == "once", job_type == "claude") were never deactivated in the database after firing. APScheduler's run_once removed them from the in-memory scheduler queue, but the DB row stayed active=1 until the next restart when _register_new_jobs caught it as an "expired one-shot."

The reminder path had the correct check since the beginning (if data["schedule_type"] == "once": await sessions.deactivate_job(job_id)). The Claude job path was missing it in two of its three branches.

What changed

Added the one-shot deactivation check to two branches in _job_callback:

  1. CONDITION_NOT_MET path - A one-shot auto-remove job whose condition is not met on its single firing will never fire again. Now correctly deactivated.

  2. Non-conditional else path - A plain one-shot Claude job (no auto_remove) fires, delivers its response, and now correctly deactivates. This was the main bug.

The CONDITION_MET path already deactivates unconditionally via auto_remove logic - no change needed.

What did NOT change

  • No schedule_removal() calls added - APScheduler handles that for run_once jobs
  • No changes to the reminder path (already correct)
  • No changes to the try/except blocks (purely additive)
  • No changes to the CONDITION_MET path (already deactivates)

Fixes #112

Test plan

  • test_one_shot_claude_deactivates_after_sending - one-shot Claude job deactivates after delivery, no schedule_removal called
  • test_recurring_claude_does_not_deactivate - daily/interval jobs NOT deactivated (regression guard)
  • test_one_shot_condition_not_met_deactivates - one-shot auto-remove job deactivates even when condition not met
  • All 1018 tests pass (3 new)

One-shot Claude jobs (schedule_type=once) were never deactivated in
the database after firing. APScheduler removed them from the
in-memory queue, but the DB row stayed active=1 until the next
restart. The reminder path had the correct check; the Claude job
path was missing it in two branches.

Add the schedule_type=once deactivation check to:
- CONDITION_NOT_MET path (auto-remove jobs whose condition wasn't met)
- Non-conditional else path (plain one-shot Claude jobs)

The CONDITION_MET path already deactivates unconditionally via
auto_remove logic.

Fixes #112
@dcellison
Copy link
Copy Markdown
Owner Author

Review by Kai

PR Review: #119 — Deactivate one-shot Claude jobs after firing

Overall: Clean fix. One warning worth discussing, two minor suggestions.


Warning

cron.py else-branch: deactivate_job runs even if message delivery fails

The new deactivate_job call is placed after the try/except block that wraps message sending:

            except Exception:
                log.exception("Failed to send job %d result", job_id)

            # new code — always runs, even after caught exception above
            if data["schedule_type"] == "once":
                await sessions.deactivate_job(job_id)

If sending fails (network error, bot error, etc.), the exception is swallowed, and the job is silently deactivated anyway — the user never receives their one-shot response and gets no indication it was lost. Since APScheduler already removed the job from its queue, the job won't retry regardless, so this doesn't create a data-integrity bug. But it's a silent failure mode worth acknowledging.

The CONDITION_NOT_MET path has the same structure and is consistent, so this appears to be a deliberate pattern — but it's worth confirming: is "fire-and-forget, deactivate even on delivery failure" the intended behavior for one-shot jobs? If yes, a brief comment would make the intent explicit.


Suggestions

cron.py: deactivate_job itself is not guarded

Both new calls to await sessions.deactivate_job(job_id) are outside any try/except. If the DB call fails, the exception propagates uncaught from the callback. This matches the existing reminder-path pattern (per the PR description), so it's consistent — but the callback silently dies without logging. Low risk in practice.

tests/test_cron.py line ~518: schedule_removal.assert_not_called() — worth a comment

self.ctx.job.schedule_removal.assert_not_called()

The assertion is correct (the description explains APScheduler handles queue removal for run_once), but a reader might wonder why schedule_removal isn't called when there's a schedule_removal() call elsewhere in the else-branch. A one-line comment linking this to the run_once / no auto_remove setup would prevent confusion.


What looks good

  • The diagnosis is accurate: CONDITION_MET path already deactivated; the two new additions cover the remaining cases without disrupting existing logic.
  • No unnecessary schedule_removal() calls added.
  • Regression guard (test_recurring_claude_does_not_deactivate) is the right call.
  • Diff is minimal and surgical.

Add comments explaining that one-shot deactivation runs even after
delivery failure (intentional - job can't retry regardless) and why
schedule_removal is not called for run_once jobs.
@dcellison
Copy link
Copy Markdown
Owner Author

Review by Kai

The prior review concerns have been addressed. Here's what changed:

Resolved from prior review:

  • Both new deactivate_job blocks now have explicit comments documenting the fire-and-forget intent ("Runs even if delivery failed above - the job can't retry regardless, so deactivating prevents a stale active=1 row."). The warning is acknowledged and the behavior is intentional.
  • test_one_shot_claude_deactivates_after_sending now includes the schedule_removal.assert_not_called() comment explaining the APScheduler / run_once rationale.

New findings:

Suggestiontests/test_cron.py (new test_one_shot_condition_not_met_deactivates): The other two new tests both patch("kai.cron.log_message"); this one doesn't. It's fine if the CONDITION_NOT_MET path with no notification doesn't call log_message, but the inconsistency is worth a glance to confirm it's deliberate rather than a missing patch. Since the PR states all 1018 tests pass, it's not broken — just worth confirming intent.

Otherwise the PR is clean. The fix is minimal and correct, the three branches are now consistent, the regression guard is solid, and the comments make the intentional silent-failure behavior explicit and readable.

@dcellison dcellison merged commit 8345ca6 into main Mar 23, 2026
1 check passed
@dcellison dcellison deleted the fix/deactivate-oneshot-claude-jobs branch March 23, 2026 16:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Deactivate one-shot Claude jobs after firing

2 participants