Deactivate one-shot Claude jobs after firing by dcellison · Pull Request #119 · dcellison/kai

dcellison · 2026-03-23T16:29:41Z

Summary

One-shot Claude jobs (schedule_type == "once", job_type == "claude") were never deactivated in the database after firing. APScheduler's run_once removed them from the in-memory scheduler queue, but the DB row stayed active=1 until the next restart when _register_new_jobs caught it as an "expired one-shot."

The reminder path had the correct check since the beginning (if data["schedule_type"] == "once": await sessions.deactivate_job(job_id)). The Claude job path was missing it in two of its three branches.

What changed

Added the one-shot deactivation check to two branches in _job_callback:

CONDITION_NOT_MET path - A one-shot auto-remove job whose condition is not met on its single firing will never fire again. Now correctly deactivated.
Non-conditional else path - A plain one-shot Claude job (no auto_remove) fires, delivers its response, and now correctly deactivates. This was the main bug.

The CONDITION_MET path already deactivates unconditionally via auto_remove logic - no change needed.

What did NOT change

No schedule_removal() calls added - APScheduler handles that for run_once jobs
No changes to the reminder path (already correct)
No changes to the try/except blocks (purely additive)
No changes to the CONDITION_MET path (already deactivates)

Fixes #112

Test plan

test_one_shot_claude_deactivates_after_sending - one-shot Claude job deactivates after delivery, no schedule_removal called
test_recurring_claude_does_not_deactivate - daily/interval jobs NOT deactivated (regression guard)
test_one_shot_condition_not_met_deactivates - one-shot auto-remove job deactivates even when condition not met
All 1018 tests pass (3 new)

One-shot Claude jobs (schedule_type=once) were never deactivated in the database after firing. APScheduler removed them from the in-memory queue, but the DB row stayed active=1 until the next restart. The reminder path had the correct check; the Claude job path was missing it in two branches. Add the schedule_type=once deactivation check to: - CONDITION_NOT_MET path (auto-remove jobs whose condition wasn't met) - Non-conditional else path (plain one-shot Claude jobs) The CONDITION_MET path already deactivates unconditionally via auto_remove logic. Fixes #112

dcellison · 2026-03-23T16:30:44Z

Review by Kai

PR Review: #119 — Deactivate one-shot Claude jobs after firing

Overall: Clean fix. One warning worth discussing, two minor suggestions.

Warning

cron.py else-branch: deactivate_job runs even if message delivery fails

The new deactivate_job call is placed after the try/except block that wraps message sending:

            except Exception:
                log.exception("Failed to send job %d result", job_id)

            # new code — always runs, even after caught exception above
            if data["schedule_type"] == "once":
                await sessions.deactivate_job(job_id)

If sending fails (network error, bot error, etc.), the exception is swallowed, and the job is silently deactivated anyway — the user never receives their one-shot response and gets no indication it was lost. Since APScheduler already removed the job from its queue, the job won't retry regardless, so this doesn't create a data-integrity bug. But it's a silent failure mode worth acknowledging.

The CONDITION_NOT_MET path has the same structure and is consistent, so this appears to be a deliberate pattern — but it's worth confirming: is "fire-and-forget, deactivate even on delivery failure" the intended behavior for one-shot jobs? If yes, a brief comment would make the intent explicit.

Suggestions

cron.py: deactivate_job itself is not guarded

Both new calls to await sessions.deactivate_job(job_id) are outside any try/except. If the DB call fails, the exception propagates uncaught from the callback. This matches the existing reminder-path pattern (per the PR description), so it's consistent — but the callback silently dies without logging. Low risk in practice.

tests/test_cron.py line ~518: schedule_removal.assert_not_called() — worth a comment

self.ctx.job.schedule_removal.assert_not_called()

The assertion is correct (the description explains APScheduler handles queue removal for run_once), but a reader might wonder why schedule_removal isn't called when there's a schedule_removal() call elsewhere in the else-branch. A one-line comment linking this to the run_once / no auto_remove setup would prevent confusion.

What looks good

The diagnosis is accurate: CONDITION_MET path already deactivated; the two new additions cover the remaining cases without disrupting existing logic.
No unnecessary schedule_removal() calls added.
Regression guard (test_recurring_claude_does_not_deactivate) is the right call.
Diff is minimal and surgical.

Add comments explaining that one-shot deactivation runs even after delivery failure (intentional - job can't retry regardless) and why schedule_removal is not called for run_once jobs.

dcellison · 2026-03-23T16:42:02Z

Review by Kai

The prior review concerns have been addressed. Here's what changed:

Resolved from prior review:

Both new deactivate_job blocks now have explicit comments documenting the fire-and-forget intent ("Runs even if delivery failed above - the job can't retry regardless, so deactivating prevents a stale active=1 row."). The warning is acknowledged and the behavior is intentional.
test_one_shot_claude_deactivates_after_sending now includes the schedule_removal.assert_not_called() comment explaining the APScheduler / run_once rationale.

New findings:

Suggestion — tests/test_cron.py (new test_one_shot_condition_not_met_deactivates): The other two new tests both patch("kai.cron.log_message"); this one doesn't. It's fine if the CONDITION_NOT_MET path with no notification doesn't call log_message, but the inconsistency is worth a glance to confirm it's deliberate rather than a missing patch. Since the PR states all 1018 tests pass, it's not broken — just worth confirming intent.

Otherwise the PR is clean. The fix is minimal and correct, the three branches are now consistent, the regression guard is solid, and the comments make the intentional silent-failure behavior explicit and readable.

dcellison mentioned this pull request Mar 23, 2026

Deactivate one-shot Claude jobs after firing #112

Closed

Document fire-and-forget intent, clarify schedule_removal assertion

fdd04c7

Add comments explaining that one-shot deactivation runs even after delivery failure (intentional - job can't retry regardless) and why schedule_removal is not called for run_once jobs.

dcellison merged commit 8345ca6 into main Mar 23, 2026
1 check passed

dcellison deleted the fix/deactivate-oneshot-claude-jobs branch March 23, 2026 16:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deactivate one-shot Claude jobs after firing#119

Deactivate one-shot Claude jobs after firing#119
dcellison merged 2 commits intomainfrom
fix/deactivate-oneshot-claude-jobs

dcellison commented Mar 23, 2026

Uh oh!

dcellison commented Mar 23, 2026

Uh oh!

dcellison commented Mar 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dcellison commented Mar 23, 2026

Summary

What changed

What did NOT change

Test plan

Uh oh!

dcellison commented Mar 23, 2026

Review by Kai

PR Review: #119 — Deactivate one-shot Claude jobs after firing

Warning

Suggestions

What looks good

Uh oh!

dcellison commented Mar 23, 2026

Review by Kai

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants