Skip to content

Ensure _cmd_apply restarts service on failure#179

Merged
dcellison merged 2 commits intomainfrom
fix/apply-service-restart
Mar 26, 2026
Merged

Ensure _cmd_apply restarts service on failure#179
dcellison merged 2 commits intomainfrom
fix/apply-service-restart

Conversation

@dcellison
Copy link
Copy Markdown
Owner

Summary

Wrap the apply step sequence in try/finally so the service is always restarted after being stopped, even if a step fails. Previously, any exception between _stop_service and _start_service left the bot offline until manual restart.

The problem

_stop_service()    # bot goes offline
Step 1-8...        # if any step raises...
_start_service()   # ...this is never reached

The fix

_stop_service()
try:
    Step 1-8...
except Exception:
    print error guidance
    raise
finally:
    _start_service()   # always runs, even on failure

The inner try/except around _start_service in the finally block prevents a start failure from masking the original exception. Without it, Python would replace the propagating exception with the start error, hiding the actual cause.

Test plan

  • Existing dry run test passes (try/finally is transparent to no-op stop/start)
  • All 1135 tests pass, make check clean
  • Manual: introduce a failure in a step (e.g., bad permissions) and verify the service restarts

Fixes #166

Wrap the apply step sequence in try/finally so _start_service is
always called after the service was stopped, even if a step fails.
Previously, any exception between stop and start left the bot
offline until manual restart.

The inner try/except around _start_service in the finally block
prevents a start failure from masking the original exception.

Fixes #166
@dcellison
Copy link
Copy Markdown
Owner Author

Review by Kai

Review: fix/apply-service-restartsrc/kai/install.py

The core fix is correct and idiomatic. The try/except/raise + finally with an inner try/except around _start_service is the right pattern for this problem, and the PR description accurately explains why the inner guard is needed.

Two real issues:


WARNING — platform-specific hardcoded restart command

print("Manually restart with: sudo launchctl kickstart system/com.syrinx.kai")

launchctl is macOS-only. The rest of the function uses a platform variable and passes it to _start_service, so Linux (systemd) is clearly a supported target. On Linux the correct command would be something like sudo systemctl start kai. This message will actively mislead Linux users during a failure scenario.

Fix: branch on platform here (same as elsewhere), or print a generic "check your service manager" message.


SUGGESTION — silent success when service fails to start after clean apply

If steps 1–8 all succeed but _start_service raises in the finally block, the exception is swallowed, the warning is printed, and execution falls through to the # -- Summary -- block. The caller gets a normal return and (presumably) a zero exit code. There's no observable difference between "applied and started successfully" and "applied successfully, service is now offline."

Depending on how _cmd_apply is wired to its exit code, this may already be acceptable — the print warning is there. But if the summary block prints something like "Installation complete!", that's misleading. Worth verifying that the summary output or exit code reflects the degraded state.


Everything else looks clean. No security issues introduced — the change is purely structural. The except Exception: ... raise in the install body correctly preserves the original traceback. The finally guard correctly prevents exception replacement.

@dcellison
Copy link
Copy Markdown
Owner Author

Review by Kai

Review: fix/apply-service-restart — revision

The prior WARNING is fixed. The platform branch on the manual-restart message correctly handles macOS vs. Linux:

if platform == "darwin":
    print("Manually restart with: sudo launchctl kickstart system/com.syrinx.kai")
else:
    print("Manually restart with: sudo systemctl start kai")

No new issues introduced. The else arm is slightly broad (covers any non-darwin platform rather than gating on Linux explicitly), but that matches the two-target model the rest of the function uses and is reasonable.

The prior SUGGESTION (silent success when _start_service fails after a clean apply) remains unaddressed — the author has seen it and made their decision, so no further comment.

Everything else is clean. The structural fix is correct, the inner try/except around _start_service in finally correctly prevents exception replacement, and the except Exception: ... raise preserves the original traceback. Ready to merge.

@dcellison dcellison merged commit 8abe844 into main Mar 26, 2026
1 check passed
@dcellison dcellison deleted the fix/apply-service-restart branch March 26, 2026 15:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

install: _cmd_apply leaves service stopped on partial failure

2 participants