handle signals properly#589
Conversation
✅ Deploy Preview for jumpstarter-docs ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
WalkthroughImplements a fork-based supervisor in the CLI to run the exporter in a forked child with signal forwarding and restart semantics; adds an async exporter lifecycle via create_exporter with cancellation‑shielded teardown; makes channel_factory awaitable, adds Exporter.stop(), and hardens session close logging and handler cleanup. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant CLI as Parent (run.py)
participant Child as Forked Child
participant Exporter as Exporter.serve()
User->>CLI: run(config)
CLI->>Child: fork()
rect rgb(235,245,255)
note right of Child: Child process
Child->>Exporter: anyio TG + signal receiver → run exporter.serve()
alt SIGINT/SIGTERM/SIGHUP/SIGQUIT
Child->>Exporter: cancel TG / call stop()
Exporter-->>Child: graceful exit (0 or restart)
else normal completion or error
Child-->>CLI: exit code (non-zero => propagate)
end
end
rect rgb(245,235,245)
note right of CLI: Parent supervision loop
CLI-->>Child: forward signals
CLI->>CLI: waitpid()
alt child exit == 0 (restart)
CLI->>Child: fork again
else non-zero / fatal signal
CLI-->>User: propagate exit status
end
end
sequenceDiagram
autonumber
participant Cfg as ExporterConfigV1Alpha1
participant Exp as Exporter
participant Ctrl as ControllerService (gRPC)
Cfg->>Cfg: async with create_exporter()
Cfg->>Exp: instantiate (await channel_factory)
Cfg->>Exp: await __aenter__() then await serve()
alt lease lost / stop requested / cancel
Exp->>Exp: stop() cancels TG or marks stop_requested
end
Cfg->>Ctrl: Unregister(reason="Exporter shutdown") with timeout (shielded)
Cfg->>Exp: await __aexit__()
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (1)
packages/jumpstarter-cli/jumpstarter_cli/run.py (1)
1-1: Remove unused asyncio importThe
asyncioimport is only used in the child signal handler, which should be replaced with anyio's signal handling as suggested above.Once the signal handling is updated to use anyio, remove the unused import:
-import asyncio import os import signal import sys
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
packages/jumpstarter-cli/jumpstarter_cli/run.py(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: NickCao
PR: jumpstarter-dev/jumpstarter#521
File: packages/jumpstarter-cli/jumpstarter_cli/run.py:1-2
Timestamp: 2025-06-12T17:18:48.519Z
Learning: Exporters in this project are intended to run only on Unix systems, so Unix-specific APIs like `os.fork()` are acceptable and Windows compatibility is not required.
🧬 Code Graph Analysis (1)
packages/jumpstarter-cli/jumpstarter_cli/run.py (4)
packages/jumpstarter-cli-common/jumpstarter_cli_common/config.py (1)
opt_config(99-100)packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py (1)
handle_exceptions(32-46)packages/jumpstarter-cli-common/jumpstarter_cli_common/signal.py (1)
signal_handler(9-22)packages/jumpstarter/jumpstarter/exporter/exporter.py (1)
status(115-134)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: pytest-matrix (macos-15, 3.12)
- GitHub Check: e2e
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.12)
- GitHub Check: pytest-matrix (macos-15, 3.13)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.13)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.11)
- GitHub Check: pytest-matrix (macos-15, 3.11)
| def _serve_with_exc_handling(exporter): | ||
| """Fork-based approach with improved signal handling.""" | ||
| pid = None | ||
|
|
||
| def parent_signal_handler(signum, _): | ||
| """Handle signals: forward to child, restart child on SIGHUP, exit on others.""" | ||
| if pid and pid > 0: | ||
| try: | ||
| os.kill(pid, signum) | ||
| if signum != signal.SIGHUP: | ||
| _, status = os.waitpid(pid, 0) | ||
| sys.exit(os.WEXITSTATUS(status) if os.WIFEXITED(status) else 1) | ||
| except ProcessLookupError: | ||
| pass | ||
| if signum != signal.SIGHUP: | ||
| sys.exit(0) | ||
|
|
||
| # Set up signal handlers | ||
| for sig in (signal.SIGINT, signal.SIGTERM, signal.SIGHUP, signal.SIGQUIT): | ||
| signal.signal(sig, parent_signal_handler) | ||
|
|
||
| while True: | ||
| result = Value("i", 0) | ||
| pid = os.fork() | ||
|
|
||
| if pid > 0: | ||
| os.waitpid(pid, 0) | ||
| if result.value != 0: | ||
| return result.value | ||
| # Parent process | ||
| if (exit_code := _handle_parent(pid)) is not None: | ||
| return exit_code | ||
| else: | ||
| try: | ||
| anyio.run(exporter.serve) | ||
| except* Exception as excgroup: | ||
| for exc in leaf_exceptions(excgroup): | ||
| click.echo( | ||
| f"Exception while serving on the exporter: {type(exc).__name__}: {exc}", | ||
| err=True, | ||
| ) | ||
| result.value = 1 | ||
| return | ||
| _handle_child(exporter) | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion
Consider handling SIGCHLD to avoid zombie processes
If the child process exits unexpectedly (e.g., crashes), it will become a zombie until the parent's next signal or explicit wait. Consider adding a SIGCHLD handler to reap children promptly.
Add SIGCHLD handling to promptly reap child processes:
def _serve_with_exc_handling(exporter):
"""Fork-based approach with improved signal handling."""
pid = None
+ child_exited = False
+
+ def sigchld_handler(signum, _):
+ """Handle SIGCHLD to reap zombie processes."""
+ nonlocal child_exited
+ try:
+ os.waitpid(-1, os.WNOHANG)
+ child_exited = True
+ except ChildProcessError:
+ pass
def parent_signal_handler(signum, _):
"""Handle signals: forward to child, restart child on SIGHUP, exit on others."""
if pid and pid > 0:
try:
os.kill(pid, signum)
except ProcessLookupError:
pass
if signum != signal.SIGHUP:
pass
# Set up signal handlers
for sig in (signal.SIGINT, signal.SIGTERM, signal.SIGHUP, signal.SIGQUIT):
signal.signal(sig, parent_signal_handler)
+ signal.signal(signal.SIGCHLD, sigchld_handler)
while True:
+ child_exited = False
pid = os.fork()
if pid > 0:
# Parent process
- if (exit_code := _handle_parent(pid)) is not None:
- return exit_code
+ while not child_exited:
+ signal.pause() # Wait for signals
+ _, status = os.waitpid(pid, os.WNOHANG)
+ if os.WIFEXITED(status):
+ if (exit_code := os.WEXITSTATUS(status)) != 0:
+ return exit_code
+ elif os.WIFSIGNALED(status):
+ sig = os.WTERMSIG(status)
+ if sig != signal.SIGHUP:
+ return 128 + sig
+ # Continue loop for SIGHUP or normal exit
else:
_handle_child(exporter)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def _serve_with_exc_handling(exporter): | |
| """Fork-based approach with improved signal handling.""" | |
| pid = None | |
| def parent_signal_handler(signum, _): | |
| """Handle signals: forward to child, restart child on SIGHUP, exit on others.""" | |
| if pid and pid > 0: | |
| try: | |
| os.kill(pid, signum) | |
| if signum != signal.SIGHUP: | |
| _, status = os.waitpid(pid, 0) | |
| sys.exit(os.WEXITSTATUS(status) if os.WIFEXITED(status) else 1) | |
| except ProcessLookupError: | |
| pass | |
| if signum != signal.SIGHUP: | |
| sys.exit(0) | |
| # Set up signal handlers | |
| for sig in (signal.SIGINT, signal.SIGTERM, signal.SIGHUP, signal.SIGQUIT): | |
| signal.signal(sig, parent_signal_handler) | |
| while True: | |
| result = Value("i", 0) | |
| pid = os.fork() | |
| if pid > 0: | |
| os.waitpid(pid, 0) | |
| if result.value != 0: | |
| return result.value | |
| # Parent process | |
| if (exit_code := _handle_parent(pid)) is not None: | |
| return exit_code | |
| else: | |
| try: | |
| anyio.run(exporter.serve) | |
| except* Exception as excgroup: | |
| for exc in leaf_exceptions(excgroup): | |
| click.echo( | |
| f"Exception while serving on the exporter: {type(exc).__name__}: {exc}", | |
| err=True, | |
| ) | |
| result.value = 1 | |
| return | |
| _handle_child(exporter) | |
| def _serve_with_exc_handling(exporter): | |
| """Fork-based approach with improved signal handling.""" | |
| pid = None | |
| child_exited = False | |
| def sigchld_handler(signum, _): | |
| """Handle SIGCHLD to reap zombie processes.""" | |
| nonlocal child_exited | |
| try: | |
| os.waitpid(-1, os.WNOHANG) | |
| child_exited = True | |
| except ChildProcessError: | |
| pass | |
| def parent_signal_handler(signum, _): | |
| """Handle signals: forward to child, restart child on SIGHUP, exit on others.""" | |
| if pid and pid > 0: | |
| try: | |
| os.kill(pid, signum) | |
| if signum != signal.SIGHUP: | |
| _, status = os.waitpid(pid, 0) | |
| sys.exit(os.WEXITSTATUS(status) if os.WIFEXITED(status) else 1) | |
| except ProcessLookupError: | |
| pass | |
| if signum != signal.SIGHUP: | |
| sys.exit(0) | |
| # Set up signal handlers | |
| for sig in (signal.SIGINT, signal.SIGTERM, signal.SIGHUP, signal.SIGQUIT): | |
| signal.signal(sig, parent_signal_handler) | |
| signal.signal(signal.SIGCHLD, sigchld_handler) | |
| while True: | |
| child_exited = False | |
| pid = os.fork() | |
| if pid > 0: | |
| # Parent process | |
| while not child_exited: | |
| signal.pause() # Wait for signals | |
| _, status = os.waitpid(pid, os.WNOHANG) | |
| if os.WIFEXITED(status): | |
| if (exit_code := os.WEXITSTATUS(status)) != 0: | |
| return exit_code | |
| elif os.WIFSIGNALED(status): | |
| sig = os.WTERMSIG(status) | |
| if sig != signal.SIGHUP: | |
| return 128 + sig | |
| # Continue loop for SIGHUP or normal exit | |
| else: | |
| _handle_child(exporter) |
🤖 Prompt for AI Agents
In packages/jumpstarter-cli/jumpstarter_cli/run.py around lines 51 to 81, the
parent lacks a SIGCHLD handler so exited child processes can become zombies; add
a SIGCHLD handler that non-blockingly reaps children (loop calling
os.waitpid(-1, os.WNOHANG) until it returns (0,0) or raises ChildProcessError),
swallow ProcessLookupError, and install that handler alongside the existing
signal handlers (ensure it returns quickly and does not block the main loop or
conflict with parent_signal_handler).
|
there are several problems with current signal handling. it's not apparent when running interactively and just Ctrl+C it, but when running as a service or from within a container it doesn't do cleanup and it's generally not handled correctly. for forked process the usual way is to send SIGTERM to the process group, both parent and child get the signal and terminate immediately (well, depending on timing parent may attempt to rerun child). in order to gracefully shut down the parent needs to handle wait on child exit (and has to handle its signals) and propagate them to child in case only the parent gets them. The child needs to react to signals as well, cancelling the async tasks that in turn triggers exporter's aexit unregistering from controller first. And with that the child can also communicate back its exit code cleanly, without a need for shared "result" var. |
87dd447 to
de08520
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
packages/jumpstarter-cli/jumpstarter_cli/run.py (2)
68-77: Ensure child never continues executing parent code; reset handlers in childTwo points:
- Add a defensive exit after
_handle_child(exporter)so the child cannot “fall through” in pathological cases.- The child inherits the parent’s Python signal handlers; reset them to defaults to avoid interference with AnyIO’s signal receiver in the child.
Apply this diff:
- else: - _handle_child(exporter) + else: + # Child process: reset parent's handlers to defaults + for _sig in (signal.SIGINT, signal.SIGTERM, signal.SIGHUP, signal.SIGQUIT): + try: + signal.signal(_sig, signal.SIG_DFL) + except Exception: + pass + _handle_child(exporter) + os._exit(1) # Should be unreachable; safeguard to prevent fallthrough
40-49: Handle EINTR in waitpid and interpret SIGHUP-terminated child as restartSignals delivered to the parent may interrupt
waitpid()with EINTR, causing an exception and aborting the loop. Also, if the child is terminated by SIGHUP before it can handle it, treat that as a restart instead of exiting.Apply this diff:
-def _handle_parent(pid): +def _handle_parent(pid): """Handle parent process waiting for child.""" - _, status = os.waitpid(pid, 0) + # waitpid() can be interrupted by signals; retry on EINTR + while True: + try: + _, status = os.waitpid(pid, 0) + break + except InterruptedError: + continue if os.WIFEXITED(status): exit_code = os.WEXITSTATUS(status) if exit_code != 0: return exit_code # Child exited normally (code 0) - continue loop for SIGHUP restart - else: - return 128 + (os.WTERMSIG(status) if os.WIFSIGNALED(status) else 0) + elif os.WIFSIGNALED(status): + sig = os.WTERMSIG(status) + # If the child died directly from SIGHUP (missed in-child handling), restart + if sig == signal.SIGHUP: + return None + return 128 + sig + else: + # Stopped/continued are unexpected here with blocking wait + return 1
🧹 Nitpick comments (2)
packages/jumpstarter-cli/jumpstarter_cli/run.py (2)
56-66: Option: forward to the child’s process group to catch grandchildrenIf the exporter spawns subprocesses, signaling only the child PID may leave grandchildren running. Consider making the child a new session leader and forwarding signals to its process group.
Apply this diff to forward to the process group with a safe fallback:
def parent_signal_handler(signum, _): """Forward signals to child process.""" if pid and pid > 0: try: - os.kill(pid, signum) + # Prefer signaling the child's process group, fallback to PID + try: + os.killpg(pid, signum) + except Exception: + os.kill(pid, signum) except ProcessLookupError: pass @@ - for sig in (signal.SIGINT, signal.SIGTERM, signal.SIGHUP, signal.SIGQUIT): + for sig in (signal.SIGINT, signal.SIGTERM, signal.SIGHUP, signal.SIGQUIT): signal.signal(sig, parent_signal_handler)And, in the child branch (only once, right after fork), make the child a session leader so it owns its process group:
- else: - # Child process: reset parent's handlers to defaults + else: + # Child process: reset parent's handlers to defaults and become a new session leader for _sig in (signal.SIGINT, signal.SIGTERM, signal.SIGHUP, signal.SIGQUIT): try: signal.signal(_sig, signal.SIG_DFL) except Exception: pass + try: + os.setsid() + except Exception: + pass _handle_child(exporter) os._exit(1) # Should be unreachable; safeguard to prevent fallthroughNote: This is POSIX-specific; the fallback to
os.kill(pid, signum)keeps behavior correct on non-POSIX platforms.
12-37: Child-side signal handling and graceful shutdown: overall solid
- Uses AnyIO’s
open_signal_receiverwith a proper synchronous context manager — correct.- Cancels the task group on first signal; maps exit codes appropriately, including special-casing SIGHUP to 0 for restarts.
Minor nit: consider logging which signal was received to aid diagnostics.
Apply this minimal addition:
except get_cancelled_exc_class(): - if received_signal == signal.SIGHUP: + if received_signal == signal.SIGHUP: + click.echo("SIGHUP received, restarting child", err=True) return 0
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
packages/jumpstarter-cli/jumpstarter_cli/run.py(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-05-07T08:29:38.418Z
Learnt from: mangelajo
PR: jumpstarter-dev/jumpstarter#448
File: packages/jumpstarter-cli-common/jumpstarter_cli_common/signal.py:10-11
Timestamp: 2025-05-07T08:29:38.418Z
Learning: In AnyIO 4.6.2+, `open_signal_receiver()` is a regular synchronous context manager that should be used with a plain `with` statement, not `async with`. However, it returns an asynchronous iterator that must be consumed with `async for`. The correct usage pattern is: `with open_signal_receiver(...) as signals: async for signum in signals:`. Reference: https://anyio.readthedocs.io/en/stable/signals.html
Applied to files:
packages/jumpstarter-cli/jumpstarter_cli/run.py
📚 Learning: 2025-05-07T08:29:38.418Z
Learnt from: mangelajo
PR: jumpstarter-dev/jumpstarter#448
File: packages/jumpstarter-cli-common/jumpstarter_cli_common/signal.py:10-11
Timestamp: 2025-05-07T08:29:38.418Z
Learning: In AnyIO, `open_signal_receiver()` returns a synchronous context manager (used with regular `with`), not an asynchronous one. While the context manager is synchronous, it yields an asynchronous iterator that should be consumed with `async for`. Reference: https://anyio.readthedocs.io/en/stable/signals.html
Applied to files:
packages/jumpstarter-cli/jumpstarter_cli/run.py
🧬 Code Graph Analysis (1)
packages/jumpstarter-cli/jumpstarter_cli/run.py (3)
packages/jumpstarter-cli-common/jumpstarter_cli_common/config.py (1)
opt_config(99-100)packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py (1)
handle_exceptions(32-46)packages/jumpstarter-cli-common/jumpstarter_cli_common/signal.py (1)
signal_handler(9-22)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: e2e
- GitHub Check: pytest-matrix (macos-15, 3.11)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.12)
- GitHub Check: pytest-matrix (macos-15, 3.12)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.11)
- GitHub Check: pytest-matrix (macos-15, 3.13)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.13)
🔇 Additional comments (1)
packages/jumpstarter-cli/jumpstarter_cli/run.py (1)
17-21: No change required — repository requires Python >= 3.11, so await anext(signals) is supportedpyproject metadata and CI show the project targets Python 3.11+ (pyproject requires-python = ">=3.11" in multiple packages and .github/workflows runs 3.11–3.13), while anext() was introduced in Python 3.10. The portability concern in the original comment is therefore not applicable; switching to an async for loop is optional (readability), not required.
Files to note:
- packages/jumpstarter-cli/jumpstarter_cli/run.py — current use:
received_signal = await anext(signals)- packages/jumpstarter-cli-common/jumpstarter_cli_common/signal.py — also uses open_signal_receiver
Optional readability diff (not required):
- async def signal_handler(scope): - nonlocal received_signal - with open_signal_receiver(signal.SIGINT, signal.SIGTERM, signal.SIGHUP, signal.SIGQUIT) as signals: - received_signal = await anext(signals) - scope.cancel() + async def signal_handler(scope): + nonlocal received_signal + with open_signal_receiver(signal.SIGINT, signal.SIGTERM, signal.SIGHUP, signal.SIGQUIT) as signals: + async for signum in signals: + received_signal = signum + scope.cancel() + break
bae1231 to
2755970
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (4)
packages/jumpstarter-cli/jumpstarter_cli/run.py (4)
12-41: LGTM! Clean signal handling implementation using anyio.The child process signal handling using anyio's
open_signal_receiveris well-implemented and follows the correct pattern (synchronous context manager with async iteration). The signal forwarding to cancel the task group and proper exit code handling align with Unix conventions.
15-15: Consider using None to distinguish "no signal" from signal 0.Using
0as the default forreceived_signalcould be ambiguous since it's also a valid signal number (though rarely used). When cancellation occurs without a signal, the current code returns 128 + 0 = 128, which might be misleading.Consider using
Noneto represent "no signal received":- received_signal = 0 + received_signal = None async def signal_handler(cancel_scope): nonlocal received_signal with open_signal_receiver(signal.SIGINT, signal.SIGTERM, signal.SIGHUP, signal.SIGQUIT) as signals: received_signal = await anext(signals) # Cancel the entire task group including exporter's tasks cancel_scope.cancel() try: async with create_task_group() as tg: tg.start_soon(signal_handler, tg.cancel_scope) # Pass our task group to exporter.serve() so it adds its tasks directly to our group await exporter.serve(task_group=tg) # If signal was received but task group completed normally, use signal-based exit code - return 128 + received_signal if received_signal else 128 + return 128 + received_signal if received_signal is not None else 0 except get_cancelled_exc_class(): - return 0 if received_signal == signal.SIGHUP else 128 + received_signal # restart child on SIGHUP + if received_signal == signal.SIGHUP: + return 0 # restart child on SIGHUP + elif received_signal is not None: + return 128 + received_signal + else: + return 0 # graceful shutdown without signalAlso applies to: 30-32
79-80: Add defensive exit after _handle_child.While
_handle_childcallssys.exit(), adding a defensive exit ensures the child never continues if something goes wrong.else: _handle_child(exporter) + os._exit(1) # Defensive: should never reach here
57-67: Consider handling SIGCHLD for prompt zombie reaping.If the child exits unexpectedly (e.g., crashes) between signals, it becomes a zombie until the parent receives another signal or explicitly waits. While the current design eventually reaps the child, prompt reaping would be cleaner.
This is a minor edge case that doesn't affect correctness but could improve resource management in long-running scenarios. Consider adding SIGCHLD handling if you observe zombie processes during testing.
🧹 Nitpick comments (2)
packages/jumpstarter/jumpstarter/exporter/exporter.py (1)
119-164: Consider documenting the task group ownership contract.The change from internal task group management to external task group orchestration is well-implemented. However, the contract between the exporter and the caller regarding task group lifecycle and cancellation responsibilities should be documented.
Consider adding a docstring to clarify the contract:
- async def serve(self, task_group): # noqa: C901 + async def serve(self, task_group): # noqa: C901 + """ + Serve the exporter using the provided task group. + + Args: + task_group: An external anyio TaskGroup to which exporter tasks will be added. + The caller is responsible for the task group's lifecycle and cancellation. + + Note: + When lease changes occur, this method may break out of its loop. The caller + should handle appropriate cleanup and restart logic. + """packages/jumpstarter-cli/jumpstarter_cli/run.py (1)
43-67: Signal handler setup timing could miss early signals.The signal handlers are set up after the fork (line 54-55), creating a small window where signals sent immediately after fork but before handler installation might be lost. While rare, this could affect automated testing or rapid signal scenarios.
Consider setting up handlers before fork and resetting them in the child:
def _handle_parent(pid): """Handle parent process waiting for child and signal forwarding.""" + # ... handler definition remains the same ... + _, status = os.waitpid(pid, 0) + # ... rest remains the same ... def _serve_with_exc_handling(exporter): """Fork-based approach with improved signal handling.""" + def parent_signal_handler(signum, _): + """Placeholder handler before fork.""" + pass + + # Set up handlers before fork to avoid race + original_handlers = {} + for sig in (signal.SIGINT, signal.SIGTERM, signal.SIGHUP, signal.SIGQUIT): + original_handlers[sig] = signal.signal(sig, parent_signal_handler) + while True: pid = os.fork() if pid > 0: # Parent process + # Update handler to forward to actual child PID + def forward_to_child(signum, _): + try: + os.kill(pid, signum) + except ProcessLookupError: + pass + + for sig in (signal.SIGINT, signal.SIGTERM, signal.SIGHUP, signal.SIGQUIT): + signal.signal(sig, forward_to_child) + if (exit_code := _handle_parent(pid)) is not None: + # Restore original handlers before exit + for sig, handler in original_handlers.items(): + signal.signal(sig, handler) return exit_code else: + # Child: restore default handlers + for sig in (signal.SIGINT, signal.SIGTERM, signal.SIGHUP, signal.SIGQUIT): + signal.signal(sig, signal.SIG_DFL) _handle_child(exporter)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these settings in your CodeRabbit configuration.
📒 Files selected for processing (4)
packages/jumpstarter-cli/jumpstarter_cli/run.py(1 hunks)packages/jumpstarter/jumpstarter/config/exporter.py(2 hunks)packages/jumpstarter/jumpstarter/exporter/exporter.py(4 hunks)packages/jumpstarter/jumpstarter/exporter/session.py(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-05-07T08:29:38.418Z
Learnt from: mangelajo
PR: jumpstarter-dev/jumpstarter#448
File: packages/jumpstarter-cli-common/jumpstarter_cli_common/signal.py:10-11
Timestamp: 2025-05-07T08:29:38.418Z
Learning: In AnyIO 4.6.2+, `open_signal_receiver()` is a regular synchronous context manager that should be used with a plain `with` statement, not `async with`. However, it returns an asynchronous iterator that must be consumed with `async for`. The correct usage pattern is: `with open_signal_receiver(...) as signals: async for signum in signals:`. Reference: https://anyio.readthedocs.io/en/stable/signals.html
Applied to files:
packages/jumpstarter-cli/jumpstarter_cli/run.py
📚 Learning: 2025-05-07T08:29:38.418Z
Learnt from: mangelajo
PR: jumpstarter-dev/jumpstarter#448
File: packages/jumpstarter-cli-common/jumpstarter_cli_common/signal.py:10-11
Timestamp: 2025-05-07T08:29:38.418Z
Learning: In AnyIO, `open_signal_receiver()` returns a synchronous context manager (used with regular `with`), not an asynchronous one. While the context manager is synchronous, it yields an asynchronous iterator that should be consumed with `async for`. Reference: https://anyio.readthedocs.io/en/stable/signals.html
Applied to files:
packages/jumpstarter-cli/jumpstarter_cli/run.py
🧬 Code Graph Analysis (4)
packages/jumpstarter/jumpstarter/exporter/session.py (1)
packages/jumpstarter-driver-composite/jumpstarter_driver_composite/driver.py (1)
report(37-38)
packages/jumpstarter/jumpstarter/exporter/exporter.py (3)
packages/jumpstarter-protocol/jumpstarter_protocol/jumpstarter/v1/jumpstarter_pb2_grpc.py (1)
ControllerServiceStub(9-68)packages/jumpstarter/jumpstarter/config/exporter.py (2)
channel_factory(165-173)serve(159-189)packages/jumpstarter/conftest.py (1)
Unregister(59-60)
packages/jumpstarter/jumpstarter/config/exporter.py (1)
packages/jumpstarter/jumpstarter/exporter/exporter.py (2)
serve(119-163)Exporter(24-163)
packages/jumpstarter-cli/jumpstarter_cli/run.py (5)
packages/jumpstarter-cli-common/jumpstarter_cli_common/config.py (1)
opt_config(99-100)packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py (1)
handle_exceptions(32-46)packages/jumpstarter-cli-common/jumpstarter_cli_common/signal.py (1)
signal_handler(9-22)packages/jumpstarter/jumpstarter/exporter/exporter.py (2)
serve(119-163)status(126-145)packages/jumpstarter/jumpstarter/config/exporter.py (1)
serve(159-189)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: Redirect rules - jumpstarter-docs
- GitHub Check: Header rules - jumpstarter-docs
- GitHub Check: pytest-matrix (macos-15, 3.12)
- GitHub Check: pytest-matrix (macos-15, 3.13)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.11)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.13)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.12)
- GitHub Check: pytest-matrix (macos-15, 3.11)
- GitHub Check: e2e
- GitHub Check: Pages changed - jumpstarter-docs
🔇 Additional comments (3)
packages/jumpstarter/jumpstarter/exporter/session.py (1)
46-56: LGTM! Robust error handling during cleanup.The enhanced error handling in
__exit__ensures that driver close failures are properly logged with descriptive information, and the logging handler is always removed via the finally block. This prevents resource leaks and improves debugging.packages/jumpstarter/jumpstarter/exporter/exporter.py (1)
32-52: LGTM! Comprehensive cleanup with timeout protection.The enhanced
__aexit__implementation with a 10-second timeout, nested error handling, and non-masking behavior ensures robust cleanup even when the controller is unresponsive. The descriptive unregister reason improves debugging.packages/jumpstarter/jumpstarter/config/exporter.py (1)
159-190: LGTM! Well-structured lifecycle management with shielded cleanup.The explicit lifecycle management with
__aenter__and__aexit__, combined with the shielded cleanup usingCancelScope(shield=True), ensures proper resource cleanup even during abrupt cancellation. The try/finally pattern guarantees cleanup execution.
ace8994 to
b31e578
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
🔭 Outside diff range comments (1)
packages/jumpstarter-cli/jumpstarter_cli/run.py (1)
90-93: Propagate exit code to the shell (Click ignores return values)Returning an int from a Click command doesn’t set process exit status. Use
sys.exit(...)to ensure the CLI exits with the intended code.def run(config): """Run an exporter locally.""" - return _serve_with_exc_handling(config) + sys.exit(_serve_with_exc_handling(config))
♻️ Duplicate comments (1)
packages/jumpstarter-cli/jumpstarter_cli/run.py (1)
59-72: Treat SIGHUP-terminated child (WIFSIGNALED) as restart to avoid race at early startupIf the child receives SIGHUP before installing its handler, it may terminate due to the default action, and the parent currently exits instead of restarting. Handle
WIFSIGNALED(status) == SIGHUPas a restart as well.- else: - # Child killed by unhandled signal - terminate - child_exit_code = 128 + (os.WTERMSIG(status) if os.WIFSIGNALED(status) else 0) - click.echo(f"Child killed by unhandled signal: {child_exit_code}", err=True) - return child_exit_code + else: + # Child terminated by a signal + if os.WIFSIGNALED(status): + sig = os.WTERMSIG(status) + if sig == signal.SIGHUP: + # Treat SIGHUP as a restart trigger even if the child didn't trap it + return None + child_exit_code = 128 + sig + click.echo(f"Child killed by unhandled signal: {child_exit_code}", err=True) + return child_exit_code + # Unexpected termination state + click.echo("Child terminated unexpectedly", err=True) + return 1
🧹 Nitpick comments (4)
packages/jumpstarter/jumpstarter/exporter/session.py (1)
47-55: Harden error logging in exit to avoid cascading failuresIf
root_device.report()raises inside the except block, the original close() error would be masked. Wrap the report retrieval in its own try/except and fall back to the class name.try: self.root_device.close() except Exception as e: - # Get driver name from report for more descriptive logging - report = self.root_device.report() - driver_name = report.labels.get('jumpstarter.dev/name', self.root_device.__class__.__name__) - logger.error("Error closing driver %s: %s", driver_name, e, exc_info=True) + # Best-effort: fetch a descriptive driver name for logging + try: + report = self.root_device.report() + driver_name = report.labels.get("jumpstarter.dev/name", self.root_device.__class__.__name__) + except Exception: + driver_name = self.root_device.__class__.__name__ + logger.error("Error closing driver %s: %s", driver_name, e, exc_info=True) finally: logging.getLogger().removeHandler(self._logging_handler)packages/jumpstarter/jumpstarter/config/exporter.py (1)
159-189: Call aexit only if aenter completed successfullyIf
await exporter.__aenter__()raises, calling__aexit__may perform invalid cleanup. Track anenteredflag to ensure correctness. The shielding looks good.- exporter = None + exporter = None + entered = False try: exporter = Exporter( channel_factory=channel_factory, device_factory=ExporterConfigV1Alpha1DriverInstance(children=self.export).instantiate, tls=self.tls, grpc_options=self.grpcOptions, ) - await exporter.__aenter__() + await exporter.__aenter__() + entered = True await exporter.serve(task_group) finally: # Shield all cleanup operations from abrupt cancellation for clean shutdown - if exporter: + if exporter and entered: with CancelScope(shield=True): await exporter.__aexit__(None, None, None)Would you like me to add defensive logging around
__aexit__as well to avoid masking an original failure during cleanup?packages/jumpstarter/jumpstarter/exporter/exporter.py (1)
159-176: Clarify log message to match the new fork-based lifecycleGiven the process will exit and the supervisor restarts it, the current message “killing existing connections” is misleading. Suggest updating the log to reflect the restart flow.
- logger.info("Lease status changed, killing existing connections") + logger.info("Lease status changed; exiting to allow supervisor to restart")packages/jumpstarter-cli/jumpstarter_cli/run.py (1)
82-84: Optional: place child in its own process group and signal the groupIf the child ever spawns descendants, sending signals to the process group ensures all are terminated/restarted consistently.
- else: - _handle_child(exporter) + else: + # Optional: make the child its own process group leader so we can signal the whole group + try: + os.setpgid(0, 0) + except Exception: + pass + _handle_child(exporter)And in
parent_signal_handler, useos.killpg(pid, signum)instead ofos.kill(pid, signum)if you adopt process groups.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
packages/jumpstarter-cli/jumpstarter_cli/run.py(1 hunks)packages/jumpstarter/jumpstarter/config/exporter.py(2 hunks)packages/jumpstarter/jumpstarter/exporter/exporter.py(4 hunks)packages/jumpstarter/jumpstarter/exporter/session.py(1 hunks)
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: michalskrivanek
PR: jumpstarter-dev/jumpstarter#589
File: packages/jumpstarter/jumpstarter/exporter/exporter.py:147-155
Timestamp: 2025-08-14T13:11:35.034Z
Learning: In the jumpstarter fork-based architecture, when the exporter's serve() method exits (e.g., due to lease changes), the child process terminates and the parent process automatically restarts it, eliminating concerns about orphaned tasks since the entire process is restarted.
Learnt from: NickCao
PR: jumpstarter-dev/jumpstarter#521
File: packages/jumpstarter-cli/jumpstarter_cli/run.py:1-2
Timestamp: 2025-06-12T17:18:48.519Z
Learning: Exporters in this project are intended to run only on Unix systems, so Unix-specific APIs like `os.fork()` are acceptable and Windows compatibility is not required.
📚 Learning: 2025-08-14T13:11:35.034Z
Learnt from: michalskrivanek
PR: jumpstarter-dev/jumpstarter#589
File: packages/jumpstarter/jumpstarter/exporter/exporter.py:147-155
Timestamp: 2025-08-14T13:11:35.034Z
Learning: In the jumpstarter fork-based architecture, when the exporter's serve() method exits (e.g., due to lease changes), the child process terminates and the parent process automatically restarts it, eliminating concerns about orphaned tasks since the entire process is restarted.
Applied to files:
packages/jumpstarter/jumpstarter/exporter/exporter.pypackages/jumpstarter-cli/jumpstarter_cli/run.py
📚 Learning: 2025-05-07T08:29:38.418Z
Learnt from: mangelajo
PR: jumpstarter-dev/jumpstarter#448
File: packages/jumpstarter-cli-common/jumpstarter_cli_common/signal.py:10-11
Timestamp: 2025-05-07T08:29:38.418Z
Learning: In AnyIO 4.6.2+, `open_signal_receiver()` is a regular synchronous context manager that should be used with a plain `with` statement, not `async with`. However, it returns an asynchronous iterator that must be consumed with `async for`. The correct usage pattern is: `with open_signal_receiver(...) as signals: async for signum in signals:`. Reference: https://anyio.readthedocs.io/en/stable/signals.html
Applied to files:
packages/jumpstarter-cli/jumpstarter_cli/run.py
📚 Learning: 2025-05-07T08:29:38.418Z
Learnt from: mangelajo
PR: jumpstarter-dev/jumpstarter#448
File: packages/jumpstarter-cli-common/jumpstarter_cli_common/signal.py:10-11
Timestamp: 2025-05-07T08:29:38.418Z
Learning: In AnyIO, `open_signal_receiver()` returns a synchronous context manager (used with regular `with`), not an asynchronous one. While the context manager is synchronous, it yields an asynchronous iterator that should be consumed with `async for`. Reference: https://anyio.readthedocs.io/en/stable/signals.html
Applied to files:
packages/jumpstarter-cli/jumpstarter_cli/run.py
📚 Learning: 2025-08-14T13:20:51.953Z
Learnt from: michalskrivanek
PR: jumpstarter-dev/jumpstarter#589
File: packages/jumpstarter-cli/jumpstarter_cli/run.py:15-15
Timestamp: 2025-08-14T13:20:51.953Z
Learning: Signal 0 is not a valid receivable signal in Unix systems. It can only be used as an argument to kill(pid, 0) to test if a process exists, but doesn't actually send a signal. Receivable signals are positive integers (e.g., SIGINT=2, SIGTERM=15, SIGHUP=1, SIGQUIT=3). Using 0 to represent "no signal received" in signal handling code is correct.
Applied to files:
packages/jumpstarter-cli/jumpstarter_cli/run.py
🧬 Code Graph Analysis (4)
packages/jumpstarter/jumpstarter/exporter/session.py (1)
packages/jumpstarter-driver-composite/jumpstarter_driver_composite/driver.py (1)
report(37-38)
packages/jumpstarter/jumpstarter/exporter/exporter.py (3)
packages/jumpstarter-protocol/jumpstarter_protocol/jumpstarter/v1/jumpstarter_pb2_grpc.py (1)
ControllerServiceStub(9-68)packages/jumpstarter/jumpstarter/config/exporter.py (2)
channel_factory(165-173)serve(159-189)packages/jumpstarter/conftest.py (1)
Unregister(59-60)
packages/jumpstarter/jumpstarter/config/exporter.py (1)
packages/jumpstarter/jumpstarter/exporter/exporter.py (2)
serve(120-175)Exporter(24-175)
packages/jumpstarter-cli/jumpstarter_cli/run.py (3)
packages/jumpstarter-cli-common/jumpstarter_cli_common/config.py (1)
opt_config(99-100)packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py (1)
handle_exceptions(32-46)packages/jumpstarter-cli-common/jumpstarter_cli_common/signal.py (1)
signal_handler(9-22)
🔇 Additional comments (4)
packages/jumpstarter/jumpstarter/exporter/exporter.py (3)
7-7: Import update looks correctUsing
connect_unix,create_memory_object_stream, andsleepfrom anyio aligns with the new external TaskGroup orchestration.
33-52: Robust unregistration on shutdown — good improvementThe 10s timeout and guarded exception logging in
__aexit__make cleanup safer without masking the original cause. LGTM.
120-131: Docstring clearly states the external TaskGroup contractThis clarifies ownership and is consistent with the fork-based supervisor. Good call.
packages/jumpstarter-cli/jumpstarter_cli/run.py (1)
20-31: Child: correct AnyIO signal handling and graceful cancellationUsing
open_signal_receiverwith a synchronous context manager and canceling the task group on first signal is correct. Good.
b31e578 to
115afe4
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
packages/jumpstarter/jumpstarter/config/exporter.py (3)
175-189: Optional: use AsyncExitStack for idiomatic async context handling with shielded cleanupAsyncExitStack lets you control exactly when aexit runs (inside a shield) without manually invoking dunder methods.
Alternative refactor:
- exporter = None - try: - exporter = Exporter( - channel_factory=channel_factory, - device_factory=ExporterConfigV1Alpha1DriverInstance(children=self.export).instantiate, - tls=self.tls, - grpc_options=self.grpcOptions, - ) - await exporter.__aenter__() - await exporter.serve(task_group) - finally: - # Shield all cleanup operations from abrupt cancellation for clean shutdown - if exporter: - with CancelScope(shield=True): - await exporter.__aexit__(None, None, None) + from contextlib import AsyncExitStack + stack = AsyncExitStack() + try: + exporter = Exporter( + channel_factory=channel_factory, + device_factory=ExporterConfigV1Alpha1DriverInstance(children=self.export).instantiate, + tls=self.tls, + grpc_options=self.grpcOptions, + ) + await stack.enter_async_context(exporter) + await exporter.serve(task_group) + finally: + # Shield all cleanup operations from abrupt cancellation for clean shutdown + with CancelScope(shield=True): + await stack.aclose()
159-163: Type-hint task_group for clarityAnnotating task_group improves readability and IDE assistance. AnyIO exposes the protocol in anyio.abc.TaskGroup.
Apply this diff:
- async def serve(self, task_group): + async def serve(self, task_group: "TaskGroup"): # dynamic import to avoid circular imports - from anyio import CancelScope + from anyio import CancelScope + from anyio.abc import TaskGroup
166-168: Improve diagnostics when endpoint/token are missingInclude alias/path to quickly identify which exporter config failed.
- if self.endpoint is None or self.token is None: - raise ConfigurationError("endpoint or token not set in exporter config") + if self.endpoint is None or self.token is None: + raise ConfigurationError( + f"endpoint or token not set in exporter config {self.alias} (path={self.path})" + )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
packages/jumpstarter-cli/jumpstarter_cli/run.py(1 hunks)packages/jumpstarter/jumpstarter/config/exporter.py(2 hunks)packages/jumpstarter/jumpstarter/exporter/exporter.py(4 hunks)packages/jumpstarter/jumpstarter/exporter/session.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- packages/jumpstarter/jumpstarter/exporter/session.py
- packages/jumpstarter-cli/jumpstarter_cli/run.py
- packages/jumpstarter/jumpstarter/exporter/exporter.py
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: michalskrivanek
PR: jumpstarter-dev/jumpstarter#589
File: packages/jumpstarter/jumpstarter/exporter/exporter.py:147-155
Timestamp: 2025-08-14T13:11:35.034Z
Learning: In the jumpstarter fork-based architecture, when the exporter's serve() method exits (e.g., due to lease changes), the child process terminates and the parent process automatically restarts it, eliminating concerns about orphaned tasks since the entire process is restarted.
Learnt from: NickCao
PR: jumpstarter-dev/jumpstarter#521
File: packages/jumpstarter-cli/jumpstarter_cli/run.py:1-2
Timestamp: 2025-06-12T17:18:48.519Z
Learning: Exporters in this project are intended to run only on Unix systems, so Unix-specific APIs like `os.fork()` are acceptable and Windows compatibility is not required.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: Redirect rules - jumpstarter-docs
- GitHub Check: Header rules - jumpstarter-docs
- GitHub Check: Pages changed - jumpstarter-docs
- GitHub Check: pytest-matrix (macos-15, 3.11)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.11)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.12)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.13)
- GitHub Check: pytest-matrix (macos-15, 3.12)
- GitHub Check: pytest-matrix (macos-15, 3.13)
- GitHub Check: e2e
🔇 Additional comments (2)
packages/jumpstarter/jumpstarter/config/exporter.py (2)
159-190: Good: cancellation-shielded cleanup and external TaskGroup integrationPassing an external task_group and shielding exporter cleanup from cancellation matches the PR’s signal-handling goals and the fork-based supervision model. This should help ensure child-side cleanup runs reliably before exit and restart.
159-190: Exporter.serve signature and call sites are aligned — no change requiredVerified that Exporter defines async def serve(self, task_group) and existing call sites pass task_group (positional or named):
- packages/jumpstarter/jumpstarter/exporter/exporter.py — async def serve(self, task_group) (≈line 120)
- packages/jumpstarter/jumpstarter/config/exporter.py — await exporter.serve(task_group) (≈line 184)
- packages/jumpstarter-cli/jumpstarter_cli/run.py — await exporter.serve(task_group=tg) (line 35)
No action required.
115afe4 to
7aaacc0
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
packages/jumpstarter/jumpstarter/config/exporter.py (1)
175-189: Call aexit only if aenter completed to avoid unsafe cleanupIf
Exporter.__aenter__raises, invoking__aexit__is not guaranteed to be safe and can mask the original error. Track whether the context was successfully entered before calling__aexit__.Apply this diff:
- exporter = None + exporter = None + entered = False try: exporter = Exporter( channel_factory=channel_factory, device_factory=ExporterConfigV1Alpha1DriverInstance(children=self.export).instantiate, tls=self.tls, grpc_options=self.grpcOptions, ) await exporter.__aenter__() + entered = True await exporter.serve(task_group) finally: # Shield all cleanup operations from abrupt cancellation for clean shutdown - if exporter: + if exporter and entered: with CancelScope(shield=True): await exporter.__aexit__(None, None, None)
🧹 Nitpick comments (1)
packages/jumpstarter-cli/jumpstarter_cli/run.py (1)
51-79: Consider handling SIGCHLD to promptly reap zombie processesIf the child process exits unexpectedly (e.g., crashes) without the parent receiving a signal, the child becomes a zombie until the parent receives its next signal. While the current implementation works correctly for signal-driven termination, adding a SIGCHLD handler would improve robustness by promptly reaping children.
Add SIGCHLD handling to immediately detect child termination:
def _handle_parent(pid): """Handle parent process waiting for child and signal forwarding.""" + child_exited = False + + def sigchld_handler(signum, _): + """Handle SIGCHLD to detect child exit.""" + nonlocal child_exited + child_exited = True + def parent_signal_handler(signum, _): logger.info("PARENT: Received %d (%s), forwarding to child PID %d", signum, signal.Signals(signum).name, pid) if pid and pid > 0: try: os.kill(pid, signum) except ProcessLookupError: pass # Set up signal handlers after fork for sig in (signal.SIGINT, signal.SIGTERM, signal.SIGHUP, signal.SIGQUIT): signal.signal(sig, parent_signal_handler) + signal.signal(signal.SIGCHLD, sigchld_handler) + # Use signal.pause() to wait for signals instead of blocking in waitpid + while not child_exited: + signal.pause() + _, status = os.waitpid(pid, 0)Alternatively, the current implementation is sufficient if prompt zombie reaping isn't critical for your use case.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
packages/jumpstarter-cli/jumpstarter_cli/run.py(1 hunks)packages/jumpstarter/jumpstarter/config/exporter.py(2 hunks)packages/jumpstarter/jumpstarter/exporter/exporter.py(4 hunks)packages/jumpstarter/jumpstarter/exporter/session.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/jumpstarter/jumpstarter/exporter/session.py
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: michalskrivanek
PR: jumpstarter-dev/jumpstarter#589
File: packages/jumpstarter/jumpstarter/exporter/exporter.py:147-155
Timestamp: 2025-08-14T13:11:35.034Z
Learning: In the jumpstarter fork-based architecture, when the exporter's serve() method exits (e.g., due to lease changes), the child process terminates and the parent process automatically restarts it, eliminating concerns about orphaned tasks since the entire process is restarted.
📚 Learning: 2025-08-14T13:11:35.034Z
Learnt from: michalskrivanek
PR: jumpstarter-dev/jumpstarter#589
File: packages/jumpstarter/jumpstarter/exporter/exporter.py:147-155
Timestamp: 2025-08-14T13:11:35.034Z
Learning: In the jumpstarter fork-based architecture, when the exporter's serve() method exits (e.g., due to lease changes), the child process terminates and the parent process automatically restarts it, eliminating concerns about orphaned tasks since the entire process is restarted.
Applied to files:
packages/jumpstarter-cli/jumpstarter_cli/run.pypackages/jumpstarter/jumpstarter/exporter/exporter.py
📚 Learning: 2025-05-07T08:29:38.418Z
Learnt from: mangelajo
PR: jumpstarter-dev/jumpstarter#448
File: packages/jumpstarter-cli-common/jumpstarter_cli_common/signal.py:10-11
Timestamp: 2025-05-07T08:29:38.418Z
Learning: In AnyIO 4.6.2+, `open_signal_receiver()` is a regular synchronous context manager that should be used with a plain `with` statement, not `async with`. However, it returns an asynchronous iterator that must be consumed with `async for`. The correct usage pattern is: `with open_signal_receiver(...) as signals: async for signum in signals:`. Reference: https://anyio.readthedocs.io/en/stable/signals.html
Applied to files:
packages/jumpstarter-cli/jumpstarter_cli/run.py
📚 Learning: 2025-05-07T08:29:38.418Z
Learnt from: mangelajo
PR: jumpstarter-dev/jumpstarter#448
File: packages/jumpstarter-cli-common/jumpstarter_cli_common/signal.py:10-11
Timestamp: 2025-05-07T08:29:38.418Z
Learning: In AnyIO, `open_signal_receiver()` returns a synchronous context manager (used with regular `with`), not an asynchronous one. While the context manager is synchronous, it yields an asynchronous iterator that should be consumed with `async for`. Reference: https://anyio.readthedocs.io/en/stable/signals.html
Applied to files:
packages/jumpstarter-cli/jumpstarter_cli/run.py
📚 Learning: 2025-08-14T13:20:51.953Z
Learnt from: michalskrivanek
PR: jumpstarter-dev/jumpstarter#589
File: packages/jumpstarter-cli/jumpstarter_cli/run.py:15-15
Timestamp: 2025-08-14T13:20:51.953Z
Learning: Signal 0 is not a valid receivable signal in Unix systems. It can only be used as an argument to kill(pid, 0) to test if a process exists, but doesn't actually send a signal. Receivable signals are positive integers (e.g., SIGINT=2, SIGTERM=15, SIGHUP=1, SIGQUIT=3). Using 0 to represent "no signal received" in signal handling code is correct.
Applied to files:
packages/jumpstarter-cli/jumpstarter_cli/run.py
🧬 Code Graph Analysis (2)
packages/jumpstarter-cli/jumpstarter_cli/run.py (5)
packages/jumpstarter-cli-common/jumpstarter_cli_common/config.py (1)
opt_config(99-100)packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py (1)
handle_exceptions(32-46)packages/jumpstarter-cli-common/jumpstarter_cli_common/signal.py (1)
signal_handler(9-22)packages/jumpstarter/jumpstarter/config/exporter.py (1)
serve(159-189)packages/jumpstarter/jumpstarter/exporter/exporter.py (2)
serve(120-175)status(138-157)
packages/jumpstarter/jumpstarter/exporter/exporter.py (3)
packages/jumpstarter-protocol/jumpstarter_protocol/jumpstarter/v1/jumpstarter_pb2_grpc.py (1)
ControllerServiceStub(9-68)packages/jumpstarter/jumpstarter/config/exporter.py (2)
channel_factory(165-173)serve(159-189)packages/jumpstarter/conftest.py (1)
Unregister(59-60)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: Redirect rules - jumpstarter-docs
- GitHub Check: Header rules - jumpstarter-docs
- GitHub Check: Pages changed - jumpstarter-docs
- GitHub Check: e2e
- GitHub Check: pytest-matrix (macos-15, 3.13)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.13)
- GitHub Check: pytest-matrix (macos-15, 3.12)
- GitHub Check: pytest-matrix (macos-15, 3.11)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.11)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.12)
🔇 Additional comments (6)
packages/jumpstarter/jumpstarter/exporter/exporter.py (3)
33-53: Cleanup logic looks good with proper timeout and error handlingThe new cleanup implementation is robust:
- 10-second timeout ensures cleanup doesn't hang indefinitely
- Inner exception handling prevents unregistration errors from propagating
- Outer exception handling ensures cleanup issues don't mask original errors
- Clear logging for both success and failure cases
This aligns well with the fork-based architecture where child termination triggers proper cleanup.
120-131: serve() signature change properly delegates lifecycle managementThe new signature accepting an external
task_groupcorrectly shifts responsibility for task lifecycle management to the caller. The docstring clearly documents this contract including:
- External responsibility for the task group lifecycle
- Expected behavior on lease changes
- Caller's responsibility for cleanup/restart logic
This design works well with the fork-based supervisor in the CLI that handles restart on exit.
159-172: External task group integration maintains proper separation of concernsThe migration from internal task group to external task group is well executed:
- Status task is correctly added to the external group via
start_soon- Lease change breaks out of the loop instead of cancelling internally
- The comment clearly indicates the caller handles cancellation
Given the learnings about the fork-based architecture where the child process is restarted on lease changes, this approach eliminates any concerns about orphaned tasks.
packages/jumpstarter-cli/jumpstarter_cli/run.py (3)
15-49: Child process signal handling implementation is well designedThe child process handler correctly:
- Uses anyio's signal handling with proper synchronous context manager (
with) and async iteration- Tracks received signals and prevents duplicate processing
- Logs signals with descriptive names for debugging
- Returns the signal number for parent to interpret
- Properly exits the child process with
sys.exit(anyio.run(...))The use of 0 to represent "no signal received" is correct as noted in the learnings.
51-79: Parent process correctly implements signal forwarding and child restart logicThe parent handler properly:
- Forwards signals to the child with error handling for ProcessLookupError
- Interprets exit code 0 or SIGHUP as restart triggers
- Returns appropriate Unix exit codes (128 + signal number)
- Handles both normal exits and signal terminations
The implementation correctly addresses the PR objectives by ensuring the parent waits for the child and propagates exit codes while treating SIGHUP specially for restarts.
81-91: Fork-based supervisor with restart loop is well implementedThe supervisor correctly:
- Implements a restart loop that continues when
_handle_parentreturns None- Exits with the child's exit code when non-None is returned
- Includes defensive
sys.exit(1)after_handle_childto ensure the child never continuesThis addresses the core PR objectives of proper signal handling and process lifecycle management.
7aaacc0 to
e6b0b78
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
packages/jumpstarter/jumpstarter/exporter/exporter.py (1)
32-51: Bounded unregister and non-masking cleanup are on point
move_on_after(10)to bound the unregister is a solid defensive choice.- Logging and not re-raising from
__aexit__avoids masking original errors during teardown.Minor optional refinements:
- Use
logger.exception(...)for automatic stack traces.- Close the channel after unregister if
channel_factory()creates a new channel per call.Apply this small refinement for the logging and optional channel close:
- try: - if self.registered: - logger.info("Unregistering exporter with controller") - try: - with anyio.move_on_after(10): # 10 second timeout - controller = jumpstarter_pb2_grpc.ControllerServiceStub(await self.channel_factory()) - await controller.Unregister( + try: + if self.registered: + logger.info("Unregistering exporter with controller") + try: + with anyio.move_on_after(10): # 10 second timeout + channel = await self.channel_factory() + controller = jumpstarter_pb2_grpc.ControllerServiceStub(channel) + await controller.Unregister( jumpstarter_pb2.UnregisterRequest( reason="Exporter shutdown", ) ) logger.info("Controller unregistration completed successfully") + finally: + try: + # Close channel if independently created + await channel.close() # no-op if channel is reused or already closed + except Exception: + logger.debug("Controller channel close failed or unnecessary", exc_info=True) except Exception as e: - logger.error("Error during controller unregistration: %s", e, exc_info=True) + logger.exception("Error during controller unregistration: %s", e) except Exception as e: - logger.error("Error during exporter cleanup: %s", e, exc_info=True) + logger.exception("Error during exporter cleanup: %s", e)packages/jumpstarter-cli/jumpstarter_cli/run.py (2)
76-93: Parent forwarding and exit-code mapping look correct
- Signal forwarding is simple and non-blocking.
waitpid()handling and mapping: 0/SIGHUP → restart, signals → 128+N; unexpected terminations are handled.Optional:
- Prefer
logger.erroroverclick.echofor consistency.Apply this tiny logging consistency tweak:
- click.echo(f"Child killed by unhandled signal: {child_exit_signal}", err=True) + logger.error("Child killed by unhandled signal: %s", child_exit_signal)
100-103: Optional: consider process-group delivery for broader shutdownsIf the child ever spawns subprocesses, forwarding to the child’s process group (after
setsid()in the child) ensures all descendants receive the signal. Not necessary now, but worth noting.Possible approach (conceptual, not a direct diff here):
- In the child (immediately after forking), call
os.setsid()to create a new session/process group.- In the parent, forward using
os.killpg(child_pgid, signum).This is only needed if the child will spawn subprocesses that must also be signaled.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
packages/jumpstarter-cli/jumpstarter_cli/run.py(1 hunks)packages/jumpstarter/jumpstarter/config/exporter.py(2 hunks)packages/jumpstarter/jumpstarter/exporter/exporter.py(2 hunks)packages/jumpstarter/jumpstarter/exporter/session.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/jumpstarter/jumpstarter/exporter/session.py
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: michalskrivanek
PR: jumpstarter-dev/jumpstarter#589
File: packages/jumpstarter/jumpstarter/exporter/exporter.py:147-155
Timestamp: 2025-08-14T13:11:35.034Z
Learning: In the jumpstarter fork-based architecture, when the exporter's serve() method exits (e.g., due to lease changes), the child process terminates and the parent process automatically restarts it, eliminating concerns about orphaned tasks since the entire process is restarted.
📚 Learning: 2025-08-14T13:11:35.034Z
Learnt from: michalskrivanek
PR: jumpstarter-dev/jumpstarter#589
File: packages/jumpstarter/jumpstarter/exporter/exporter.py:147-155
Timestamp: 2025-08-14T13:11:35.034Z
Learning: In the jumpstarter fork-based architecture, when the exporter's serve() method exits (e.g., due to lease changes), the child process terminates and the parent process automatically restarts it, eliminating concerns about orphaned tasks since the entire process is restarted.
Applied to files:
packages/jumpstarter/jumpstarter/exporter/exporter.pypackages/jumpstarter-cli/jumpstarter_cli/run.py
📚 Learning: 2025-05-07T08:29:38.418Z
Learnt from: mangelajo
PR: jumpstarter-dev/jumpstarter#448
File: packages/jumpstarter-cli-common/jumpstarter_cli_common/signal.py:10-11
Timestamp: 2025-05-07T08:29:38.418Z
Learning: In AnyIO 4.6.2+, `open_signal_receiver()` is a regular synchronous context manager that should be used with a plain `with` statement, not `async with`. However, it returns an asynchronous iterator that must be consumed with `async for`. The correct usage pattern is: `with open_signal_receiver(...) as signals: async for signum in signals:`. Reference: https://anyio.readthedocs.io/en/stable/signals.html
Applied to files:
packages/jumpstarter-cli/jumpstarter_cli/run.py
📚 Learning: 2025-05-07T08:29:38.418Z
Learnt from: mangelajo
PR: jumpstarter-dev/jumpstarter#448
File: packages/jumpstarter-cli-common/jumpstarter_cli_common/signal.py:10-11
Timestamp: 2025-05-07T08:29:38.418Z
Learning: In AnyIO, `open_signal_receiver()` returns a synchronous context manager (used with regular `with`), not an asynchronous one. While the context manager is synchronous, it yields an asynchronous iterator that should be consumed with `async for`. Reference: https://anyio.readthedocs.io/en/stable/signals.html
Applied to files:
packages/jumpstarter-cli/jumpstarter_cli/run.py
📚 Learning: 2025-08-14T13:20:51.953Z
Learnt from: michalskrivanek
PR: jumpstarter-dev/jumpstarter#589
File: packages/jumpstarter-cli/jumpstarter_cli/run.py:15-15
Timestamp: 2025-08-14T13:20:51.953Z
Learning: Signal 0 is not a valid receivable signal in Unix systems. It can only be used as an argument to kill(pid, 0) to test if a process exists, but doesn't actually send a signal. Receivable signals are positive integers (e.g., SIGINT=2, SIGTERM=15, SIGHUP=1, SIGQUIT=3). Using 0 to represent "no signal received" in signal handling code is correct.
Applied to files:
packages/jumpstarter-cli/jumpstarter_cli/run.py
🧬 Code Graph Analysis (2)
packages/jumpstarter-cli/jumpstarter_cli/run.py (5)
packages/jumpstarter-cli-common/jumpstarter_cli_common/config.py (1)
opt_config(99-100)packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py (2)
handle_exceptions(32-46)leaf_exceptions(78-102)packages/jumpstarter-cli-common/jumpstarter_cli_common/signal.py (1)
signal_handler(9-22)packages/jumpstarter/jumpstarter/exporter/exporter.py (2)
serve(119-164)status(129-148)packages/jumpstarter/jumpstarter/config/exporter.py (1)
serve(159-188)
packages/jumpstarter/jumpstarter/config/exporter.py (1)
packages/jumpstarter/jumpstarter/exporter/exporter.py (2)
Exporter(24-164)serve(119-164)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: Redirect rules - jumpstarter-docs
- GitHub Check: Header rules - jumpstarter-docs
- GitHub Check: Pages changed - jumpstarter-docs
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.12)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.11)
- GitHub Check: pytest-matrix (macos-15, 3.13)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.13)
- GitHub Check: pytest-matrix (macos-15, 3.11)
- GitHub Check: pytest-matrix (macos-15, 3.12)
- GitHub Check: e2e
🔇 Additional comments (7)
packages/jumpstarter/jumpstarter/config/exporter.py (3)
161-161: Good call: import CancelScope locally to avoid circular importsImporting CancelScope within the method keeps dependencies localized and prevents circular-import issues.
175-188: Shielded cleanup on shutdown looks correct and robustThe try/finally with CancelScope(shield=True) ensures
__aexit__runs even under cancellation, aligning with the PR’s graceful shutdown goals.
175-188: Exporter’s__aenter__Is the Default No-Op—Safe to Call__aexit__DirectlyI’ve confirmed that
Exporterdoes not define its own__aenter__(it inherits the default fromcontextlib.AbstractAsyncContextManager, which simply returnsselfwith no side effects). Its custom__aexit__is already guarded by theregisteredflag, so invokingawait exporter.__aexit__(…)without a prior__aenter__is safe here. No changes needed.packages/jumpstarter/jumpstarter/exporter/exporter.py (1)
120-123: Docstring addition is fineThe short docstring for
serve()is harmless and keeps intent clear.packages/jumpstarter-cli/jumpstarter_cli/run.py (3)
24-31: Correct AnyIO signal receiver usageUsing
with open_signal_receiver(...)plusasync formatches AnyIO’s intended pattern and avoids asyncio handler conflicts.
60-64: Return-code semantics are consistent with parent handlerReturning the signal number (or 0) is correctly interpreted by the parent: 0 and SIGHUP → restart; other signals → 128+N.
96-106: Defensive exit in child path is goodThe extra
sys.exit(1)after_handle_child(exporter)guards against unexpected fall-through.
| async def run_exporter(): | ||
| nonlocal exporter_tg | ||
| try: | ||
| async with create_task_group() as tg: | ||
| exporter_tg = tg | ||
| await exporter.serve() | ||
| except* Exception as excgroup: | ||
| from jumpstarter_cli_common.exceptions import leaf_exceptions | ||
| for exc in leaf_exceptions(excgroup): | ||
| click.echo( | ||
| f"Exception while serving on the exporter: {type(exc).__name__}: {exc}", | ||
| err=True, | ||
| ) | ||
|
|
||
| async def signal_handler_wrapper(): | ||
| await signal_handler(lambda: exporter_tg.cancel_scope.cancel() if exporter_tg else None) | ||
|
|
||
| signal_tg.start_soon(signal_handler_wrapper) | ||
| await run_exporter() | ||
| # Cancel the signal handler after exporter completes | ||
| signal_tg.cancel_scope.cancel() | ||
|
|
There was a problem hiding this comment.
Child cancellation path likely raises and skips exit-code mapping
On signal, cancel_func() cancels the exporter’s task group while await exporter.serve() is awaited. Without catching AnyIO’s cancellation exception, the CancelledError (BaseException) propagates out of serve_with_graceful_shutdown, causing anyio.run(...) to raise and bypass your return logic (exit code mapping).
Catch get_cancelled_exc_class() so the child exits cleanly with the intended status.
Apply this diff:
-from anyio import create_task_group, open_signal_receiver
+from anyio import create_task_group, open_signal_receiver, get_cancelled_exc_class
@@
async def run_exporter():
nonlocal exporter_tg
try:
async with create_task_group() as tg:
exporter_tg = tg
await exporter.serve()
- except* Exception as excgroup:
+ except get_cancelled_exc_class():
+ # Graceful cancellation via signal; let the caller decide exit code
+ return
+ except* Exception as excgroup:
from jumpstarter_cli_common.exceptions import leaf_exceptions
for exc in leaf_exceptions(excgroup):
click.echo(
f"Exception while serving on the exporter: {type(exc).__name__}: {exc}",
err=True,
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| async def run_exporter(): | |
| nonlocal exporter_tg | |
| try: | |
| async with create_task_group() as tg: | |
| exporter_tg = tg | |
| await exporter.serve() | |
| except* Exception as excgroup: | |
| from jumpstarter_cli_common.exceptions import leaf_exceptions | |
| for exc in leaf_exceptions(excgroup): | |
| click.echo( | |
| f"Exception while serving on the exporter: {type(exc).__name__}: {exc}", | |
| err=True, | |
| ) | |
| async def signal_handler_wrapper(): | |
| await signal_handler(lambda: exporter_tg.cancel_scope.cancel() if exporter_tg else None) | |
| signal_tg.start_soon(signal_handler_wrapper) | |
| await run_exporter() | |
| # Cancel the signal handler after exporter completes | |
| signal_tg.cancel_scope.cancel() | |
| ++ b/packages/jumpstarter-cli/jumpstarter_cli/run.py | |
| @@ | |
| from anyio import create_task_group, open_signal_receiver, get_cancelled_exc_class | |
| @@ | |
| async def run_exporter(): | |
| nonlocal exporter_tg | |
| try: | |
| async with create_task_group() as tg: | |
| exporter_tg = tg | |
| await exporter.serve() | |
| except get_cancelled_exc_class(): | |
| # Graceful cancellation via signal; let the caller decide exit code | |
| return | |
| except* Exception as excgroup: | |
| from jumpstarter_cli_common.exceptions import leaf_exceptions | |
| for exc in leaf_exceptions(excgroup): | |
| click.echo( | |
| f"Exception while serving on the exporter: {type(exc).__name__}: {exc}", | |
| err=True, | |
| ) | |
| async def signal_handler_wrapper(): | |
| await signal_handler(lambda: exporter_tg.cancel_scope.cancel() if exporter_tg else None) | |
| signal_tg.start_soon(signal_handler_wrapper) | |
| await run_exporter() | |
| # Cancel the signal handler after exporter completes | |
| signal_tg.cancel_scope.cancel() |
e6b0b78 to
b502a28
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
packages/jumpstarter/jumpstarter/config/exporter.py (1)
175-183: Verify lifecycle symmetry: is aexit safe without a preceding aenter?The code no longer calls
__aenter__(), yet still calls__aexit__(). If__aexit__assumes state set up by__aenter__, this can raise and either fail cleanup or mask real errors. Ifserve()is now the sole lifecycle and__aexit__is explicitly safe standalone, that's fine—please confirm.As a safe pattern, consider re-introducing an explicit enter flag around
__aenter__/__aexit__symmetry:Apply this diff (if
Exporter.__aenter__is still required/meaningful):- exporter = None - try: - exporter = Exporter( + exporter = None + entered = False + try: + exporter = Exporter( channel_factory=channel_factory, device_factory=ExporterConfigV1Alpha1DriverInstance(children=self.export).instantiate, tls=self.tls, grpc_options=self.grpcOptions, ) - await exporter.serve() + # Ensure enter/exit symmetry if __aenter__ establishes required state + if hasattr(exporter, "__aenter__"): + await exporter.__aenter__() + entered = True + await exporter.serve() finally: - # Shield all cleanup operations from abrupt cancellation for clean shutdown - if exporter: + # Shield all cleanup operations from abrupt cancellation for clean shutdown + if exporter and (not hasattr(exporter, "__aenter__") or entered): with CancelScope(shield=True): - await exporter.__aexit__(None, None, None) + try: + await exporter.__aexit__(None, None, None) + except Exception: + passRun this script to check whether
Exporterdefines__aenter__and where it’s used:#!/bin/bash # Inspect Exporter lifecycle methods and usages rg -nP -C3 '^\s*class\s+Exporter\b' rg -nP -C2 'def\s+__aenter__\s*\(' rg -nP -C2 'await\s+exporter\.__aenter__\s*\(' rg -nP -C2 'async\s+with\s+Exporter\('
🧹 Nitpick comments (2)
packages/jumpstarter/jumpstarter/exporter/session.py (1)
50-55: Avoid secondary exceptions while preparing error context; derive driver name before closeComputing
report = self.root_device.report()inside the except block can raise again and mask the original close error. Derive a descriptive name before attemptingclose(), and coerce labels to a plain dict to avoid protobuf MapContainer quirks.Apply this diff within this method to compute the name up-front and make label access robust:
def __exit__(self, exc_type, exc_value, traceback): - try: - self.root_device.close() - except Exception as e: - # Get driver name from report for more descriptive logging - report = self.root_device.report() - driver_name = report.labels.get('jumpstarter.dev/name', self.root_device.__class__.__name__) - logger.error("Error closing driver %s: %s", driver_name, e, exc_info=True) - finally: - logging.getLogger().removeHandler(self._logging_handler) + # Best-effort: get a descriptive driver name before closing; never let this raise + driver_name = self.root_device.__class__.__name__ + try: + report = self.root_device.report() + driver_name = dict(getattr(report, "labels", {})).get("jumpstarter.dev/name", driver_name) + except Exception: + pass + + try: + self.root_device.close() + except Exception as e: + logger.error("Error closing driver %s: %s", driver_name, e, exc_info=True) + finally: + logging.getLogger().removeHandler(self._logging_handler)packages/jumpstarter/jumpstarter/config/exporter.py (1)
185-188: Don’t let cleanup mask the primary failure; swallow/log aexit errorsIf
exporter.__aexit__()throws, it will override any exception fromexporter.serve(). Wrap it in try/except to preserve the original failure signal.Apply this diff:
- if exporter: - with CancelScope(shield=True): - await exporter.__aexit__(None, None, None) + if exporter: + with CancelScope(shield=True): + try: + await exporter.__aexit__(None, None, None) + except Exception: + # Don't mask the original exception from serve() + pass
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
packages/jumpstarter-cli/jumpstarter_cli/run.py(1 hunks)packages/jumpstarter/jumpstarter/config/exporter.py(2 hunks)packages/jumpstarter/jumpstarter/exporter/exporter.py(2 hunks)packages/jumpstarter/jumpstarter/exporter/session.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- packages/jumpstarter/jumpstarter/exporter/exporter.py
- packages/jumpstarter-cli/jumpstarter_cli/run.py
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: michalskrivanek
PR: jumpstarter-dev/jumpstarter#589
File: packages/jumpstarter/jumpstarter/exporter/exporter.py:147-155
Timestamp: 2025-08-14T13:11:35.034Z
Learning: In the jumpstarter fork-based architecture, when the exporter's serve() method exits (e.g., due to lease changes), the child process terminates and the parent process automatically restarts it, eliminating concerns about orphaned tasks since the entire process is restarted.
🧬 Code Graph Analysis (2)
packages/jumpstarter/jumpstarter/exporter/session.py (3)
packages/jumpstarter/jumpstarter/driver/base.py (2)
close(85-87)report(192-209)packages/jumpstarter-driver-composite/jumpstarter_driver_composite/client.py (1)
close(13-16)packages/jumpstarter-driver-composite/jumpstarter_driver_composite/driver.py (1)
report(37-38)
packages/jumpstarter/jumpstarter/config/exporter.py (1)
packages/jumpstarter/jumpstarter/exporter/exporter.py (2)
Exporter(24-164)serve(119-164)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: pytest-matrix (macos-15, 3.12)
- GitHub Check: pytest-matrix (macos-15, 3.13)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.11)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.13)
- GitHub Check: e2e
- GitHub Check: pytest-matrix (macos-15, 3.11)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.12)
🔇 Additional comments (2)
packages/jumpstarter/jumpstarter/exporter/session.py (1)
47-49: Good: make device close best-effort without breaking teardownWrapping
root_device.close()in a try/except prevents teardown from cascading failures. Thefinallyensures the logging handler is always removed.packages/jumpstarter/jumpstarter/config/exporter.py (1)
161-162: Local CancelScope import is appropriateImporting
CancelScopewithinserve()avoids unnecessary module-level deps and potential circulars. Looks good.
| if self.lease_name != "" and self.lease_name != status.lease_name: | ||
| self.lease_name = status.lease_name | ||
| logger.info("Lease status changed, killing existing connections") | ||
| tg.cancel_scope.cancel() |
There was a problem hiding this comment.
Not necessarily wrong, but asking, why do we stop canceling this scope here?
There was a problem hiding this comment.
It's probably cancelled when you exit the with...
There was a problem hiding this comment.
it's not to trigger "abrupt" cancellation, letting it exit the await (https://github.com/jumpstarter-dev/jumpstarter/pull/589/files#diff-a0c03b6fd1ccd1545c4f26d1e232edb53e3b775a456a96fd78151689fcc8dc5fR43, then in turn line 57) cleanly.
There was a problem hiding this comment.
bah...but then the other tasks exporter's tg (status, handle) doesn't really finish, it does indeed need to be cancelled.
the voluntary/cancelled exit is then handled in run.py line 44-50 - either logging an error or be silent on get_cancelled_exc_class
There was a problem hiding this comment.
i've put it back
| finally: | ||
| # Shield all cleanup operations from abrupt cancellation for clean shutdown | ||
| if exporter: | ||
| with CancelScope(shield=True): |
There was a problem hiding this comment.
|
ok, it looks good but it’s failing consistently the E2E tests for some reason: https://github.com/jumpstarter-dev/jumpstarter/actions?query=branch%3Asignals+workflow%3A%22%22Run+E2E+Tests%22%22++ https://github.com/jumpstarter-dev/jumpstarter/actions/runs/17001120921/job/48202970902#step:2:903 it fails there: https://github.com/jumpstarter-dev/jumpstarter-e2e/blob/main/tests.bats#L125 Something happens with the exporter being taken for a lease. I'll try to reproduce locally. |
b502a28 to
c667913
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (6)
packages/jumpstarter/jumpstarter/exporter/exporter.py (4)
32-51: Good, guarded cleanup; consider closing the channel and logging on timeoutThe unregister path is solid and shielded by a 10s timeout. Two small robustness tweaks:
- Close the channel you create to avoid leaking connections.
- Detect/log when the
move_on_aftertimeout actually fired (so ops can tell unregistration timed out).Apply:
- with anyio.move_on_after(10): # 10 second timeout - controller = jumpstarter_pb2_grpc.ControllerServiceStub(await self.channel_factory()) - await controller.Unregister( + with anyio.move_on_after(10) as scope: # 10 second timeout + channel = await self.channel_factory() + try: + controller = jumpstarter_pb2_grpc.ControllerServiceStub(channel) + await controller.Unregister( jumpstarter_pb2.UnregisterRequest( reason="Exporter shutdown", ) - ) + ) + finally: + try: + channel.close() + except Exception: + logger.debug("Ignoring error closing channel during cleanup", exc_info=True) + if scope.cancelled_caught: + logger.warning("Controller unregistration timed out after 10s")
61-83: Register channel lifetime is unmanaged; close it to avoid leaks
session()creates a new gRPC channel forcontrollerthat is never closed. Over many (re)registrations this can leak connections.Wrap the channel and close it:
- controller = jumpstarter_pb2_grpc.ControllerServiceStub(await self.channel_factory()) + channel = await self.channel_factory() + controller = jumpstarter_pb2_grpc.ControllerServiceStub(channel) with Session( uuid=self.uuid, labels=self.labels, root_device=self.device_factory(), ) as session: async with session.serve_unix_async() as path: async with grpc.aio.secure_channel( f"unix://{path}", grpc.local_channel_credentials(grpc.LocalConnectionType.UDS) ) as channel: response = await jumpstarter_pb2_grpc.ExporterServiceStub(channel).GetReport(empty_pb2.Empty()) logger.info("Registering exporter with controller") await controller.Register( jumpstarter_pb2.RegisterRequest( labels=self.labels, reports=response.reports, ) ) self.registered = True - yield path + try: + yield path + finally: + try: + channel.close() + except Exception: + logger.debug("Ignoring error closing controller channel", exc_info=True)
120-122: Docstring: clarify when serve() returnsHelpful to note the exit conditions to aid operators/debugging.
- """ - Serve the exporter. - """ + """ + Serve the exporter until cancelled or the lease name changes. + Exits when: + - Status stream indicates the lease name changed (triggers controlled shutdown). + - The task group is cancelled (e.g., SIGTERM/SIGINT propagated by the supervisor). + """
162-165: Reduce log verbosity for steady-state statusEmitting “Currently leased/not leased” on every status can spam logs in busy systems.
Consider downgrading to debug or rate-limiting (e.g., only log on state transitions).
packages/jumpstarter/jumpstarter/config/exporter.py (2)
169-174: Guard against None grpcOptionsYAML configs can set
grpcOptions: null. PassingNoneinto the channel helper can backfire. Defaulting to{}is safer.- return aio_secure_channel(self.endpoint, credentials, self.grpcOptions) + return aio_secure_channel(self.endpoint, credentials, self.grpcOptions or {})
175-188: Shielded cleanup is good; consider symmetric aenter/aexit and None-safe optionsTwo tweaks:
- Call
__aenter__()so future changes to Exporter’s enter logic don’t get bypassed.- Pass
{}forgrpc_optionsto keep None out of the exporter.- exporter = None + exporter = None + entered = False try: exporter = Exporter( channel_factory=channel_factory, device_factory=ExporterConfigV1Alpha1DriverInstance(children=self.export).instantiate, tls=self.tls, - grpc_options=self.grpcOptions, + grpc_options=self.grpcOptions or {}, ) - await exporter.serve() + await exporter.__aenter__() + entered = True + await exporter.serve() finally: # Shield all cleanup operations from abrupt cancellation for clean shutdown - if exporter: + if exporter and entered: with CancelScope(shield=True): await exporter.__aexit__(None, None, None)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
packages/jumpstarter-cli/jumpstarter_cli/run.py(1 hunks)packages/jumpstarter/jumpstarter/config/exporter.py(2 hunks)packages/jumpstarter/jumpstarter/exporter/exporter.py(2 hunks)packages/jumpstarter/jumpstarter/exporter/session.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- packages/jumpstarter/jumpstarter/exporter/session.py
- packages/jumpstarter-cli/jumpstarter_cli/run.py
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: michalskrivanek
PR: jumpstarter-dev/jumpstarter#589
File: packages/jumpstarter/jumpstarter/exporter/exporter.py:147-155
Timestamp: 2025-08-14T13:11:35.034Z
Learning: In the jumpstarter fork-based architecture, when the exporter's serve() method exits (e.g., due to lease changes), the child process terminates and the parent process automatically restarts it, eliminating concerns about orphaned tasks since the entire process is restarted.
Learnt from: michalskrivanek
PR: jumpstarter-dev/jumpstarter#584
File: packages/jumpstarter/jumpstarter/client/grpc.py:38-54
Timestamp: 2025-08-18T07:13:37.602Z
Learning: User michalskrivanek prefers inline code over small helper function extractions when the abstraction doesn't significantly improve readability or reduce complexity. They value straightforward, readable code over reducing minor duplication.
📚 Learning: 2025-08-14T13:11:35.034Z
Learnt from: michalskrivanek
PR: jumpstarter-dev/jumpstarter#589
File: packages/jumpstarter/jumpstarter/exporter/exporter.py:147-155
Timestamp: 2025-08-14T13:11:35.034Z
Learning: In the jumpstarter fork-based architecture, when the exporter's serve() method exits (e.g., due to lease changes), the child process terminates and the parent process automatically restarts it, eliminating concerns about orphaned tasks since the entire process is restarted.
Applied to files:
packages/jumpstarter/jumpstarter/exporter/exporter.py
🧬 Code Graph Analysis (2)
packages/jumpstarter/jumpstarter/config/exporter.py (1)
packages/jumpstarter/jumpstarter/exporter/exporter.py (2)
Exporter(24-165)serve(119-165)
packages/jumpstarter/jumpstarter/exporter/exporter.py (3)
packages/jumpstarter-protocol/jumpstarter_protocol/jumpstarter/v1/jumpstarter_pb2_grpc.py (1)
ControllerServiceStub(9-68)packages/jumpstarter/jumpstarter/config/exporter.py (1)
channel_factory(165-173)packages/jumpstarter/conftest.py (1)
Unregister(59-60)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: pytest-matrix (macos-15, 3.11)
- GitHub Check: pytest-matrix (macos-15, 3.12)
- GitHub Check: pytest-matrix (macos-15, 3.13)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.13)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.11)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.12)
- GitHub Check: e2e
🔇 Additional comments (3)
packages/jumpstarter/jumpstarter/exporter/exporter.py (1)
150-157: Correct cancellation on lease changeCancelling the task group and breaking ensures all spawned tasks (status/listen/handle and session) wind down; with the fork-based supervisor this cleanly returns control to the parent for a restart. Looks good.
packages/jumpstarter/jumpstarter/config/exporter.py (2)
161-162: Nice: local CancelScope import avoids cyclesLocal import keeps the module graph clean and avoids circular import headaches during CLI startup/shutdown.
175-188: No configs setgrpcOptions: null
I searched all YAML and JSON configuration files forgrpcOptions: nullorgrpcOptions: ~and found no occurrences. It’s safe to passself.grpcOptionsdirectly without additional null checks.
|
it's a great test. i focused to much on signals testing i missed the most obvious obvious one when no signals are involved, just release the lease |
|
Nice, so that cancel_scope was important, In theory ChatGPT assured that the with would cancel it, but it doesn't?... The Unit tests failed, but I think it was a transient qemu issue, re-running them |
|
Github is not re-running all the unit tests, but it needed a rebase anyway. |
Head branch was pushed to by a user without write access
0d5b1db to
b923cbe
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (2)
packages/jumpstarter/jumpstarter/config/exporter.py (1)
176-191: Guard aexit behind a successful aenter to avoid unsafe cleanupIf
await exporter.__aenter__()raises, calling__aexit__is unsafe and can mask the original error. Track whether the context was successfully entered.Apply this diff:
exporter = None + entered = False try: exporter = Exporter( channel_factory=channel_factory, device_factory=ExporterConfigV1Alpha1DriverInstance(children=self.export).instantiate, tls=self.tls, grpc_options=self.grpcOptions, ) # Initialize the exporter (registration, etc.) - await exporter.__aenter__() + await exporter.__aenter__() + entered = True yield exporter finally: # Shield all cleanup operations from abrupt cancellation for clean shutdown - if exporter: + if exporter and entered: with CancelScope(shield=True): await exporter.__aexit__(None, None, None)packages/jumpstarter-cli/jumpstarter_cli/run.py (1)
43-57: Catch AnyIO cancellation to ensure clean exit code mappingWhen
exporter.stop()cancels its task group,await exporter.serve()will raise AnyIO’s cancellation exception. Without catching it,anyio.run()raises and bypasses your exit-code mapping. Catch the cancellation exception and continue to the return mapping.Apply this diff:
-from anyio import create_task_group, open_signal_receiver +from anyio import create_task_group, open_signal_receiver, get_cancelled_exc_class @@ async with config.create_exporter() as exporter: try: await exporter.serve() + except get_cancelled_exc_class(): + # Graceful cancellation (e.g., due to signals) – proceed to map exit/restart + pass except* Exception as excgroup: from jumpstarter_cli_common.exceptions import leaf_exceptions for exc in leaf_exceptions(excgroup): if not isinstance(exc, anyio.get_cancelled_exc_class()): click.echo( f"Exception while serving on the exporter: {type(exc).__name__}: {exc}", err=True, )
🧹 Nitpick comments (2)
packages/jumpstarter/jumpstarter/config/exporter.py (1)
167-175: Channel factory typing vs. usage mismatch
channel_factory()is awaited by callers (e.g.,await self.channel_factory()in exporter code), so its type is effectivelyCallable[[], Awaitable[grpc.aio.Channel]]. Consider aligning the annotation in theExporterclass to avoid confusion in type checkers and IDEs.If agreeable, I can provide the corresponding change in
Exporterto:
- import
Awaitable- annotate
channel_factory: Callable[[], Awaitable[grpc.aio.Channel]]packages/jumpstarter/jumpstarter/exporter/exporter.py (1)
32-49: Graceful stop API looks good; minor type annotation improvementThe stop semantics with immediate vs. wait-for-lease-exit are sensible and match the forked-supervisor model. One nit:
channel_factoryis awaited elsewhere, so its annotation should beCallable[[], Awaitable[grpc.aio.Channel]]to match usage.I can send a small patch to import
Awaitableand adjust the annotation if you confirm.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
packages/jumpstarter-cli/jumpstarter_cli/run.py(1 hunks)packages/jumpstarter/jumpstarter/config/exporter.py(1 hunks)packages/jumpstarter/jumpstarter/exporter/exporter.py(4 hunks)packages/jumpstarter/jumpstarter/exporter/session.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/jumpstarter/jumpstarter/exporter/session.py
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: michalskrivanek
PR: jumpstarter-dev/jumpstarter#584
File: packages/jumpstarter/jumpstarter/client/grpc.py:38-54
Timestamp: 2025-08-18T07:13:37.602Z
Learning: User michalskrivanek prefers inline code over small helper function extractions when the abstraction doesn't significantly improve readability or reduce complexity. They value straightforward, readable code over reducing minor duplication.
📚 Learning: 2025-08-14T13:11:35.034Z
Learnt from: michalskrivanek
PR: jumpstarter-dev/jumpstarter#589
File: packages/jumpstarter/jumpstarter/exporter/exporter.py:147-155
Timestamp: 2025-08-14T13:11:35.034Z
Learning: In the jumpstarter fork-based architecture, when the exporter's serve() method exits (e.g., due to lease changes), the child process terminates and the parent process automatically restarts it, eliminating concerns about orphaned tasks since the entire process is restarted.
Applied to files:
packages/jumpstarter/jumpstarter/exporter/exporter.pypackages/jumpstarter-cli/jumpstarter_cli/run.py
📚 Learning: 2025-05-07T08:29:38.418Z
Learnt from: mangelajo
PR: jumpstarter-dev/jumpstarter#448
File: packages/jumpstarter-cli-common/jumpstarter_cli_common/signal.py:10-11
Timestamp: 2025-05-07T08:29:38.418Z
Learning: In AnyIO 4.6.2+, `open_signal_receiver()` is a regular synchronous context manager that should be used with a plain `with` statement, not `async with`. However, it returns an asynchronous iterator that must be consumed with `async for`. The correct usage pattern is: `with open_signal_receiver(...) as signals: async for signum in signals:`. Reference: https://anyio.readthedocs.io/en/stable/signals.html
Applied to files:
packages/jumpstarter-cli/jumpstarter_cli/run.py
📚 Learning: 2025-05-07T08:29:38.418Z
Learnt from: mangelajo
PR: jumpstarter-dev/jumpstarter#448
File: packages/jumpstarter-cli-common/jumpstarter_cli_common/signal.py:10-11
Timestamp: 2025-05-07T08:29:38.418Z
Learning: In AnyIO, `open_signal_receiver()` returns a synchronous context manager (used with regular `with`), not an asynchronous one. While the context manager is synchronous, it yields an asynchronous iterator that should be consumed with `async for`. Reference: https://anyio.readthedocs.io/en/stable/signals.html
Applied to files:
packages/jumpstarter-cli/jumpstarter_cli/run.py
📚 Learning: 2025-08-14T13:20:51.953Z
Learnt from: michalskrivanek
PR: jumpstarter-dev/jumpstarter#589
File: packages/jumpstarter-cli/jumpstarter_cli/run.py:15-15
Timestamp: 2025-08-14T13:20:51.953Z
Learning: Signal 0 is not a valid receivable signal in Unix systems. It can only be used as an argument to kill(pid, 0) to test if a process exists, but doesn't actually send a signal. Receivable signals are positive integers (e.g., SIGINT=2, SIGTERM=15, SIGHUP=1, SIGQUIT=3). Using 0 to represent "no signal received" in signal handling code is correct.
Applied to files:
packages/jumpstarter-cli/jumpstarter_cli/run.py
🧬 Code Graph Analysis (3)
packages/jumpstarter/jumpstarter/config/exporter.py (3)
packages/jumpstarter/jumpstarter/exporter/exporter.py (2)
Exporter(25-189)serve(139-189)packages/jumpstarter/jumpstarter/common/grpc.py (2)
ssl_channel_credentials(16-46)aio_secure_channel(49-54)packages/jumpstarter/jumpstarter/config/grpc.py (1)
call_credentials(6-20)
packages/jumpstarter/jumpstarter/exporter/exporter.py (3)
packages/jumpstarter-protocol/jumpstarter_protocol/jumpstarter/v1/jumpstarter_pb2_grpc.py (1)
ControllerServiceStub(9-68)packages/jumpstarter/jumpstarter/config/exporter.py (1)
channel_factory(167-174)packages/jumpstarter/conftest.py (1)
Unregister(59-60)
packages/jumpstarter-cli/jumpstarter_cli/run.py (4)
packages/jumpstarter-cli-common/jumpstarter_cli_common/config.py (1)
opt_config(99-100)packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py (2)
handle_exceptions(32-46)leaf_exceptions(78-102)packages/jumpstarter/jumpstarter/exporter/exporter.py (3)
stop(36-49)serve(139-189)status(148-167)packages/jumpstarter/jumpstarter/config/exporter.py (2)
create_exporter(160-191)serve(193-195)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: Redirect rules - jumpstarter-docs
- GitHub Check: Header rules - jumpstarter-docs
- GitHub Check: Pages changed - jumpstarter-docs
- GitHub Check: pytest-matrix (macos-15, 3.12)
- GitHub Check: e2e
- GitHub Check: pytest-matrix (macos-15, 3.13)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.11)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.12)
- GitHub Check: pytest-matrix (macos-15, 3.11)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.13)
🔇 Additional comments (3)
packages/jumpstarter/jumpstarter/config/exporter.py (1)
193-196: LGTM: serve() now centralizes lifecycle via create_exporterUsing
async with self.create_exporter()keeps lifecycle concerns localized and ensures teardown shielding is applied uniformly.packages/jumpstarter/jumpstarter/exporter/exporter.py (1)
169-189: Serve loop and shutdown sequencing are consistent with the forked modelTracking
self._tg, usingstop()to cancel the task group, and breaking out on lease changes or stop requests ensures the child process exits cleanly so the parent can restart it. This aligns with the PR’s supervision goals.packages/jumpstarter-cli/jumpstarter_cli/run.py (1)
65-78: LGTM: Signal forwarding is minimal and avoids double-wait racesForwarding SIGINT/SIGTERM/SIGHUP/SIGQUIT to the child without
waitpid()inside the handler prevents reaping the child twice and avoids handler-side blocking.
b923cbe to
f144c17
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
packages/jumpstarter/jumpstarter/exporter/exporter.py (3)
86-107: Controller channel leak in session(): close the channel after Registersession() opens a controller channel but never closes it. Over time (restarts, errors) this can leak FDs/memory. Close it in a finally.
Apply:
- controller = jumpstarter_pb2_grpc.ControllerServiceStub(await self.channel_factory()) + channel = await self.channel_factory() + controller = jumpstarter_pb2_grpc.ControllerServiceStub(channel) with Session( uuid=self.uuid, labels=self.labels, root_device=self.device_factory(), ) as session: async with session.serve_unix_async() as path: async with grpc.aio.secure_channel( f"unix://{path}", grpc.local_channel_credentials(grpc.LocalConnectionType.UDS) ) as channel: response = await jumpstarter_pb2_grpc.ExporterServiceStub(channel).GetReport(empty_pb2.Empty()) logger.info("Registering exporter with controller") await controller.Register( jumpstarter_pb2.RegisterRequest( labels=self.labels, reports=response.reports, ) ) self.registered = True yield path + await channel.close()If you prefer to be extra safe in the presence of exceptions:
- controller = jumpstarter_pb2_grpc.ControllerServiceStub(await self.channel_factory()) + channel = await self.channel_factory() + try: + controller = jumpstarter_pb2_grpc.ControllerServiceStub(channel) ... - yield path + yield path + finally: + await channel.close()
111-135: Listen stream channel leak; wrap each attempt with a channel lifecyclelisten() establishes a new channel per attempt but never closes it on error/EOF. Close the channel in a finally to avoid leaks on retries.
- async def listen(retries=5, backoff=3): + async def listen(retries=5, backoff=3): retries_left = retries while True: - try: - controller = jumpstarter_pb2_grpc.ControllerServiceStub(await self.channel_factory()) + channel = await self.channel_factory() + try: + controller = jumpstarter_pb2_grpc.ControllerServiceStub(channel) async for request in controller.Listen(jumpstarter_pb2.ListenRequest(lease_name=lease_name)): await listen_tx.send(request) except Exception as e: if retries_left > 0: retries_left -= 1 logger.info( "Listen stream interrupted, restarting in {}s, {} retries left: {}".format( backoff, retries_left, e ) ) await sleep(backoff) else: raise else: retries_left = retries + finally: + await channel.close()
152-172: Status stream channel leak; same fix as listen()The status() loop also creates channels per attempt without closing them. Add a finally to close the channel before retrying.
- async def status(retries=5, backoff=3): + async def status(retries=5, backoff=3): retries_left = retries while True: - try: - controller = jumpstarter_pb2_grpc.ControllerServiceStub(await self.channel_factory()) + channel = await self.channel_factory() + try: + controller = jumpstarter_pb2_grpc.ControllerServiceStub(channel) async for status in controller.Status(jumpstarter_pb2.StatusRequest()): await status_tx.send(status) except Exception as e: if retries_left > 0: retries_left -= 1 logger.info( "Status stream interrupted, restarting in {}s, {} retries left: {}".format( backoff, retries_left, e ) ) await sleep(backoff) else: raise else: retries_left = retries + finally: + await channel.close()packages/jumpstarter-cli/jumpstarter_cli/run.py (1)
107-113: CLI does not propagate exit code to the OS
clickcommands ignore return values by default; returning an int won’t set the process’ exit status. The PR goal is to propagate the child’s exit code back to the parent/CLI. Wrap the result withsys.exit(...)(orraise click.Exit(code)).Apply this diff:
def run(config): """Run an exporter locally.""" - return _serve_with_exc_handling(config) + sys.exit(_serve_with_exc_handling(config))
♻️ Duplicate comments (1)
packages/jumpstarter-cli/jumpstarter_cli/run.py (1)
79-92: SIGHUP should trigger restart; parent exit-code mapping currently terminates insteadPer the PR objective, SIGHUP must restart the child while keeping the parent running. Right now, the child returns
1on SIGHUP and the parent maps any non‑zero exit code to128 + code, resulting in termination with 129 instead of a restart.Proposed fix: treat exit code
0andSIGHUPas “restart,” and also handle the WIFSIGNALED case for completeness.Is the intended behavior to always restart on SIGHUP regardless of whether the child exits normally or is killed by the signal? If yes, apply:
@@ - if os.WIFEXITED(status): - # Interpret child exit code - child_exit_code = os.WEXITSTATUS(status) - if child_exit_code == 0: - return None # restart child (unexpected exit/exception) - else: - # Child indicates termination (signal number) - return 128 + child_exit_code # Return standard Unix exit code - else: - # Child killed by unhandled signal - terminate - child_exit_signal = os.WTERMSIG(status) if os.WIFSIGNALED(status) else 0 - click.echo(f"Child killed by unhandled signal: {child_exit_signal}", err=True) - return 128 + child_exit_signal + if os.WIFEXITED(status): + # Child encodes: 0 -> restart, signal number -> termination (unless SIGHUP) + child_exit_code = os.WEXITSTATUS(status) + if child_exit_code in (0, signal.SIGHUP): + return None # restart child + return 128 + child_exit_code # standard Unix mapping for other signals + elif os.WIFSIGNALED(status): + sig = os.WTERMSIG(status) + if sig == signal.SIGHUP: + return None # restart child on SIGHUP + click.echo(f"Child killed by unhandled signal: {sig} ({signal.Signals(sig).name})", err=True) + return 128 + sig + else: + click.echo("Child terminated unexpectedly.", err=True) + return 1
🧹 Nitpick comments (35)
packages/jumpstarter/jumpstarter/common/ipaddr.py (1)
29-47: Add return type, handle missing binary, and improve error messagesThe async implementation looks good. Two improvements will make it more robust:
- Return type annotation clarifies the API.
- Catch FileNotFoundError to surface a friendlier error when minikube isn’t installed or not in PATH.
- Include exit code and trimmed stderr in the RuntimeError for easier troubleshooting.
Apply this diff:
-async def get_minikube_ip(profile: str = None, minikube: str = "minikube"): +async def get_minikube_ip(profile: str | None = None, minikube: str = "minikube") -> str: - # Create the subprocess with optional profile - cmd = [minikube, "ip"] - if profile: - cmd.extend(["-p", profile]) - - process = await asyncio.create_subprocess_exec(*cmd, stdout=asyncio.subprocess.PIPE, stderr=asyncio.subprocess.PIPE) + # Create the subprocess with optional profile + cmd = [minikube, "ip"] + if profile: + cmd.extend(["-p", profile]) + + try: + process = await asyncio.create_subprocess_exec( + *cmd, + stdout=asyncio.subprocess.PIPE, + stderr=asyncio.subprocess.PIPE, + ) + except FileNotFoundError as e: + raise RuntimeError(f"{minikube!r} not found in PATH") from e # Wait for it to complete and get the output - stdout, stderr = await process.communicate() + stdout, stderr = await process.communicate() # Decode and strip whitespace - result = stdout.decode().strip() + result = stdout.decode().strip() # Optional: check if command was successful - if process.returncode != 0: - raise RuntimeError(stderr.decode()) + if process.returncode != 0: + err = stderr.decode().strip() or "unknown error" + raise RuntimeError(f"{minikube} ip failed (exit {process.returncode}): {err}") return resultpackages/jumpstarter/jumpstarter/common/ipaddr_test.py (2)
22-27: Prefer asyncio.subprocess.PIPE over magic -1 values in assertionsUsing constants instead of raw -1 improves clarity and resilience across Python versions.
Apply this diff:
@@ -from unittest.mock import AsyncMock, patch +from unittest.mock import AsyncMock, patch @@ -import pytest +import pytest +import asyncio @@ - mock_subprocess.assert_called_once_with( - "minikube", - "ip", - stdout=-1, - stderr=-1, # asyncio.subprocess.PIPE constants - ) + mock_subprocess.assert_called_once_with( + "minikube", + "ip", + stdout=asyncio.subprocess.PIPE, + stderr=asyncio.subprocess.PIPE, + ) @@ - mock_subprocess.assert_called_once_with("minikube", "ip", "-p", "test-profile", stdout=-1, stderr=-1) + mock_subprocess.assert_called_once_with( + "minikube", "ip", "-p", "test-profile", + stdout=asyncio.subprocess.PIPE, stderr=asyncio.subprocess.PIPE + ) @@ - mock_subprocess.assert_called_once_with("custom-minikube", "ip", stdout=-1, stderr=-1) + mock_subprocess.assert_called_once_with( + "custom-minikube", "ip", + stdout=asyncio.subprocess.PIPE, stderr=asyncio.subprocess.PIPE + ) @@ - mock_subprocess.assert_called_once_with("my-minikube", "ip", "-p", "my-profile", stdout=-1, stderr=-1) + mock_subprocess.assert_called_once_with( + "my-minikube", "ip", "-p", "my-profile", + stdout=asyncio.subprocess.PIPE, stderr=asyncio.subprocess.PIPE + )Also applies to: 40-41, 53-54, 77-77
55-65: Add test for missing minikube binary (FileNotFoundError path)If the binary is missing, the function should raise a clear RuntimeError. This test covers that branch and complements the improved error handling.
Apply this diff:
@@ async def test_get_minikube_ip_failure(self, mock_subprocess): @@ with pytest.raises(RuntimeError, match="error: cluster not found"): await get_minikube_ip() + + @pytest.mark.asyncio + @patch("asyncio.create_subprocess_exec") + async def test_get_minikube_ip_not_installed(self, mock_subprocess): + mock_subprocess.side_effect = FileNotFoundError() + with pytest.raises(RuntimeError, match="not found in PATH"): + await get_minikube_ip()docs/source/getting-started/installation/service/service-local.md (3)
20-22: Fix typo and clarify wording in IP detection warningSmall grammar fix and clearer guidance.
Apply this diff:
-```{warning} -Sometimes the automatic IP address detection for will not work correctly, to check if Jumpstarter can determine your IP address, run `jmp admin ip`. If the IP address cannot be determined, use the `--ip` argument to manually set your IP address. -``` +```{warning} +Sometimes automatic IP address detection does not work. To check if Jumpstarter can determine your IP address, run `jmp admin ip`. If it cannot, use the `--ip` argument to set your IP address explicitly. +```
28-34: Avoid emphasis-as-heading (MD036) and tighten the minikube tipConverting emphasis to headings resolves the linter warning and improves structure. The tip can be one concise sentence.
Apply this diff:
-[**kind**](https://kind.sigs.k8s.io/docs/user/quick-start/) (Kubernetes in Docker) is a tool for running local Kubernetes clusters using Docker or Podman containerized "nodes". It's lightweight and fast to start, making it excellent for CI/CD pipelines and quick local testing. +#### kind +[kind](https://kind.sigs.k8s.io/docs/user/quick-start/) runs local Kubernetes clusters using Docker or Podman containerized "nodes". It's lightweight and fast to start, making it excellent for CI/CD pipelines and quick local testing. -[**minikube**](https://minikube.sigs.k8s.io/docs/start/) runs local Kubernetes clusters using VMs or container "nodes". It works across several platforms and supports different hypervisors, making it ideal for local development and testing. It's particularly useful in environments requiring untrusted certificates. +#### minikube +[minikube](https://minikube.sigs.k8s.io/docs/start/) runs local Kubernetes clusters using VMs or container "nodes". It works across platforms and supports different hypervisors. ```{tip} -Consider minikube for environments requiring [untrusted certificates](https://minikube.sigs.k8s.io/docs/handbook/untrusted_certs/). +In environments with untrusted certificates, minikube may be simpler to configure. See the [minikube guide](https://minikube.sigs.k8s.io/docs/handbook/untrusted_certs/).--- `176-217`: **Double-check NodePort ranges vs. chart ports** The minikube example sets the API server NodePort range to 8000-9000 and configures the chart NodePorts to 8082/8083, which is consistent. The kind example uses a wider NodePort range (3000-32767) and maps host ports 8082/8083 to NodePorts 30010/30011 in the kind config. Consider adding a short note to avoid confusion when switching between kind and minikube. Would you like me to add a short note clarifying: - kind uses NodePorts 30010/30011 (mapped to host 8082/8083), - minikube uses NodePorts 8082/8083 directly, after expanding the range to 8000-9000? </blockquote></details> <details> <summary>docs/source/getting-started/installation/service/service-production.md (3)</summary><blockquote> `36-45`: **Minor grammar and clarity improvements in TLS options** Tighten wording and remove sentence fragments. Apply this diff: ```diff -**Option 1: TLS Termination at Ingress/Route (Recommended)** -- Terminate TLS at the ingress controller or OpenShift route -- Simpler certificate management -- Better performance with fewer encryption hops +**Option 1: TLS termination at Ingress/Route (recommended)** +- Terminate TLS at the ingress controller or OpenShift Route. +- Simpler certificate management. +- Fewer encryption hops. -**Option 2: End-to-End TLS** -- TLS from client to Jumpstarter service -- Higher security but more complex certificate management -- Required for strict compliance environments +**Option 2: End-to-end TLS** +- TLS from client to the Jumpstarter service. +- Higher security but more complex certificate management. +- Required for strict compliance environments.
25-31: Add ingress annotations example for gRPC on common controllersProduction issues often trace back to missing gRPC annotations. Consider adding a quick example for NGINX Ingress to minimize support load.
Proposed addition (right after the “gRPC Requirements” bullets):
+For NGINX Ingress, add: + +```yaml +metadata: + annotations: + nginx.ingress.kubernetes.io/backend-protocol: "GRPC" +``` + +For other ingress controllers, consult their gRPC guides for the equivalent settings.
77-105: Recommend using Secrets instead of inlining certificates in RouteInlined certs/keys in manifests can leak via Git history. Prefer referencing a Secret.
Proposed alternative snippet:
apiVersion: v1 kind: Secret metadata: name: jumpstarter-grpc-tls type: kubernetes.io/tls data: tls.crt: <base64 PEM> tls.key: <base64 KEY> --- apiVersion: route.openshift.io/v1 kind: Route metadata: name: jumpstarter-grpc spec: host: grpc.jumpstarter.example.com tls: termination: edge certificate: null key: null to: kind: Service name: jumpstarter-controller-grpcIf you want, I can add a short section explaining how to create the Secret from PEM files.
packages/jumpstarter-kubernetes/jumpstarter_kubernetes/cluster.py (2)
28-35: Consider streaming stderr on failures to surface errors
run_command_with_outputdrops stderr/stdout in the exception path, making triage harder. Optional: add a variant that captures stderr on non-zero exit and includes it in the raised message.If you want, I can add
run_command_with_diagnostics(cmd: list[str]) -> Nonethat:
- streams output live, and
- on non-zero exit, also returns captured tail of stderr (e.g., last 200 lines) to include in the error.
93-128: Minikube create path is reasonable; consider a timeout as a future improvementThe flow (exists -> optionally delete -> start with explicit NodePort range) is sound. If long-running CI tasks are a concern, a future enhancement could allow passing a timeout to abort hung starts.
If you see occasional hangs in CI, I can wire a
timeout: Optional[float]parameter and propagate it torun_command_with_output.packages/jumpstarter-kubernetes/jumpstarter_kubernetes/cluster_test.py (4)
178-202: Assert Kind config is piped via stdin to prevent silent regressionsThe Kind creation flow depends on passing a config via --config=/dev/stdin. Strengthen the test to assert this invariant so a future change (e.g., dropping stdin piping) doesn’t pass unnoticed.
Apply this diff inside the current test to assert the flag:
args, kwargs = mock_subprocess.call_args assert args[0] == "kind" assert args[1] == "create" assert args[2] == "cluster" assert "--name" in args assert "test-cluster" in args + assert "--config=/dev/stdin" in args assert kwargs["stdin"] == asyncio.subprocess.PIPE
247-261: Cover Kind creation with non-zero return code (failure path)We exercise success, not-installed, already-exists, and force-recreate paths, but not the failure path (non-zero return code). Add a test to ensure a non-zero return code from the Kind process raises the expected RuntimeError.
Example test to append:
+ @pytest.mark.asyncio + @patch("jumpstarter_kubernetes.cluster.kind_installed") + @patch("jumpstarter_kubernetes.cluster.kind_cluster_exists") + @patch("asyncio.create_subprocess_exec") + async def test_create_kind_cluster_failure(self, mock_subprocess, mock_cluster_exists, mock_kind_installed): + mock_kind_installed.return_value = True + mock_cluster_exists.return_value = False + mock_process = AsyncMock() + mock_process.returncode = 1 + mock_process.communicate.return_value = (b"", b"boom") + mock_subprocess.return_value = mock_process + + with pytest.raises(RuntimeError, match="Failed to create Kind cluster 'test-cluster'"): + await create_kind_cluster("kind", "test-cluster")
262-275: Exercise Kind deletion failure pathWe test success, not installed, and already deleted for Kind. Add a failure test mirroring the Minikube failure test to pin the error contract.
Example test to append:
+ @pytest.mark.asyncio + @patch("jumpstarter_kubernetes.cluster.kind_installed") + @patch("jumpstarter_kubernetes.cluster.kind_cluster_exists") + @patch("jumpstarter_kubernetes.cluster.run_command_with_output") + async def test_delete_kind_cluster_failure(self, mock_run_command, mock_cluster_exists, mock_kind_installed): + mock_kind_installed.return_value = True + mock_cluster_exists.return_value = True + mock_run_command.return_value = 1 + + with pytest.raises(RuntimeError, match="Failed to delete Kind cluster 'test-cluster'"): + await delete_kind_cluster("kind", "test-cluster")
110-172: Add a Minikube exists “false” branch test for parity with KindWe have the true-path and “not installed” tests for Minikube, but not the case where the command returns non-zero (cluster absent). Add a simple “false” test to keep parity with the Kind suite.
Example test to append:
+ @pytest.mark.asyncio + @patch("jumpstarter_kubernetes.cluster.minikube_installed") + @patch("jumpstarter_kubernetes.cluster.run_command") + async def test_minikube_cluster_exists_false(self, mock_run_command, mock_minikube_installed): + mock_minikube_installed.return_value = True + mock_run_command.return_value = (1, "", "not running") + result = await minikube_cluster_exists("minikube", "test-cluster") + assert result is Falsepackages/jumpstarter-cli-admin/jumpstarter_cli_admin/install_test.py (3)
287-333: Consider adding a focused test for the ip CLI commandWe test get_ip_generic directly, but not the ip CLI wrapper. A small test would validate option wiring and echo behavior.
Example to append:
+ def test_ip_command_help(self): + runner = CliRunner() + result = runner.invoke(ip, ["--help"]) + assert result.exit_code == 0 + assert "--cluster-name" in result.output + + @patch("jumpstarter_cli_admin.install._validate_cluster_type") + @patch("jumpstarter_cli_admin.install.get_ip_generic") + def test_ip_command_kind(self, mock_get_ip_generic, mock_validate_cluster_type): + mock_validate_cluster_type.return_value = "kind" + mock_get_ip_generic.return_value = "192.168.1.10" + runner = CliRunner() + result = runner.invoke(ip, ["--kind", "kind"]) + assert result.exit_code == 0 + assert "192.168.1.10" in result.output
252-265: Add negative test for Minikube IP when binary is missingThere’s no test asserting we raise ClickException when Minikube is requested but not installed. This guards the user experience.
Example to append:
+ @pytest.mark.asyncio + @patch("jumpstarter_cli_admin.install.minikube_installed") + async def test_get_ip_generic_minikube_not_installed(self, mock_minikube_installed): + mock_minikube_installed.return_value = False + with pytest.raises(click.ClickException, match="minikube is not installed"): + await get_ip_generic("minikube", "minikube", "test-cluster")
607-617: Uninstall happy-path is covered; consider asserting user output for UX stabilityOptionally assert on key echo lines (e.g., “Uninstalling Jumpstarter service…”, “Uninstalled Helm release…”) to lock-in UX strings.
packages/jumpstarter-cli-admin/jumpstarter_cli_admin/install.py (8)
102-110: Respect quoted/escaped extra args using shlex.splitPlain .split() breaks when users pass quoted values (e.g., --arg="foo bar"). shlex.split handles shell-like quoting and is the standard for CLI arg parsing.
Apply this diff and import shlex:
@@ - extra_args_list = kind_extra_args.split() if kind_extra_args.strip() else [] + import shlex + extra_args_list = shlex.split(kind_extra_args) if kind_extra_args.strip() else []Do the same in _create_minikube_cluster below.
127-135: Mirror shlex.split for Minikube extra argsKeep parsing behavior consistent across both cluster flows.
- extra_args_list = minikube_extra_args.split() if minikube_extra_args.strip() else [] + import shlex + extra_args_list = shlex.split(minikube_extra_args) if minikube_extra_args.strip() else []
202-211: Fix user-facing typo: “gPRC Mode” -> “gRPC Mode”Minor UX polish in the install output.
- click.echo(f"gPRC Mode: {mode}\n") + click.echo(f"gRPC Mode: {mode}\n")
255-273: Clarify --kind/--minikube options: they take an optional binary path, not flagsThe current declaration uses is_flag=False with flag_value which is slightly confusing and redundant. Declaring them as optional string options improves clarity and help text. Behavior remains the same and aligns with how you compute kind or "kind".
-@click.option( - "--kind", is_flag=False, flag_value="kind", default=None, help="Use default settings for a local Kind cluster" -) -@click.option( - "--minikube", - is_flag=False, - flag_value="minikube", - default=None, - help="Use default settings for a local Minikube cluster", -) +@click.option( + "--kind", + type=str, + default=None, + metavar="PATH", + help='Enable Kind workflow; optionally set the Kind binary path (default: "kind")', +) +@click.option( + "--minikube", + type=str, + default=None, + metavar="PATH", + help='Enable Minikube workflow; optionally set the Minikube binary path (default: "minikube")', +)Note: This is aligned with the preference for straightforward, inline code (no extra helpers). It doesn’t change downstream logic in _validate_cluster_type or in how you pass kind/minikube to helpers.
328-336: Apply the same option clarification to the ip commandKeep CLI consistency across commands.
-@click.option( - "--kind", is_flag=False, flag_value="kind", default=None, help="Use default settings for a local Kind cluster" -) -@click.option( - "--minikube", - is_flag=False, - flag_value="minikube", - default=None, - help="Use default settings for a local Minikube cluster", -) +@click.option( + "--kind", + type=str, + default=None, + metavar="PATH", + help='Use Kind workflow; optionally set the Kind binary path (default: "kind")', +) +@click.option( + "--minikube", + type=str, + default=None, + metavar="PATH", + help='Use Minikube workflow; optionally set the Minikube binary path (default: "minikube")', +)
359-367: Apply the same option clarification to the uninstall commandFor consistency and clearer UX.
-@click.option( - "--kind", is_flag=False, flag_value="kind", default=None, help="Delete the local Kind cluster after uninstalling" -) -@click.option( - "--minikube", - is_flag=False, - flag_value="minikube", - default=None, - help="Delete the local Minikube cluster after uninstalling", -) +@click.option( + "--kind", + type=str, + default=None, + metavar="PATH", + help='Delete the local Kind cluster after uninstalling; optionally set the Kind binary path', +) +@click.option( + "--minikube", + type=str, + default=None, + metavar="PATH", + help='Delete the local Minikube cluster after uninstalling; optionally set the Minikube binary path', +)
188-216: Optional: wrap Helm install failures into ClickException for cleaner CLI UXToday, any exception bubbles up. Converting to ClickException preserves traceback suppression and gives a friendly error. Tests that patch install_helm_chart to raise RuntimeError will still pass if you re-raise as ClickException.
Example:
- await install_helm_chart( - chart, name, namespace, basedomain, grpc_endpoint, router_endpoint, mode, version, kubeconfig, context, helm - ) + try: + await install_helm_chart( + chart, + name, + namespace, + basedomain, + grpc_endpoint, + router_endpoint, + mode, + version, + kubeconfig, + context, + helm, + ) + except Exception as e: + raise click.ClickException(f"Helm installation failed: {e}") from e
218-231: get_ip_generic logic is clear; minor naming nitThe parameter minikube represents the binary path. Consider renaming to minikube_binary for readability (no functional change).
packages/jumpstarter-cli/jumpstarter_cli/config.py (1)
9-11: Docstring addition aligns CLI help; minor style nitGood to see the group help surfaced via the docstring. Consider a concise one-line summary ending with a period to match PEP 257 and Click’s short-help extraction.
Apply this minimal tweak:
- """ - Manage local configurations - """ + """Manage local configurations."""packages/jumpstarter/jumpstarter/client/grpc.py (3)
83-90: Exporter model aligned with proto; consider mapping to a plain dictfrom_protobuf passes data.labels (a map field) directly. Pydantic handles mappings fine, but if you foresee serialization quirks, consider dict(data.labels) for a plain dict cast. Not required.
- return cls(namespace=namespace, name=name, labels=data.labels, online=data.online) + return cls(namespace=namespace, name=name, labels=dict(data.labels), online=data.online)
95-107: Lease display extraction is correct; handle tz-aware timestamps defensivelyFormatting effective_begin_time via strftime is fine; if there’s any chance of tz-naive datetimes slipping in, guard with .astimezone() before formatting. Optional.
- if self.lease.effective_begin_time: - start_time = self.lease.effective_begin_time.strftime("%Y-%m-%d %H:%M:%S") + if self.lease.effective_begin_time: + dt = self.lease.effective_begin_time + if dt.tzinfo is None: + dt = dt.astimezone() + start_time = dt.strftime("%Y-%m-%d %H:%M:%S")
221-249: De-duplicate exclude-fields logic between model_dump_json and model_dumpBoth methods build the same exclude set. Factor it into a small helper to keep behavior in sync and reduce drift.
Apply this diff within the changed segments, and add the helper below (outside the selected range):
- # Determine which fields to exclude - exclude_fields = set() - if not self.include_leases: - exclude_fields.add("lease") - if not self.include_online: - exclude_fields.add("online") + exclude_fields = self._exclude_fields() data = { "exporters": [ - exporter.model_dump(mode="json", exclude=exclude_fields) for exporter in self.exporters + exporter.model_dump(mode="json", exclude=exclude_fields) for exporter in self.exporters ] } return json.dumps(data, **json_kwargs)- exclude_fields = set() - if not self.include_leases: - exclude_fields.add("lease") - if not self.include_online: - exclude_fields.add("online") + exclude_fields = self._exclude_fields() return { "exporters": [ exporter.model_dump(mode="json", exclude=exclude_fields) for exporter in self.exporters ] }And add this helper in the class:
def _exclude_fields(self) -> set[str]: exclude_fields: set[str] = set() if not self.include_leases: exclude_fields.add("lease") if not self.include_online: exclude_fields.add("online") return exclude_fieldspackages/jumpstarter/jumpstarter/client/grpc_test.py (2)
172-179: Positive assertion on column count; consider explicit header assertionsYou already assert counts; optionally assert concrete headers to catch regressions in column order.
- assert len(table.columns) == 5 # NAME, LABELS, LEASED BY, LEASE STATUS, START TIME + headers = [c.header for c in table.columns] + assert headers == ["NAME", "LABELS", "LEASED BY", "LEASE STATUS", "START TIME"]
221-258: Online indicator assertions are good; add JSON gating tests for include_ flags*To fully exercise ExporterList serialization toggles, add tests that verify online/lease are omitted unless opted in.
Proposed additions:
def test_exporter_list_model_dump_json_gating(): exporters = [ Exporter(namespace="ns", name="e1", labels={}, online=True), ] el = SimpleNamespace() # stand-in for ExporterList-like behaviorOr, if you prefer a concrete test within this suite:
from jumpstarter.client.grpc import ExporterList def test_exporter_list_model_dump_json_gating(): el = ExporterList(exporters=[Exporter(namespace="ns", name="e1", labels={"k":"v"}, online=True)]) # Defaults: both fields excluded data = json.loads(el.model_dump_json()) assert "online" not in data["exporters"][0] assert "lease" not in data["exporters"][0] # Enable online only el.include_online = True el.include_leases = False data = json.loads(el.model_dump_json()) assert data["exporters"][0]["online"] is True assert "lease" not in data["exporters"][0] # Enable both el.exporters[0].lease = Mock(spec=Lease) el.include_leases = True data = json.loads(el.model_dump_json()) assert "lease" in data["exporters"][0]packages/jumpstarter/jumpstarter/exporter/exporter.py (1)
77-84: Log level on connection failures__handle logs connection failures at info level. If this is routine noise during shutdown or lease changes, fine. Otherwise, consider warning or include exc_info to aid debugging.
- logger.info("failed to handle connection: {}".format(e)) + logger.warning("failed to handle connection: %s", e, exc_info=True)packages/jumpstarter-cli/jumpstarter_cli/run.py (2)
22-36: Signal-loop control: break after terminal signals to avoid unnecessary wakeupsAfter a non‑SIGHUP (terminal) signal, you mark
signal_handled = Trueand continue looping, which keeps the receiver alive to ignore future signals. Breaking out simplifies control flow and avoids extra wakeups while shutdown is ongoing. SIGHUP should remain non-terminal to allow later SIGTERM to override.Apply this small refinement:
async for sig in signals: if signal_handled: - continue # Ignore duplicate signals + break # Stop listening once a terminating signal was processed @@ if exporter: # Terminate exporter. SIGHUP waits until current lease is let go. Later SIGTERM still overrides if received_signal != signal.SIGHUP: signal_handled = True + break exporter.stop(wait_for_lease_exit=received_signal == signal.SIGHUP)
47-55: Redundant cancellation check inside ExceptionGroup handler
except* Exceptioncannot contain anyio’s cancellation exception (it’s a BaseException subclass). Theisinstance(exc, anyio.get_cancelled_exc_class())check is therefore always False here. Consider simplifying to log all leaf exceptions.Apply:
- for exc in leaf_exceptions(excgroup): - if not isinstance(exc, anyio.get_cancelled_exc_class()): - click.echo( - f"Exception while serving on the exporter: {type(exc).__name__}: {exc}", - err=True, - ) + for exc in leaf_exceptions(excgroup): + click.echo( + f"Exception while serving on the exporter: {type(exc).__name__}: {exc}", + err=True, + )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (27)
docs/source/getting-started/guides/setup-distributed-mode.md(1 hunks)docs/source/getting-started/installation/index.md(1 hunks)docs/source/getting-started/installation/service.md(0 hunks)docs/source/getting-started/installation/service/index.md(1 hunks)docs/source/getting-started/installation/service/service-local.md(1 hunks)docs/source/getting-started/installation/service/service-production.md(1 hunks)packages/jumpstarter-cli-admin/jumpstarter_cli_admin/delete.py(1 hunks)packages/jumpstarter-cli-admin/jumpstarter_cli_admin/install.py(4 hunks)packages/jumpstarter-cli-admin/jumpstarter_cli_admin/install_test.py(1 hunks)packages/jumpstarter-cli-common/jumpstarter_cli_common/opt.py(1 hunks)packages/jumpstarter-cli/jumpstarter_cli/config.py(1 hunks)packages/jumpstarter-cli/jumpstarter_cli/get.py(2 hunks)packages/jumpstarter-cli/jumpstarter_cli/get_test.py(1 hunks)packages/jumpstarter-cli/jumpstarter_cli/run.py(1 hunks)packages/jumpstarter-kubernetes/jumpstarter_kubernetes/__init__.py(2 hunks)packages/jumpstarter-kubernetes/jumpstarter_kubernetes/cluster.py(1 hunks)packages/jumpstarter-kubernetes/jumpstarter_kubernetes/cluster_test.py(1 hunks)packages/jumpstarter-kubernetes/jumpstarter_kubernetes/install.py(1 hunks)packages/jumpstarter-protocol/jumpstarter_protocol/jumpstarter/client/v1/client_pb2.py(3 hunks)packages/jumpstarter/jumpstarter/client/grpc.py(4 hunks)packages/jumpstarter/jumpstarter/client/grpc_test.py(1 hunks)packages/jumpstarter/jumpstarter/common/ipaddr.py(1 hunks)packages/jumpstarter/jumpstarter/common/ipaddr_test.py(1 hunks)packages/jumpstarter/jumpstarter/config/client.py(5 hunks)packages/jumpstarter/jumpstarter/config/exporter.py(1 hunks)packages/jumpstarter/jumpstarter/exporter/exporter.py(4 hunks)packages/jumpstarter/jumpstarter/exporter/session.py(1 hunks)
💤 Files with no reviewable changes (1)
- docs/source/getting-started/installation/service.md
✅ Files skipped from review due to trivial changes (5)
- docs/source/getting-started/guides/setup-distributed-mode.md
- docs/source/getting-started/installation/service/index.md
- packages/jumpstarter-kubernetes/jumpstarter_kubernetes/install.py
- docs/source/getting-started/installation/index.md
- packages/jumpstarter-cli-admin/jumpstarter_cli_admin/delete.py
🚧 Files skipped from review as they are similar to previous changes (2)
- packages/jumpstarter/jumpstarter/exporter/session.py
- packages/jumpstarter/jumpstarter/config/exporter.py
🧰 Additional context used
🧠 Learnings (6)
📓 Common learnings
Learnt from: michalskrivanek
PR: jumpstarter-dev/jumpstarter#589
File: packages/jumpstarter/jumpstarter/exporter/exporter.py:147-155
Timestamp: 2025-08-14T13:11:35.034Z
Learning: In the jumpstarter fork-based architecture, when the exporter's serve() method exits (e.g., due to lease changes), the child process terminates and the parent process automatically restarts it, eliminating concerns about orphaned tasks since the entire process is restarted.
Learnt from: michalskrivanek
PR: jumpstarter-dev/jumpstarter#584
File: packages/jumpstarter/jumpstarter/client/grpc.py:38-54
Timestamp: 2025-08-18T07:13:37.619Z
Learning: User michalskrivanek prefers inline code over small helper function extractions when the abstraction doesn't significantly improve readability or reduce complexity. They value straightforward, readable code over reducing minor duplication.
📚 Learning: 2025-05-28T15:09:35.768Z
Learnt from: NickCao
PR: jumpstarter-dev/jumpstarter#463
File: packages/jumpstarter-cli-admin/jumpstarter_cli_admin/get_test.py:270-270
Timestamp: 2025-05-28T15:09:35.768Z
Learning: The jumpstarter CLI is designed to match kubectl's behavior, including returning exit code 0 (success) when no resources are found, rather than exit code 1 (failure).
Applied to files:
docs/source/getting-started/installation/service/service-local.md
📚 Learning: 2025-08-14T13:11:35.034Z
Learnt from: michalskrivanek
PR: jumpstarter-dev/jumpstarter#589
File: packages/jumpstarter/jumpstarter/exporter/exporter.py:147-155
Timestamp: 2025-08-14T13:11:35.034Z
Learning: In the jumpstarter fork-based architecture, when the exporter's serve() method exits (e.g., due to lease changes), the child process terminates and the parent process automatically restarts it, eliminating concerns about orphaned tasks since the entire process is restarted.
Applied to files:
packages/jumpstarter-cli/jumpstarter_cli/run.pypackages/jumpstarter/jumpstarter/exporter/exporter.py
📚 Learning: 2025-05-07T08:29:38.418Z
Learnt from: mangelajo
PR: jumpstarter-dev/jumpstarter#448
File: packages/jumpstarter-cli-common/jumpstarter_cli_common/signal.py:10-11
Timestamp: 2025-05-07T08:29:38.418Z
Learning: In AnyIO 4.6.2+, `open_signal_receiver()` is a regular synchronous context manager that should be used with a plain `with` statement, not `async with`. However, it returns an asynchronous iterator that must be consumed with `async for`. The correct usage pattern is: `with open_signal_receiver(...) as signals: async for signum in signals:`. Reference: https://anyio.readthedocs.io/en/stable/signals.html
Applied to files:
packages/jumpstarter-cli/jumpstarter_cli/run.py
📚 Learning: 2025-05-07T08:29:38.418Z
Learnt from: mangelajo
PR: jumpstarter-dev/jumpstarter#448
File: packages/jumpstarter-cli-common/jumpstarter_cli_common/signal.py:10-11
Timestamp: 2025-05-07T08:29:38.418Z
Learning: In AnyIO, `open_signal_receiver()` returns a synchronous context manager (used with regular `with`), not an asynchronous one. While the context manager is synchronous, it yields an asynchronous iterator that should be consumed with `async for`. Reference: https://anyio.readthedocs.io/en/stable/signals.html
Applied to files:
packages/jumpstarter-cli/jumpstarter_cli/run.py
📚 Learning: 2025-08-14T13:20:51.953Z
Learnt from: michalskrivanek
PR: jumpstarter-dev/jumpstarter#589
File: packages/jumpstarter-cli/jumpstarter_cli/run.py:15-15
Timestamp: 2025-08-14T13:20:51.953Z
Learning: Signal 0 is not a valid receivable signal in Unix systems. It can only be used as an argument to kill(pid, 0) to test if a process exists, but doesn't actually send a signal. Receivable signals are positive integers (e.g., SIGINT=2, SIGTERM=15, SIGHUP=1, SIGQUIT=3). Using 0 to represent "no signal received" in signal handling code is correct.
Applied to files:
packages/jumpstarter-cli/jumpstarter_cli/run.py
🧬 Code graph analysis (13)
packages/jumpstarter-kubernetes/jumpstarter_kubernetes/cluster_test.py (1)
packages/jumpstarter-kubernetes/jumpstarter_kubernetes/cluster.py (9)
create_kind_cluster(130-192)create_minikube_cluster(93-127)delete_kind_cluster(77-90)delete_minikube_cluster(61-74)kind_cluster_exists(49-58)kind_installed(11-13)minikube_cluster_exists(37-46)minikube_installed(6-8)run_command_with_output(28-34)
packages/jumpstarter-cli/jumpstarter_cli/get_test.py (3)
packages/jumpstarter-cli-common/jumpstarter_cli_common/opt.py (1)
parse_comma_separated(156-196)packages/jumpstarter/jumpstarter/client/grpc.py (1)
ExporterList(195-249)packages/jumpstarter/jumpstarter/config/client.py (2)
ClientConfigV1Alpha1(89-370)lease(123-131)
packages/jumpstarter-cli-admin/jumpstarter_cli_admin/install_test.py (1)
packages/jumpstarter-cli-admin/jumpstarter_cli_admin/install.py (13)
install(277-323)_configure_endpoints(42-60)_create_kind_cluster(96-118)_create_minikube_cluster(121-143)_delete_kind_cluster(146-155)_delete_minikube_cluster(158-167)_handle_cluster_creation(63-93)_handle_cluster_deletion(170-185)_validate_cluster_type(29-39)_validate_prerequisites(22-26)get_ip_generic(218-231)uninstall(372-393)ip(339-348)
packages/jumpstarter/jumpstarter/common/ipaddr_test.py (1)
packages/jumpstarter/jumpstarter/common/ipaddr.py (1)
get_minikube_ip(29-47)
packages/jumpstarter-cli/jumpstarter_cli/get.py (3)
packages/jumpstarter-cli-common/jumpstarter_cli_common/opt.py (1)
opt_comma_separated(199-234)packages/jumpstarter-cli/jumpstarter_cli/login.py (1)
relogin_client(141-155)packages/jumpstarter/jumpstarter/config/client.py (1)
list_exporters(144-183)
packages/jumpstarter/jumpstarter/client/grpc_test.py (1)
packages/jumpstarter/jumpstarter/client/grpc.py (4)
WithOptions(20-22)add_display_columns(25-35)add_exporter_row(38-53)get_status(178-192)
packages/jumpstarter-cli/jumpstarter_cli/run.py (5)
packages/jumpstarter-cli-common/jumpstarter_cli_common/config.py (1)
opt_config(99-100)packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py (2)
handle_exceptions(32-46)leaf_exceptions(78-102)packages/jumpstarter-cli-common/jumpstarter_cli_common/signal.py (1)
signal_handler(9-22)packages/jumpstarter/jumpstarter/exporter/exporter.py (3)
stop(36-49)serve(143-193)status(152-171)packages/jumpstarter/jumpstarter/config/exporter.py (2)
create_exporter(160-193)serve(195-197)
packages/jumpstarter-kubernetes/jumpstarter_kubernetes/__init__.py (3)
packages/jumpstarter-kubernetes/jumpstarter_kubernetes/cluster.py (6)
create_kind_cluster(130-192)create_minikube_cluster(93-127)delete_kind_cluster(77-90)delete_minikube_cluster(61-74)kind_installed(11-13)minikube_installed(6-8)packages/jumpstarter-cli-admin/jumpstarter_cli_admin/install.py (1)
install(277-323)packages/jumpstarter-kubernetes/jumpstarter_kubernetes/install.py (3)
helm_installed(6-8)install_helm_chart(11-66)uninstall_helm_chart(69-91)
packages/jumpstarter-cli-common/jumpstarter_cli_common/opt.py (1)
packages/jumpstarter-cli/jumpstarter_cli/get_test.py (1)
allowed_values(15-17)
packages/jumpstarter/jumpstarter/config/client.py (1)
packages/jumpstarter/jumpstarter/client/grpc.py (3)
ClientService(276-389)Exporter(79-109)ListExporters(293-309)
packages/jumpstarter/jumpstarter/exporter/exporter.py (3)
packages/jumpstarter/jumpstarter/config/exporter.py (1)
channel_factory(167-174)packages/jumpstarter-protocol/jumpstarter_protocol/jumpstarter/v1/jumpstarter_pb2_grpc.py (1)
ControllerServiceStub(9-68)packages/jumpstarter/conftest.py (1)
Unregister(59-60)
packages/jumpstarter-cli-admin/jumpstarter_cli_admin/install.py (3)
packages/jumpstarter-kubernetes/jumpstarter_kubernetes/cluster.py (6)
create_kind_cluster(130-192)create_minikube_cluster(93-127)delete_kind_cluster(77-90)delete_minikube_cluster(61-74)kind_installed(11-13)minikube_installed(6-8)packages/jumpstarter-kubernetes/jumpstarter_kubernetes/install.py (3)
helm_installed(6-8)install_helm_chart(11-66)uninstall_helm_chart(69-91)packages/jumpstarter/jumpstarter/common/ipaddr.py (2)
get_ip_address(7-26)get_minikube_ip(29-47)
packages/jumpstarter/jumpstarter/client/grpc.py (2)
packages/jumpstarter/jumpstarter/client/lease.py (1)
Lease(31-211)packages/jumpstarter-kubernetes/jumpstarter_kubernetes/exporters.py (4)
rich_add_columns(62-73)rich_add_columns(111-112)rich_add_rows(75-97)rich_add_rows(114-116)
🪛 LanguageTool
docs/source/getting-started/installation/service/service-local.md
[grammar] ~10-~10: There might be a mistake here.
Context: ...ed and configured to access your cluster - [Helm](https://helm.sh/docs/intro/install...
(QB_NEW_EN)
[grammar] ~11-~11: There might be a mistake here.
Context: ...onfigured to access your cluster - Helm (version 3.x or newer) - Administrator ...
(QB_NEW_EN)
[grammar] ~11-~11: There might be a mistake here.
Context: ...s/intro/install/) (version 3.x or newer) - Administrator access to your cluster (re...
(QB_NEW_EN)
[grammar] ~52-~52: There might be a mistake here.
Context: ...luster name (default: jumpstarter-lab) - --force-recreate-cluster: Force recreate the cluster if it alrea...
(QB_NEW_EN)
[grammar] ~53-~53: There might be a mistake here.
Context: ...if it already exists (destroys all data) - --kind-extra-args: Pass additional arguments to kind clus...
(QB_NEW_EN)
[grammar] ~54-~54: There might be a mistake here.
Context: ...ional arguments to kind cluster creation - --minikube-extra-args: Pass additional arguments to minikube ...
(QB_NEW_EN)
[grammar] ~113-~113: There might be a mistake here.
Context: ...admin install` command and all available options, see the [MAN pages](../../../re...
(QB_NEW_EN)
docs/source/getting-started/installation/service/service-production.md
[grammar] ~11-~11: There might be a mistake here.
Context: ...s/intro/install/) (version 3.x or newer) - Administrator access to your cluster (re...
(QB_NEW_EN)
[grammar] ~12-~12: There might be a mistake here.
Context: ... cluster (required for CRD installation) - Domain name for service endpoints - Ingr...
(QB_NEW_EN)
[grammar] ~13-~13: There might be a mistake here.
Context: ...ion) - Domain name for service endpoints - Ingress controller installed (for Kubern...
(QB_NEW_EN)
[grammar] ~36-~36: There might be a mistake here.
Context: ...rmination at Ingress/Route (Recommended)** - Terminate TLS at the ingress controller ...
(QB_NEW_EN)
[grammar] ~41-~41: There might be a mistake here.
Context: ...ryption hops Option 2: End-to-End TLS - TLS from client to Jumpstarter service -...
(QB_NEW_EN)
[grammar] ~42-~42: There might be a mistake here.
Context: ...- TLS from client to Jumpstarter service - Higher security but more complex certifi...
(QB_NEW_EN)
[grammar] ~43-~43: There might be a mistake here.
Context: ... but more complex certificate management - Required for strict compliance environme...
(QB_NEW_EN)
🪛 markdownlint-cli2 (0.17.2)
docs/source/getting-started/installation/service/service-local.md
36-36: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
41-41: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.12)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.13)
- GitHub Check: pytest-matrix (macos-15, 3.11)
- GitHub Check: pytest-matrix (macos-15, 3.12)
- GitHub Check: pytest-matrix (macos-15, 3.13)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.11)
- GitHub Check: e2e
| command = [kind, "create", "cluster", "--name", cluster_name, "--config=/dev/stdin"] | ||
| command.extend(extra_args) | ||
|
|
||
| kind_process = await asyncio.create_subprocess_exec( | ||
| *command, stdin=asyncio.subprocess.PIPE | ||
| ) | ||
|
|
||
| await kind_process.communicate(input=cluster_config.encode()) | ||
|
|
||
| if kind_process.returncode == 0: | ||
| return True | ||
| else: | ||
| raise RuntimeError(f"Failed to create Kind cluster '{cluster_name}'") |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Avoid /dev/stdin for cross-platform compatibility; use a temporary file
--config=/dev/stdin will fail on Windows. Using a temp file preserves streaming output and works everywhere.
Apply this diff:
- command = [kind, "create", "cluster", "--name", cluster_name, "--config=/dev/stdin"]
- command.extend(extra_args)
-
- kind_process = await asyncio.create_subprocess_exec(
- *command, stdin=asyncio.subprocess.PIPE
- )
-
- await kind_process.communicate(input=cluster_config.encode())
-
- if kind_process.returncode == 0:
- return True
- else:
- raise RuntimeError(f"Failed to create Kind cluster '{cluster_name}'")
+ import os
+ import tempfile
+
+ with tempfile.NamedTemporaryFile("w", delete=False) as f:
+ f.write(cluster_config)
+ cfg_path = f.name
+ try:
+ command = [kind, "create", "cluster", "--name", cluster_name, f"--config={cfg_path}"]
+ command.extend(extra_args)
+ returncode = await run_command_with_output(command)
+ if returncode == 0:
+ return True
+ else:
+ raise RuntimeError(f"Failed to create Kind cluster '{cluster_name}'")
+ finally:
+ try:
+ os.unlink(cfg_path)
+ except OSError:
+ passCommittable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In packages/jumpstarter-kubernetes/jumpstarter_kubernetes/cluster.py around
lines 180-192, replace the use of --config=/dev/stdin (which breaks on Windows)
by writing cluster_config to a temporary file and passing its path to --config:
create a NamedTemporaryFile (or tempfile.mkstemp) in a try/finally, write
cluster_config.encode(), flush and close the file, build the command with
"--config", temp_path (no /dev/stdin), invoke asyncio.create_subprocess_exec
without piping stdin, await communicate(), then remove the temp file in the
finally block to ensure cleanup even on errors; ensure proper encoding when
writing and that return code handling remains the same.
| _globals['_EXPORTER']._serialized_start=338 | ||
| _globals['_EXPORTER']._serialized_end=598 | ||
| _globals['_EXPORTER_LABELSENTRY']._serialized_start=444 | ||
| _globals['_EXPORTER_LABELSENTRY']._serialized_end=501 | ||
| _globals['_LEASE']._serialized_start=601 | ||
| _globals['_LEASE']._serialized_end=1478 | ||
| _globals['_GETEXPORTERREQUEST']._serialized_start=1480 | ||
| _globals['_GETEXPORTERREQUEST']._serialized_end=1554 | ||
| _globals['_LISTEXPORTERSREQUEST']._serialized_start=1557 | ||
| _globals['_LISTEXPORTERSREQUEST']._serialized_end=1736 | ||
| _globals['_LISTEXPORTERSRESPONSE']._serialized_start=1738 | ||
| _globals['_LISTEXPORTERSRESPONSE']._serialized_end=1864 | ||
| _globals['_GETLEASEREQUEST']._serialized_start=1866 | ||
| _globals['_GETLEASEREQUEST']._serialized_end=1934 | ||
| _globals['_LISTLEASESREQUEST']._serialized_start=1937 | ||
| _globals['_LISTLEASESREQUEST']._serialized_end=2110 | ||
| _globals['_LISTLEASESRESPONSE']._serialized_start=2112 | ||
| _globals['_LISTLEASESRESPONSE']._serialized_end=2226 | ||
| _globals['_CREATELEASEREQUEST']._serialized_start=2229 | ||
| _globals['_CREATELEASEREQUEST']._serialized_end=2393 | ||
| _globals['_UPDATELEASEREQUEST']._serialized_start=2396 | ||
| _globals['_UPDATELEASEREQUEST']._serialized_end=2539 | ||
| _globals['_DELETELEASEREQUEST']._serialized_start=2541 | ||
| _globals['_DELETELEASEREQUEST']._serialized_end=2612 | ||
| _globals['_CLIENTSERVICE']._serialized_start=2615 | ||
| _globals['_CLIENTSERVICE']._serialized_end=3678 | ||
| _globals['_EXPORTER']._serialized_end=627 | ||
| _globals['_EXPORTER_LABELSENTRY']._serialized_start=473 | ||
| _globals['_EXPORTER_LABELSENTRY']._serialized_end=530 | ||
| _globals['_LEASE']._serialized_start=630 | ||
| _globals['_LEASE']._serialized_end=1507 | ||
| _globals['_GETEXPORTERREQUEST']._serialized_start=1509 | ||
| _globals['_GETEXPORTERREQUEST']._serialized_end=1583 | ||
| _globals['_LISTEXPORTERSREQUEST']._serialized_start=1586 | ||
| _globals['_LISTEXPORTERSREQUEST']._serialized_end=1765 | ||
| _globals['_LISTEXPORTERSRESPONSE']._serialized_start=1767 | ||
| _globals['_LISTEXPORTERSRESPONSE']._serialized_end=1893 | ||
| _globals['_GETLEASEREQUEST']._serialized_start=1895 | ||
| _globals['_GETLEASEREQUEST']._serialized_end=1963 | ||
| _globals['_LISTLEASESREQUEST']._serialized_start=1966 | ||
| _globals['_LISTLEASESREQUEST']._serialized_end=2139 | ||
| _globals['_LISTLEASESRESPONSE']._serialized_start=2141 | ||
| _globals['_LISTLEASESRESPONSE']._serialized_end=2255 | ||
| _globals['_CREATELEASEREQUEST']._serialized_start=2258 | ||
| _globals['_CREATELEASEREQUEST']._serialized_end=2422 | ||
| _globals['_UPDATELEASEREQUEST']._serialized_start=2425 | ||
| _globals['_UPDATELEASEREQUEST']._serialized_end=2568 | ||
| _globals['_DELETELEASEREQUEST']._serialized_start=2570 | ||
| _globals['_DELETELEASEREQUEST']._serialized_end=2641 | ||
| _globals['_CLIENTSERVICE']._serialized_start=2644 | ||
| _globals['_CLIENTSERVICE']._serialized_end=3707 |
There was a problem hiding this comment.
💡 Verification agent
🧩 Analysis chain
Runtime version pin note
The generated code enforces Protobuf runtime 6.30.1. Ensure CI and packaging install a compatible google-protobuf runtime to prevent runtime version errors at import time.
🏁 Script executed:
#!/bin/bash
python - <<'PY'
import google.protobuf
print("protobuf version:", google.protobuf.__version__)
PYLength of output: 35
Ensure Protobuf runtime ≥ 6.30.1 in CI and packaging
Our environment currently has google-protobuf 4.25.8 installed, but the generated code requires Protobuf runtime 6.30.1 or newer. Without updating, imports will fail at runtime.
• Update your CI configuration and any packaging scripts to install or pin google-protobuf>=6.30.1.
• Verify that all downstream projects consuming this package also have compatible Protobuf runtimes.
• Consider adding a runtime version check in your test suite to catch mismatches early (e.g., in setup.py or a CI pre-step).
🤖 Prompt for AI Agents
In
packages/jumpstarter-protocol/jumpstarter_protocol/jumpstarter/client/v1/client_pb2.py
around lines 118-143, the generated Protobuf code requires
google-protobuf>=6.30.1 but CI and packaging currently install 4.25.8; update CI
configs and any packaging/requirements files to pin or install
google-protobuf>=6.30.1 (e.g., pyproject/requirements/CI install step), ensure
downstream consumers are informed/updated to use a compatible Protobuf runtime,
and add a lightweight runtime version check in the test/CI pre-step or packaging
metadata (or setup.py) to fail early if the installed google-protobuf is older
than 6.30.1.
There was a problem hiding this comment.
*_pb2.py are auto generated files from grpc
There was a problem hiding this comment.
@mangelajo You're absolutely right - *_pb2.py files are auto-generated from the protobuf compiler. Thanks for the clarification!
The protobuf runtime version mismatch issue still exists, but the solution should focus on the generation tooling and runtime environment alignment rather than manual file changes. The discrepancy between the required runtime version (6.30.1) and what's currently installed (4.25.8) would need to be addressed in:
- The protobuf compiler/generation environment
- CI/packaging dependencies
- Runtime environment setup
Since these files regenerate automatically, any manual edits would be overwritten anyway.
✏️ Learnings added
Learnt from: mangelajo
PR: jumpstarter-dev/jumpstarter#589
File: packages/jumpstarter-protocol/jumpstarter_protocol/jumpstarter/client/v1/client_pb2.py:0-0
Timestamp: 2025-09-02T07:32:04.510Z
Learning: *_pb2.py files in the jumpstarter codebase are auto-generated from gRPC/protobuf definitions and should not be manually modified. Issues with these files should be addressed at the protobuf generation tooling level or runtime environment setup.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
f144c17 to
ffa54bc
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
packages/jumpstarter-cli/jumpstarter_cli/run.py (2)
79-87: Parent misinterprets child’s SIGHUP exit; treat it as restartWhen the child returns
1for SIGHUP, the parent maps it to129. Adjust the mapping to restart on either0(graceful) orSIGHUP.if os.WIFEXITED(status): # Interpret child exit code child_exit_code = os.WEXITSTATUS(status) - if child_exit_code == 0: + if child_exit_code in (0, signal.SIGHUP): return None # restart child (unexpected exit/exception) else: # Child indicates termination (signal number) return 128 + child_exit_code # Return standard Unix exit codeAlternatively (or additionally), keep the current parent logic and adopt the child-side change suggested above to return 0 on SIGHUP.
59-61: Map SIGHUP to restart instead of error exitPer PR goals, SIGHUP should restart the child. Returning the raw signal number causes the parent to treat it as an error (128 + 1). Return 0 for SIGHUP so the parent restarts.
- # Return signal number if received, otherwise 0 for immediate restart - return received_signal if received_signal else 0 + # Restart on SIGHUP or no-signal cancellations; otherwise propagate signal number + return 0 if received_signal in (0, signal.SIGHUP) else received_signal
🧹 Nitpick comments (3)
packages/jumpstarter/jumpstarter/exporter/session.py (1)
46-55: Make teardown logging resilient if report() also failsIf
self.root_device.report()raises while handling aclose()error, the exception escapes theexceptblock and can mask the original teardown failure. Fall back gracefully ifreport()fails so we always log and proceed to handler removal.def __exit__(self, exc_type, exc_value, traceback): - try: - self.root_device.close() - except Exception as e: - # Get driver name from report for more descriptive logging - report = self.root_device.report() - driver_name = report.labels.get('jumpstarter.dev/name', self.root_device.__class__.__name__) - logger.error("Error closing driver %s: %s", driver_name, e, exc_info=True) + try: + self.root_device.close() + except Exception as e: + # Try to get a descriptive name, but don't let that fail the teardown path + try: + report = self.root_device.report() + driver_name = report.labels.get('jumpstarter.dev/name', self.root_device.__class__.__name__) + except Exception: + driver_name = self.root_device.__class__.__name__ + logger.error("Error closing driver %s: %s", driver_name, e, exc_info=True) finally: logging.getLogger().removeHandler(self._logging_handler)packages/jumpstarter/jumpstarter/config/exporter.py (1)
166-175: Nit: annotate the contextmanager yield type for IDEs and readersReturning the concrete type helps usage sites and avoids guesswork.
- async def create_exporter(self): + async def create_exporter(self) -> "asynccontextmanager[Exporter]":Note: if forward references are undesirable, you can import
Exporterunder TYPE_CHECKING to avoid runtime cycles.packages/jumpstarter/jumpstarter/exporter/exporter.py (1)
51-76: Close channel after Unregister — good; also guard close against timeout cancellationIf
move_on_after(10)cancels the scope, theawait channel.close()in the innerfinallycan itself be cancelled. Shielding the close ensures resources are released even on timeout.with anyio.move_on_after(10): # 10 second timeout - channel = await self.channel_factory() - try: + channel = await self.channel_factory() + try: controller = jumpstarter_pb2_grpc.ControllerServiceStub(channel) await controller.Unregister( jumpstarter_pb2.UnregisterRequest( reason="Exporter shutdown", ) ) logger.info("Controller unregistration completed successfully") - finally: - await channel.close() + finally: + # Ensure the channel closes even if the scope was cancelled by the timeout + with anyio.CancelScope(shield=True): + await channel.close()Additionally, consider mirroring this “explicit channel + close” pattern in
session(),handle.listen()andserve.status()to avoid accumulating open channels under long-running failures.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
packages/jumpstarter-cli/jumpstarter_cli/run.py(1 hunks)packages/jumpstarter/jumpstarter/config/exporter.py(1 hunks)packages/jumpstarter/jumpstarter/exporter/exporter.py(4 hunks)packages/jumpstarter/jumpstarter/exporter/session.py(1 hunks)
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: michalskrivanek
PR: jumpstarter-dev/jumpstarter#589
File: packages/jumpstarter/jumpstarter/exporter/exporter.py:147-155
Timestamp: 2025-08-14T13:11:35.034Z
Learning: In the jumpstarter fork-based architecture, when the exporter's serve() method exits (e.g., due to lease changes), the child process terminates and the parent process automatically restarts it, eliminating concerns about orphaned tasks since the entire process is restarted.
Learnt from: NickCao
PR: jumpstarter-dev/jumpstarter#521
File: packages/jumpstarter-cli/jumpstarter_cli/run.py:1-2
Timestamp: 2025-06-12T17:18:48.519Z
Learning: Exporters in this project are intended to run only on Unix systems, so Unix-specific APIs like `os.fork()` are acceptable and Windows compatibility is not required.
Learnt from: michalskrivanek
PR: jumpstarter-dev/jumpstarter#584
File: packages/jumpstarter/jumpstarter/client/grpc.py:38-54
Timestamp: 2025-08-18T07:13:37.619Z
Learning: User michalskrivanek prefers inline code over small helper function extractions when the abstraction doesn't significantly improve readability or reduce complexity. They value straightforward, readable code over reducing minor duplication.
📚 Learning: 2025-08-14T13:11:35.034Z
Learnt from: michalskrivanek
PR: jumpstarter-dev/jumpstarter#589
File: packages/jumpstarter/jumpstarter/exporter/exporter.py:147-155
Timestamp: 2025-08-14T13:11:35.034Z
Learning: In the jumpstarter fork-based architecture, when the exporter's serve() method exits (e.g., due to lease changes), the child process terminates and the parent process automatically restarts it, eliminating concerns about orphaned tasks since the entire process is restarted.
Applied to files:
packages/jumpstarter-cli/jumpstarter_cli/run.pypackages/jumpstarter/jumpstarter/exporter/exporter.py
📚 Learning: 2025-05-07T08:29:38.418Z
Learnt from: mangelajo
PR: jumpstarter-dev/jumpstarter#448
File: packages/jumpstarter-cli-common/jumpstarter_cli_common/signal.py:10-11
Timestamp: 2025-05-07T08:29:38.418Z
Learning: In AnyIO 4.6.2+, `open_signal_receiver()` is a regular synchronous context manager that should be used with a plain `with` statement, not `async with`. However, it returns an asynchronous iterator that must be consumed with `async for`. The correct usage pattern is: `with open_signal_receiver(...) as signals: async for signum in signals:`. Reference: https://anyio.readthedocs.io/en/stable/signals.html
Applied to files:
packages/jumpstarter-cli/jumpstarter_cli/run.py
📚 Learning: 2025-05-07T08:29:38.418Z
Learnt from: mangelajo
PR: jumpstarter-dev/jumpstarter#448
File: packages/jumpstarter-cli-common/jumpstarter_cli_common/signal.py:10-11
Timestamp: 2025-05-07T08:29:38.418Z
Learning: In AnyIO, `open_signal_receiver()` returns a synchronous context manager (used with regular `with`), not an asynchronous one. While the context manager is synchronous, it yields an asynchronous iterator that should be consumed with `async for`. Reference: https://anyio.readthedocs.io/en/stable/signals.html
Applied to files:
packages/jumpstarter-cli/jumpstarter_cli/run.py
📚 Learning: 2025-08-14T13:20:51.953Z
Learnt from: michalskrivanek
PR: jumpstarter-dev/jumpstarter#589
File: packages/jumpstarter-cli/jumpstarter_cli/run.py:15-15
Timestamp: 2025-08-14T13:20:51.953Z
Learning: Signal 0 is not a valid receivable signal in Unix systems. It can only be used as an argument to kill(pid, 0) to test if a process exists, but doesn't actually send a signal. Receivable signals are positive integers (e.g., SIGINT=2, SIGTERM=15, SIGHUP=1, SIGQUIT=3). Using 0 to represent "no signal received" in signal handling code is correct.
Applied to files:
packages/jumpstarter-cli/jumpstarter_cli/run.py
🧬 Code graph analysis (4)
packages/jumpstarter-cli/jumpstarter_cli/run.py (4)
packages/jumpstarter-cli-common/jumpstarter_cli_common/config.py (1)
opt_config(99-100)packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py (2)
handle_exceptions(32-46)leaf_exceptions(78-102)packages/jumpstarter/jumpstarter/exporter/exporter.py (3)
stop(36-49)serve(143-193)status(152-171)packages/jumpstarter/jumpstarter/config/exporter.py (2)
create_exporter(160-193)serve(195-197)
packages/jumpstarter/jumpstarter/exporter/session.py (1)
packages/jumpstarter/jumpstarter/driver/base.py (2)
close(85-87)report(192-209)
packages/jumpstarter/jumpstarter/config/exporter.py (4)
packages/jumpstarter/jumpstarter/exporter/exporter.py (2)
Exporter(25-193)serve(143-193)packages/jumpstarter/jumpstarter/common/grpc.py (2)
ssl_channel_credentials(16-46)aio_secure_channel(49-54)packages/jumpstarter/jumpstarter/config/grpc.py (1)
call_credentials(6-20)packages/jumpstarter/jumpstarter/common/utils.py (1)
serve(31-39)
packages/jumpstarter/jumpstarter/exporter/exporter.py (3)
packages/jumpstarter/jumpstarter/config/exporter.py (1)
channel_factory(167-174)packages/jumpstarter-protocol/jumpstarter_protocol/jumpstarter/v1/jumpstarter_pb2_grpc.py (1)
ControllerServiceStub(9-68)packages/jumpstarter/conftest.py (1)
Unregister(59-60)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: pytest-matrix (macos-15, 3.11)
- GitHub Check: e2e
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.11)
- GitHub Check: pytest-matrix (macos-15, 3.12)
- GitHub Check: pytest-matrix (macos-15, 3.13)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.13)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.12)
🔇 Additional comments (6)
packages/jumpstarter/jumpstarter/config/exporter.py (2)
159-194: Lifecycle wrapper looks solid; shielded teardown and aenter/aexit gating LGTMThe explicit lifecycle via
create_exporter()withenteredgating and a shielded__aexit__matches the PR goals and avoids abrupt cancellation during cleanup.
195-197: Delegation to the lifecycle is correct
serve()delegating toasync with self.create_exporter()keeps responsibilities clean and centralizes teardown. Good change.packages/jumpstarter/jumpstarter/exporter/exporter.py (3)
26-35: Type changes align with awaited channel factory and task-group tracking
channel_factory: Callable[[], Awaitable[grpc.aio.Channel]]correctly reflects the async channel creation.- Internal
_tg,_started,_stop_requestedstate fits the new stop semantics.
36-50: Stop semantics are sensible; immediate vs. lease-exit pathsThe split between immediate cancellation and “stop on lease exit” is clear and matches how the child should react to SIGHUP vs. TERM/INT. No issues spotted.
173-194: Serve loop + internal task group assignment LGTMStoring
self._tg, invokingself.stop()on lease changes, and clearing_tgon exit integrates well with the child’s signal-driven stop model.packages/jumpstarter-cli/jumpstarter_cli/run.py (1)
41-42: LGTM: start signal receiver before exporter creationStarting the signal handler immediately reduces window for missed signals and aligns with AnyIO’s guidance on signal handling.
Properly handle termination signals to exit cleanly. Signals are forwarded to child, and its exit code is propagated to parent. SIGHUP restarts the child while keeping the parent.
sending SIGHUP to child process results in delayed termination waiting on lease to be free (either the lease ends or there is another error resulting in child restart) Intended for graceful termination of exporter after the current lease holder is done
ffa54bc to
f4754db
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
packages/jumpstarter/jumpstarter/exporter/exporter.py (2)
114-135: Leak: Listen stream creates channels but never closes them (retries can exhaust FDs)Each retry re-creates a channel. Ensure it’s closed after the stream ends (successfully, error, or cancellation).
async def listen(retries=5, backoff=3): retries_left = retries while True: try: - controller = jumpstarter_pb2_grpc.ControllerServiceStub(await self.channel_factory()) - async for request in controller.Listen(jumpstarter_pb2.ListenRequest(lease_name=lease_name)): - await listen_tx.send(request) + ch = await self.channel_factory() + try: + controller = jumpstarter_pb2_grpc.ControllerServiceStub(ch) + async for request in controller.Listen( + jumpstarter_pb2.ListenRequest(lease_name=lease_name) + ): + await listen_tx.send(request) + finally: + # Shielding optional; add if you expect cancellation here. + await ch.close() except Exception as e: if retries_left > 0: retries_left -= 1 logger.info( "Listen stream interrupted, restarting in {}s, {} retries left: {}".format( backoff, retries_left, e ) ) await sleep(backoff) else: raise else: retries_left = retries
153-173: Leak: Status stream also doesn’t close channels on restartSame problem as Listen; close the channel per attempt.
async def status(retries=5, backoff=3): retries_left = retries while True: try: - controller = jumpstarter_pb2_grpc.ControllerServiceStub(await self.channel_factory()) - async for status in controller.Status(jumpstarter_pb2.StatusRequest()): - await status_tx.send(status) + ch = await self.channel_factory() + try: + controller = jumpstarter_pb2_grpc.ControllerServiceStub(ch) + async for status in controller.Status(jumpstarter_pb2.StatusRequest()): + await status_tx.send(status) + finally: + await ch.close() except Exception as e: if retries_left > 0: retries_left -= 1 logger.info( "Status stream interrupted, restarting in {}s, {} retries left: {}".format( backoff, retries_left, e ) ) await sleep(backoff) else: raise else: retries_left = retries
🧹 Nitpick comments (6)
packages/jumpstarter/jumpstarter/exporter/exporter.py (6)
36-50: stop(): annotate return type; otherwise looks soundMinor polish: add explicit return type.
- def stop(self, wait_for_lease_exit=False): + def stop(self, wait_for_lease_exit: bool = False) -> None:
52-76: Don’t log expected cancellations/timeouts as errors during cleanupUnregistration may be cancelled/timed-out; log at info for CancelledError to keep logs clean. You already shield channel.close(), which is great.
try: if self.registered: logger.info("Unregistering exporter with controller") try: with anyio.move_on_after(10): # 10 second timeout channel = await self.channel_factory() try: controller = jumpstarter_pb2_grpc.ControllerServiceStub(channel) await controller.Unregister( jumpstarter_pb2.UnregisterRequest( reason="Exporter shutdown", ) ) logger.info("Controller unregistration completed successfully") finally: with anyio.CancelScope(shield=True): await channel.close() - except Exception as e: - logger.error("Error during controller unregistration: %s", e, exc_info=True) + except anyio.get_cancelled_exc_class(): + logger.info("Controller unregistration cancelled or timed out") + except Exception as e: + logger.error("Error during controller unregistration: %s", e, exc_info=True) - except Exception as e: - logger.error("Error during exporter cleanup: %s", e, exc_info=True) + except anyio.get_cancelled_exc_class(): + logger.info("Exporter cleanup cancelled") + except Exception as e: + logger.error("Error during exporter cleanup: %s", e, exc_info=True)
86-106: Close gRPC channel used for initial Register to avoid a leakThis channel is not reused; close it after Register.
- controller = jumpstarter_pb2_grpc.ControllerServiceStub(await self.channel_factory()) + ch = await self.channel_factory() + try: + controller = jumpstarter_pb2_grpc.ControllerServiceStub(ch) with Session( uuid=self.uuid, labels=self.labels, root_device=self.device_factory(), ) as session: async with session.serve_unix_async() as path: async with grpc.aio.secure_channel( f"unix://{path}", grpc.local_channel_credentials(grpc.LocalConnectionType.UDS) ) as channel: response = await jumpstarter_pb2_grpc.ExporterServiceStub(channel).GetReport(empty_pb2.Empty()) logger.info("Registering exporter with controller") - await controller.Register( + await controller.Register( jumpstarter_pb2.RegisterRequest( labels=self.labels, reports=response.reports, ) ) self.registered = True yield path + finally: + await ch.close()
145-147: Serve docstring: clarify exit conditions and parent/child contractDocument that serve() returns on lease change or stop request; in the fork-based model the parent restarts the child.
- """ - Serve the exporter. - """ + """ + Serve the exporter. + + Returns when: + - The lease name changes (teardown and let the supervisor restart us), or + - stop(wait_for_lease_exit=False) cancels the task group, or + - stop(wait_for_lease_exit=True) is requested and the exporter is not leased. + + In the fork-based architecture, the parent supervises and restarts the child on exit. + """
178-182: Improve lease-change logging; set both old/new for traceabilityOptional: log transition and then stop. Functional behavior unchanged.
- if self.lease_name != "" and self.lease_name != status.lease_name: - self.lease_name = status.lease_name - logger.info("Lease status changed, killing existing connections") + if self.lease_name != "" and self.lease_name != status.lease_name: + old = self.lease_name + self.lease_name = status.lease_name + logger.info("Lease changed: %s -> %s; tearing down", old, self.lease_name) self.stop() break
191-194: Redundant stop() call before breaking when not leasedYou’re already inside the same TaskGroup; just break and let the outer context exit. The extra cancel is harmless but noisy.
- if self._stop_requested: - self.stop() - break + if self._stop_requested: + break
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
packages/jumpstarter-cli/jumpstarter_cli/run.py(1 hunks)packages/jumpstarter/jumpstarter/config/exporter.py(1 hunks)packages/jumpstarter/jumpstarter/exporter/exporter.py(4 hunks)packages/jumpstarter/jumpstarter/exporter/session.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- packages/jumpstarter/jumpstarter/config/exporter.py
- packages/jumpstarter-cli/jumpstarter_cli/run.py
- packages/jumpstarter/jumpstarter/exporter/session.py
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: michalskrivanek
PR: jumpstarter-dev/jumpstarter#589
File: packages/jumpstarter/jumpstarter/exporter/exporter.py:147-155
Timestamp: 2025-08-14T13:11:35.034Z
Learning: In the jumpstarter fork-based architecture, when the exporter's serve() method exits (e.g., due to lease changes), the child process terminates and the parent process automatically restarts it, eliminating concerns about orphaned tasks since the entire process is restarted.
Learnt from: michalskrivanek
PR: jumpstarter-dev/jumpstarter#584
File: packages/jumpstarter/jumpstarter/client/grpc.py:38-54
Timestamp: 2025-08-18T07:13:37.619Z
Learning: User michalskrivanek prefers inline code over small helper function extractions when the abstraction doesn't significantly improve readability or reduce complexity. They value straightforward, readable code over reducing minor duplication.
📚 Learning: 2025-08-14T13:11:35.034Z
Learnt from: michalskrivanek
PR: jumpstarter-dev/jumpstarter#589
File: packages/jumpstarter/jumpstarter/exporter/exporter.py:147-155
Timestamp: 2025-08-14T13:11:35.034Z
Learning: In the jumpstarter fork-based architecture, when the exporter's serve() method exits (e.g., due to lease changes), the child process terminates and the parent process automatically restarts it, eliminating concerns about orphaned tasks since the entire process is restarted.
Applied to files:
packages/jumpstarter/jumpstarter/exporter/exporter.py
🧬 Code graph analysis (1)
packages/jumpstarter/jumpstarter/exporter/exporter.py (3)
packages/jumpstarter/jumpstarter/config/exporter.py (1)
channel_factory(167-174)packages/jumpstarter-protocol/jumpstarter_protocol/jumpstarter/v1/jumpstarter_pb2_grpc.py (1)
ControllerServiceStub(9-68)packages/jumpstarter/conftest.py (1)
Unregister(59-60)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: Redirect rules - jumpstarter-docs
- GitHub Check: Header rules - jumpstarter-docs
- GitHub Check: Pages changed - jumpstarter-docs
- GitHub Check: pytest-matrix (macos-15, 3.13)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.12)
- GitHub Check: pytest-matrix (macos-15, 3.12)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.11)
- GitHub Check: pytest-matrix (macos-15, 3.11)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.13)
- GitHub Check: e2e
🔇 Additional comments (6)
packages/jumpstarter/jumpstarter/exporter/exporter.py (6)
8-8: Using anyio.abc.TaskGroup is fineInterface import keeps typing precise while still compatible with create_task_group() instances.
26-34: Lifecycle fields (_stop_requested, _started, _tg) look goodThese enable explicit shutdown semantics; no issues spotted.
175-175: Storing tg for later cancellation is appropriateThis enables external stop() to cancel the running group.
184-186: Deferred start until lease known — LGTMStarting handle after first non-empty lease name avoids premature listeners.
194-194: Resetting _tg to None on exit — LGTMPrevents stale references after the TaskGroup context closes.
2-2: Async channel_factory usage verified – no missing await callsI ran the suggested grep across the exporter and config modules to find any un-awaited
channel_factory()invocations and confirmed there are none. The switch toAwaitableand the import fromcollections.abcis correct, and all call sites properly await the async factory. Code changes look good as-is.
Properly handle termination signals to exit cleanly. Signals are forwarded to child, and its exit code is propagated to parent. SIGHUP restarts the child while keeping the parent.
Summary by CodeRabbit
New Features
Changes
Bug Fixes