Conversation
fad6553 to
193ce4e
Compare
193ce4e to
6a86741
Compare
5185f37 to
c334ade
Compare
82d2556 to
a83e37a
Compare
📝 WalkthroughWalkthroughRefactors node execution into a two-phase model: plan generation via Changes
Sequence Diagram(s)sequenceDiagram
participant CLI as "CLI (apply/main)"
participant Planner as "plan_for_node"
participant Executor as "execute(plan)"
participant Step as "Step (Ping/Keys/Build/Activate)"
participant Target as "SharedTarget / Remote Host"
CLI->>Planner: make_goal(name, node) -> plan_for_node(...)
Planner-->>CLI: NodePlan (ordered steps)
CLI->>Executor: execute(plan)
Executor->>Step: iterate & run step.execute(ctx)
Step->>Target: read/write via SharedTarget (ssh/ping/push)
Step-->>Executor: result (ok / error)
Executor-->>CLI: completion / error
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
crates/core/src/hive/steps/ping.rs (1)
27-54:⚠️ Potential issue | 🔴 CriticalDeadlock: read lock held when attempting write lock.
The read lock acquired on line 29 (
let target = ...) remains in scope when line 53 attempts to acquire a write lock on the sameRwLock. Sincetokio::sync::RwLockis not reentrant, this will deadlock.The read guard must be explicitly dropped before acquiring the write lock.
🐛 Proposed fix
#[instrument(skip_all, name = "ping")] async fn execute(&self, ctx: &mut Context) -> Result<(), HiveLibError> { loop { let target = self.target.0.read().await; event!( Level::INFO, status = "attempting", host = target.get_preferred_host()?.to_string() ); if target.ping(ctx.modifiers).await.is_ok() { event!( Level::INFO, status = "success", host = target.get_preferred_host()?.to_string() ); return Ok(()); } // ? will take us out if we ran out of hosts + let host_str = target.get_preferred_host()?.to_string(); event!( Level::WARN, status = "failed to ping", - host = target.get_preferred_host()?.to_string() + host = host_str ); + // Drop read lock before acquiring write lock to avoid deadlock + drop(target); + self.target.0.write().await.host_failed(); } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/core/src/hive/steps/ping.rs` around lines 27 - 54, In execute(), the read guard obtained by let target = self.target.0.read().await stays in scope and causes a deadlock when you later call self.target.0.write().await.host_failed(); fix by dropping the read guard before acquiring the write lock: ensure the variable named target (the read guard from self.target.0.read()) is out of scope or explicitly dropped (drop(target)) immediately before calling self.target.0.write().await, then acquire the write guard and call host_failed() on it.crates/core/src/hive/node.rs (1)
70-100:⚠️ Potential issue | 🟡 MinorRemove unused
non_interactive_forcedparameter fromcreate_ssh_args.The
non_interactive_forcedparameter is never used in the function body. The test cases at lines 386 and 370 demonstrate that the function produces identical output regardless of whether this parameter istrueorfalse. Either implement the intended functionality using this parameter or remove it from the signature.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/core/src/hive/node.rs` around lines 70 - 100, The parameter non_interactive_forced on create_ssh_args is unused; remove it from the function signature and all call sites (and any trait/impl signatures) so create_ssh_args only takes (&self, modifiers: SubCommandModifiers) -> Result<Vec<String>, HiveLibError>, then update tests that passed true/false to call the new signature and run cargo test to ensure no remaining references; if the parameter was intended to change behavior instead, implement the conditional logic inside create_ssh_args using non_interactive_forced and update tests accordingly—prefer removing it if there is no intended behavior change.crates/core/src/hive/steps/activate.rs (1)
30-48:⚠️ Potential issue | 🟠 MajorBug: Initial ping result is discarded, and retry count is inconsistent.
The ping at line 33 is performed but its result is immediately overwritten at line 38 in the first loop iteration—this ping attempt is wasted. Additionally, the loop runs only twice (
0..2→ values 0 and 1), but the log message says "/3", so you're missing the third attempt.🐛 Proposed fix: Check initial ping before retry loop or adjust loop bounds
async fn wait_for_ping(target: &SharedTarget, ctx: &Context) -> Result<(), HiveLibError> { let target = target.0.read().await; let host = target.get_preferred_host()?; - let mut result = target.ping(ctx.modifiers).await; - for num in 0..2 { + for num in 0..3 { warn!("Trying to ping {host} (attempt {}/3)", num + 1); - result = target.ping(ctx.modifiers).await; + let result = target.ping(ctx.modifiers).await; if result.is_ok() { info!("Regained connection to {} via {host}", ctx.name); - break; + return Ok(()); } } - result + Err(HiveLibError::NetworkError(NetworkError::HostsExhausted)) }Alternatively, if you prefer to check the initial ping first before entering the retry loop:
async fn wait_for_ping(target: &SharedTarget, ctx: &Context) -> Result<(), HiveLibError> { let target = target.0.read().await; let host = target.get_preferred_host()?; - let mut result = target.ping(ctx.modifiers).await; + + // Initial ping attempt + if target.ping(ctx.modifiers).await.is_ok() { + info!("Regained connection to {} via {host}", ctx.name); + return Ok(()); + } for num in 0..2 { - warn!("Trying to ping {host} (attempt {}/3)", num + 1); + warn!("Trying to ping {host} (retry {}/2)", num + 1); - result = target.ping(ctx.modifiers).await; + let result = target.ping(ctx.modifiers).await; if result.is_ok() { info!("Regained connection to {} via {host}", ctx.name); - - break; + return Ok(()); } } - result + Err(HiveLibError::NetworkError(NetworkError::HostsExhausted)) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/core/src/hive/steps/activate.rs` around lines 30 - 48, In wait_for_ping the initial ping (target.ping at the top) is immediately overwritten by the loop and the loop bounds (0..2) conflict with the "/3" log; fix by treating the initial ping as attempt 1 (check result before entering retries) or remove the pre-loop ping and make the loop perform three attempts consistently; update the retry loop bounds and the warn/info attempt numbering accordingly so target.ping (the call on SharedTarget) is only invoked as intended and the log reflects the actual attempt count.
🧹 Nitpick comments (3)
crates/cli/src/main.rs (1)
89-90: Consider removing redundant type annotation.The explicit type annotation
wire_core::hive::node::ApplyGoalcould be simplified. Thetry_into()method can infer the target type from usage context.♻️ Suggested simplification
- let goal: wire_core::hive::node::ApplyGoal = - apply_args.goal.clone().try_into().unwrap(); + let goal = apply_args.goal.clone().try_into().unwrap();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/cli/src/main.rs` around lines 89 - 90, Remove the redundant explicit type on the `goal` binding: replace `let goal: wire_core::hive::node::ApplyGoal = apply_args.goal.clone().try_into().unwrap();` with a type-inferred binding like `let goal = apply_args.goal.clone().try_into().unwrap();` so the `try_into()` target is inferred; update any nearby code that relied on the explicit annotation if necessary (reference: the `goal` variable and the `apply_args.goal.clone().try_into().unwrap()` expression).crates/core/src/hive/plan.rs (1)
62-70: Confirmunreachable!()is safe here.The
unreachable!()at Line 69 assumesapply_plan_keysis only called whengoalisKeysorSwitchToConfiguration(Switch). This is enforced by the caller at Lines 140-144. However, if future changes add moreApplyGoalvariants that should process keys, this could panic unexpectedly.Consider using an exhaustive match with explicit handling or a compile-time assertion to make the contract clearer.
🔧 Safer alternative using exhaustive match
let (pre_keys, post_keys) = match goal { ApplyGoal::SwitchToConfiguration(SwitchToConfigurationGoal::Switch) => node .keys .clone() .into_iter() .partition(|x| matches!(x.upload_at, UploadKeyAt::PreActivation)), ApplyGoal::Keys => (node.keys.clone(), Vec::new()), - _ => unreachable!(), + ApplyGoal::Build | ApplyGoal::Push | ApplyGoal::SwitchToConfiguration(_) => { + // Other goals don't process keys through this path + unreachable!("apply_plan_keys called with non-key goal: {:?}", goal) + } };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/core/src/hive/plan.rs` around lines 62 - 70, The current match in apply_plan_keys uses unreachable!() for non-Keys/non-SwitchToConfiguration(Switch) variants, which can panic if ApplyGoal gains new variants; replace the unreachable!() with an explicit fallback that fails gracefully: change the final arm to return a clear error (e.g., Err(anyhow!("apply_plan_keys called with unsupported goal: {:?}", goal))) or otherwise handle other ApplyGoal variants explicitly, and optionally add a debug_assert!(matches!(goal, ApplyGoal::Keys | ApplyGoal::SwitchToConfiguration(SwitchToConfigurationGoal::Switch))) to document the expectation; reference ApplyGoal and SwitchToConfigurationGoal::Switch in your change so the intent is clear.crates/core/src/hive/steps/activate.rs (1)
84-84: Consider handlingNonecase instead ofunwrap().If
ctx.state.buildisNone, this will panic. While the execution flow may guarantee this is alwaysSomeby this point, usingok_orwith a descriptive error would be more defensive.🛡️ Suggested defensive handling
- let built_path = ctx.state.build.as_ref().unwrap(); + let built_path = ctx.state.build.as_ref().ok_or_else(|| { + HiveLibError::ActivationError(ActivationError::MissingBuildPath) + })?;This would require adding a
MissingBuildPathvariant toActivationError, or using an existing appropriate error type.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/core/src/hive/steps/activate.rs` at line 84, Replace the panic-prone unwrap on ctx.state.build in activate.rs by returning a proper error: change let built_path = ctx.state.build.as_ref().unwrap(); to use ok_or / ok_or_else and map to an ActivationError variant (e.g., MissingBuildPath) so callers receive a descriptive error instead of panicking; if ActivationError lacks a suitable variant, add MissingBuildPath to ActivationError and use that when converting the Option into a Result for built_path.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@CHANGELOG.md`:
- Around line 20-22: Update the sentence in CHANGELOG.md that reads 'Previously,
picking what steps would be ran was done on the fly during execution.' to use
the correct past participle: replace "would be ran" with "would be run" so the
sentence reads 'Previously, picking what steps would be run was done on the fly
during execution.'
- Around line 23-24: Update the sentence in CHANGELOG.md that currently reads
"there is no keys to deploy" to use correct subject-verb agreement; change it to
"there are no keys to deploy" so the plural noun "keys" matches the verb "are"
(the rest of the sentence can remain unchanged).
In `@crates/core/src/hive/executor.rs`:
- Around line 38-42: The JSON parse uses expect() inside the map closure for
evaluate_hive_attribute which will panic in the spawned task; replace that panic
with proper error conversion so parsing failures return
HiveInitialisationError::ParseEvaluateError. Change the map(|output|
serde_json::from_str::<Derivation>(&output).expect(...)) into a fallible
transform (e.g. .and_then or .map(|output|
serde_json::from_str::<Derivation>(&output).map_err(|e|
HiveInitialisationError::ParseEvaluateError { source: e.into(), context:
output.clone() }))) so the spawned task returns a Result<Derivation,
HiveInitialisationError> and propagate that Result to the caller instead of
panicking. Ensure the call sites that await evaluate_hive_attribute handle the
Result accordingly.
---
Outside diff comments:
In `@crates/core/src/hive/node.rs`:
- Around line 70-100: The parameter non_interactive_forced on create_ssh_args is
unused; remove it from the function signature and all call sites (and any
trait/impl signatures) so create_ssh_args only takes (&self, modifiers:
SubCommandModifiers) -> Result<Vec<String>, HiveLibError>, then update tests
that passed true/false to call the new signature and run cargo test to ensure no
remaining references; if the parameter was intended to change behavior instead,
implement the conditional logic inside create_ssh_args using
non_interactive_forced and update tests accordingly—prefer removing it if there
is no intended behavior change.
In `@crates/core/src/hive/steps/activate.rs`:
- Around line 30-48: In wait_for_ping the initial ping (target.ping at the top)
is immediately overwritten by the loop and the loop bounds (0..2) conflict with
the "/3" log; fix by treating the initial ping as attempt 1 (check result before
entering retries) or remove the pre-loop ping and make the loop perform three
attempts consistently; update the retry loop bounds and the warn/info attempt
numbering accordingly so target.ping (the call on SharedTarget) is only invoked
as intended and the log reflects the actual attempt count.
In `@crates/core/src/hive/steps/ping.rs`:
- Around line 27-54: In execute(), the read guard obtained by let target =
self.target.0.read().await stays in scope and causes a deadlock when you later
call self.target.0.write().await.host_failed(); fix by dropping the read guard
before acquiring the write lock: ensure the variable named target (the read
guard from self.target.0.read()) is out of scope or explicitly dropped
(drop(target)) immediately before calling self.target.0.write().await, then
acquire the write guard and call host_failed() on it.
---
Nitpick comments:
In `@crates/cli/src/main.rs`:
- Around line 89-90: Remove the redundant explicit type on the `goal` binding:
replace `let goal: wire_core::hive::node::ApplyGoal =
apply_args.goal.clone().try_into().unwrap();` with a type-inferred binding like
`let goal = apply_args.goal.clone().try_into().unwrap();` so the `try_into()`
target is inferred; update any nearby code that relied on the explicit
annotation if necessary (reference: the `goal` variable and the
`apply_args.goal.clone().try_into().unwrap()` expression).
In `@crates/core/src/hive/plan.rs`:
- Around line 62-70: The current match in apply_plan_keys uses unreachable!()
for non-Keys/non-SwitchToConfiguration(Switch) variants, which can panic if
ApplyGoal gains new variants; replace the unreachable!() with an explicit
fallback that fails gracefully: change the final arm to return a clear error
(e.g., Err(anyhow!("apply_plan_keys called with unsupported goal: {:?}", goal)))
or otherwise handle other ApplyGoal variants explicitly, and optionally add a
debug_assert!(matches!(goal, ApplyGoal::Keys |
ApplyGoal::SwitchToConfiguration(SwitchToConfigurationGoal::Switch))) to
document the expectation; reference ApplyGoal and
SwitchToConfigurationGoal::Switch in your change so the intent is clear.
In `@crates/core/src/hive/steps/activate.rs`:
- Line 84: Replace the panic-prone unwrap on ctx.state.build in activate.rs by
returning a proper error: change let built_path =
ctx.state.build.as_ref().unwrap(); to use ok_or / ok_or_else and map to an
ActivationError variant (e.g., MissingBuildPath) so callers receive a
descriptive error instead of panicking; if ActivationError lacks a suitable
variant, add MissingBuildPath to ActivationError and use that when converting
the Option into a Result for built_path.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8bd70e6e-5827-4ed1-8a80-f78d5c5fd08d
📒 Files selected for processing (20)
CHANGELOG.mdcrates/cli/src/apply.rscrates/cli/src/cli.rscrates/cli/src/main.rscrates/core/src/commands/common.rscrates/core/src/commands/mod.rscrates/core/src/commands/noninteractive.rscrates/core/src/commands/pty/mod.rscrates/core/src/hive/executor.rscrates/core/src/hive/mod.rscrates/core/src/hive/node.rscrates/core/src/hive/plan.rscrates/core/src/hive/steps/activate.rscrates/core/src/hive/steps/build.rscrates/core/src/hive/steps/cleanup.rscrates/core/src/hive/steps/evaluate.rscrates/core/src/hive/steps/keys.rscrates/core/src/hive/steps/mod.rscrates/core/src/hive/steps/ping.rscrates/core/src/hive/steps/push.rs
💤 Files with no reviewable changes (2)
- crates/core/src/hive/steps/mod.rs
- crates/core/src/hive/steps/cleanup.rs
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: marshmallow <github@althaea.zone>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: marshmallow <github@althaea.zone>
There was a problem hiding this comment.
🧹 Nitpick comments (6)
crates/core/src/hive/node.rs (1)
101-126: Consider usingChildOutputMode::Genericfor the ping operation.The ping method uses
ChildOutputMode::Interactivewhich may trigger terminal handling logic. For a simple connectivity check,ChildOutputMode::Genericmight be more appropriate since no user interaction is expected.♻️ Suggested change
let output = run_command( &CommandArguments::new(command_string, modifiers) .log_stdout() - .mode(crate::commands::ChildOutputMode::Interactive), + .mode(crate::commands::ChildOutputMode::Generic), ) .await?;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/core/src/hive/node.rs` around lines 101 - 126, The ping method currently builds a CommandArguments with mode(crate::commands::ChildOutputMode::Interactive) which can trigger terminal/TTY handling; change it to mode(crate::commands::ChildOutputMode::Generic) in the call to CommandArguments::new inside ping (the function named ping on the node type) so the run_command invocation uses Generic output mode for a non-interactive connectivity check; update the single occurrence of ChildOutputMode::Interactive to ChildOutputMode::Generic (refer to run_command, CommandArguments::new, and CommandStringBuilder usage in ping).crates/core/src/hive/executor.rs (1)
59-70: Debug assertions may panic in debug builds if span invariants are violated.These
debug_assert_matches!anddebug_assert!calls verify tracing span invariants. While useful for catching configuration issues during development, consider whether a more graceful handling (e.g., logging a warning) would be preferable to panicking, even in debug builds.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/core/src/hive/executor.rs` around lines 59 - 70, Replace the panicking debug_assert_matches! and debug_assert! checks on Span::current() with non-panicking runtime checks that avoid unwraps and log warnings instead; specifically, call Span::current().metadata() into an Option, if Some(metadata) check metadata.name() != "execute" and use tracing::warn! to warn if it differs, and check metadata.fields().field("node").is_none() and warn if missing, and if metadata() is None warn that span metadata is unavailable—remove the debug_assert_matches!, debug_assert!, and any .unwrap()s so the code no longer panics in debug builds.crates/core/src/hive/plan.rs (2)
302-320: Address or remove the TODO comment.The commented code block (lines 306-318) appears to be remnants of old test cases with a TODO note about potential coverage loss. Either restore the relevant test coverage or remove the dead comments to keep the codebase clean.
Would you like me to help identify what test coverage might be missing based on the commented code, or open an issue to track this?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/core/src/hive/plan.rs` around lines 302 - 320, The TODO and large commented block inside the assertion for plan.steps (which references Evaluate.into(), Build { target: None }.into(), and the commented Ping, PushKeyAgent, Keys, PushBuildOutput, SwitchToConfiguration entries) should be resolved: either remove the dead comments and the TODO to clean up the test, or restore/add explicit test cases that exercise the missing step variants (Ping, PushKeyAgent, Keys with UploadKeyAt::PostActivation, evaluate/build/push steps, SwitchToConfiguration) and update the assert_eq! on plan.steps to include those variants; locate the assertion around plan.steps in the test that constructs the plan and choose one approach (clean deletion of comments and TODO, or concrete test additions that construct and assert the extra step variants) and implement accordingly.
62-70: Consider replacingunreachable!()with exhaustive pattern matching.While the
unreachable!()is safe due to the call site guard at lines 140-144, using exhaustive pattern matching would make the code more self-documenting and catch any future changes toApplyGoalthat might introduce new variants:♻️ Suggested improvement
let (pre_keys, post_keys) = match goal { ApplyGoal::SwitchToConfiguration(SwitchToConfigurationGoal::Switch) => node .keys .clone() .into_iter() .partition(|x| matches!(x.upload_at, UploadKeyAt::PreActivation)), ApplyGoal::Keys => (node.keys.clone(), Vec::new()), - _ => unreachable!(), + ApplyGoal::Build | ApplyGoal::Push | ApplyGoal::SwitchToConfiguration(_) => { + unreachable!("apply_plan_keys should only be called for Keys or Switch goals") + } };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/core/src/hive/plan.rs` around lines 62 - 70, Replace the catch-all _ => unreachable!() in the match on ApplyGoal with exhaustive match arms for each ApplyGoal variant (e.g., explicitly handle ApplyGoal::SwitchToConfiguration(SwitchToConfigurationGoal::Prepare) and any other ApplyGoal variants besides Keys and SwitchToConfiguration(...Switch)), returning appropriate (pre_keys, post_keys) values or, for genuinely impossible branches, call panic! with a clear message; update the match around node.keys.clone().into_iter().partition(...) so every ApplyGoal variant is listed explicitly instead of using unreachable!(), referencing ApplyGoal, SwitchToConfigurationGoal, and the pre_keys/post_keys binding to locate the code.crates/core/src/hive/steps/ping.rs (1)
25-55: Consider reducing lock scope during async ping operation.The read lock acquired at line 29 is held across the async
ping()call (line 37), which performs a network operation. While this prevents concurrent modifications during the ping, it may cause unnecessary lock contention. Consider extracting the necessary data before the async operation:♻️ Suggested optimization to reduce lock scope
async fn execute(&self, ctx: &mut Context) -> Result<(), HiveLibError> { loop { - let target = self.target.0.read().await; - + let (host, ping_result) = { + let target = self.target.0.read().await; + let host = target.get_preferred_host()?.to_string(); + event!( + Level::INFO, + status = "attempting", + host = host + ); + (host, target.ping(ctx.modifiers).await) + }; + - event!( - Level::INFO, - status = "attempting", - host = target.get_preferred_host()?.to_string() - ); - - if target.ping(ctx.modifiers).await.is_ok() { + if ping_result.is_ok() { event!( Level::INFO, status = "success", - host = target.get_preferred_host()?.to_string() + host = host ); return Ok(()); } // ? will take us out if we ran out of hosts event!( Level::WARN, status = "failed to ping", - host = target.get_preferred_host()?.to_string() + host = host ); - self.target.0.write().await.host_failed(); } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/core/src/hive/steps/ping.rs` around lines 25 - 55, The read lock is held across the async network call in Ping::execute (self.target.0.read() -> target.ping()), causing contention; instead, inside the read lock clone or extract only the owned pieces needed (e.g., let preferred_host = target.get_preferred_host()?.to_string() and let ping_target = target.clone() or otherwise create an owned handle that implements ping), drop the read guard before awaiting ping(ping_target, ctx.modifiers), use preferred_host for event! logs, and when ping fails reacquire the write lock to call self.target.0.write().await.host_failed(); ensure get_preferred_host(), ping(), and host_failed() usages are updated to operate on the owned/cloned values and locks are not held across await points.crates/core/src/hive/steps/activate.rs (1)
30-48: Read lock held across multiple network operations.The
RwLockread guard is acquired at line 31 and held through all ping retry attempts. This blocks any concurrent write operations on the target for the duration of potentially slow network timeouts, which could cause contention if other tasks need to modify the target state.Consider releasing and re-acquiring the lock between attempts:
♻️ Suggested refactor to reduce lock contention
async fn wait_for_ping(target: &SharedTarget, ctx: &Context) -> Result<(), HiveLibError> { - let target = target.0.read().await; - let host = target.get_preferred_host()?; - let mut result = target.ping(ctx.modifiers).await; + let host = { + let target = target.0.read().await; + target.get_preferred_host()?.clone() + }; + let mut result = target.0.read().await.ping(ctx.modifiers).await; for num in 0..2 { warn!("Trying to ping {host} (attempt {}/3)", num + 1); - result = target.ping(ctx.modifiers).await; + result = target.0.read().await.ping(ctx.modifiers).await; if result.is_ok() { info!("Regained connection to {} via {host}", ctx.name); break; } } result }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/core/src/hive/steps/activate.rs` around lines 30 - 48, The read guard on `target` is held across await points in wait_for_ping, causing lock contention; fix by not holding the RwLock across network calls: acquire a short-lived read lock to obtain the preferred host (and any small clonable data you need), then drop the guard before entering the ping loop; inside the loop, re-acquire a read lock only immediately before invoking `target.ping(...)` (or, if `ping` can be called via a SharedTarget method that acquires its own lock, use that) so each ping attempt uses a fresh, brief read lock rather than holding it for the entire retry sequence.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@crates/core/src/hive/executor.rs`:
- Around line 59-70: Replace the panicking debug_assert_matches! and
debug_assert! checks on Span::current() with non-panicking runtime checks that
avoid unwraps and log warnings instead; specifically, call
Span::current().metadata() into an Option, if Some(metadata) check
metadata.name() != "execute" and use tracing::warn! to warn if it differs, and
check metadata.fields().field("node").is_none() and warn if missing, and if
metadata() is None warn that span metadata is unavailable—remove the
debug_assert_matches!, debug_assert!, and any .unwrap()s so the code no longer
panics in debug builds.
In `@crates/core/src/hive/node.rs`:
- Around line 101-126: The ping method currently builds a CommandArguments with
mode(crate::commands::ChildOutputMode::Interactive) which can trigger
terminal/TTY handling; change it to
mode(crate::commands::ChildOutputMode::Generic) in the call to
CommandArguments::new inside ping (the function named ping on the node type) so
the run_command invocation uses Generic output mode for a non-interactive
connectivity check; update the single occurrence of ChildOutputMode::Interactive
to ChildOutputMode::Generic (refer to run_command, CommandArguments::new, and
CommandStringBuilder usage in ping).
In `@crates/core/src/hive/plan.rs`:
- Around line 302-320: The TODO and large commented block inside the assertion
for plan.steps (which references Evaluate.into(), Build { target: None }.into(),
and the commented Ping, PushKeyAgent, Keys, PushBuildOutput,
SwitchToConfiguration entries) should be resolved: either remove the dead
comments and the TODO to clean up the test, or restore/add explicit test cases
that exercise the missing step variants (Ping, PushKeyAgent, Keys with
UploadKeyAt::PostActivation, evaluate/build/push steps, SwitchToConfiguration)
and update the assert_eq! on plan.steps to include those variants; locate the
assertion around plan.steps in the test that constructs the plan and choose one
approach (clean deletion of comments and TODO, or concrete test additions that
construct and assert the extra step variants) and implement accordingly.
- Around line 62-70: Replace the catch-all _ => unreachable!() in the match on
ApplyGoal with exhaustive match arms for each ApplyGoal variant (e.g.,
explicitly handle
ApplyGoal::SwitchToConfiguration(SwitchToConfigurationGoal::Prepare) and any
other ApplyGoal variants besides Keys and SwitchToConfiguration(...Switch)),
returning appropriate (pre_keys, post_keys) values or, for genuinely impossible
branches, call panic! with a clear message; update the match around
node.keys.clone().into_iter().partition(...) so every ApplyGoal variant is
listed explicitly instead of using unreachable!(), referencing ApplyGoal,
SwitchToConfigurationGoal, and the pre_keys/post_keys binding to locate the
code.
In `@crates/core/src/hive/steps/activate.rs`:
- Around line 30-48: The read guard on `target` is held across await points in
wait_for_ping, causing lock contention; fix by not holding the RwLock across
network calls: acquire a short-lived read lock to obtain the preferred host (and
any small clonable data you need), then drop the guard before entering the ping
loop; inside the loop, re-acquire a read lock only immediately before invoking
`target.ping(...)` (or, if `ping` can be called via a SharedTarget method that
acquires its own lock, use that) so each ping attempt uses a fresh, brief read
lock rather than holding it for the entire retry sequence.
In `@crates/core/src/hive/steps/ping.rs`:
- Around line 25-55: The read lock is held across the async network call in
Ping::execute (self.target.0.read() -> target.ping()), causing contention;
instead, inside the read lock clone or extract only the owned pieces needed
(e.g., let preferred_host = target.get_preferred_host()?.to_string() and let
ping_target = target.clone() or otherwise create an owned handle that implements
ping), drop the read guard before awaiting ping(ping_target, ctx.modifiers), use
preferred_host for event! logs, and when ping fails reacquire the write lock to
call self.target.0.write().await.host_failed(); ensure get_preferred_host(),
ping(), and host_failed() usages are updated to operate on the owned/cloned
values and locks are not held across await points.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 01a55f1d-13f3-4a7b-880f-728f53d71cc4
📒 Files selected for processing (20)
CHANGELOG.mdcrates/cli/src/apply.rscrates/cli/src/cli.rscrates/cli/src/main.rscrates/core/src/commands/common.rscrates/core/src/commands/mod.rscrates/core/src/commands/noninteractive.rscrates/core/src/commands/pty/mod.rscrates/core/src/hive/executor.rscrates/core/src/hive/mod.rscrates/core/src/hive/node.rscrates/core/src/hive/plan.rscrates/core/src/hive/steps/activate.rscrates/core/src/hive/steps/build.rscrates/core/src/hive/steps/cleanup.rscrates/core/src/hive/steps/evaluate.rscrates/core/src/hive/steps/keys.rscrates/core/src/hive/steps/mod.rscrates/core/src/hive/steps/ping.rscrates/core/src/hive/steps/push.rs
💤 Files with no reviewable changes (2)
- crates/core/src/hive/steps/mod.rs
- crates/core/src/hive/steps/cleanup.rs
There was a problem hiding this comment.
🧹 Nitpick comments (5)
crates/core/src/hive/steps/activate.rs (2)
150-162: Minor:get_preferred_host()called twice in error path.
target.get_preferred_host()?is called at both line 155 (for logging) and line 160 (for error construction). Consider extracting to a variable.♻️ Suggested refactor
let target = target.0.read().await; + let host = target.get_preferred_host()?; error!( "Failed to get regain connection to {name} via {host} after reboot.", name = ctx.name, - host = target.get_preferred_host()? + host = host ); return Err(HiveLibError::NetworkError( NetworkError::HostUnreachableAfterReboot( - target.get_preferred_host()?.to_string(), + host.to_string(), ), ));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/core/src/hive/steps/activate.rs` around lines 150 - 162, Extract the preferred host into a local variable to avoid calling target.get_preferred_host()? twice: after acquiring target = target.0.read().await, let preferred = target.get_preferred_host()? (or similar name) and use preferred in the error! macro and when constructing HiveLibError::NetworkError(NetworkError::HostUnreachableAfterReboot(preferred.to_string())); this reduces duplicate calls and keeps behaviour identical.
30-47: Lock held for entire retry loop, unlikePingstep.The read lock is acquired once at line 31 and held for all 3 retry attempts. This differs from the
Pingstep inping.rswhich drops and reacquires the lock between attempts to allowhost_failed()mutations.This appears intentional since
wait_for_pingis used after reboot/activation to verify connectivity to a specific host, not to cycle through fallback hosts. A brief comment explaining this distinction would improve clarity.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/core/src/hive/steps/activate.rs` around lines 30 - 47, The read lock in wait_for_ping (acquired via target.0.read().await before calling get_preferred_host) is intentionally held across all retry attempts, unlike the Ping step in ping.rs which drops/reacquires the lock to allow host_failed() to mutate hosts; add a concise comment above the lock acquisition in wait_for_ping explaining that we intentionally lock the target for the entire retry loop because this step verifies connectivity to a specific preferred host after reboot/activation (so we must not allow host list mutations or fallback cycling during retries), and reference get_preferred_host and host_failed to clarify the difference in behavior from ping.rs.crates/core/src/hive/executor.rs (2)
63-74: Debug assertions for span validation are runtime-only.These debug assertions validate that the span has the expected name and fields, but they only run in debug builds. If
cli/tracing_setup.rsis modified incorrectly, the error would only surface during development, not in CI release builds.Consider adding an integration test that validates the span configuration, or documenting the coupling more explicitly in both files.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/core/src/hive/executor.rs` around lines 63 - 74, The current checks using debug_assert_matches! and debug_assert! on Span::current().metadata().unwrap().name() and .fields().field("node") only run in debug builds; replace or supplement them with a non-debug runtime assertion (e.g., explicit if-check that panics with a clear message) so CI/release builds also validate the span, or add an integration test that exercises Span::current() metadata and asserts the span name equals "execute" and that fields().field("node") is Some; update or reference the same symbols (Span::current(), metadata(), name(), fields().field("node")) in the test or runtime check and include a clear error message that points to the tracing setup coupling.
60-86: Document the invariant betweengreedy_evaluateandEvaluatestep inclusion.The design correctly prevents the unused channel sender issue: when
greedy_evaluateis false (forApplyGoal::Keys), theEvaluatestep is always excluded from the plan (seeplan.rsline 153-155). All other goals setgreedy_evaluate: trueand include theEvaluatestep. Adding a comment inexecutor.rsexplaining this invariant would improve code clarity for future maintainers.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/core/src/hive/executor.rs` around lines 60 - 86, Add a short comment in executor.rs near the creation of the oneshot channel/tx+rx and the conditional tokio::spawn call (around evaluate_task and plan.greedy_evaluate) that documents the invariant: when plan.greedy_evaluate is false the Evaluate step is omitted from the plan (so the tx sender is deliberately unused), and when greedy_evaluate is true Evaluate is included and evaluate_task will consume tx; reference the symbols greedy_evaluate, Evaluate (plan step), evaluate_task, and the tx/rx channel to make clear why the unused-sender case is safe and intentional.crates/core/src/hive/plan.rs (1)
308-322: TODO comment indicates potentially lost test coverage.The commented code block mentions steps that were previously tested but may no longer be covered. Consider addressing this TODO or removing the commented code if it's no longer relevant.
Would you like me to open an issue to track adding test coverage for the commented scenarios, or help identify what specific coverage may have been lost?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/core/src/hive/plan.rs` around lines 308 - 322, The TODO/commented block in plan.rs references previously-tested step variants (Build, Ping, PushKeyAgent, Keys, crate::hive::steps::evaluate::Evaluate, crate::hive::steps::build::Build, crate::hive::steps::push::PushBuildOutput, SwitchToConfiguration) and should be resolved: either remove the dead commented code and the TODO, or replace the TODO with a concrete tracking action (create a test-coverage issue and reference its ID in the comment). To fix, decide which variants still need coverage, add unit/integration tests exercising those enum variants (e.g., Build, Ping, PushKeyAgent, Keys, Evaluate, PushBuildOutput, SwitchToConfiguration) and run cargo test to verify, or if they are obsolete, delete the commented examples and the TODO line to avoid misleading future readers; if you choose to track it, update the TODO to include the new issue number and a short note of which variants need tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@crates/core/src/hive/executor.rs`:
- Around line 63-74: The current checks using debug_assert_matches! and
debug_assert! on Span::current().metadata().unwrap().name() and
.fields().field("node") only run in debug builds; replace or supplement them
with a non-debug runtime assertion (e.g., explicit if-check that panics with a
clear message) so CI/release builds also validate the span, or add an
integration test that exercises Span::current() metadata and asserts the span
name equals "execute" and that fields().field("node") is Some; update or
reference the same symbols (Span::current(), metadata(), name(),
fields().field("node")) in the test or runtime check and include a clear error
message that points to the tracing setup coupling.
- Around line 60-86: Add a short comment in executor.rs near the creation of the
oneshot channel/tx+rx and the conditional tokio::spawn call (around
evaluate_task and plan.greedy_evaluate) that documents the invariant: when
plan.greedy_evaluate is false the Evaluate step is omitted from the plan (so the
tx sender is deliberately unused), and when greedy_evaluate is true Evaluate is
included and evaluate_task will consume tx; reference the symbols
greedy_evaluate, Evaluate (plan step), evaluate_task, and the tx/rx channel to
make clear why the unused-sender case is safe and intentional.
In `@crates/core/src/hive/plan.rs`:
- Around line 308-322: The TODO/commented block in plan.rs references
previously-tested step variants (Build, Ping, PushKeyAgent, Keys,
crate::hive::steps::evaluate::Evaluate, crate::hive::steps::build::Build,
crate::hive::steps::push::PushBuildOutput, SwitchToConfiguration) and should be
resolved: either remove the dead commented code and the TODO, or replace the
TODO with a concrete tracking action (create a test-coverage issue and reference
its ID in the comment). To fix, decide which variants still need coverage, add
unit/integration tests exercising those enum variants (e.g., Build, Ping,
PushKeyAgent, Keys, Evaluate, PushBuildOutput, SwitchToConfiguration) and run
cargo test to verify, or if they are obsolete, delete the commented examples and
the TODO line to avoid misleading future readers; if you choose to track it,
update the TODO to include the new issue number and a short note of which
variants need tests.
In `@crates/core/src/hive/steps/activate.rs`:
- Around line 150-162: Extract the preferred host into a local variable to avoid
calling target.get_preferred_host()? twice: after acquiring target =
target.0.read().await, let preferred = target.get_preferred_host()? (or similar
name) and use preferred in the error! macro and when constructing
HiveLibError::NetworkError(NetworkError::HostUnreachableAfterReboot(preferred.to_string()));
this reduces duplicate calls and keeps behaviour identical.
- Around line 30-47: The read lock in wait_for_ping (acquired via
target.0.read().await before calling get_preferred_host) is intentionally held
across all retry attempts, unlike the Ping step in ping.rs which
drops/reacquires the lock to allow host_failed() to mutate hosts; add a concise
comment above the lock acquisition in wait_for_ping explaining that we
intentionally lock the target for the entire retry loop because this step
verifies connectivity to a specific preferred host after reboot/activation (so
we must not allow host list mutations or fallback cycling during retries), and
reference get_preferred_host and host_failed to clarify the difference in
behavior from ping.rs.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b533ed73-c38a-4ef3-b359-aef6b59cb55f
📒 Files selected for processing (8)
crates/cli/src/main.rscrates/core/src/commands/noninteractive.rscrates/core/src/commands/pty/mod.rscrates/core/src/hive/executor.rscrates/core/src/hive/node.rscrates/core/src/hive/plan.rscrates/core/src/hive/steps/activate.rscrates/core/src/hive/steps/ping.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- crates/core/src/commands/noninteractive.rs
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/core/src/commands/mod.rs (1)
46-53:⚠️ Potential issue | 🟠 MajorPreserve the escalation command as argv, not a joined shell string.
Line 93 collapses structured args into one string, so values containing spaces or shell metacharacters will be re-split incorrectly when this is later interpolated into
sh -c .... An empty slice also becomesSome(""), which makes the command look elevated even though there is no executable escalation command.Suggested direction
pub(crate) struct CommandArguments<S: AsRef<str>> { modifiers: SubCommandModifiers, target: Option<SharedTarget>, output_mode: ChildOutputMode, command_string: S, keep_stdin_open: bool, - privilege_escalation_command: Option<String>, + privilege_escalation_command: Option<Vec<Arc<str>>>, log_stdout: bool, } @@ pub(crate) fn privileged(mut self, escalation_command: &[Arc<str>]) -> Self { - self.privilege_escalation_command = Some(escalation_command.iter().join(" ")); + self.privilege_escalation_command = + (!escalation_command.is_empty()).then(|| escalation_command.to_vec()); self }Then update the interactive/non-interactive builders to append those args directly, or shell-escape each token at the final boundary if a shell string is unavoidable.
Also applies to: 92-94
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/core/src/commands/mod.rs` around lines 46 - 53, The privilege_escalation_command is being stored as a single joined shell string which loses token boundaries and turns empty slices into Some(""), so change CommandArguments (and related builders) to store the escalation command as Vec<String> (or Option<Vec<String>>) instead of Option<String> (refer to the struct CommandArguments and the field privilege_escalation_command), preserve the original argv tokens when setting this field in the interactive/non-interactive builders, and update any code that consumes privilege_escalation_command to append the tokens directly to command argument lists (or perform shell-escaping at the last boundary) rather than interpolating a joined string into sh -c.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@crates/core/src/commands/mod.rs`:
- Around line 46-53: The privilege_escalation_command is being stored as a
single joined shell string which loses token boundaries and turns empty slices
into Some(""), so change CommandArguments (and related builders) to store the
escalation command as Vec<String> (or Option<Vec<String>>) instead of
Option<String> (refer to the struct CommandArguments and the field
privilege_escalation_command), preserve the original argv tokens when setting
this field in the interactive/non-interactive builders, and update any code that
consumes privilege_escalation_command to append the tokens directly to command
argument lists (or perform shell-escaping at the last boundary) rather than
interpolating a joined string into sh -c.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 5b28aada-c6bf-47e6-9240-694801c6efc6
📒 Files selected for processing (2)
crates/core/src/commands/mod.rscrates/core/src/commands/noninteractive.rs
re #392
Summary by CodeRabbit