Skip to content

fix: Files panel hangs on "Loading remote files..." indefinitely#461

Merged
pedramamini merged 1 commit intoRunMaestro:mainfrom
croakingtoad:fix/remote-file-tree-loading-hang
Feb 25, 2026
Merged

fix: Files panel hangs on "Loading remote files..." indefinitely#461
pedramamini merged 1 commit intoRunMaestro:mainfrom
croakingtoad:fix/remote-file-tree-loading-hang

Conversation

@croakingtoad
Copy link
Copy Markdown

@croakingtoad croakingtoad commented Feb 25, 2026

Problem

The Files right panel in SSH remote sessions gets permanently stuck showing "Loading remote files..." with a spinner and a stale progress count (e.g., "25 files in 7 folders"), but never actually renders the file tree. This happens especially after extended Maestro usage (hours), making the Files panel effectively non-functional for remote sessions.

image

Root Cause Analysis

I traced the issue through three interacting code paths:

1. No timeout on SSH commands — the primary cause

loadFileTree() (fileExplorer.ts) walks the remote directory tree by making one SSH call per directory via window.maestro.fs.readDir. Each call flows through:

loadFileTreeRecursive (per directory, sequential for-of loop)
  → IPC: fs:readDir
    → readDirRemote (remote-fs.ts)
      → execRemoteCommand
        → execFileNoThrow('ssh', args)  ← NO TIMEOUT

execFileNoThrow (execFile.ts) calls Node's execFile without a timeout option. If any single SSH command hangs (stale ControlMaster socket, keepalive failure, network degradation after hours of use), the promise never resolves. Since directories are processed sequentially in a for...of loop, one hung SSH call blocks the entire tree walk forever.

The progress callback fires after each directory completes, which is why users see "25 files in 7 folders" — that's the count from the last directory that succeeded before the hang. fileTreeLoading stays true permanently, so the UI never transitions from the spinner to the file list.

The overall Promise.all([loadFileTree, directorySize]) in useFileTreeManagement.ts also cannot resolve because even if directorySize (a single du/find command) completes, Promise.all waits for both.

2. Stale closure in async callbacks

The useEffect that triggers the initial file tree load captures activeSessionId from its closure. The .then() and .catch() callbacks use this captured value to identify which session to update. If the user switches sessions while loading is in progress, the callbacks fire with the old activeSessionId — which can cause the completion state to target the wrong session, leaving the original session's fileTreeLoading permanently stuck at true.

3. Timed-out SSH commands weren't retried

Even after adding timeouts, Node kills the process with an ETIMEDOUT error code. This pattern wasn't in the RECOVERABLE_SSH_ERRORS list, so timed-out commands would fail without retry rather than being retried with the existing exponential backoff logic.

Fixes

Fix 1: Add 30s timeout to SSH commands

Files: src/main/utils/execFile.ts, src/main/utils/remote-fs.ts

  • Added timeout option to ExecOptions interface and pass it through to Node's execFile
  • Set SSH_COMMAND_TIMEOUT_MS = 30000 (30 seconds) for all SSH remote filesystem commands
  • This ensures hung SSH connections are killed and surfaced as errors rather than blocking forever

Fix 2: Fix stale closure in async callbacks

File: src/renderer/hooks/git/useFileTreeManagement.ts

  • Capture session.id at the start of the effect into a local const sessionId
  • Use sessionId (stable, bound at effect start) instead of activeSessionId (from closure, can change) in all .then(), .catch(), and progress callbacks
  • This ensures the loading completion always targets the correct session regardless of whether the user switched sessions during the load

Fix 3: Add ETIMEDOUT to recoverable SSH errors

File: src/main/utils/remote-fs.ts

  • Added /ETIMEDOUT/i pattern to RECOVERABLE_SSH_ERRORS
  • When a timeout kills an SSH process, the existing retry logic now kicks in with exponential backoff (up to 3 retries)
  • If all retries fail, the error propagates to the .catch() handler which sets fileTreeError and schedules a 20s retry — so the user sees an error message instead of an infinite spinner

Testing

  • All 68 existing tests pass (execFile.test.ts: 23 tests, remote-fs.test.ts: 45 tests)
  • No new TypeScript errors introduced (pre-existing errors in src/prompts/index.ts are unrelated)

User Impact

Before: Files panel shows "Loading remote files..." forever after SSH connection degrades (common after hours of use). Only workaround was restarting Maestro.

After: If an SSH command hangs, it times out after 30s, retries up to 3 times with backoff, and either succeeds on retry or shows a clear error with automatic retry after 20s. Session switching during loading no longer corrupts loading state.

Summary by CodeRabbit

  • Bug Fixes

    • SSH commands now timeout after 30 seconds and are retried when timeouts occur.
    • File-tree loading fixes prevent stale session state and ensure progress/errors are tied to the correct load session.
    • Recursive directory reads now skip unreadable subfolders instead of aborting the entire tree load.
  • New Features

    • Optional configurable process timeouts: external commands can be killed after a set time, preserve partial output, and report timeout status.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Feb 25, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds optional process timeouts to exec utilities and applies a 30s timeout to SSH commands (treated as recoverable); preserves stdout and marks timed-out child processes with exitCode 'ETIMEDOUT'. Also fixes stale-closure session handling in file-tree loading and makes directory-size/stat errors non-fatal; unreadable subdirectories are skipped during recursive reads.

Changes

Cohort / File(s) Summary
Process execution utils
src/main/utils/execFile.ts
Add timeout?: number to ExecOptions; propagate timeout through execFileNoThrow and execFileWithInput; implement manual timeout for spawn-based flows (kill child, set exitCode 'ETIMEDOUT', append timeout message to stderr, preserve stdout; ensure timer cleanup).
SSH remote FS
src/main/utils/remote-fs.ts
Introduce SSH_COMMAND_TIMEOUT_MS = 30000; pass timeout into execFileNoThrow; add 'ETIMEDOUT' to RECOVERABLE_SSH_ERRORS and update retry logic to treat timeouts as recoverable while preserving exponential backoff.
File-tree session handling
src/renderer/hooks/git/useFileTreeManagement.ts
Capture a stable sessionId at load start and replace activeSessionId closures with the captured sessionId across initial load, progress, completion, stats-migration, and error callbacks; make directorySize/stat fetching non-fatal and apply stats only when available.
File-tree traversal hardening
src/renderer/utils/fileExplorer.ts
Wrap recursive child-directory reads with try/catch so unreadable subdirectories (permissions, broken symlinks, SSH namespaces) are skipped and represented as empty folders instead of aborting the full traversal.

Sequence Diagram(s)

sequenceDiagram
    participant UI as UI / Caller
    participant RemoteFS as remote-fs.execSsh
    participant ExecUtil as execFileNoThrow / execFileWithInput
    participant Timer as TimeoutHandler
    participant Retry as RetryLogic

    UI->>RemoteFS: request SSH command
    RemoteFS->>ExecUtil: exec SSH cmd with { timeout: 30s }
    ExecUtil->>Timer: start 30s timer

    alt command finishes before timeout
        ExecUtil->>Timer: clear timer
        ExecUtil-->>RemoteFS: return stdout/stderr, exitCode
        RemoteFS-->>UI: success
    else timeout occurs
        Timer->>ExecUtil: trigger timeout -> mark ETIMEDOUT
        ExecUtil->>ExecUtil: kill child, append timeout message to stderr, preserve stdout
        ExecUtil-->>RemoteFS: return exitCode = 'ETIMEDOUT'
        RemoteFS->>Retry: classify 'ETIMEDOUT' as recoverable
        Retry-->>RemoteFS: retry with backoff (or fail after attempts)
        RemoteFS-->>UI: final result
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main fix: addressing a bug where the Files panel hangs on 'Loading remote files...' indefinitely.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

@croakingtoad
Copy link
Copy Markdown
Author

I haven't been able to test this fix as I'm on Windows and the idea of having to build the executable gives me a headache. :)

@croakingtoad croakingtoad marked this pull request as ready for review February 25, 2026 17:20
@greptile-apps
Copy link
Copy Markdown

greptile-apps bot commented Feb 25, 2026

Greptile Summary

Fixed Files panel hanging indefinitely on "Loading remote files..." by adding 30-second timeouts to SSH commands, fixing stale closure bug in async callbacks, and adding retry logic for timeout errors.

Key Changes:

  • Added timeout parameter to execFileNoThrow and propagated it to Node's execFile API
  • Set 30-second timeout on all SSH remote filesystem operations to prevent indefinite hangs
  • Added ETIMEDOUT to recoverable SSH error patterns for automatic retry with exponential backoff
  • Fixed stale closure in useFileTreeManagement.ts by capturing session.id at effect start instead of using activeSessionId from closure in async callbacks

Issues Found:

  • execFileWithInput function doesn't support the new timeout parameter - when input is provided alongside timeout, the timeout is silently ignored

Confidence Score: 4/5

  • Safe to merge with one logical bug: execFileWithInput doesn't support timeout parameter
  • The PR correctly addresses the root causes of the file tree hanging issue with well-tested timeout and retry logic. However, there's a gap: execFileWithInput (used when stdin input is provided) doesn't support the new timeout parameter, so timeouts would be silently ignored in that code path. This is unlikely to affect the Files panel use case but could cause issues if SSH commands with stdin input are added in the future.
  • Check src/main/utils/execFile.ts - the execFileWithInput function needs timeout support

Important Files Changed

Filename Overview
src/main/utils/execFile.ts Added timeout option to ExecOptions interface and properly propagated it through execFileNoThrow to Node's execFile. The backward compatibility handling for old vs. new signatures looks correct.
src/main/utils/remote-fs.ts Added 30-second timeout constant and passed it to all SSH commands via execFileNoThrow. Also added ETIMEDOUT pattern to recoverable errors list for retry logic. Changes are well-documented and consistent.
src/renderer/hooks/git/useFileTreeManagement.ts Fixed stale closure bug by capturing session.id at effect start and using it consistently in all async callbacks instead of activeSessionId. This prevents targeting wrong session when user switches during load.

Sequence Diagram

sequenceDiagram
    participant UI as Files Panel (React)
    participant Hook as useFileTreeManagement
    participant FileExp as loadFileTree
    participant IPC as fs:readDir (IPC)
    participant RemoteFS as readDirRemote
    participant Exec as execFileNoThrow
    participant SSH as ssh command

    UI->>Hook: Initial render / session change
    Hook->>Hook: Capture sessionId (fix #2)
    Hook->>Hook: Mark fileTreeLoading = true
    
    par Parallel file tree & stats
        Hook->>FileExp: loadFileTree(treeRoot)
        loop For each directory
            FileExp->>IPC: readDir(dirPath)
            IPC->>RemoteFS: readDirRemote(dirPath, sshConfig)
            RemoteFS->>Exec: execFileNoThrow('ssh', args, {timeout: 30000}) [fix #1]
            Exec->>SSH: Execute with 30s timeout
            
            alt SSH hangs > 30s
                SSH-->>Exec: ETIMEDOUT error
                Exec-->>RemoteFS: exitCode = ETIMEDOUT
                RemoteFS->>RemoteFS: Check RECOVERABLE_SSH_ERRORS [fix #3]
                RemoteFS->>RemoteFS: Retry with backoff (up to 3x)
                alt Retry succeeds
                    RemoteFS-->>IPC: Success
                else All retries fail
                    RemoteFS-->>IPC: Error
                end
            else SSH responds
                SSH-->>Exec: Success
                Exec-->>RemoteFS: exitCode = 0
                RemoteFS-->>IPC: Directory entries
            end
            
            IPC-->>FileExp: Results
            FileExp->>Hook: onProgress callback
            Hook->>Hook: Update progress (uses sessionId)
        end
        FileExp-->>Hook: Complete tree
    and
        Hook->>IPC: directorySize(treeRoot)
        IPC-->>Hook: Stats
    end
    
    Hook->>Hook: setSessions with sessionId [fix #2]
    Hook->>UI: fileTreeLoading = false, show tree
Loading

Last reviewed commit: 4d991de

Copy link
Copy Markdown

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@greptile-apps
Copy link
Copy Markdown

greptile-apps bot commented Feb 25, 2026

Additional Comments (1)

src/main/utils/execFile.ts
execFileWithInput uses spawn instead of execFile and doesn't support the timeout parameter. When input is provided with a timeout, the timeout will be silently ignored.

async function execFileWithInput(
	command: string,
	args: string[],
	cwd: string | undefined,
	input: string,
	timeout?: number
): Promise<ExecResult> {
	return new Promise((resolve) => {
		const isWindows = process.platform === 'win32';
		const useShell = isWindows && needsWindowsShell(command);

		const child = spawn(command, args, {
			cwd,
			shell: useShell,
			stdio: ['pipe', 'pipe', 'pipe'],
			timeout,
		});

		let stdout = '';
		let stderr = '';

		child.stdout?.on('data', (data) => {
			stdout += data.toString();
		});

		child.stderr?.on('data', (data) => {
			stderr += data.toString();
		});

		child.on('close', (code) => {
			resolve({
				stdout,
				stderr,
				exitCode: code ?? 1,
			});
		});

		child.on('error', (err) => {
			resolve({
				stdout: '',
				stderr: err.message,
				exitCode: (err as any).code ?? 1,
			});
		});

		// Write input to stdin and close it
		if (child.stdin) {
			child.stdin.write(input);
			child.stdin.end();
		}
	});
}

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
src/main/utils/remote-fs.ts (1)

116-121: SSH_COMMAND_TIMEOUT_MS is declared ~50 lines after it's first used in defaultDeps.

This is safe at runtime (the reference is inside a function body evaluated only when called, by which point the module is fully initialized), but the forward reference is confusing and can mislead readers into thinking a TDZ error is possible. Moving the constant above defaultDeps aligns with convention and eliminates the uncertainty.

♻️ Suggested re-ordering
+/**
+ * Timeout for individual SSH commands in milliseconds.
+ * Prevents hung SSH connections (e.g., stale ControlMaster sockets)
+ * from blocking the file tree load indefinitely.
+ */
+const SSH_COMMAND_TIMEOUT_MS = 30000;
+
 /**
  * Default dependencies using real implementations.
  */
 const defaultDeps: RemoteFsDeps = {
 	execSsh: (command: string, args: string[]): Promise<ExecResult> => {
 		return execFileNoThrow(command, args, undefined, { timeout: SSH_COMMAND_TIMEOUT_MS });
 	},
 	...
 };
 
 ...
 
-/**
- * Timeout for individual SSH commands in milliseconds.
- * ...
- */
-const SSH_COMMAND_TIMEOUT_MS = 30000;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/utils/remote-fs.ts` around lines 116 - 121, Move the
SSH_COMMAND_TIMEOUT_MS constant definition above the defaultDeps declaration to
eliminate the forward reference; specifically, relocate the const
SSH_COMMAND_TIMEOUT_MS = 30000 so it appears before the defaultDeps
function/object, ensuring defaultDeps can reference SSH_COMMAND_TIMEOUT_MS
without a confusing forward reference while keeping behavior unchanged.
src/main/utils/execFile.ts (1)

109-112: timeout is silently dropped when input is also provided.

Lines 110–112 forward to execFileWithInput without the timeout, meaning { input: "...", timeout: 30000 } silently ignores the timeout. This doesn't affect the SSH use-case (no input there), but the API is misleading.

If execFileWithInput won't support timeout, consider documenting the limitation on the timeout field in ExecOptions, or add a guard:

♻️ Minimal guard option
 	// If input is provided, use spawn instead of execFile to write to stdin
 	if (input !== undefined) {
+		// Note: timeout is not supported when input is provided
 		return execFileWithInput(command, args, cwd, input);
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/utils/execFile.ts` around lines 109 - 112, The branch that routes to
execFileWithInput currently drops the timeout from ExecOptions when input is
provided; update the logic so timeout is respected: either pass the timeout
through to execFileWithInput (modify execFileWithInput signature to accept and
honor a timeout option) or explicitly guard and throw/document that
ExecOptions.timeout is incompatible with input; specifically change the
conditional handling around execFile(input) to forward the full ExecOptions
(including timeout) to execFileWithInput or validate and error when both input
and timeout are set, referencing execFileWithInput, execFile and the
ExecOptions.timeout field.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/main/utils/remote-fs.ts`:
- Line 97: The retry logic never triggers for Node.js timeouts because
execFileNoThrow surfaces timeouts as error.code === 'ETIMEDOUT' (stored in
result.exitCode) while isRecoverableSshError only inspects stdout/stderr for
/ETIMEDOUT/i; update execRemoteCommand to treat a string exitCode 'ETIMEDOUT' as
recoverable by extending the recoverable check (either inside
isRecoverableSshError or just before retry logic in execRemoteCommand) to return
true when result.exitCode === 'ETIMEDOUT' (also reference execFileNoThrow as the
origin of the string exitCode).

In `@src/renderer/hooks/git/useFileTreeManagement.ts`:
- Around line 372-375: The stats-migration useEffect is still closing over
activeSessionId causing directorySize results to be applied to the wrong
session; fix it by capturing the stable session id into a local const (e.g.,
const sessionId = session.id) at the top of that effect and replace occurrences
of activeSessionId inside the .then()/.catch() callbacks and prev.map comparator
(the s.id === ... check) with that sessionId so the migration writes stats to
the session that initiated the request (also ensure any onProgress or async
callbacks in that effect use sessionId rather than activeSessionId).

---

Nitpick comments:
In `@src/main/utils/execFile.ts`:
- Around line 109-112: The branch that routes to execFileWithInput currently
drops the timeout from ExecOptions when input is provided; update the logic so
timeout is respected: either pass the timeout through to execFileWithInput
(modify execFileWithInput signature to accept and honor a timeout option) or
explicitly guard and throw/document that ExecOptions.timeout is incompatible
with input; specifically change the conditional handling around execFile(input)
to forward the full ExecOptions (including timeout) to execFileWithInput or
validate and error when both input and timeout are set, referencing
execFileWithInput, execFile and the ExecOptions.timeout field.

In `@src/main/utils/remote-fs.ts`:
- Around line 116-121: Move the SSH_COMMAND_TIMEOUT_MS constant definition above
the defaultDeps declaration to eliminate the forward reference; specifically,
relocate the const SSH_COMMAND_TIMEOUT_MS = 30000 so it appears before the
defaultDeps function/object, ensuring defaultDeps can reference
SSH_COMMAND_TIMEOUT_MS without a confusing forward reference while keeping
behavior unchanged.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4b01abc and 4d991de.

📒 Files selected for processing (3)
  • src/main/utils/execFile.ts
  • src/main/utils/remote-fs.ts
  • src/renderer/hooks/git/useFileTreeManagement.ts

Comment thread src/main/utils/remote-fs.ts
Comment thread src/renderer/hooks/git/useFileTreeManagement.ts
@croakingtoad croakingtoad force-pushed the fix/remote-file-tree-loading-hang branch from 4d991de to 0a0f579 Compare February 25, 2026 17:34
@croakingtoad
Copy link
Copy Markdown
Author

Update: Fixed a gap in the initial implementation where execFileWithInput (the spawn-based code path used when stdin input is provided) did not support the new timeout parameter — timeout was silently ignored when both input and timeout were specified.

Since spawn() doesn't have a built-in timeout option like execFile, I implemented it manually with setTimeout + child.kill(). When the timeout fires, the process is killed and the result includes an ETIMEDOUT exit code and descriptive stderr message, consistent with how Node's execFile timeout behaves on the execFileAsync path.

The timer is also properly cleaned up in both the close and error handlers to prevent leaks.

@croakingtoad croakingtoad force-pushed the fix/remote-file-tree-loading-hang branch from 0a0f579 to 90c0b98 Compare February 25, 2026 17:39
@croakingtoad
Copy link
Copy Markdown
Author

Update: Addressed both CodeRabbit findings:

1. ETIMEDOUT retry not firing (remote-fs.ts) — Good catch. execFileAsync puts 'ETIMEDOUT' in error.code which surfaces as result.exitCode, not in stderr. The /ETIMEDOUT/i regex only checked stderr/stdout, so it never matched for Node.js-triggered timeouts. Fixed by adding an explicit result.exitCode === 'ETIMEDOUT' check in execRemoteCommand alongside the existing stderr pattern matching:

const isNodeTimeout = result.exitCode === 'ETIMEDOUT';
if ((isRecoverableSshError(combinedOutput) || isNodeTimeout) && attempt < maxRetries) {

I've kept the /ETIMEDOUT/i regex in the patterns list as well — it covers the case where the SSH client itself writes "ETIMEDOUT" to stderr (socket-level timeout), which is a distinct scenario from Node killing the process.

2. Stale closure in stats migration effect (useFileTreeManagement.ts) — Also valid. The stats migration useEffect (line ~487-531) had the same activeSessionId stale closure issue in its .then() callback. Applied the same fix: capture session.id into a local const sessionId and use it in the async callbacks.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/main/utils/execFile.ts (1)

120-127: ⚠️ Potential issue | 🟠 Major

Non-stdin timeout path does not normalize ETIMEDOUT errors, breaking retry classification.

execFileAsync (used when no stdin input) does not emit ETIMEDOUT on timeout. Unlike execFileWithInput which properly detects error.killed and sets exitCode: 'ETIMEDOUT', the catch block at lines 134-141 returns exitCode: error.code ?? 1 — which becomes 1 when timeout occurs (since error.killed=true but error.code=null). This prevents retry logic in remote-fs.ts from recognizing timeouts via the RECOVERABLE_SSH_ERRORS /ETIMEDOUT/i pattern.

🐛 Proposed fix
 	} catch (error: any) {
+		const isTimedOut = timeout !== undefined && timeout > 0 && error?.killed === true;
 		// execFile throws on non-zero exit codes
 		// Use ?? instead of || to correctly handle exit code 0 (which is falsy but valid)
 		return {
 			stdout: error.stdout || '',
-			stderr: error.stderr || error.message || '',
-			exitCode: error.code ?? 1,
+			stderr: isTimedOut
+				? `${error.stderr || ''}\nETIMEDOUT: process timed out after ${timeout}ms`
+				: error.stderr || error.message || '',
+			exitCode: isTimedOut ? 'ETIMEDOUT' : (error.code ?? 1),
 		};
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/utils/execFile.ts` around lines 120 - 127, The non-stdin execution
path in execFileAsync does not normalize timeouts to 'ETIMEDOUT', so update the
catch handling in the function that calls execFileAsync to detect when
error.killed is true (timeout) and map exitCode to 'ETIMEDOUT' (or error.code if
present) before returning, mirroring the behavior in execFileWithInput; ensure
the returned object from execFileAsync uses exitCode: error.killed &&
!error.code ? 'ETIMEDOUT' : (error.code ?? 1) so remote-fs.ts and its
RECOVERABLE_SSH_ERRORS /ETIMEDOUT/i pattern will correctly classify timeouts.
♻️ Duplicate comments (1)
src/renderer/hooks/git/useFileTreeManagement.ts (1)

508-512: ⚠️ Potential issue | 🟠 Major

Stats migration async callback still targets activeSessionId (race risk).

If the user switches sessions before directorySize resolves, stats can be written to the wrong session.

🐛 Proposed fix
 	useEffect(() => {
 		const session = sessions.find((s) => s.id === activeSessionId);
 		if (!session) return;
+		const sessionId = session.id;
 ...
 		window.maestro.fs
 			.directorySize(treeRoot, sshContext?.sshRemoteId)
 			.then((stats) => {
 				setSessions((prev) =>
 					prev.map((s) =>
-						s.id === activeSessionId
+						s.id === sessionId
 							? {
 									...s,
 									fileTreeStats: {
 ...
 			.catch((error) => {
 				logger.warn('Stats migration failed', 'FileTreeManagement', {
 					error: error?.message || 'Unknown error',
-					sessionId: activeSessionId,
+					sessionId,
 				});
 			});

Also applies to: 528-529

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/renderer/hooks/git/useFileTreeManagement.ts` around lines 508 - 512, The
async directorySize stats callback is closing over the mutable activeSessionId
and may update the wrong session if the user switches sessions; capture the
current session id into a local const (e.g., const targetSessionId =
activeSessionId) before calling directorySize or starting the async chain and
use that captured targetSessionId inside the .then() to match and update
sessions via setSessions (instead of referencing activeSessionId), and apply the
same change to the other occurrence that updates sessions (the block around the
directorySize .then() and the similar code at the other occurrence).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/main/utils/execFile.ts`:
- Around line 98-103: The current branch that treats options as ExecOptions uses
"'timeout' in options" which only checks existence; update the check and
assignment around options/ExecOptions/execOpts/input/timeout so timeout is
validated as a number before passing to execFile: ensure you cast options to
ExecOptions only after confirming typeof options === 'object' and either 'input'
in options or (typeof (options as ExecOptions).timeout === 'number' || (typeof
(options as ExecOptions).timeout === 'string' && Number.isFinite(Number((options
as ExecOptions).timeout)))), and if a string numeric timeout is allowed convert
it to a Number and reject/ignore non-numeric values (use Number.isFinite or
Number.isInteger as appropriate) so execFile always receives a numeric timeout.

---

Outside diff comments:
In `@src/main/utils/execFile.ts`:
- Around line 120-127: The non-stdin execution path in execFileAsync does not
normalize timeouts to 'ETIMEDOUT', so update the catch handling in the function
that calls execFileAsync to detect when error.killed is true (timeout) and map
exitCode to 'ETIMEDOUT' (or error.code if present) before returning, mirroring
the behavior in execFileWithInput; ensure the returned object from execFileAsync
uses exitCode: error.killed && !error.code ? 'ETIMEDOUT' : (error.code ?? 1) so
remote-fs.ts and its RECOVERABLE_SSH_ERRORS /ETIMEDOUT/i pattern will correctly
classify timeouts.

---

Duplicate comments:
In `@src/renderer/hooks/git/useFileTreeManagement.ts`:
- Around line 508-512: The async directorySize stats callback is closing over
the mutable activeSessionId and may update the wrong session if the user
switches sessions; capture the current session id into a local const (e.g.,
const targetSessionId = activeSessionId) before calling directorySize or
starting the async chain and use that captured targetSessionId inside the
.then() to match and update sessions via setSessions (instead of referencing
activeSessionId), and apply the same change to the other occurrence that updates
sessions (the block around the directorySize .then() and the similar code at the
other occurrence).

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4d991de and 0a0f579.

📒 Files selected for processing (3)
  • src/main/utils/execFile.ts
  • src/main/utils/remote-fs.ts
  • src/renderer/hooks/git/useFileTreeManagement.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/main/utils/remote-fs.ts

Comment on lines +98 to 103
if ('input' in options || 'timeout' in options) {
// New signature with ExecOptions
input = options.input;
const execOpts = options as ExecOptions;
input = execOpts.input;
timeout = execOpts.timeout;
} else {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, locate and examine the execFile.ts file
fd -t f "execFile.ts" src/main/utils/

Repository: RunMaestro/Maestro

Length of output: 88


🏁 Script executed:

# Read the relevant section of the file around lines 98-103
head -n 150 src/main/utils/execFile.ts | tail -n +80

Repository: RunMaestro/Maestro

Length of output: 2040


🏁 Script executed:

# Search for all call sites of execFileNoThrow to understand how it's being called
rg -nP --type=ts "execFileNoThrow\s*\(" | head -20

Repository: RunMaestro/Maestro

Length of output: 2120


🏁 Script executed:

# Check for env objects being passed and whether they might contain timeout/input keys
rg -nP --type=ts "process\.env|{.*timeout.*}" src/main/ | head -30

Repository: RunMaestro/Maestro

Length of output: 2945


🏁 Script executed:

# Find the ExecOptions type definition
rg -nP --type=ts "type ExecOptions|interface ExecOptions" src/main/

Repository: RunMaestro/Maestro

Length of output: 121


🏁 Script executed:

# Look at the full context around the execFileNoThrow function signature and type definitions
head -n 85 src/main/utils/execFile.ts

Repository: RunMaestro/Maestro

Length of output: 2852


🏁 Script executed:

# Search for any env variable spreads or objects that might contain timeout or input
rg -nP --type=ts "(env\s*=|env:).*timeout|input.*env" src/main/

Repository: RunMaestro/Maestro

Length of output: 169


🏁 Script executed:

# Check if there are any env object spreads with computed properties
rg -nP --type=ts "\{.*\.\.\..*env.*\}" src/main/ | head -20

Repository: RunMaestro/Maestro

Length of output: 316


🏁 Script executed:

# Check the actual calls that pass env to execFileNoThrow
rg -nP -B2 -A2 "execFileNoThrow\([^)]*,\s*\[.*\],\s*undefined,\s*env" src/main/

Repository: RunMaestro/Maestro

Length of output: 2055


🏁 Script executed:

# Look at cliDetection.ts more carefully to see what env objects look like
sed -n '20,40p' src/main/utils/cliDetection.ts
sed -n '55,75p' src/main/utils/cliDetection.ts
sed -n '155,175p' src/main/utils/cliDetection.ts

Repository: RunMaestro/Maestro

Length of output: 2487


🏁 Script executed:

# Check if any code constructs env objects with timeout or input keys
rg -nP --type=ts "timeout\s*:|input\s*:" src/main/ | grep -v "//" | grep -v "ExecOptions" | head -20

Repository: RunMaestro/Maestro

Length of output: 1222


🏁 Script executed:

# Verify the type signature more carefully - are there any overloads?
sed -n '78,92p' src/main/utils/execFile.ts

Repository: RunMaestro/Maestro

Length of output: 685


🏁 Script executed:

# Check the getExpandedEnv implementation
rg -nP -A15 "export.*function getExpandedEnv|function getExpandedEnv" src/main/

Repository: RunMaestro/Maestro

Length of output: 2334


🏁 Script executed:

# Check if there are any env objects that might contain 'input' or 'timeout' keys
rg -nP --type=ts "process\.env\[.*timeout|process\.env\[.*input|timeout\s*=\s*process\.env|input\s*=\s*process\.env" src/main/

Repository: RunMaestro/Maestro

Length of output: 44


🏁 Script executed:

# Look at the actual implementation of the conditional more carefully and check the other path
sed -n '86,135p' src/main/utils/execFile.ts

Repository: RunMaestro/Maestro

Length of output: 1435


🏁 Script executed:

# Get more context on the execFileWithInput function and timeout handling
sed -n '145,200p' src/main/utils/execFile.ts

Repository: RunMaestro/Maestro

Length of output: 1409


🏁 Script executed:

# Check if there's any test coverage for the env vs ExecOptions discrimination
rg -nP --type=ts "execFileNoThrow.*env|execFileNoThrow.*timeout" src/main/__tests__/ src/main/**/__tests__/ 2>/dev/null | head -20

Repository: RunMaestro/Maestro

Length of output: 44


🏁 Script executed:

# Verify the type signature - check if TypeScript would catch misuse at compile time
rg -nP "ExecOptions | NodeJS.ProcessEnv" src/main/

Repository: RunMaestro/Maestro

Length of output: 1206


🏁 Script executed:

# Check if any code ever constructs env objects with 'input' or 'timeout' as literal keys
rg -nP --type=ts "{[^}]*input[^}]*}" src/main/ | grep -i "process\.env\|NodeJS.ProcessEnv\|getExpandedEnv" | head -10

Repository: RunMaestro/Maestro

Length of output: 44


🏁 Script executed:

# Look at the test cases to understand how execFileNoThrow is being tested
find src/main -name "*execFile*test*" -o -name "*test*execFile*"

Repository: RunMaestro/Maestro

Length of output: 44


🏁 Script executed:

# Check how the type union is actually being used in practice - are there TypeScript errors?
rg -nP --type=ts "execFileNoThrow" src/main/ | wc -l

Repository: RunMaestro/Maestro

Length of output: 65


🏁 Script executed:

# Check if there's any dynamic construction of env objects with computed keys
rg -nP --type=ts "\[.*timeout.*\]|\[.*input.*\].*=.*process\.env" src/main/

Repository: RunMaestro/Maestro

Length of output: 383


🏁 Script executed:

# Search for any tests of the execFileNoThrow with env parameter
rg -nP --type=ts "execFileNoThrow.*test|test.*execFileNoThrow" src/main/ -A5 -B5

Repository: RunMaestro/Maestro

Length of output: 44


🏁 Script executed:

# Check the actual remote-fs.ts call to see the exact usage pattern
sed -n '65,75p' src/main/utils/remote-fs.ts

Repository: RunMaestro/Maestro

Length of output: 429


🏁 Script executed:

# Look at git.ts:1406 where input is passed
sed -n '1400,1410p' src/main/ipc/handlers/git.ts

Repository: RunMaestro/Maestro

Length of output: 344


options shape check lacks type validation for the timeout property.

The condition 'timeout' in options only checks existence, not whether timeout is a number. If an env object with a non-numeric timeout key were passed, Node's execFile would receive an invalid type. Use type-safe checks instead.

Suggested fix
 	if (options) {
-		if ('input' in options || 'timeout' in options) {
+		const maybeExecOpts = options as ExecOptions;
+		const hasExecInput = maybeExecOpts.input !== undefined;
+		const hasExecTimeout = typeof maybeExecOpts.timeout === 'number';
+		if (hasExecInput || hasExecTimeout) {
 			// New signature with ExecOptions
-			const execOpts = options as ExecOptions;
-			input = execOpts.input;
-			timeout = execOpts.timeout;
+			input = maybeExecOpts.input;
+			timeout = maybeExecOpts.timeout;
 		} else {
 			// Old signature with just env
 			env = options as NodeJS.ProcessEnv;
 		}
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if ('input' in options || 'timeout' in options) {
// New signature with ExecOptions
input = options.input;
const execOpts = options as ExecOptions;
input = execOpts.input;
timeout = execOpts.timeout;
} else {
if (options) {
const maybeExecOpts = options as ExecOptions;
const hasExecInput = maybeExecOpts.input !== undefined;
const hasExecTimeout = typeof maybeExecOpts.timeout === 'number';
if (hasExecInput || hasExecTimeout) {
// New signature with ExecOptions
input = maybeExecOpts.input;
timeout = maybeExecOpts.timeout;
} else {
// Old signature with just env
env = options as NodeJS.ProcessEnv;
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/utils/execFile.ts` around lines 98 - 103, The current branch that
treats options as ExecOptions uses "'timeout' in options" which only checks
existence; update the check and assignment around
options/ExecOptions/execOpts/input/timeout so timeout is validated as a number
before passing to execFile: ensure you cast options to ExecOptions only after
confirming typeof options === 'object' and either 'input' in options or (typeof
(options as ExecOptions).timeout === 'number' || (typeof (options as
ExecOptions).timeout === 'string' && Number.isFinite(Number((options as
ExecOptions).timeout)))), and if a string numeric timeout is allowed convert it
to a Number and reject/ignore non-numeric values (use Number.isFinite or
Number.isInteger as appropriate) so execFile always receives a numeric timeout.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
src/renderer/hooks/git/useFileTreeManagement.ts (1)

501-534: Stale-closure fix correctly applied to stats migration effect.

sessionId is captured before the directorySize call, the .then() comparator uses it (line 514), and the warning log at line 531 surfaces the stable ID — matches the past review recommendation exactly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/renderer/hooks/git/useFileTreeManagement.ts` around lines 501 - 534, The
stats migration effect already correctly captures a stable sessionId before
calling window.maestro.fs.directorySize and uses it in the .then() updater and
logger.warn; no code changes are required to fix a stale-closure here—keep the
captured const sessionId and the existing setSessions update and catch block
as-is and remove/ignore the duplicate review comment.
src/main/utils/execFile.ts (1)

98-102: ⚠️ Potential issue | 🟡 Minor

'timeout' in options is not type-safe for the env vs. ExecOptions disambiguation.

An env object whose keys happen to include 'timeout' or 'input' (e.g., a custom NODE_ENV-style env map) would be silently mis-routed into the ExecOptions branch, ignoring the env and potentially breaking callers. Using value-type guards instead of key presence makes the intent explicit.

🛡️ Proposed fix
 	if (options) {
-		if ('input' in options || 'timeout' in options) {
-			// New signature with ExecOptions
-			const execOpts = options as ExecOptions;
-			input = execOpts.input;
-			timeout = execOpts.timeout;
+		const maybeExecOpts = options as ExecOptions;
+		const hasExecInput = maybeExecOpts.input !== undefined;
+		const hasExecTimeout = typeof maybeExecOpts.timeout === 'number';
+		if (hasExecInput || hasExecTimeout) {
+			// New signature with ExecOptions
+			input = maybeExecOpts.input;
+			timeout = maybeExecOpts.timeout;
 		} else {
 			// Old signature with just env
 			env = options as NodeJS.ProcessEnv;
 		}
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/utils/execFile.ts` around lines 98 - 102, The current disambiguation
uses "'input' in options || 'timeout' in options" which is not type-safe because
an env map could include those keys; update the branch in the execFile-related
logic (where options, ExecOptions, input, timeout, and env are handled) to use
value-type guards instead: check that options.input is a valid input type (e.g.,
typeof input === 'string' || Buffer.isBuffer(input) || input instanceof
Uint8Array) and that options.timeout is a number (typeof options.timeout ===
'number') before treating options as ExecOptions, otherwise treat it as env;
adjust any type assertions/casts accordingly so env objects with keys named
"input" or "timeout" are not misclassified.
🧹 Nitpick comments (1)
src/main/utils/remote-fs.ts (1)

116-121: Consider moving SSH_COMMAND_TIMEOUT_MS before its first use in defaultDeps.

SSH_COMMAND_TIMEOUT_MS is declared at line 121 but referenced inside the defaultDeps.execSsh arrow function body at line 71. This works at runtime (the function body is only executed after module initialization completes, so the const is initialized by then — no TDZ violation), but the forward reference to a const looks like a potential temporal dead zone bug to readers and may trigger some linter rules. Moving the constant to above defaultDeps removes the ambiguity entirely.

♻️ Suggested reorganization
+/**
+ * Timeout for individual SSH commands in milliseconds.
+ * Prevents hung SSH connections (e.g., stale ControlMaster sockets)
+ * from blocking the file tree load indefinitely.
+ */
+const SSH_COMMAND_TIMEOUT_MS = 30000;
+
 /**
  * Default dependencies using real implementations.
  */
 const defaultDeps: RemoteFsDeps = {
 	execSsh: (command: string, args: string[]): Promise<ExecResult> => {
 		return execFileNoThrow(command, args, undefined, { timeout: SSH_COMMAND_TIMEOUT_MS });
 	},
 	...
 };

-/**
- * Timeout for individual SSH commands in milliseconds.
- * Prevents hung SSH connections (e.g., stale ControlMaster sockets)
- * from blocking the file tree load indefinitely.
- */
-const SSH_COMMAND_TIMEOUT_MS = 30000;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/utils/remote-fs.ts` around lines 116 - 121, Move the
SSH_COMMAND_TIMEOUT_MS constant so it is declared before it is referenced in
defaultDeps.execSsh to avoid a forward-reference/TDZ-looking situation;
specifically, relocate the const SSH_COMMAND_TIMEOUT_MS declaration above the
defaultDeps object (or at least above the execSsh arrow function) so
defaultDeps.execSsh can read SSH_COMMAND_TIMEOUT_MS without a perceived temporal
dead zone.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/main/utils/execFile.ts`:
- Around line 98-102: The current disambiguation uses "'input' in options ||
'timeout' in options" which is not type-safe because an env map could include
those keys; update the branch in the execFile-related logic (where options,
ExecOptions, input, timeout, and env are handled) to use value-type guards
instead: check that options.input is a valid input type (e.g., typeof input ===
'string' || Buffer.isBuffer(input) || input instanceof Uint8Array) and that
options.timeout is a number (typeof options.timeout === 'number') before
treating options as ExecOptions, otherwise treat it as env; adjust any type
assertions/casts accordingly so env objects with keys named "input" or "timeout"
are not misclassified.

In `@src/renderer/hooks/git/useFileTreeManagement.ts`:
- Around line 501-534: The stats migration effect already correctly captures a
stable sessionId before calling window.maestro.fs.directorySize and uses it in
the .then() updater and logger.warn; no code changes are required to fix a
stale-closure here—keep the captured const sessionId and the existing
setSessions update and catch block as-is and remove/ignore the duplicate review
comment.

---

Nitpick comments:
In `@src/main/utils/remote-fs.ts`:
- Around line 116-121: Move the SSH_COMMAND_TIMEOUT_MS constant so it is
declared before it is referenced in defaultDeps.execSsh to avoid a
forward-reference/TDZ-looking situation; specifically, relocate the const
SSH_COMMAND_TIMEOUT_MS declaration above the defaultDeps object (or at least
above the execSsh arrow function) so defaultDeps.execSsh can read
SSH_COMMAND_TIMEOUT_MS without a perceived temporal dead zone.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0a0f579 and 90c0b98.

📒 Files selected for processing (3)
  • src/main/utils/execFile.ts
  • src/main/utils/remote-fs.ts
  • src/renderer/hooks/git/useFileTreeManagement.ts

@croakingtoad croakingtoad force-pushed the fix/remote-file-tree-loading-hang branch from 90c0b98 to e87fa69 Compare February 25, 2026 18:28
@croakingtoad
Copy link
Copy Markdown
Author

Update: Fixed a fourth issue discovered during testing — directorySize failure was killing the entire file tree load.

Problem: loadFileTree and directorySize were coupled in a Promise.all. If the du SSH command failed (timeout, permission error, etc.), Promise.all rejected immediately and the .catch() handler set fileTreeError — preventing the file tree from ever showing, even when loadFileTree would have succeeded.

This bug was always present but invisible: previously loadFileTree hung forever so Promise.all never settled, meaning directorySize failures were silently swallowed. The timeout fix unmasked it.

Fix: Decoupled the two operations. directorySize errors are now caught independently and logged as a non-fatal warning. The file tree loads and displays regardless of whether stats (du) succeed. If stats fail, fileTreeStats is simply undefined and the size/count display is omitted — the file list still renders normally.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/main/utils/execFile.ts (1)

187-203: ⚠️ Potential issue | 🟡 Minor

ETIMEDOUT info is lost when 'error' fires before 'close' after child.kill().

On Windows (and occasionally POSIX), killing a process can emit 'error' before 'close'. Since the 'error' handler resolves the Promise with exitCode: 1, the subsequent 'close' handler's killed ? 'ETIMEDOUT' : … branch becomes a no-op. The remote-fs.ts retry logic then sees exitCode: 1 and isNodeTimeout === false, so the timeout retry never fires — negating the core fix of this PR.

🐛 Proposed fix
 		child.on('error', (err) => {
 			if (timer) clearTimeout(timer);
 			resolve({
 				stdout: '',
-				stderr: err.message,
-				exitCode: 1,
+				stderr: killed
+					? `${err.message}\nETIMEDOUT: process timed out after ${timeout}ms`
+					: err.message,
+				exitCode: killed ? 'ETIMEDOUT' : 1,
 			});
 		});
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/utils/execFile.ts` around lines 187 - 203, The Promise resolution in
the child.on('error') handler can lose the ETIMEDOUT signal if 'error' fires
before 'close'; update the error handler in execFile.ts to clear the timer, and
when killed is true set stderr to include the ETIMEDOUT message and set exitCode
to 'ETIMEDOUT' (matching the close handler behavior) instead of always 1; also
ensure you guard against double-resolve by using an internal "finished" boolean
(or similar) that both child.on('error') and child.on('close') check/set so only
the first handler resolves the Promise.
♻️ Duplicate comments (4)
src/renderer/hooks/git/useFileTreeManagement.ts (2)

515-547: LGTM — stale-closure fix and non-fatal stats handling correctly applied.

sessionId is captured at effect entry, used in both .then() and .catch() callbacks, matching the pattern applied to the initial-load effect.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/renderer/hooks/git/useFileTreeManagement.ts` around lines 515 - 547, This
change correctly captures sessionId for the async callbacks and handles
non-fatal stats fetches; no code changes needed—leave the directory size fetch
using the captured sessionId, the setSessions update (inside the .then
callback), and the logger.warn in the .catch as-is (references: sessionId,
setSessions, window.maestro.fs.directorySize).

372-375: LGTM — stale-closure fix correctly applied.

sessionId is captured from session.id before any async boundary, and all three callback sites (setSessions in onProgress, .then(), and .catch()) correctly reference the stable sessionId.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/renderer/hooks/git/useFileTreeManagement.ts` around lines 372 - 375, The
stale-closure issue has been correctly addressed by capturing session.id into a
stable local (sessionId) before async boundaries; no code changes
necessary—ensure all async callbacks reference sessionId (used in setSessions
inside onProgress, the Promise .then(), and .catch() handlers) rather than
reading session or activeSessionId to avoid stale closures.
src/main/utils/remote-fs.ts (1)

162-191: LGTM — ETIMEDOUT retry logic correctly wired.

isNodeTimeout covers the Node.js-kill path (exitCode === 'ETIMEDOUT'), and the /ETIMEDOUT/i regex in RECOVERABLE_SSH_ERRORS covers SSH-client-emitted messages. Together they ensure all timeout scenarios trigger exponential-backoff retry.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/utils/remote-fs.ts` around lines 162 - 191, No changes required: the
retry logic in the for loop correctly treats Node timeout via isNodeTimeout
(exitCode === 'ETIMEDOUT') alongside isRecoverableSshError(combinedOutput), uses
getBackoffDelay(attempt, baseDelayMs, maxDelayMs) for exponential backoff, logs
via logger.debug, awaits sleep(delay), and returns lastResult as fallback; leave
functions buildSshArgs, execSsh, isRecoverableSshError, getBackoffDelay, and
sleep intact.
src/main/utils/execFile.ts (1)

98-106: 'timeout' in options still admits env objects with a 'timeout' key — existing concern.

An env object containing a 'timeout' key (e.g., { TIMEOUT: '30', timeout: '5' }) would be misclassified as ExecOptions, silently discarding the env and parsing an unexpected timeout string.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/utils/execFile.ts` around lines 98 - 106, The current type-guard
misclassifies plain env objects that happen to have a 'timeout' key; update the
branch that distinguishes ExecOptions vs NodeJS.ProcessEnv by performing
stronger runtime checks (e.g., ensure options is an object and that
options.input is a string/Buffer or options.timeout is a number) instead of only
using "'timeout' in options". In the block referencing ExecOptions (the local
variables input, timeout and the type ExecOptions), cast only when those
stricter checks pass; otherwise treat options as NodeJS.ProcessEnv and assign to
env to avoid accidentally discarding environment variables.
🧹 Nitpick comments (2)
src/main/utils/remote-fs.ts (2)

69-76: SSH_COMMAND_TIMEOUT_MS is defined after its first reference site — consider hoisting.

SSH_COMMAND_TIMEOUT_MS is declared at Line 121 but referenced inside defaultDeps.execSsh at Line 71. This works correctly at runtime because the reference is inside a function body (closure evaluated lazily), but the out-of-order placement is non-obvious and could mislead a reader into suspecting a TDZ issue. Moving the constant above defaultDeps would make the dependency order explicit.

Also applies to: 116-121

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/utils/remote-fs.ts` around lines 69 - 76, Move the
SSH_COMMAND_TIMEOUT_MS constant so it's declared before defaultDeps to make the
dependency order explicit: ensure SSH_COMMAND_TIMEOUT_MS is defined above the
defaultDeps object (which contains execSsh) so the closure in
defaultDeps.execSsh references an earlier-declared symbol; update the file by
hoisting the SSH_COMMAND_TIMEOUT_MS declaration above the defaultDeps block
(also check and hoist any related constants between lines ~116-121 mentioned in
the comment).

69-76: A uniform 30s timeout also gates write/rename/delete operations — verify it's safe for large file transfers.

SSH_COMMAND_TIMEOUT_MS is applied to every command through defaultDeps, including writeFileRemote which embeds base64-encoded file content inline in the shell command. Over slow SSH links, writing a large file can legitimately take more than 30 s, causing spurious ETIMEDOUT retries and ultimately a failure. Consider either excluding mutation operations from the per-command timeout or making the timeout configurable per operation type.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/utils/remote-fs.ts` around lines 69 - 76, The default execSsh in
defaultDeps currently applies SSH_COMMAND_TIMEOUT_MS to every call (via
execFileNoThrow), which can cause timeouts for large writes; update the design
so mutation-heavy operations (e.g., writeFileRemote, renameRemote, deleteRemote
or any function that embeds large payloads) either pass an explicit longer
timeout or use a separate exec path without the 30s limit—add an optional
timeout parameter to execSsh in RemoteFsDeps (or create execSshForMutations) and
update call sites (notably writeFileRemote) to supply a larger or configurable
timeout instead of relying on SSH_COMMAND_TIMEOUT_MS. Ensure default behavior
for lightweight commands remains unchanged and make the new timeout configurable
via existing config structs or function args.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/renderer/hooks/git/useFileTreeManagement.ts`:
- Around line 413-422: The directorySize error is currently only logged via
logger.warn and must also be reported to Sentry: update the catch on the
statsPromise (the call to window.maestro.fs.directorySize inside
useFileTreeManagement) to call the Sentry utility (e.g., captureException or
captureMessage from src/utils/sentry) with the caught err and contextual
tags/extra including treeRoot and sshContext?.sshRemoteId in addition to the
existing logger.warn so the failure is tracked; import the Sentry helper at the
top of the module and ensure the Sentry call runs inside the .catch before
returning undefined.

---

Outside diff comments:
In `@src/main/utils/execFile.ts`:
- Around line 187-203: The Promise resolution in the child.on('error') handler
can lose the ETIMEDOUT signal if 'error' fires before 'close'; update the error
handler in execFile.ts to clear the timer, and when killed is true set stderr to
include the ETIMEDOUT message and set exitCode to 'ETIMEDOUT' (matching the
close handler behavior) instead of always 1; also ensure you guard against
double-resolve by using an internal "finished" boolean (or similar) that both
child.on('error') and child.on('close') check/set so only the first handler
resolves the Promise.

---

Duplicate comments:
In `@src/main/utils/execFile.ts`:
- Around line 98-106: The current type-guard misclassifies plain env objects
that happen to have a 'timeout' key; update the branch that distinguishes
ExecOptions vs NodeJS.ProcessEnv by performing stronger runtime checks (e.g.,
ensure options is an object and that options.input is a string/Buffer or
options.timeout is a number) instead of only using "'timeout' in options". In
the block referencing ExecOptions (the local variables input, timeout and the
type ExecOptions), cast only when those stricter checks pass; otherwise treat
options as NodeJS.ProcessEnv and assign to env to avoid accidentally discarding
environment variables.

In `@src/main/utils/remote-fs.ts`:
- Around line 162-191: No changes required: the retry logic in the for loop
correctly treats Node timeout via isNodeTimeout (exitCode === 'ETIMEDOUT')
alongside isRecoverableSshError(combinedOutput), uses getBackoffDelay(attempt,
baseDelayMs, maxDelayMs) for exponential backoff, logs via logger.debug, awaits
sleep(delay), and returns lastResult as fallback; leave functions buildSshArgs,
execSsh, isRecoverableSshError, getBackoffDelay, and sleep intact.

In `@src/renderer/hooks/git/useFileTreeManagement.ts`:
- Around line 515-547: This change correctly captures sessionId for the async
callbacks and handles non-fatal stats fetches; no code changes needed—leave the
directory size fetch using the captured sessionId, the setSessions update
(inside the .then callback), and the logger.warn in the .catch as-is
(references: sessionId, setSessions, window.maestro.fs.directorySize).
- Around line 372-375: The stale-closure issue has been correctly addressed by
capturing session.id into a stable local (sessionId) before async boundaries; no
code changes necessary—ensure all async callbacks reference sessionId (used in
setSessions inside onProgress, the Promise .then(), and .catch() handlers)
rather than reading session or activeSessionId to avoid stale closures.

---

Nitpick comments:
In `@src/main/utils/remote-fs.ts`:
- Around line 69-76: Move the SSH_COMMAND_TIMEOUT_MS constant so it's declared
before defaultDeps to make the dependency order explicit: ensure
SSH_COMMAND_TIMEOUT_MS is defined above the defaultDeps object (which contains
execSsh) so the closure in defaultDeps.execSsh references an earlier-declared
symbol; update the file by hoisting the SSH_COMMAND_TIMEOUT_MS declaration above
the defaultDeps block (also check and hoist any related constants between lines
~116-121 mentioned in the comment).
- Around line 69-76: The default execSsh in defaultDeps currently applies
SSH_COMMAND_TIMEOUT_MS to every call (via execFileNoThrow), which can cause
timeouts for large writes; update the design so mutation-heavy operations (e.g.,
writeFileRemote, renameRemote, deleteRemote or any function that embeds large
payloads) either pass an explicit longer timeout or use a separate exec path
without the 30s limit—add an optional timeout parameter to execSsh in
RemoteFsDeps (or create execSshForMutations) and update call sites (notably
writeFileRemote) to supply a larger or configurable timeout instead of relying
on SSH_COMMAND_TIMEOUT_MS. Ensure default behavior for lightweight commands
remains unchanged and make the new timeout configurable via existing config
structs or function args.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 90c0b98 and e87fa69.

📒 Files selected for processing (3)
  • src/main/utils/execFile.ts
  • src/main/utils/remote-fs.ts
  • src/renderer/hooks/git/useFileTreeManagement.ts

Comment on lines +413 to +422
// Fetch stats independently — a directorySize failure (e.g., `du` timeout
// on large repos over SSH) should not prevent the file tree from loading.
const statsPromise = window.maestro.fs
.directorySize(treeRoot, sshContext?.sshRemoteId)
.catch((err) => {
logger.warn('directorySize failed (non-fatal)', 'FileTreeManagement', {
error: err?.message || 'Unknown error',
});
return undefined;
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

directorySize failure is silently swallowed — add Sentry reporting.

logger.warn alone doesn't surface directorySize failures to error tracking. Per coding guidelines, use Sentry utilities for explicit error reporting with context.

🛡️ Proposed fix
+import { captureMessage } from 'src/utils/sentry';
 ...
 			const statsPromise = window.maestro.fs
 				.directorySize(treeRoot, sshContext?.sshRemoteId)
 				.catch((err) => {
 					logger.warn('directorySize failed (non-fatal)', 'FileTreeManagement', {
 						error: err?.message || 'Unknown error',
 					});
+					captureMessage('directorySize failed during file tree load', {
+						extra: { sessionId, treeRoot, error: err?.message },
+					});
 					return undefined;
 				});

As per coding guidelines: "Use Sentry utilities (captureException, captureMessage) from 'src/utils/sentry' for explicit error reporting with context."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/renderer/hooks/git/useFileTreeManagement.ts` around lines 413 - 422, The
directorySize error is currently only logged via logger.warn and must also be
reported to Sentry: update the catch on the statsPromise (the call to
window.maestro.fs.directorySize inside useFileTreeManagement) to call the Sentry
utility (e.g., captureException or captureMessage from src/utils/sentry) with
the caught err and contextual tags/extra including treeRoot and
sshContext?.sshRemoteId in addition to the existing logger.warn so the failure
is tracked; import the Sentry helper at the top of the module and ensure the
Sentry call runs inside the .catch before returning undefined.

@croakingtoad croakingtoad force-pushed the fix/remote-file-tree-loading-hang branch from e87fa69 to afbb455 Compare February 25, 2026 18:41
@croakingtoad
Copy link
Copy Markdown
Author

Update: Fixed a fifth issue discovered during testing — a single unreadable subdirectory was killing the entire file tree walk.

Problem: loadFileTreeRecursive (fileExplorer.ts) processes directories sequentially with await. If any child directory readDir failed (paths with spaces over SSH, permission denied, broken symlinks, etc.), the error propagated all the way up via throw error, aborting the entire tree walk. In testing, a directory named Dr. Sue Johnson triggered this over SSH.

Fix: Wrapped the recursive child directory call in a try/catch. If a subdirectory can't be read, it's shown as an empty folder and the walk continues with the remaining directories. The root directory failure still propagates (no tree to show at all), but child directory failures are now gracefully handled.

New file changed: src/renderer/utils/fileExplorer.ts

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/main/utils/execFile.ts (1)

196-203: ⚠️ Potential issue | 🟡 Minor

Error handler ignores killed flag — ETIMEDOUT exit code and collected stdout are lost when kill() triggers an error event.

child.kill() can fire an error event (e.g., ESRCH when the process exits before the signal is delivered). When that happens the error handler resolves the Promise with exitCode: 1 and stdout: '', discarding both the accumulated stdout and the ETIMEDOUT exit code. The subsequent close event's resolve is silently dropped by the settled Promise, so the caller receives an uninformative exitCode: 1 instead of 'ETIMEDOUT'.

🐛 Proposed fix — make the error handler timeout-aware
 		child.on('error', (err) => {
 			if (timer) clearTimeout(timer);
+			if (killed) {
+				// kill() triggered an error event (e.g., ESRCH); the close event may
+				// not fire — resolve with the same ETIMEDOUT semantics as the close handler.
+				resolve({
+					stdout,
+					stderr: `${stderr}\nETIMEDOUT: process timed out after ${timeout}ms`,
+					exitCode: 'ETIMEDOUT',
+				});
+				return;
+			}
 			resolve({
 				stdout: '',
 				stderr: err.message,
 				exitCode: 1,
 			});
 		});
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/utils/execFile.ts` around lines 196 - 203, The child process 'error'
handler currently always resolves with exitCode: 1 and empty stdout, which
discards accumulated stdout and any timeout semantics; update the
child.on('error', ...) logic in execFile.ts to be timeout-aware: if a timeout
timer (timer) had fired or the process was killed (child.killed or a timedOut
flag you add), do not override the eventual close handler's resolution—either
include the collected stdout/stderr and propagate the ETIMEDOUT exit code (or a
'timedOut' marker) when resolving, or simply rethrow/forward the error so the
close handler can settle the Promise; ensure you reference and preserve the
existing stdout buffer (e.g., stdout variable), the timer/timedOut state, and
the close resolution path rather than unconditionally calling resolve({stdout:
'', stderr: err.message, exitCode: 1}).
♻️ Duplicate comments (1)
src/main/utils/execFile.ts (1)

98-103: 'timeout' in options type-safety check — previously flagged.

The options discrimination logic is unchanged from the previous commit where this was flagged.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/utils/execFile.ts` around lines 98 - 103, The discrimination using
"'timeout' in options" is unsafe; create and use a proper type guard (e.g.
isExecOptions(obj): obj is ExecOptions) that checks obj !== null && typeof obj
=== 'object' and the presence of 'input' or 'timeout', then cast to ExecOptions
and extract input/timeout; update the logic around the execFile/execOptions
handling (the options variable and ExecOptions type) to call this guard before
treating options as ExecOptions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/renderer/utils/fileExplorer.ts`:
- Around line 361-363: The empty catch in loadFileTreeRecursive silently
swallows all errors; change it to catch the error object and only handle
expected OS errors (e.g., err.code === 'ENOENT' || 'EACCES' || 'EPERM') by
skipping the unreadable child directory, otherwise report/until-rethrow
unexpected errors to Sentry (captureException or similar) or rethrow so they
bubble up; use the error.code check inside the catch where unreadable child
directories are skipped and ensure unexpected errors are not suppressed.

---

Outside diff comments:
In `@src/main/utils/execFile.ts`:
- Around line 196-203: The child process 'error' handler currently always
resolves with exitCode: 1 and empty stdout, which discards accumulated stdout
and any timeout semantics; update the child.on('error', ...) logic in
execFile.ts to be timeout-aware: if a timeout timer (timer) had fired or the
process was killed (child.killed or a timedOut flag you add), do not override
the eventual close handler's resolution—either include the collected
stdout/stderr and propagate the ETIMEDOUT exit code (or a 'timedOut' marker)
when resolving, or simply rethrow/forward the error so the close handler can
settle the Promise; ensure you reference and preserve the existing stdout buffer
(e.g., stdout variable), the timer/timedOut state, and the close resolution path
rather than unconditionally calling resolve({stdout: '', stderr: err.message,
exitCode: 1}).

---

Duplicate comments:
In `@src/main/utils/execFile.ts`:
- Around line 98-103: The discrimination using "'timeout' in options" is unsafe;
create and use a proper type guard (e.g. isExecOptions(obj): obj is ExecOptions)
that checks obj !== null && typeof obj === 'object' and the presence of 'input'
or 'timeout', then cast to ExecOptions and extract input/timeout; update the
logic around the execFile/execOptions handling (the options variable and
ExecOptions type) to call this guard before treating options as ExecOptions.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e87fa69 and afbb455.

📒 Files selected for processing (4)
  • src/main/utils/execFile.ts
  • src/main/utils/remote-fs.ts
  • src/renderer/hooks/git/useFileTreeManagement.ts
  • src/renderer/utils/fileExplorer.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/renderer/hooks/git/useFileTreeManagement.ts
  • src/main/utils/remote-fs.ts

Comment on lines +361 to +363
} catch {
// Skip unreadable child directories — show them as empty folders
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Empty catch silently swallows unexpected errors — add error-code filtering or Sentry reporting.

The intent is right (skip unreadable directories), but the bare catch {} also suppresses any programming error that surfaces from inside loadFileTreeRecursive — those will never reach Sentry. Per the coding guidelines, only expected/recoverable errors should be handled explicitly with specific error codes; unexpected errors should bubble up.

🛡️ Proposed fix — filter by expected OS error codes
-			} catch {
-				// Skip unreadable child directories — show them as empty folders
-			}
+			} catch (err: any) {
+				// Only swallow expected OS-level errors (permissions, missing entry, I/O).
+				// Unexpected errors (bugs, network protocol failures, etc.) should propagate.
+				const EXPECTED_CODES = new Set(['EACCES', 'EPERM', 'ENOENT', 'ETIMEDOUT', 'ENOTDIR']);
+				if (!EXPECTED_CODES.has(err?.code)) {
+					throw err;
+				}
+				// Skip unreadable child directories — show them as empty folders
+			}

As per coding guidelines: "Handle only expected/recoverable errors explicitly with specific error codes. Let unexpected errors bubble up to Sentry."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/renderer/utils/fileExplorer.ts` around lines 361 - 363, The empty catch
in loadFileTreeRecursive silently swallows all errors; change it to catch the
error object and only handle expected OS errors (e.g., err.code === 'ENOENT' ||
'EACCES' || 'EPERM') by skipping the unreadable child directory, otherwise
report/until-rethrow unexpected errors to Sentry (captureException or similar)
or rethrow so they bubble up; use the error.code check inside the catch where
unreadable child directories are skipped and ensure unexpected errors are not
suppressed.

…rever

Three fixes for an issue where the Files right panel never transitions from
"Loading remote files..." to showing the actual file list during SSH sessions:

1. Add 30s timeout to SSH commands (execFile.ts, remote-fs.ts)
   - SSH readDir calls had no timeout, so a stale ControlMaster socket or
     hung connection would block loadFileTree indefinitely
   - Add timeout option to ExecOptions and pass it through execFileAsync
   - Add ETIMEDOUT to recoverable SSH errors so timed-out commands retry

2. Fix stale closure in async callbacks (useFileTreeManagement.ts)
   - The useEffect captured activeSessionId in its closure, but .then()
     and .catch() callbacks could fire after the user switched sessions
   - Capture session.id at effect start and use it in all async callbacks
     to always target the correct session regardless of navigation

3. Ensure timed-out SSH commands get retried with backoff
   - When Node kills a process via timeout, the error code is ETIMEDOUT
   - Added pattern to RECOVERABLE_SSH_ERRORS so the existing retry logic
     handles it with exponential backoff

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@croakingtoad croakingtoad force-pushed the fix/remote-file-tree-loading-hang branch from afbb455 to efe29ed Compare February 25, 2026 19:17
Copy link
Copy Markdown
Collaborator

@pedramamini pedramamini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! Solid root cause analysis on all three bugs — the stale closure, directorySize coupling, and child directory error propagation.

One fix applied before merge: the execFileAsync code path (which is the one SSH commands actually use, since they don't pipe stdin) didn't produce 'ETIMEDOUT' as exitCode on timeout. Node's execFile with timeout sets error.killed = true and error.code = undefined when it kills the process, so the original catch block yielded exitCode: 1 — the retry logic in remote-fs would never recognize it as a timeout. Added detection for timeout && error.killed && !error.code (excluding maxBuffer kills which also set error.killed but have error.code = 'ERR_CHILD_PROCESS_STDIO_MAXBUFFER'), plus tests covering the new behavior.

@pedramamini pedramamini merged commit 1be15b7 into RunMaestro:main Feb 25, 2026
1 check passed
pedramamini added a commit that referenced this pull request Feb 25, 2026
The merged PR #461 added timeout support to execFileNoThrow, but the
execFileAsync code path (used by SSH commands) didn't surface 'ETIMEDOUT'
as the exitCode. Node's execFile with timeout sets error.killed=true and
error.code=undefined, so the catch block yielded exitCode:1 — the
remote-fs retry logic never recognized it as a timeout.

Detect timeout via `timeout && error.killed && !error.code`, excluding
maxBuffer kills which also set error.killed but have a specific error.code.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants