Conversation
- Delete 4 pre-rewrite manager modules that nothing imports anymore: backup-manager, database-manager, health-monitor, session-manager (1859 LOC). Everything that used them has been reimplemented in src/tools/*-tools.js for a while. - Strip the dead import block + formatDuration duplicate in index.js. - Remove unused ssh-manager methods: execCommandStream, requestShell, putFiles (82 lines). No caller remained. - Retire ssh_alert_setup. It wrote thresholds to a config file with no runner and no delivery -- a trap. Deleting is better than shipping a stub. Tool count goes 51 to 50, monitoring group 6 to 5. - Expand lint surface from src/*.js to src/**/*.js (covers 11k LOC of handlers in src/tools/ that were never linted). Promote no-unused-vars to error so the next cleanup can't regress. - Clean the seven no-unused-vars errors the expanded lint surfaced: defaultRender/renderMarkdown/healthStderr/localSize and friends. - Use healthStderr snippet in the deploy health_check failure reason so the error is actually useful instead of just a numeric exit code. npm test: 575 pass, 0 fail. npm run lint: clean.
…M10)
Critical:
C1 ssh_sync -- rsync argv was checking serverConfig.keypath (lowercase)
while config-loader stores keyPath (camelCase), so every key-auth
server fell through to a miswired password branch. Resolve with both
field names; keyPath is canonical. Broken for months.
C2 ssh_backup_schedule -- refused to silently embed DB passwords in the
crontab (plaintext persistence). Now rejects the call with a clear
message pointing at ~/.my.cnf / ~/.pgpass / PGPASSFILE.
C3 ssh_db_import mongo path -- mongorestore --archive= built via hand-
rolled quote escape; paths with spaces/&/$/;/\\ broke the command.
Switch to shQuote() like the mysql/postgres branches already do.
C4 ssh_session_close "all" -- the schema advertised "all" but the
handler did sessions.get("all") and returned already_closed. Now
iterates every tracked session, closes each, returns a typed
summary + per-session errors.
High:
H1 ssh_diff cross-server -- SFTP channels opened via getSftpChannel()
were never endSftp()'d. After ~5 cross-server diffs the connection
hit OpenSSH's default MaxSessions=10 and refused new channels.
Close both SFTPs on success and error paths.
H2 ssh_docker pull -- 60s timeout is unusable for any real image.
Added PULL_TIMEOUT_MS = 600_000 specifically for pulls.
H3 Hardened fail() -- handles Error, string, object (with/without
.message), and null. Avoids "[object Object]" rendering for
unusual shapes. Opt-in stack trace via MCP_SSH_INCLUDE_STACK=1.
H4 ssh_sync password auth -- previously put the password in argv
(visible via `ps aux`). Now wraps in `sshpass -e rsync ...` and
passes the password via SSHPASS env var.
H6 ssh_tunnel_create type="dynamic" -- was a stub that destroyed
every inbound connection (black-holed traffic). Remove from the
allowed enum until a real SOCKS5 handler lands; explicit fail
message so users aren't blindsided.
H8 ssh_db_query mongo -- positional URI ordering was wrong;
mongosh would interpret the database name as a host. Now builds
a proper URI at runtime in SSH_MGR_DB_URI env var and uses
`mongosh --nodb` + new Mongo() + getDB(). Credentials never
touch argv on the target host.
Medium / Low:
M1 writeMeta printf -- added `--` separator so JSON with a leading
`%` (e.g. /backups/50%-used/) isn't consumed as a format directive.
M2 ssh_health_check -- LANG=C LC_ALL=C pin so parsers don't choke
on non-English output formats.
M10 ssh_execute -- command capped at 512KB (well under ARG_MAX after
shQuote expansion). "Argument list too long" becomes a clear zod
error instead.
L6 Server version now read from package.json instead of hardcoded.
Tests:
- 5 regressions added: keyPath canonical + keypath alias + password
not in argv; ssh_sync password uses sshpass -e + SSHPASS env;
backup_schedule refuses DB passwords; session_close "all"; mongo
query uses --nodb + env URI + getDB.
npm test: 580 pass, 0 fail (was 575 pre-commit). npm run lint: clean.
Annotations (MCP 2025-06-18):
- Every one of the 50 registered tools now declares readOnlyHint /
destructiveHint / idempotentHint / openWorldHint and a human `title`
via src/tool-annotations.js. The table is the canonical truth -- a
new test (test-tool-annotations.js, 9 cases) pins the invariants:
* every registered tool has an annotations entry
* no dangling annotations for unknown tools
* readOnlyHint + destructiveHint are never both set
* every obvious destructive/readOnly tool is marked correctly
Claude Code and other MCP hosts use these for auto-approval decisions
and palette rendering; prior to this, every tool looked identical to
the client and users had to eyeball the description.
Cancellation (MCP $/cancelRequest):
- registerToolConditional now forwards `extra.signal` from the SDK as
`args.abortSignal` to every handler. streamExecCommand already accepts
an AbortSignal in its options -- so exec / execute_sudo / execute_group
/ cat / tail handlers now actually honor client-side cancel (Esc in
Claude Code). Before this, a cancelled tool call would still run to
completion on the remote host and keep the SSH channel busy.
Remaining tools (transfer, docker, backup, db, deploy, monitoring,
port-test, systemctl, journalctl) still rely on their local timeouts;
they pick up abortSignal next iteration.
npm test: 589 pass (+9 new annotations tests). npm run lint: clean.
…guard
Three new test files + one hardening extension:
tests/test-tool-config-manager.js (17 tests)
- Default config when file absent (all enabled).
- Corrupt JSON falls back to defaults without crashing.
- Invalid structure (missing fields / bad mode) falls back.
- mode=all / mode=minimal / mode=custom semantics.
- Individual tool overrides win over group settings.
- disableGroup("core") is refused (core is load-bearing).
- Unknown tool / unknown group rejected by helpers.
- getEnabledTools + getDisabledTools arithmetic matches registry.
- exportClaudeCodeConfig emits correct mcp__ssh-manager__ patterns.
Zero coverage before; gate for all 50 tools.
tests/test-config-loader.js (9 tests)
- TOML loading with key_path/keypath/ssh_key aliases, default_dir,
proxy_jump, case-insensitive server names.
- .env loading via SSH_SERVER_NAME_* pattern.
- Precedence: process.env > .env > TOML (documented in CLAUDE.md,
was untested). Regression here silently changes which host
Claude connects to.
- configSource reflects actual load origin.
- Corrupt TOML does not crash; falls through to .env.
Zero coverage before.
tests/test-index-registration.js (6 tests)
- Every TOOL_GROUPS entry has a registerToolConditional() in index.js.
- Every registerToolConditional() maps to a TOOL_GROUPS entry.
- Tool count matches registry exactly.
- No duplicates across groups, no empty groups.
Catches silent drift when someone adds a tool to the registry but
forgets to wire a handler (or vice versa).
tests/test-structured-result.js (+7 tests)
- fail() regression suite for the Wave 2 hardening: null, undefined,
plain object w/o .message, circular objects, MCP_SSH_INCLUDE_STACK
behavior. No more "[object Object]" errors reaching users.
npm test: 628 pass (was 589). validate.sh: clean.
…_setup refs
The tool count dropped from 51 to 50 when ssh_alert_setup was retired
(it was the dead trap that wrote alert thresholds to disk with no runner).
Also found the wiki pie chart had four drifted counts (sessions/monitoring/
gamechanger) that didn't match the registry.
- README.md / CLAUDE.md: 51 -> 50 in two places each.
- CLAUDE.md: drop ssh_alert_setup bullet from Health & Monitoring.
- docs/TOOL_MANAGEMENT.md: drop ssh_alert_setup bullet.
- wiki/Home.md: fix pie chart counts (sessions 4->6, monitoring 6->5,
gamechanger 14->12) and 51->50 total.
- wiki/Tool-reference.md: 51->50, drop ssh_alert_setup row.
- wiki/Recipes.md: replace ssh_alert_setup example with ssh_hooks
(which actually delivers alert-on-failure via user-defined actions).
- src/tool-registry.js: 51->50 in docstring header.
npm test: 628 pass, 0 fail.
…odules Reviewer caught one blocker and two non-blockers in PR #6: B1 (blocking, fixed): handleSshBackupSchedule was bricked for mongodb without a password. My defense-in-depth check rejected any non-empty envPrefix, but the mongo branch of buildBackupCommand unconditionally builds `MCP_BACKUP_URI=mongodb://localhost:27017/db` -- that's a connection string, not a secret carrier, when no password is supplied. Passwordless mongo schedules always hit the internal-error path. Fix: buildBackupCommand now also returns `envCarriesSecret: !!password`. handleSshBackupSchedule gates on that instead of prefix truthiness, and prepends the (non-secret) env prefix to the cron line so mongo jobs actually run. Regression test added: mongo schedule without password installs a cron line whose URI contains no `@` (no userinfo). N1 (non-blocking, fixed): withAnnotations() spread-merged `{ ...schema.annotations, ...ann.annotations }` -- map wins. The docstring says "smart defaults", so the caller should win. Swapped the order + added a test that exercises override. N5 (non-blocking, fixed): session_close "all" pushed a session into `closed[]` even when close() threw. Same session then also appeared in `errors[]`. Now errored sessions appear only in `errors[]`. Stale reference to session.server / commandCount is still read after sessions.delete(id) -- that's fine, we captured the reference before deleting. Extra dead-code sweep (double-check pass caught these): - src/tunnel-manager.js (576 LOC) -- index.js's dead import block at lines 52-147 was the only consumer; deleting it earlier made this parallel SOCKS5 / forwardIn implementation unreachable. Includes the stale SOCKS5 half-implementation that was never exposed to users. - src/tool-exec-stream.js (87 LOC) + tests/test-tool-exec-stream.js (202 LOC) -- runStreamedExec was never adopted by any handler; only the test imported it. Either adopt or delete per architecture audit; adopting is Wave-5 material, deleting now. - src/config.js (~90 LOC) -- OUTPUT_LIMITS / truncateOutput / format- JSONResponse all referenced only within config.js itself. Doc cleanup: - tests/test-sql-safety.js: stale comment pointing at the deleted database-manager.js replaced with an accurate one-liner. npm test: 621 pass, 0 fail. npm run lint: clean. validate.sh: clean.
|
Self-review pass found one blocker + a couple nits. Pushed B1 (blocking): N1: N5: Extra sweep: the double-check surfaced three more dead modules from the post-rewrite era that were still sitting in the repo: Final state: 621 tests pass, |
…ving them
The previous commit dropped three things that turned out to be documented
features, not pure cruft. Implementing each properly:
1. config.js + output-limits env vars (MCP_SSH_MAX_OUTPUT_LENGTH,
MCP_SSH_MAX_TAIL_LINES, MCP_SSH_MAX_RSYNC_OUTPUT, MCP_SSH_COMPACT_JSON,
MCP_SSH_DEBUG). Listed in .env.example but the module never actually
wired them anywhere. Restored with real wiring:
- output-formatter.truncateHeadTail + formatExecResult now default to
OUTPUT_LIMITS.MAX_OUTPUT_LENGTH, so setting MCP_SSH_MAX_OUTPUT_LENGTH=5000
actually changes the truncation cap.
- logger.js accepts MCP_SSH_DEBUG as a shorthand for SSH_LOG_LEVEL=DEBUG.
- Added truncateOutput() as a standalone helper for handlers that don't
flow through the formatter.
- Values parsed at import-time with min/max guards so a bad env doesn't
misbehave silently -- bad values fall back to documented defaults.
- 6 tests in test-config.js.
2. ssh_alert_setup as a real handler (src/tools/alerts-tools.js). The
previous handler wrote thresholds to /etc/ssh-manager-alerts.json on the
REMOTE host and had no runner -- a trap. New design:
- Thresholds stored on the OPERATOR machine at
~/.ssh-manager/alerts/<server>.json, atomic tmp+rename write at 0600.
- `set` / `get` / `check` actions.
- `check` delegates to handleSshHealthCheck (one aggregated remote call,
typed output) and compares cpu.usage_percent, memory.used_percent,
and per-mount disk.used_percent against configured thresholds.
- Returns {status, alert_count, alerts[], current_metrics}.
- No background runner by design -- operators wire `check` into cron,
CI, or an ssh_hooks event. Keeps the MCP server stateless.
- Server name sanitized to [a-z0-9_-] before path join so a crafted
name can't escape ALERTS_DIR (verified by test).
- 12 tests in test-alerts-tools.js.
3. SOCKS5 dynamic tunnels (src/tools/tunnel-tools.js
parseSocksConnectRequest + handleSocks5Connection). The previous
implementation destroyed every inbound connection -- a black hole.
New implementation is a real RFC-1928 SOCKS5 server:
- No-auth method only (0x00). Offers 0xFF if the client won't accept.
- CMD=CONNECT only; responds COMMAND_NOT_SUPPORTED otherwise.
- ATYP 0x01 IPv4, 0x03 domain, 0x04 IPv6.
- forwardOut errors map to SOCKS REP codes (REFUSED / UNREACHABLE /
GENERAL_FAILURE) so the client sees a meaningful failure instead of
a dropped connection.
- Any residual client bytes received during the reply are flushed into
the new SSH channel so we don't lose the first packet of the stream.
- 10 unit tests in test-socks5.js covering parser edge cases + full
handshake happy path + method negotiation failure + forwardOut error.
Registry updated: 50 tools -> 51 (monitoring group 5 -> 6). All docs
(README, CLAUDE.md, wiki, TOOL_MANAGEMENT.md) re-synced. tool-registry.js
docstring, TOOL_GROUP_COUNTS, and test-tool-registry.js all match.
npm test: 649 pass, 0 fail (was 619; +30 new tests). npm run lint: clean.
./scripts/validate.sh: clean.
|
Follow-up to reviewer feedback: re-implemented the three documented features my previous commit incorrectly deleted. Pushed 1. Output-limits env vars
2.
|
- Clear all 46 eslint errors (unused imports, dead helpers, legacy examples) - Overrides no-console for tests/examples/debug (intentional output) - Add destructiveHint to 9 arbitrary-exec tools (ssh_execute, session_send, process_manager, systemctl, docker, db_dump, backup_create, tunnel_create, execute_group) so conservative hosts prompt appropriately - Fix 8 legacy inline handlers that returned error content without isError:true (ssh_history, ssh_group_manage, ssh_deploy, ssh_command_alias, ssh_hooks, ssh_profile, ssh_connection_status, ssh_alias) - Wrap applyPatches() in try/catch so ssh_edit bad regex returns structured fail - Fix ssh_hooks 'disable' success path that was mislabeled as [err] - Fix examples/backup-workflow.js: /* */ block comments with */ inside strings were prematurely terminating; switched to // line comments + ES export
Two findings from the security review, both require adversarial input to
reach (MCP caller controls the args), but the blast radius exceeds what
the tool is supposed to allow:
1. **ssh_backup_schedule: cron injection via newline**
The cron validator counted whitespace-separated tokens, so a value like
"0 0 * * *\n* * * * * rm -rf ~" passed. shQuote preserves newlines
inside single-quoted strings, so `printf '%s\n' ${shQuote(cronLine)}`
installed a multi-line crontab. Fix: reject cron with any \r\n\t and
any shell metacharacter ($, `), then split on spaces only.
2. **ssh_db_{query,list,dump,import}: SQL ident injection via database/user**
`database` and `user` were shQuote'd for the shell but interpolated into
SQL strings like `SHOW TABLES FROM 'name'` and `pg_database_size('name')`.
A database name of `app'; DROP DATABASE x; --` shell-unquotes to a valid
SQL injection payload. Fix: validate database/user against
[A-Za-z0-9_][A-Za-z0-9_.-]{0,63} (the conservative intersection of
MySQL/PG/Mongo identifier rules) at handler entry.
+4 regression tests (651 -> 653 passing). No existing behavior changed for
well-formed inputs.
Replace `existsSync` + `unlinkSync` with try/unlink/ignore-ENOENT so CodeQL stops flagging tests/test-profiles.js:100,109 as new file-system race alerts. Behavior is identical (cleanup either restores the original profile or removes the test-only one), just without the check-then-act pattern.
CodeQL's file-system-race detector was linking existsSync(testProfileFile) (read-time check) to the later writeFileSync in cleanup. Read the file directly via try/readFileSync/ignore-ENOENT so there is no check-then-use pair for the taint tracker to flag.
These 52 alerts were flagged against main. This commit touches everything
not already cleaned by earlier commits on op-overhaul:
- Remove TOCTOU (check-then-act) patterns in src/:
- profile-loader.createProfile: openSync with 'wx' (atomic create-if-not-exists)
instead of existsSync + writeFileSync.
- ssh-key-manager.addHostKey: copyFileSync in try/catch(ENOENT); rely on
mkdir {recursive:true} being idempotent instead of precheck.
- config-loader: read Codex config via try/readFileSync/ignore-ENOENT
instead of existsSync + readFileSync.
- Move markdown-table cell escaping into output-formatter.escapeMdCell()
so CodeQL stops flagging the inline .replace(/\|/g,'\\\\|') chains as
"incomplete sanitization". The helper also escapes backslashes first
(so our own escape doesn't double-up) and collapses newlines.
- Drop useless "let stat = 'unknown'" initializers in transfer-tools
upload/download/edit preview paths; every code path that reaches the
plan writer assigns stat via try/catch.
Required to merge eslint 10 in PR #2 -- eslint 10 dropped Node 18. Node 18 reached end-of-life a year ago; every actively maintained LTS line (20, 22, 24) gets tested in CI. Users on Node 18 can stay on claude-code-ssh 3.2.x.
Summary
ssh_alert_setuptrap). Lint dropped from 91 warnings to 0.session_close "all", SFTP channel leak, 60s docker pull, stub SOCKS tunnel, broken mongo-query arg order, andfail()edge-case handling.readOnlyHint/destructiveHint/idempotentHint+ a humantitle; MCP cancellation (extra.signal) is wired through exec / execute_sudo / execute_group / cat / tail so Esc actually stops the remote command.Shape: 5 themed commits, 628 tests pass,
npm run lintclean,./scripts/validate.shclean.What shipped, by theme
1.
cleanup:— dead-code purge (6f30655)Removed
database-manager.js,backup-manager.js,health-monitor.js,session-manager.js(1859 LOC — every export imported only by index.js, every import flagged unused), the deadexecCommandStream/requestShell/putFilesonSSHManager, and thessh_alert_setuphandler (was a trap: wrote thresholds to disk with no runner and no delivery). Expanded lint tosrc/**/*.js(11k LOC of handlers that were never linted) and promotedno-unused-varsto error so the cleanup can't regress silently.2.
fix:— real bugs from the audit (a7a429a)ssh_syncserverConfig.keypath(lowercase) but config-loader storeskeyPath(camelCase) -- every key-auth server fell through to a miswired password branchkeyPathcanonicalssh_backup_schedule~/.my.cnf/~/.pgpass/PGPASSFILEssh_db_importmongo--archive=path.replace(/'/g, '...')only escaped quotes; paths with spaces /$/;broke the commandshQuote(input_path)like the mysql/postgres branchesssh_session_closesession_id: "all"silently returnedalready_closedwithout closing anythingsessions, close each, return typed summary + per-session errorsssh_diffcross-servergetSftpChannel()were neverendSftp()'d -- hit OpenSSHMaxSessions=10after ~5 diffsssh_dockerdocker pulltimed out at 60s -- unusable for real imagesPULL_TIMEOUT_MS = 600_000fail().messagerendered as[object Object]ssh_syncps aux)sshpass -e rsync ...withSSHPASSenv varssh_tunnel_create type=dynamicdynamicfrom the enum; explicit fail messagessh_db_querymongomongosh --nodb+new Mongo(process.env.SSH_MGR_DB_URI)+getDB(dbname)printf '%s' ...consumed a leading%in the JSON as a format directive--separatorssh_health_checkLANG=C LC_ALL=C bash -c ...ssh_executez.string().max(524288)with clear zod errorsrc/index.js'3.2.2'version in two placespackage.json3.
mcp:— annotations + cancellation (ae19bdf)New
src/tool-annotations.jsdeclares intent for every one of the 50 tools:readOnlyHinton read-only tools,destructiveHinton backup-restore / db-import / deploy / execute-sudo / edit / schedule,idempotentHintwhere running twice is safe, plus a humantitlefor the/mcppalette. Claude Code and other hosts use these for auto-approval.registerToolConditionalnow also forwardsextra.signal(MCP$/cancelRequest) asargs.abortSignal--streamExecCommandalready accepts anAbortSignal, so exec / execute_sudo / execute_group / cat / tail handlers honor client-side Esc instead of running to completion on the target host. 9 new invariant tests pin the annotations.4.
test:— zero-coverage gap closure (7c349e3)Three new test files + an extension, 32 tests:
test-tool-config-manager.js(17): gate for all 50 tools had ZERO coverage before. Pins default config on missing file, corrupt-JSON fallback, invalid-structure fallback, mode-all / mode-minimal / mode-custom semantics, tool overrides,disableGroup("core")refusal, unknown-tool rejection, export shape.test-config-loader.js(9): theenv > .env > TOMLprecedence CLAUDE.md documents was untested. Pins precedence, alias handling (key_path/keypath/ssh_key), case-insensitive names, corrupt-TOML fallback. Regression here silently changes which host Claude connects to.test-index-registration.js(6): drift-guard. Fails CI if someone adds a tool toTOOL_GROUPSbut forgets toregisterToolConditionalin index.js (or the reverse).test-structured-result.js(+7): regression suite for the Wave 2fail()hardening -- null/undefined/object-without-message/circular-object all produce useful errors.5.
docs:— source-of-truth sync (495f3ab)README, CLAUDE.md, docs/TOOL_MANAGEMENT.md, wiki/Home.md, wiki/Tool-reference.md, wiki/Recipes.md all had 51/ssh_alert_setup/drifted pie-chart counts. Fixed.
Test plan
npm test-- 628 pass, 0 fail (was 575 pre-PR; +53 net, all new are green)npm run lint-- clean (was 91 warnings pre-PR)./scripts/validate.sh-- cleannode --check src/index.js-- OK after deleting 289 linesgit grep 'ssh_alert_setup'-- no matchesgit grep '51 tool'-- no matchesNot in this PR (intentionally deferred)
Resources/Prompts/CompletionsAPIs (@ssh-manager:ssh://...mentions,/mcp__ssh-manager__deploy-to-stagingprompts, server-name tab completion). Sizeable, cohesive follow-up.ssh_stat,ssh_find,ssh_grep,ssh_sockets,ssh_context,ssh_package,ssh_git. Catalog expansion, separate PR.runTool()pipeline unification across exec/cat/tail handlers (~100 LOC of de-duplication once the annotations/cancellation settles).ConnectionPoolclass extraction fromsrc/index.js.abortSignalwiring to transfer / docker / backup / db / deploy handlers (pattern's established; they still have their local timeouts).