Skip to content

nssh: ntfy read deadlines + one-shot remote prep#3

Merged
abizer merged 1 commit into
masterfrom
nssh-3
Apr 22, 2026
Merged

nssh: ntfy read deadlines + one-shot remote prep#3
abizer merged 1 commit into
masterfrom
nssh-3

Conversation

@abizer
Copy link
Copy Markdown
Owner

@abizer abizer commented Apr 22, 2026

Summary

  • Fix ntfy subscriber hanging on dead TCP sockets (overnight/sleep/NAT rebind). Wrap the dialed net.Conn so every Read resets a 90s deadline; ntfy's ~55s keepalives keep a healthy stream inside the window, silence trips i/o timeout → reconnect.
  • Collapse checkRemoteVersion + writeRemoteSession into a single bash -l -s invocation (prepareRemote). Startup goes from 2 SSH round-trips to 1 before the interactive session.
  • Drop the now-unused probeRemoteVersion/checkRemoteVersion from infect.go.

Test plan

  • go build ./... and go test ./... pass locally
  • nssh <host> opens a session and the ntfy subscriber logs a reconnect line if the network is bounced
  • Leave a session idle > 2 minutes with no activity; clipboard still works after
  • Version-mismatch prompt still fires when the remote nssh is outdated
  • Missing-remote prompt still fires when nssh isn't installed on the remote

🤖 Generated with Claude Code


Note

Medium Risk
Medium risk because it changes the ntfy streaming transport behavior and refactors remote session initialization/version probing into a new single SSH invocation, which could affect connection reliability and upgrade prompts.

Overview
Improves reliability of the ntfy subscriber by dialing with TCP keepalive, enforcing per-read deadlines (to detect zombie sockets), and logging scanner errors before reconnecting.

Refactors remote startup so version probing, session file write, and remote JSONL log seeding happen in one bash -l SSH call (prepareRemote), and removes the now-redundant remote version-check helpers from infect.go. Also updates README wording to better describe nssh’s purpose.

Reviewed by Cursor Bugbot for commit 9af7c7e. Bugbot is set up for automated code reviews on this repo. Configure here.

Two fixes that make nssh survive overnight idle and cut a round-trip at
session start.

ntfy subscriber: the local long-poll held an HTTP connection with no
timeouts at all. When the laptop slept or the connection was silently
dropped (NAT rebind, proxy kill, sleep/wake), scanner.Scan() blocked
forever on a zombie socket and reconnect never fired. Fixed by wrapping
the dialed net.Conn so every Read pushes the read deadline 90s forward.
ntfy sends keepalives every ~55s, so the window is 1.6× safety — any
true silence trips i/o timeout, scanner exits, outer loop reconnects.
Also set ResponseHeaderTimeout and a 15s dialer keepalive as a belt.

Session prep: checkRemoteVersion and writeRemoteSession were two
separate SSH invocations. Merged into a single bash -l login shell that
echoes NSSH_VERSION: <ver|none>, writes the session TOML, and appends
the session-open JSONL event. Caller inlines the warn/prompt-to-infect
flow on the parsed version. Cuts startup from two SSH round-trips to
one (plus the real session connect).

Removed now-unused probeRemoteVersion / checkRemoteVersion from
infect.go; promptYes stays there, it's still used by infect flows.

README: abizer clarified the copy while this was in flight.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@abizer abizer merged commit a28fd7d into master Apr 22, 2026
1 check passed
@abizer abizer deleted the nssh-3 branch April 22, 2026 16:48
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9af7c7ec98

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread cmd/nssh/main.go
Comment on lines +221 to +223
Transport: &http.Transport{
DialContext: func(ctx context.Context, network, addr string) (net.Conn, error) {
conn, err := dialer.DialContext(ctx, network, addr)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restore proxy support in ntfy HTTP transport

Replacing the default client transport with a custom http.Transport here drops Proxy: http.ProxyFromEnvironment, and in Go a nil Proxy means no proxy is used. That regresses connectivity for environments that require HTTP_PROXY/HTTPS_PROXY to reach ntfy (corporate networks, CI runners, bastion setups), where previous behavior worked via the default transport.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

Bugbot Autofix is ON, but it could not run because the branch was deleted or merged before autofix could start.

Reviewed by Cursor Bugbot for commit 9af7c7e. Configure here.

Comment thread cmd/nssh/main.go
},
ResponseHeaderTimeout: 30 * time.Second,
},
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Custom transport loses DefaultTransport's 30s dial timeout

Medium Severity

The net.Dialer only sets KeepAlive but omits Timeout. The previous &http.Client{} used http.DefaultTransport, which dials with a 30s timeout. With Timeout defaulting to zero, TCP connects to an unreachable ntfy server can hang for minutes (OS-level TCP timeout). This directly undermines the deadlineConn improvement — after detecting a dead socket in ~90s, the reconnect dial itself can stall for far longer. The dialer also drops Proxy: http.ProxyFromEnvironment and TLSHandshakeTimeout, which http.DefaultTransport provides.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 9af7c7e. Configure here.

Comment thread cmd/nssh/main.go
}
return v
}
return ""
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Version output discarded when session write fails

Low Severity

The script uses set -e and echoes NSSH_VERSION: before mkdir/cat. If the version probe succeeds but session-file writing fails (permissions, disk full), the script exits non-zero. Go's cmd.Output() still returns the captured stdout, but the err != nil check discards it and returns "". This causes a misleading "not installed on remote" prompt when nssh IS installed but the session directory is unwritable.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 9af7c7e. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant