Skip to content

Refactor/maintainability#4

Merged
TeoSlayer merged 3 commits intomainfrom
refactor/maintainability
Feb 9, 2026
Merged

Refactor/maintainability#4
TeoSlayer merged 3 commits intomainfrom
refactor/maintainability

Conversation

@TeoSlayer
Copy link
Copy Markdown
Owner

  • Add install section to agent skills documentation for self-contained onboarding
  • Extract named constants and eliminate magic numbers across the daemon package (beacon message types, retry/RTO parameters, buffer capacities,
    heartbeat timing)
  • Define shared sentinel errors (ErrNodeNotFound, ErrConnClosed, ErrDialTimeout, etc.) replacing ~25 ad-hoc fmt.Errorf strings — callers can now
    use errors.Is()
  • Deduplicate recv-push goroutine (startRecvPusher), add jsonRPC() helper in driver, convert short mutex patterns to defer, fix silent error
    ignoring

teovl added 3 commits February 9, 2026 14:06
Agents and users reading SKILLS.md had no install instructions.
Adding the section makes the doc self-contained for onboarding.
Hardcoded numeric literals scattered across the daemon package made
tuning values difficult to find and reason about. This centralizes
them into named constants with clear documentation:

- Beacon message types (protocol/header.go): single source of truth
  used by beacon server, tunnel manager, and tests
- Dial/retransmission constants: retry counts, RTO bounds, intervals
- RTO parameters (RFC 6298): clock granularity, min/max clamp values
- Zero-window probe bounds, accept queue, send buffer capacities
- Handshake timing: replay reaper interval, recv timeout, close delay
- ConnState.String() method replaces inline switch in ConnectionList
- Heartbeat loop uses config keepalive interval instead of hardcoded 30s
Define shared sentinel errors (ErrNodeNotFound, ErrNetworkNotFound,
ErrConnClosed, ErrConnRefused, ErrDialTimeout, ErrChecksumMismatch) in
protocol/header.go and replace ~25 ad-hoc fmt.Errorf strings across
registry, daemon, driver, and beacon — callers can now use errors.Is().

Extract startRecvPusher() in daemon/ipc.go to deduplicate the recv-push
goroutine that was copy-pasted between CmdDial and CmdAccept handlers.
Add jsonRPC() helper and named sub-command constants in driver/driver.go,
reducing 10 methods from 8-12 lines each to 1-5 lines.

Fix silent error ignoring in dashboard (w.Write), gateway (io.Copy),
and nameserver (conn.Write) with explicit discards or error logging.
Convert 7 short mutex patterns to use defer across daemon, driver,
and replication packages. Remove redundant replication count() method.

Add overflow guard comment in packet.go and document InsecureSkipVerify
cert-pinning pattern in registry/client.go.
@TeoSlayer TeoSlayer added the enhancement New feature or request label Feb 9, 2026
Comment thread pkg/protocol/packet.go Dismissed
Comment thread pkg/registry/client.go
// use VerifyPeerCertificate for certificate pinning (SHA-256 fingerprint).
// This is the standard Go pattern — the custom callback below provides
// strictly stronger verification than CA-based trust.
InsecureSkipVerify: true, //nolint:gosec // cert pinning via VerifyPeerCertificate

Check failure

Code scanning / CodeQL

Disabled TLS certificate check High

InsecureSkipVerify should not be used in production code.
@TeoSlayer TeoSlayer merged commit b197c59 into main Feb 9, 2026
1 of 2 checks passed
@TeoSlayer TeoSlayer deleted the refactor/maintainability branch February 9, 2026 12:18
TeoSlayer pushed a commit that referenced this pull request Mar 24, 2026
Add 18 new tests across 3 test files:
- ipc_ops_test: SetTags error paths, beacon register/list, registry
  operations, daemon custom config, IPC after deregister
- handshake_test: trust persistence load/verify, trust load from disk
- tasksubmit_test: unmarshal type mismatches, frame read/write
  round-trips, oversized frame rejection, status update and result
  message marshaling, TaskFile time utility edge cases

Remove dead RecvResult from tasksubmit client.

Key coverage improvements:
- handleBeaconRegister: 0% → 100%
- handleBeaconList: 0% → 100%
- loadTrust: 25% → 81%
- handleSetTags: 55% → 75%
- handleSetHostname: 71% → 82%
- handleSetTaskExec: 59% → 71%
TeoSlayer added a commit that referenced this pull request Apr 30, 2026
* Add install section to agent skills documentation

Agents and users reading SKILLS.md had no install instructions.
Adding the section makes the doc self-contained for onboarding.

* Extract named constants and eliminate magic numbers

Hardcoded numeric literals scattered across the daemon package made
tuning values difficult to find and reason about. This centralizes
them into named constants with clear documentation:

- Beacon message types (protocol/header.go): single source of truth
  used by beacon server, tunnel manager, and tests
- Dial/retransmission constants: retry counts, RTO bounds, intervals
- RTO parameters (RFC 6298): clock granularity, min/max clamp values
- Zero-window probe bounds, accept queue, send buffer capacities
- Handshake timing: replay reaper interval, recv timeout, close delay
- ConnState.String() method replaces inline switch in ConnectionList
- Heartbeat loop uses config keepalive interval instead of hardcoded 30s

* Improve maintainability with sentinel errors, deduplication, and cleanup

Define shared sentinel errors (ErrNodeNotFound, ErrNetworkNotFound,
ErrConnClosed, ErrConnRefused, ErrDialTimeout, ErrChecksumMismatch) in
protocol/header.go and replace ~25 ad-hoc fmt.Errorf strings across
registry, daemon, driver, and beacon — callers can now use errors.Is().

Extract startRecvPusher() in daemon/ipc.go to deduplicate the recv-push
goroutine that was copy-pasted between CmdDial and CmdAccept handlers.
Add jsonRPC() helper and named sub-command constants in driver/driver.go,
reducing 10 methods from 8-12 lines each to 1-5 lines.

Fix silent error ignoring in dashboard (w.Write), gateway (io.Copy),
and nameserver (conn.Write) with explicit discards or error logging.
Convert 7 short mutex patterns to use defer across daemon, driver,
and replication packages. Remove redundant replication count() method.

Add overflow guard comment in packet.go and document InsecureSkipVerify
cert-pinning pattern in registry/client.go.

---------

Co-authored-by: Teodor Calin <teodor@vulturelabs.io>
TeoSlayer pushed a commit that referenced this pull request Apr 30, 2026
Add 18 new tests across 3 test files:
- ipc_ops_test: SetTags error paths, beacon register/list, registry
  operations, daemon custom config, IPC after deregister
- handshake_test: trust persistence load/verify, trust load from disk
- tasksubmit_test: unmarshal type mismatches, frame read/write
  round-trips, oversized frame rejection, status update and result
  message marshaling, TaskFile time utility edge cases

Remove dead RecvResult from tasksubmit client.

Key coverage improvements:
- handleBeaconRegister: 0% → 100%
- handleBeaconList: 0% → 100%
- loadTrust: 25% → 81%
- handleSetTags: 55% → 75%
- handleSetHostname: 71% → 82%
- handleSetTaskExec: 59% → 71%
TeoSlayer pushed a commit that referenced this pull request May 3, 2026
Test pins the CURRENT (buggy) behavior: SendData appends to
conn.NagleBuf without any cap. With a slow / unACKing peer, cwnd
fills after IW10 (40 KB) and sendSegment blocks; the rest of the
caller's data slice — up to 5 MiB in this test — sits in NagleBuf
indefinitely. With many connections in this state, daemon RSS
climbs linearly until OOM.

Reproducer: addPeerOnDaemon (peer with no responder) + Connection
forced into StateEstablished + advertised 1 MiB recv window so the
binding constraint is cwnd, not flow control. The 5 MiB SendData
call is run in a goroutine; after 2 s of polling, NagleBuf > 1 MiB
asserts the bug exists.

Test passes today (NagleBuf grew past 1 MiB). v1.9.1 GREEN #4 will
introduce MaxNagleBuf = 8 * MaxSegmentSize (32 KB) and ErrSendBufFull,
flipping the assertion to expect the cap is honored.
TeoSlayer pushed a commit that referenced this pull request May 3, 2026
SendData now rejects writes that would push the per-connection NagleBuf
past MaxNagleBuf (32 KB = 8 segments). Caller gets ErrSendBufFull and
must back off / retry.

Pre-fix: SendData unconditionally appended to NagleBuf with no size
cap. With a slow / unACKing peer, cwnd filled after IW10 and
sendSegment blocked, but the application's data slice still sat in
NagleBuf. Many connections in this state → daemon RSS climbed
linearly with offered-but-undeliverable load until host OOM-killed
the process.

Implementation:
  - new const MaxNagleBuf = 8 * MaxSegmentSize in pkg/daemon/ports.go
    (placed alongside the other window-parameter constants)
  - new ErrSendBufFull in pkg/daemon/daemon.go
  - SendData early-return ErrSendBufFull when len(NagleBuf) + len(data)
    > MaxNagleBuf, BEFORE the append. Existing nagleFlush logic is
    unchanged (still drains MSS-sized segments and respects cwnd via
    sendSegment's existing zero-window-probe loop).

Test flipped: TestSendDataNagleBufGrowsUnbounded now asserts:
  - 5 MiB SendData returns ErrSendBufFull immediately
  - NagleBuf remains empty after the rejected write
  - boundary case (MaxNagleBuf-1 + 2 > MaxNagleBuf) also rejects
  - NagleBuf never grows past the cap

Race detector clean across full pkg/daemon -race suite (62 s).

Note: existing callers (dataexchange.WriteFrame, etc.) propagate
SendData errors up. ErrSendBufFull is a transient condition — typical
caller pattern is to retry after a backoff, or surface the error to
the application as a "would block" signal. Pre-v1.9.1 daemons silently
accumulated forever; the new error makes the backpressure visible
instead of silent.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants