feat(smtp): distinguish TLS handshake failure from post-TLS application error#718
Conversation
phillip-stephens
left a comment
There was a problem hiding this comment.
Hello @apowis, really appreciate the work on this and the in-depth unit tests. Also good catch on the STATUS_HANDSHAKE_ERROR when we fail to establish a TLS connection.
To clarify what you think the desired status on a failed SMTP (or other protocol where the protocol occurs both before and after a TLS handshake):
- Failure with initial TCP connection -
SCAN_CONNECTION_REFUSED/TIMEOUT/CLOSED - Failure with application protocol, pre-TLS -
SCAN_APPLICATION_ERROR - Failure with TLS handshake -
SCAN_HANDSHAKE_ERROR - Failure with application protocol, post-TLS -
SCAN_TLS_APPLICATION_ERROR
Rename SCAN_TLS_APPLICATION_ERROR -> SCAN_POST_TLS_APPLICATION_ERROR and HandshakeComplete -> HandshakeCompletedSuccessfully (field, JSON tag, schema) per reviewer suggestions for clarity.
Rename SCAN_TLS_APPLICATION_ERROR -> SCAN_POST_TLS_APPLICATION_ERROR and HandshakeComplete -> HandshakeCompletedSuccessfully (field, JSON tag, schema) per reviewer suggestions. Update all test names and references accordingly.
933b145 to
49aa8d5
Compare
|
Thanks for the review @phillip-stephens! I've addressed both points:
I also took the opportunity to add a small feature while I was in this area: Happy to pull Re: applying this to other protocols including HTTP For simple protocols like IMAP, POP3, and FTP — where the module owns the TLS connection directly and calls HTTP is more nuanced. The HTTP module delegates TLS entirely to
Happy to pick that up as a follow-on once this lands. |
49aa8d5 to
1c72142
Compare
Rename SCAN_TLS_APPLICATION_ERROR -> SCAN_POST_TLS_APPLICATION_ERROR and HandshakeComplete -> HandshakeCompletedSuccessfully (field, JSON tag, schema) per reviewer suggestions. Update all test names and references accordingly.
1c72142 to
6fac127
Compare
|
Let's pull out the Also would love to get a PR to add Thanks for this and just @ me when you've pulled out the |
Rename SCAN_TLS_APPLICATION_ERROR -> SCAN_POST_TLS_APPLICATION_ERROR and HandshakeComplete -> HandshakeCompletedSuccessfully (field, JSON tag, schema) per reviewer suggestions. Update all test names and references accordingly.
6fac127 to
823e723
Compare
|
Done — I've pulled For the
We could maybe compress some of these into one PR, such as combining A-E and leaving F and G separate. Let us know what you think. |
Set to true only when the full TLS handshake completes without error, covering both Handshake() and HandshakeContext(). Partial handshake data (ServerHello, certificates) is still surfaced in HandshakeLog on failure via the existing deferred GetHandshakeLog() capture. Registers the new field in the shared tls_log zgrab2 schema SubRecord. Adds unit tests covering success and failure cases.
Distinguishes a protocol-level failure that occurs after a successful TLS handshake from a generic SCAN_APPLICATION_ERROR. Callers can now tell whether a scan failure happened before or after TLS was established.
…on error Track whether TLS is active in the scanner so that application-level errors after a successful handshake return SCAN_POST_TLS_APPLICATION_ERROR instead of SCAN_APPLICATION_ERROR. TLS handshake failures return SCAN_HANDSHAKE_ERROR. TLSLog is populated from partial handshake data even on failure, so callers always see what zcrypto captured. Adds a test that exercises the post-TLS error path using a net.Pipe fake server and the real TLS wrapper with InsecureSkipVerify.
823e723 to
656f1b9
Compare
|
That plan sounds broadly fine with a few changes. Personally, I think we should only use the new For something like Let's:
Thanks for this work, feels like an overdue alignment of our error statuses! |
phillip-stephens
left a comment
There was a problem hiding this comment.
Final nit and then looks good to me
Co-authored-by: Phillip Stephens <pstephens9275@gmail.com>
f149816 to
45bdc97
Compare
phillip-stephens
left a comment
There was a problem hiding this comment.
LGTM, thanks again!
…-distinction-local
…s for all rolled-out modules Add a shared testhelpers package (MakeL4Dialer, GenerateTestCert, RunTLSServer, MakeFailingTLSWrapper, MakeInsecureTLSWrapper) to avoid repeating the net.Pipe/cert-gen/TLS-server boilerplate across six test files. Each module gets the same three test shapes introduced for SMTP in PR zmap#718: - <Module>HandshakeError: TLS wrapper returns error -> SCAN_HANDSHAKE_ERROR - <Module>STARTTLSHandshakeError: pre-TLS exchange succeeds, wrapper fails - <Module>HandshakeCompletedSuccessfully: real stdlib TLS server completes handshake, result.TLSLog.HandshakeCompletedSuccessfully == true ManageSieve additionally gets TestManageSievePostTLSCapabilitiesError which verifies that a dropped connection after a successful handshake returns SCAN_POST_TLS_APPLICATION_ERROR. Also fixes a pre-existing bug in the FTP scanner where implicit-TLS TLSLog was captured into a local results variable via a defer but the returned value was &ftp.results (a copy made before the defer ran), meaning TLSLog was always nil on implicit-TLS success. TLSLog is now captured directly into results before ftp is initialised.
…for all modules Each module gets the same three test shapes introduced for SMTP in PR zmap#718: - Implicit TLS or STARTTLS handshake error -> SCAN_HANDSHAKE_ERROR - STARTTLS handshake error (where applicable) - Real stdlib TLS handshake -> TLSLog.HandshakeCompletedSuccessfully == true ManageSieve additionally gets TestManageSievePostTLSCapabilitiesError: server closes after a successful handshake without sending post-TLS capabilities -> SCAN_POST_TLS_APPLICATION_ERROR. MySQL uses a hand-crafted HandshakeV10 binary packet with CLIENT_SSL set so SupportsTLS() returns true and the scanner reaches the TLS wrapper. Postgres uses a minimal SSLRequest exchange (8 bytes client, 'S' server) to get the scanner to DoSSL, then verifies TLSLog is populated even when the post-TLS Postgres conversation fails (server closes after handshake).
…for all modules Each module gets the same three test shapes introduced for SMTP in PR zmap#718: - Implicit TLS or STARTTLS handshake error -> SCAN_HANDSHAKE_ERROR - STARTTLS handshake error (where applicable) - Real stdlib TLS handshake -> TLSLog.HandshakeCompletedSuccessfully == true ManageSieve additionally gets TestManageSievePostTLSCapabilitiesError: server closes after a successful handshake without sending post-TLS capabilities -> SCAN_POST_TLS_APPLICATION_ERROR. MySQL uses a hand-crafted HandshakeV10 binary packet with CLIENT_SSL set so SupportsTLS() returns true and the scanner reaches the TLS wrapper. Postgres uses a minimal SSLRequest exchange (8 bytes client, 'S' server) to get the scanner to DoSSL, then verifies TLSLog is populated even when the post-TLS Postgres conversation fails (server closes after handshake).
…for all modules Each module gets the same three test shapes introduced for SMTP in PR zmap#718: - Implicit TLS or STARTTLS handshake error -> SCAN_HANDSHAKE_ERROR - STARTTLS handshake error (where applicable) - Real stdlib TLS handshake -> TLSLog.HandshakeCompletedSuccessfully == true ManageSieve additionally gets TestManageSievePostTLSCapabilitiesError: server closes after a successful handshake without sending post-TLS capabilities -> SCAN_POST_TLS_APPLICATION_ERROR. MySQL uses a hand-crafted HandshakeV10 binary packet with CLIENT_SSL set so SupportsTLS() returns true and the scanner reaches the TLS wrapper. Postgres uses a minimal SSLRequest exchange (8 bytes client, 'S' server) to get the scanner to DoSSL, then verifies TLSLog is populated even when the post-TLS Postgres conversation fails (server closes after handshake).
…MySQL, Postgres (#727) * chore: add shared TLS test helpers for module scanner tests Adds modules/testhelpers with five helpers used across the per-module TLS scanner tests: - MakeL4Dialer: wraps a net.Conn as an L4Dialer - GenerateTestCert: throwaway self-signed RSA cert - RunTLSServer: stdlib TLS server goroutine with a post-handshake callback - MakeFailingTLSWrapper: always returns a handshake error - MakeInsecureTLSWrapper: real zcrypto wrapper with InsecureSkipVerify * feat: roll out SCAN_HANDSHAKE_ERROR to IMAP, POP3, FTP, ManageSieve, MySQL, Postgres Each module now returns SCAN_HANDSHAKE_ERROR (instead of TryGetScanStatus or SCAN_APPLICATION_ERROR) when the TLS handshake itself fails, and captures any partial TLSLog before returning the error. ManageSieve also gets SCAN_POST_TLS_APPLICATION_ERROR for post-TLS capability read failures, since it has application-layer exchanges both before and after the STARTTLS handshake (matching the SMTP pattern). IMAP, POP3, FTP, MySQL, and Postgres keep SCAN_APPLICATION_ERROR for application-level errors — these are TCP-TLS-App protocols with no pre-TLS app logic, per reviewer guidance. FTP: GetFTPSCertificates return type changed from error to *zgrab2.ScanError so the caller can propagate the exact status without a TryGetScanStatus call. FTP (bugfix): implicit-TLS TLSLog was being written into a local results variable via a defer, but the returned value was &ftp.results (a value copy made before the defer ran), so TLSLog was silently nil on every implicit-TLS success. Fixed by capturing it directly before initialising ftp. Postgres: DoSSL now stores the TLS connection even on handshake failure so GetTLSLog can surface partial handshake data. The newConnection caller is updated from SCAN_APPLICATION_ERROR to SCAN_HANDSHAKE_ERROR, which also fixes an unintended retry: the SSL-retry loop was incorrectly re-attempting on TLS handshake failures (it should only retry on RequestSSL rejection). * test: verify SCAN_HANDSHAKE_ERROR and HandshakeCompletedSuccessfully for all modules Each module gets the same three test shapes introduced for SMTP in PR #718: - Implicit TLS or STARTTLS handshake error -> SCAN_HANDSHAKE_ERROR - STARTTLS handshake error (where applicable) - Real stdlib TLS handshake -> TLSLog.HandshakeCompletedSuccessfully == true ManageSieve additionally gets TestManageSievePostTLSCapabilitiesError: server closes after a successful handshake without sending post-TLS capabilities -> SCAN_POST_TLS_APPLICATION_ERROR. MySQL uses a hand-crafted HandshakeV10 binary packet with CLIENT_SSL set so SupportsTLS() returns true and the scanner reaches the TLS wrapper. Postgres uses a minimal SSLRequest exchange (8 bytes client, 'S' server) to get the scanner to DoSSL, then verifies TLSLog is populated even when the post-TLS Postgres conversation fails (server closes after handshake). * move conn close defer right after conn creation in ftp/scanner * added 2 new asserts to tests for mysql and postgres --------- Co-authored-by: Phillip Stephens <phillip@cs.stanford.edu>
When scanning with TLS (SMTPS or STARTTLS), zgrab2 previously had no way to tell whether a failed scan was caused by the TLS handshake itself failing, or by the application-layer exchange failing after TLS was successfully established. Both cases returned a generic error status, and the `tls` JSON object gave no indication of whether the handshake completed.
This PR adds that distinction via two changes:
`TLSLog.HandshakeCompletedSuccessfully bool` (`json:"handshake_completed_successfully"`)
Set to `true` in `TLSConnection.Handshake()` only when the handshake returns `nil`. All modules that call `tlsConn.GetLog()` get this field in their JSON output automatically — no per-module changes required.
The `handshake_completed_successfully` field is also registered in the shared `tls_log` SubRecord in `zgrab2_schemas/zgrab2/zgrab2.py` so zschema validation passes for all modules.
`SCAN_POST_TLS_APPLICATION_ERROR` (`"tls-application-error"`)
A new scan status for "TLS handshake succeeded but the application-layer exchange failed." The existing `SCAN_HANDSHAKE_ERROR` already covers TLS failure; this fills the gap.
The SMTP module is updated as a POC:
How to Test
```
go test github.com/zmap/zgrab2/...
go test github.com/zmap/zgrab2/modules/smtp -v -run 'TestScanSMTPS|TestScanSTARTTLS|TestHandshake'
```
Three new SMTP scanner tests cover the three states: SMTPS handshake failure, STARTTLS handshake failure, and TLS-success with a 500 error banner (uses a real in-process TLS handshake between stdlib server and zcrypto client over `net.Pipe`).
Notes
The `handshake_completed_successfully` field is available to all other TLS-capable modules already — only the status-code handling needs updating per protocol. If this approach is accepted, IMAP, POP3, FTP, HTTP, and the remaining modules will follow in separate PRs.