Commit d352dfd
authored
usockets: don't defer TLS close when close_notify flush fails (#30368)
### What
`ssl_handle_shutdown` in `packages/bun-usockets/src/crypto/openssl.c`
returned 0 ("wait for the peer") when `SSL_shutdown` failed with
`SSL_ERROR_WANT_WRITE` / `WANT_READ` during a graceful close
(`force_fast_shutdown == 0`). Change it to return 1 so the raw socket
closes (and `SSL_free` runs) immediately.
### Why
`SSL_shutdown` allocates BoringSSL's `ssl->s3->write_buffer` to hold the
encoded `close_notify` alert. If the BIO write fails (kernel buffer
full, peer already gone), `SSL_shutdown` returns -1 with `WANT_WRITE`.
The old code returned 0 from `ssl_handle_shutdown`, which told
`us_internal_ssl_close` to leave the fd open and wait for the peer.
That deferral is correct for the `SSL_shutdown() == 0` case (alert
flushed, waiting for the peer's reply — see the comment in
`us_internal_ssl_close`). It's wrong here: the alert never went out,
`SSL_SENT_SHUTDOWN` is already set on the first call, and once
`us_internal_ssl_is_shut_down` is true `on_writable`/`on_data`
short-circuit without ever re-dispatching the queued alert. There is no
retry path, so the socket stays in limbo holding `s->ssl` and the write
buffer until *some* other event arrives — which may never happen.
This shows up as an intermittent LSan failure on the Debian x64-asan
lane in `test/js/node/http/node-https-checkServerIdentity.test.ts`: the
spawned child calls `server.close()` and exits right after the request
emits `error`, so the lingering accepted socket never gets another event
and `SSL_free` never runs.
```
Direct leak of 417 byte(s) in 1 object(s) allocated from:
bssl::SSLBuffer::EnsureCap vendor/boringssl/ssl/ssl_buffer.cc:72
bssl::do_tls_write vendor/boringssl/ssl/s3_pkt.cc:194
bssl::tls_dispatch_alert vendor/boringssl/ssl/s3_pkt.cc:373
SSL_shutdown vendor/boringssl/ssl/ssl_lib.cc:1039
ssl_handle_shutdown packages/bun-usockets/src/crypto/openssl.c:821
us_internal_ssl_close packages/bun-usockets/src/crypto/openssl.c:871
us_internal_ssl_on_data
```
This is a recurring pre-existing flake on `main` — see retrigger commits
`8722a10109`, `13a267eabb`, `ded11f3fb7`.
### Behavior
- The unsent `close_notify` is best-effort. Closing without it produces
an abrupt FIN, which is indistinguishable from a half-close to the peer
(TLS 1.3 doesn't require it; TLS 1.2 implementations tolerate it).
- The deferred-graceful-close path (`SSL_shutdown() == 0` → wait for the
peer's `close_notify`) is unchanged.
- `force_fast_shutdown == 1` (forceful close from `_destroy()` / abort)
already returned 1 here, so no change.
### Testing
- `bun bd test test/js/node/http/node-https-checkServerIdentity.test.ts`
— 4 pass, 0 fail.
- `bun bd test test/js/node/tls/node-tls-connect.test.ts
test/js/bun/net/socket.test.ts` — 57 pass, 3 skip, 1 fail. The single
failure (`setSession() should not leak the SSL_SESSION returned by
d2i_SSL_SESSION`, RSS-growth threshold 40 MiB exceeded at ~293 MiB)
reproduces identically on unmodified `main` in a macOS debug build —
pre-existing, unrelated to this change.
- The LSan leak itself is Linux-ASAN-only and timing-dependent; not
reproducible on macOS, so there is no deterministic regression test.1 parent 58fcdad commit d352dfd
1 file changed
Lines changed: 13 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
828 | 828 | | |
829 | 829 | | |
830 | 830 | | |
831 | | - | |
| 831 | + | |
| 832 | + | |
| 833 | + | |
| 834 | + | |
| 835 | + | |
| 836 | + | |
| 837 | + | |
| 838 | + | |
| 839 | + | |
| 840 | + | |
| 841 | + | |
| 842 | + | |
| 843 | + | |
832 | 844 | | |
833 | 845 | | |
834 | 846 | | |
| |||
0 commit comments