TLS write deadlock when CT-out BIO ring < response body

## Summary

When `TLS_BIO_RING_SIZE` (CT-out, `internal_bio` writebuf) is smaller than what the response emit pass tries to push, the connection hangs: client receives `200` + 0 body bytes and waits until timeout. With the current default `64 KiB` the bug is latent — one SSL_write fits 3–4 records of 16 KiB plaintext each, so most responses succeed in one go. Reducing the ring exposes it on any body > one record.

## Reproduction

1. Edit `src/core/tls_layer.c` so the ciphertext BIO pair uses the small ring on both sides:
   ```c
   BIO_new_bio_pair(&internal_bio, TLS_BIO_RING_SIZE_SMALL,
                    &network_bio,  TLS_BIO_RING_SIZE_SMALL)
   ```
2. Build and start the server.
3. `curl --http2 -k -o /dev/null https://localhost:8443/static/app.js` against a 200 KiB body.
4. Result: `code=200 size=0 time=<curl --max-time>`. `reset.css` (8 KiB) works.

## Root cause

Two write paths bypass the `tls_plaintext_bio` queue and call `SSL_write` directly:

- `h2_tls_write_chunk` in `src/http2/http2_strategy.c:1150` — h2 emit
- `http_connection_tls_fsm_send_plaintext_atomic` in `src/core/http_connection_tls.c:498` — h1 sendfile (also h1 parse-error reply)

Both treat `TLS_IO_WANT_WRITE` (wbio full) as failure and `return false`. The undelivered tail (h2 `stage[]` + offset, or the h1 chunk) is discarded; the caller does **not** re-enter when the wbio drains. The existing `tls_drain` retry mechanism (`cipher_completion → tls_advance_state → tls_drain`) only services `tls_plaintext_bio` — it cannot pick up bytes that never went through that queue.

`tls_write_plaintext` itself was already correctly returning `wrote` only on `rc == 1` (all-or-nothing without `SSL_MODE_ENABLE_PARTIAL_WRITE`). PARTIAL_WRITE is set at `SSL_CTX_set_mode`, so partial writes are possible — but the callers above still throw the unwritten remainder away.

## Fix sketch (~150–200 LoC)

1. **h2 emit**: when `h2_tls_write_chunk` hits `WANT_WRITE`, stash the undelivered slice (`stage[]` + offset, or a heap copy if the staging buffer is on the call stack). Hook `tls_zc_write_done_cb` to resume the same write loop after the cipher write to libuv completes. Return success so the upper layer treats the conn as healthy.
2. **h1 sendfile**: rename `*_atomic` to make non-atomic semantics explicit, do the same `tls_zc_write_done_cb` parking on `WANT_WRITE`, resume on completion. The existing h1 sendfile state machine already has a `H1_SEND_PHASE_TLS_DRAIN` phase — reuse it.
3. **Tests**: a phpt under `tests/phpt/server/tls/` that lowers `TLS_BIO_RING_SIZE` via build flag and verifies a body > one record completes.

## Impact

Until fixed, `TLS_BIO_RING_SIZE` (CT-out) must stay ≥ the largest single SSL_write the emit paths can produce. Empirically `64 KiB` is enough for current h2 / h1 sendfile workloads. Once the retry path lands, the ring can shrink (e.g. to `17 KiB`) and save ~94 KiB per TLS connection (the asymmetric change in v0.6.0 already drops the unused half — this would drop the in-use half too).

## Related

- Asymmetric BIO sizing landed in v0.6.0 (CT-in / PT-app back-channel 17 KiB) — see CHANGELOG.
- Double-destroy fix in v0.6.0 is unrelated but found in the same investigation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TLS write deadlock when CT-out BIO ring < response body #29

Summary

Reproduction

Root cause

Fix sketch (~150–200 LoC)

Impact

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

TLS write deadlock when CT-out BIO ring < response body #29

Description

Summary

Reproduction

Root cause

Fix sketch (~150–200 LoC)

Impact

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions