Skip to content

Fix unexpected errors when cancelling QUIC stream mid RPC#408

Merged
n8mgr merged 7 commits intomasterfrom
chris/update-quic
Mar 13, 2026
Merged

Fix unexpected errors when cancelling QUIC stream mid RPC#408
n8mgr merged 7 commits intomasterfrom
chris/update-quic

Conversation

@ChrisSchinnerl
Copy link
Member

@ChrisSchinnerl ChrisSchinnerl commented Mar 13, 2026

This PR is related to this other open PR: https://github.com/SiaFoundation/indexd/pull/864

When testing it out I noticed errors in the logs such as

  • decoding errors due to large prefix
  • proof validation errors
  • data not being a multiple of leaf size

These errors were always happening on cancelled streams.

The problem is that Close read up to 4kib of data while an ongoing RPC might be reading data at the same time. Depending on the timing that would corrupt response data in different ways. By always explicitly closing the reading side, those errors disappeared because we didn't "steal" data from the stream and the RPC still got unblocked.

The reason these errors are bad in the first place is that they are neither context errors, quic errors nor rpc errors. So they lead to the transport being reset which harmed performance.

The other part of this PR is adding descriptions whenever we call CloseWithError. That's just useful information to have when logging a QUIC error.

@ChrisSchinnerl ChrisSchinnerl self-assigned this Mar 13, 2026
Copilot AI review requested due to automatic review settings March 13, 2026 12:19
@github-project-automation github-project-automation bot moved this to In Progress in Sia Mar 13, 2026
@ChrisSchinnerl ChrisSchinnerl changed the title Chris/update quic Fix unexpected errors when cancelling QUIC stream mid RPC Mar 13, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes QUIC stream-close behavior to avoid corrupting in-flight RPC reads on cancelled streams, and improves QUIC/WebTransport connection-close observability by adding close reasons.

Changes:

  • Update stream.Close() to cancel the read side explicitly instead of attempting to drain bytes.
  • Add descriptive reason strings to CloseWithError calls for QUIC/WebTransport connections and sessions.
  • Add changeset entries documenting both changes.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
rhp/v4/quic/quic.go Prevents concurrent close/drain from stealing bytes; adds QUIC close reason strings for better logging/debuggability.
.changeset/pass_proper_descriptions_when_closing_quic_connection.md Documents the addition of close reason strings.
.changeset/always_cancel_read_side_of_stream_when_calling_streamclose.md Documents the stream read-cancellation change on close.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

ChrisSchinnerl and others added 3 commits March 13, 2026 13:41
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
chris124567
chris124567 previously approved these changes Mar 13, 2026
conn, err := quic.DialAddr(ctx, addr, tc, qc)
// Dial the QUIC connection using 0-RTT
// https://quic-go.net/docs/quic/client/#0rtt-security
conn, err := quic.DialAddrEarly(ctx, addr, tc, qc)
Copy link
Member

@chris124567 chris124567 Mar 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to the design of the TLS protocol, clients cannot directly request session tickets or unilaterally enable 0-RTT. These capabilities depend on the server’s configuration and support.
It needs to use TLS session resumption. There’s no way to use 0-RTT without a TLS session ticket.
The server’s support for session resumption, indicated by the session ticket issued on the initial connection, must be present.
The ALPN (configured using tls.Config.NextProtos) on the new connection must be the same.

Are we sure 0-RTT will "just work"? Otherwise LGTM

Copy link
Member

@n8mgr n8mgr Mar 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we need to manually handle rejections. Probably drop the 0-RTT change for now.

conn, err := tr.DialEarly(ctx, <server address>, <tls.Config>, <quic.Config>)
// ... error handling
_, err :=conn.AcceptStream()
if errors.Is(err, quic.Err0RTTRejected) {
  // The server rejected 0-RTT. None of the data sent so far was processed.
  newConn := conn.NextConnection()
}

https://quic-go.net/docs/quic/client/#0rtt-rejection

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll leave this out for now and only enable it explicitly on the server side as that is the default for the http3.Server anyway.

@n8mgr n8mgr merged commit b555900 into master Mar 13, 2026
13 checks passed
@n8mgr n8mgr deleted the chris/update-quic branch March 13, 2026 20:40
@github-project-automation github-project-automation bot moved this from In Progress to Done in Sia Mar 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants