Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify a closed stream with the CANCEL code as valid #600

Closed
wants to merge 8 commits into from

Conversation

DDtKey
Copy link
Contributor

@DDtKey DDtKey commented Feb 2, 2022

According to HTTP/2, the CANCEL code only indicates that the stream is no longer needed.
CANCEL (0x8): Used by the endpoint to indicate that the stream is no longer needed.

But when the client receives the response body data and the CANCEL code from the server, it returns an error.
Error example with hyper:

hyper::Error(Body, Error { kind: Reset(StreamId(3), CANCEL, Remote) })

This happens due to incorrect checking of an open stream:: link

@seanmonstar
Copy link
Member

Hey, thanks for the PR! Hm, I'd love to understand this better. I think the current behavior is correct, that if the server sends a CANCEL reset, we should return an "error" that includes that the stream is canceled. It's not saying that it's a huge internal error, but it is different from the normal Ok case, no?

@DDtKey
Copy link
Contributor Author

DDtKey commented Feb 3, 2022

Not really 🤔
This is a legal situation where the data has been fully received and server has also returned the CANCEL code.
But now, the client receives the entire data and returns an error during the polling the stream, although in fact it was polled to the end.

Although the name of code is contradictory, this code only says that the stream is no longer needed. It looks consistent with the state graph.

@glebpom
Copy link

glebpom commented Feb 4, 2022

Hi. I want to add a few details since we worked with @DDtKey on this issue. We met a flaky integration test, where we sent a large request from reqwest to a poem web server (based on hyper). In some circumstances, we got the canceled error instead of the expected 413 error. The problem existed on CI, but not locally. However, when we changed the reactor to current-thread or limited the number of threads to 1, it started to reliable reproduce locally as well. Wireshark has shown that the response was entirely generated, and the stream was CANCELed after that. We used the HTTPS endpoint, so we decrypt the traffic with pre-master keys.
image
The data was fully sent, and the content-length was also correct. However, the client returned the error here. The first chunk contained only a part of the data.
Applying the patch above reliably fixes the problem.

@DDtKey
Copy link
Contributor Author

DDtKey commented Feb 4, 2022

To be clear, reqwest successfully received headers and formed the rewest::Response.
But when we tried to get the response body (reqwest::Response::bytes() -> hyper::to_bytes() -> .. ), we got this behavior. The data was received, but the client returned an error even though the stream was actually polled to the end (i watched it with debugger, second chunk was correctly filled).
The RST_STREAM with CANCEL code looks acceptable to gracefully end the stream in such case

@seanmonstar
Copy link
Member

I see, I think I understand the issue now. The problem isn't so much about CANCEL specifically, it's that if a stream is reset, even though some data was received already, the reset code trumps getting anymore of the data out. I think what we want to do is allow a user to poll more data out of the buffer, and only when there is nothing left, then check if there is an error code. That would handle any reset code. What do you think?

@glebpom
Copy link

glebpom commented Feb 8, 2022

Looks good to me. However, I'm still thinking that, in accordance with spec, CANCEL should be treated as a closed channel, rather than an error. So, in my opinion, both approaches may be implemented.

@olix0r
Copy link
Collaborator

olix0r commented Feb 9, 2022

@glebpom While I understand that applications may want to handle these cases similarly--and I agree it's important to fix the current behavior--I think it would be bad if we lost the ability to differentiate a CANCEL from an END_STREAM.

For example, imagine if a proxy is transporting a HTTP/2 stream. We'd want the proxy to propagate the CANCEL reset and not simply treat it as an END_STREAM, right? If the library treats these cases identically, it would be impossible for such a proxy to preserve the original server behavior.

@glebpom
Copy link

glebpom commented Feb 9, 2022

@olix0r Ok, that makes sense. Seems like on the h2 layer the CANCEL error should propagate. 👍

@DDtKey
Copy link
Contributor Author

DDtKey commented Feb 15, 2022

It looks good to me too.
But how do you propose to implement it? 🤔

@glebpom
Copy link

glebpom commented Jan 20, 2023

A long time has passed since we discussed the bug. We observed the same problem a few more times with different client implementations on the unpatched h2 versions.
So, what would be the plan to fix it on a hyper level properly? Is there an issue in the hyper repo we can link to? Or can we go with the patch in h2?
Thanks.

@glebpom
Copy link

glebpom commented Jan 31, 2023

Hi, @seanmonstar @olix0r. Any updates on this? Thanks

@DDtKey
Copy link
Contributor Author

DDtKey commented Jun 5, 2023

@seanmonstar I noticed several usages of ensure_recv_open() which doesn't check result of the function, it's bool and looks like false is just ignored in some places. And one test is just hanging.

This can be an issue even without this fix, because there is another match-branch in master: Closed(Cause::EndStream) | HalfClosedRemote(..) | ReservedLocal => Ok(false)

So it seems for me logic is incorrect right now, I'm created separate PR to address this: #687
With these changes it works fine and tests are passing. I'm merged that PR to this one to keep this one working.

@DDtKey
Copy link
Contributor Author

DDtKey commented Jul 21, 2023

After testing the case against latest h2 - this patch doesn't seem to work after #634

Now the h2 uses Reset(NO_ERROR) - what's actually seems correct, but probably it should be fixed on hyper side, because this causes error on client side:
hyper::Error(Body, Error { kind: Reset(StreamId(3), NO_ERROR, Remote) })

cc @seanmonstar

I'm closing this PR because the way to fix should be different now.

UPD: looks like it’s directly related to hyperium/hyper#2872

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants