Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix] should not break directly #1849

Closed
wants to merge 5 commits into from
Closed

[fix] should not break directly #1849

wants to merge 5 commits into from

Conversation

ylht
Copy link
Contributor

@ylht ylht commented May 7, 2024

We should invoke the take function for sent_frames, which is the same as in lines 709 and 753.

@djc
Copy link
Member

djc commented May 7, 2024

Thanks! We should probably have a regression test for this...

@ylht ylht changed the title [fix] should invoke take for sent_frames [fix] should not break directly May 7, 2024
@ylht
Copy link
Contributor Author

ylht commented May 7, 2024

I realized that the error is not coming from func take, this line is correct. The error comes from the fact that we shouldn't just break when not padding, otherwise re-instantiating let builder = builder_storage.get_or_insert(PacketBuilder::new( on line 783 won't work. This will cause that we use the last builder when going to the logic below // Finish the last packet on line 924.

@ylht
Copy link
Contributor Author

ylht commented May 7, 2024

It's hard for me to provide a regression test for my PR, but it does solve my problem temporarily.

@ylht ylht requested a review from djc May 7, 2024 13:23
@djc
Copy link
Member

djc commented May 7, 2024

I'm going to wait for a review from @Ralith here, who touched this code last.

It would be helpful if you could explain in your PR description or in an issue what the symptoms are that led you to this fix. (For example, is this the same as or different from #1850?)

Copy link
Collaborator

@Ralith Ralith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear what this change is intended to accomplish, but its effect is solely to disable intended behavior. Refer to the comments on MAX_PADDING just above your change. Perhaps you could start by describing the problem you're having?

@@ -693,17 +693,15 @@ impl Connection {
let packet_len_unpadded = cmp::max(builder.min_size, buf.len())
- datagram_start
+ builder.tag_len;
if packet_len_unpadded + MAX_PADDING < segment_size {
if packet_len_unpadded + MAX_PADDING >= segment_size {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect this condition to be unsatisfiable. segment_size is the maximum length packet assembly is permitted to produce.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry that you may be confusing. If this condition is always unsatisfiable, then the original condition is always true, and it means that we always execute original break operation in its branch, which is incorrect.

In fact, segment_size is exactly the maximum length, so it is always larger than packet_len_unpadded. But the comparison add extra MAX_PADDING to judge whether we skip amount of padding. Therefore, their sum may be larger than segment_size and then the condition can be satisfiable is some case.

In original version, if the condition is satisfiable, we directly break from the while loop and execute the packet logic for the last packet. In current version, we just skip the padding operation builder.pad_to(segment_size as u16); but keep all the following operations.

In short, the original version skip all the following operations, but the current version just skip the padding operation builder.pad_to(segment_size as u16);.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note that the involved commit is d5fc528.

Copy link
Collaborator

@Ralith Ralith May 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this condition is always unsatisfiable, then the original condition is always true

Oops, yes, you're correct. Nonetheless, the original behavior is intended. The difference achieved by breaking here is what terminates the GSO batch, which is the purpose of this condition. Removing the break allows the batch to continue with incorrect padding. This is why the tests are failing.

@ylht
Copy link
Contributor Author

ylht commented May 8, 2024

It's not clear what this change is intended to accomplish, but its effect is solely to disable intended behavior. Refer to the comments on MAX_PADDING just above your change. Perhaps you could start by describing the problem you're having?

Thanks for your contribution, I use Quinn to implement a private proxy. Recently, I update Quinn to 0.11.0, but I cannot pass the test of speedtest.net. It reports the error stream reset by peer: error 0.

I test your recent commits one by one using binary search (utilizing quinn={git="...", rev="..."}). I find that the first commit came from d5fc528. I attempt to review this commit and make a PR based on my understanding.

@Ralith
Copy link
Collaborator

Ralith commented May 8, 2024

Recently, I update Quinn to 0.11.0, but I cannot pass the test of speedtest.net. It reports the error stream reset by peer: error 0.

I don't know what "the test of speedtest.net" is. An unexpected "stream reset by peer" error is almost certainly an application logic error. You should investigate why the sender is deciding to issue a reset.

I find that the first commit came from d5fc528

This is most likely a coincidence arising from your application logic being timing-sensitive.

@ylht
Copy link
Contributor Author

ylht commented May 8, 2024

Yes, you are right. I try to check again.

@ylht ylht closed this May 8, 2024
@ylht
Copy link
Contributor Author

ylht commented May 8, 2024

Oh~~, I locate the bug, but I cannot fix it.

I find the unit test gso_truncation in this commit d5fc528 is wrong. In fact, it does not really pass.

Specifically, I run the command cargo test --color=always --lib tests::gso_truncation -- --show-output to observe its output log. The client cannot decrypt the second packet, i.e., server: quinn_proto::connection: failed to authenticate packet.

2024-05-08T12:56:36.427721Z TRACE server: quinn_proto::connection: got Data packet (1053 bytes) from [::1]:44433 using id 4a9a80b04ca84a27
2024-05-08T12:56:36.427744Z TRACE server:recv{space=Data pn=10}: quinn_proto::connection: got datagram frame len=1024
2024-05-08T12:56:36.427761Z TRACE server: quinn_proto::connection: got Data packet (781 bytes) from [::1]:44433 using id 4a9a80b04ca84a27
2024-05-08T12:56:36.427776Z TRACE server: quinn_proto::connection::packet_crypto: decryption failed with packet number 10883353
2024-05-08T12:56:36.427783Z DEBUG server: quinn_proto::connection: failed to authenticate packet
2024-05-08T12:56:36.427790Z TRACE server: quinn_proto::connection: got Data packet (797 bytes) from [::1]:44433 using id 4a9a80b04ca84a27
2024-05-08T12:56:36.427810Z TRACE server:recv{space=Data pn=12}: quinn_proto::connection: got datagram frame len=768

After I delete the following code in the commit d5fc528. The unit test gso_truncation exactly pass. However, I cannot expose the deep reason. Please help me.

                        let packet_len_unpadded = cmp::max(builder.min_size, buf.len())
                            - datagram_start
                            + builder.tag_len;
                        if packet_len_unpadded + MAX_PADDING < segment_size {
                            trace!(
                                "GSO truncated by demand for {} padding bytes",
                                segment_size - packet_len_unpadded
                            );
                            break;
                        }

2024-05-08T12:59:08.856743Z TRACE server: quinn_proto::connection: got Data packet (1053 bytes) from [::1]:44433 using id 8c32e49c8d5f276a
2024-05-08T12:59:08.856789Z TRACE server:recv{space=Data pn=10}: quinn_proto::connection: got datagram frame len=1024
2024-05-08T12:59:08.856824Z TRACE server: quinn_proto::connection: got Data packet (1053 bytes) from [::1]:44433 using id 8c32e49c8d5f276a
2024-05-08T12:59:08.856867Z TRACE server:recv{space=Data pn=11}: quinn_proto::connection: got datagram frame len=768
2024-05-08T12:59:08.856990Z TRACE server: quinn_proto::connection: got Data packet (797 bytes) from [::1]:44433 using id 8c32e49c8d5f276a
2024-05-08T12:59:08.857034Z TRACE server:recv{space=Data pn=12}: quinn_proto::connection: got datagram frame len=768

@ylht
Copy link
Contributor Author

ylht commented May 8, 2024

Please check the unit test further. @Ralith

@ylht
Copy link
Contributor Author

ylht commented May 8, 2024

I have fixed the unit test, and open a new PR #1853.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants