Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

do not cap pto_count to one #148

Merged
merged 5 commits into from
Apr 19, 2019
Merged

do not cap pto_count to one #148

merged 5 commits into from
Apr 19, 2019

Conversation

kazuho
Copy link
Member

@kazuho kazuho commented Apr 15, 2019

@janaiyengar Please review.

FWIW, QUICLY_MAX_PTO_COUNT is currently set to 16, meaning that maximum PTO timeout is 65 seconds when using 1ms timer, which should be more than enough.

Closes #150.

@kazuho kazuho requested a review from janaiyengar April 15, 2019 13:12
@kazuho
Copy link
Member Author

kazuho commented Apr 15, 2019

I think the proposed changes are correct.

My assumption is that the test failures are due to other bugs becoming prominent, now that the endpoints are sending less packets when under loss.

@janaiyengar
Copy link
Collaborator

As discussed offline, let's dispose of the MAX_PTO_COUNT altogether.

Copy link
Collaborator

@janaiyengar janaiyengar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks for doing this.

@kazuho kazuho changed the title cap pto_count to 16 (instead of one) do not cap pto_count to one Apr 16, 2019
@kazuho
Copy link
Member Author

kazuho commented Apr 16, 2019

@janaiyengar I believe you love e11b9d7.

With that being fixed, the remaining issue seems to be the following. I am uncertain how I should resolve it. May I ask for your wisdom?

When the network is severely congested in both directions, it is often the case that even the ACKs you receive are outdated. It could possibly be more than 3 PTO. However, because quicly discards sentmap entries after 3 PTO, the ACK frames are ignored. Therefore, the situation becomes indistinguishable from no communication at all, and the connection times out.

To me it seems that this is something very different from what we see in TCP.

Do we need to fix the issue? If yes, how should we?

@kazuho
Copy link
Member Author

kazuho commented Apr 16, 2019

Tried to retain sentmap for longer period. Didn't fix the issue.

The other case that I am concerned about is quicly only sending ACK once every time it receives a new packet (either immediately or after ack-delay). This is different from TCP where you bundle ACKs for every packet you send. Maybe we need to retransmit ACK whenever PTO timer fires (because that is an indication of loss in either or both the directions).

@janaiyengar
Copy link
Collaborator

janaiyengar commented Apr 18, 2019

@janaiyengar I believe you love e11b9d7.

+1!

When the network is severely congested in both directions, it is often the case that even the ACKs you receive are outdated. It could possibly be more than 3 PTO. However, because quicly discards sentmap entries after 3 PTO, the ACK frames are ignored. Therefore, the situation becomes indistinguishable from no communication at all, and the connection times out.

This is indeed different than TCP, because TCP doesn't discard anything that's not acked. As discussed offline, let's disable the 75% loss tests for now and proceed as long as the 50% loss tests work. Either way, let's look at the logs next week to figure out what's going on with the tests.

@kazuho kazuho merged commit deb4908 into master Apr 19, 2019
kazuho added a commit that referenced this pull request Feb 22, 2020
more key exchanges for the OpenSSL backend
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Broken PTO timeout calculation?
2 participants