Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor DetectLostPackets #2066

Merged
merged 7 commits into from
Dec 6, 2018
Merged

Refactor DetectLostPackets #2066

merged 7 commits into from
Dec 6, 2018

Conversation

martinthomson
Copy link
Member

This had a couple of problems that I think this addresses, and some that
it doesn't.

The first is that it isn't clear what is being iterated over and in what
order. I think that the point here is to iterate over sent_packets,
starting with the oldest. Correct me if that is wrong. In any case,
this change doesn't assume an iteration order. The only reason the
iteration order is important is in setting loss_time, which needs to be
the earliest time (assuming that I'm right).

This simplifies the function, by setting thresholds at the top and doing
a simple comparison.

I've added a note about loss_time potentially being in the past.

The problem that remains is that this appeared to iterate only over
packets that have a packet number less than the largest acknowledged.
I've added that condition to the loop, but I don't think that it's
right. I think that it's just redundant - and while an implementation
might stop its iteration at the largest acknowledged to save cycles,
this function will operate the same without the extra check.

It's also not clear whether the greater than comparisons here were
correct. If you assume that firing of the timer cannot take zero time,
this is never an issue, but with discrete intervals on time values,
that's not always going to happen. As setup, this code could be called
at exactly loss_time for a packet, in which case that packet will not be
declared lost. I think that this wants >= in the time comparison for
that reason.

Finally, should the early retransmit timer be configurable? Should it
be set based on kTimeReorderingThreshold?

@martinthomson martinthomson added editorial An issue that does not affect the design of the protocol; does not require consensus. -recovery labels Nov 29, 2018
Copy link
Contributor

@janaiyengar janaiyengar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments, but I think this refactor is basically correct (mod the corrections I've noted), and is definitely helpful! @ianswett should take a look as well.

delta = largest_acked.packet_number - unacked.packet_number
if (time_since_sent > delay_until_lost ||
delta > reordering_threshold):
loss_delay = 9/8 * max(latest_rtt, smoothed_rtt)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be (1 + time_reordering_fraction)

continue

if (unacked.time_sent > lost_send_time ||
unacked.packet_number > lost_pn):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be

    if (lost_send_time > unacked.time_sent || lost_pn > unacked.packet_number)


// Inform the congestion controller of lost packets and
// lets it decide whether to retransmit immediately.
if (!lost_packets.empty()):
OnPacketsLost(lost_packets)
~~~

This algorithm results in loss_time being set to the earliest time that the
earliest packet was sent. As a result loss_time could be in the past. Timers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This algorithm results in loss_time being set based on the earliest packet that is still in flight. As a result ...

Copy link
Contributor

@ianswett ianswett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this. I think PR#1974 is close to landing, so I'd like to land that and then take another look at this.

martinthomson and others added 4 commits December 5, 2018 17:20
This had a couple of problems that I think this addresses, and some that
it doesn't.

The first is that it isn't clear what is being iterated over and in what
order.  I think that the point here is to iterate over sent_packets,
starting with the oldest.  Correct me if that is wrong.  In any case,
this change doesn't assume an iteration order.  The only reason the
iteration order is important is in setting loss_time, which needs to be
the earliest time (assuming that I'm right).

This simplifies the function, by setting thresholds at the top and doing
a simple comparison.

I've added a note about loss_time potentially being in the past.

The problem that remains is that this appeared to iterate only over
packets that have a packet number less than the largest acknowledged.
I've added that condition to the loop, but I don't think that it's
right.  I think that it's just redundant - and while an implementation
might stop its iteration at the largest acknowledged to save cycles,
this function will operate the same without the extra check.

It's also not clear whether the greater than comparisons here were
correct.  If you assume that firing of the timer cannot take zero time,
this is never an issue, but with discrete intervals on time values,
that's not always going to happen.  As setup, this code could be called
at exactly loss_time for a packet, in which case that packet will not be
declared lost.  I think that this wants >= in the time comparison for
that reason.

Finally, should the early retransmit timer be configurable?  Should it
be set based on kTimeReorderingThreshold?
@mikkelfj
Copy link
Contributor

mikkelfj commented Dec 6, 2018

How about gaps in PN sequence (optimistic ACK defense)

Copy link
Contributor

@ianswett ianswett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, but I think you should move the where the timers in the past comment lives.

if (!lost_packets.empty()):
OnPacketsLost(lost_packets)
~~~

This algorithm results in loss_time being set based on the earliest packet that
is still in flight. Timers set based on a loss_time that has already passed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment about timers with timeouts that have already passed is more widely applicable, so maybe move it closer to SetLossDetectionTimer()?

@martinthomson
Copy link
Member Author

Thanks for cleaning this up Jana. I wasn't sure if the timers comment was scoped specifically to this particular instance, so I didn't make it a more general statement. @ianswett, do you think that you could move that for me?

Move and update Martin's comments
@ianswett
Copy link
Contributor

ianswett commented Dec 6, 2018

Thanks @martinthomson, I updated to what I believe is the right location. Feel free to update further.

@janaiyengar
Copy link
Contributor

@mikkelfj: the loss recovery algorithm does not handle arbitrary gaps in sequence space. This can be handled with some complexity, which is probably worth writing a sentence or two about. Mind filing an issue?

@janaiyengar janaiyengar merged commit f29543c into master Dec 6, 2018
@MikeBishop MikeBishop deleted the refactor-loss_time branch February 6, 2019 00:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
-recovery editorial An issue that does not affect the design of the protocol; does not require consensus.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants