Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] Fixed getFlighSpan function and rexmit bug #1613

Merged

Conversation

maxsharabayko
Copy link
Collaborator

@maxsharabayko maxsharabayko commented Oct 19, 2020

Scope

In short, SRT sender may hang up under certain conditions if the receiver does not send loss reports periodically.
It means the receiver is either SRT version prior to v1.3.0 (didn’t have PeriodicNAK) or disables Periodic NAK explicitly (SRTO_NAKREPORTS option).

The issue found by @CommonsNat in #1000 was introduced in PR #1181 (Mar 17, 2020) with the new getFlightSpan() utility function. The function was later modified in #1190.

Only SRT v1.4.2 is affected by the bug which this PR fixes.

To Reproduce

See this and this comments.

TODO:

  • Resolve inline comments that are left in the code to help with the review.
  • Should there be any difference in LATE_REXMIT and FAST_REXMIT logic except for periodic NAK?

Fixing the getFlighSpan()

The lowest value returned by the getFlightSpan() was 1, while 0 is expected if there are no packets in flight.
The fixed version:

/// Returns the number of packets in flight (sent, but not yet acknowledged).
/// @returns The number of packets in flight belonging to the interval [0; ...)
int32_t getFlightSpan() const
{
    // m_iSndCurrSeqNo is the sequence number of the latest original packet sent.
    // m_iSndLastAck is the sequence number of the first unacknowledged packet.
    // When there are no packets in flight, m_iSndLastAck = incseq(m_iSndCurrSeqNo).

    // Packets sent:
    // | 1 | 2 | 3 | 4 | 5 |
    //   ^               ^
    //   |               |
    // m_iSndLastAck     |
    //               m_iSndCurrSeqNo
    //
    // In Flight: [m_iSndLastAck; m_iSndCurrSeqNo]
    //
    // Normally m_iSndLastAck should be PAST the m_iSndCurrSeqNo,
    // however in a case when the sending stopped and all packets were
    // ACKed, the m_iSndLastAck is one sequence ahead of m_iSndCurrSeqNo.
    // Therefore we increase m_iSndCurrSeqNo by 1 forward and then
    // get the distance towards the last ACK. This way this value may
    // be only positive as seqlen() includes endpoints.
    // Finally, we subtract 1 to exclude the increment added earlier.

    return CSeqNo::seqlen(m_iSndLastAck, CSeqNo::incseq(m_iSndCurrSeqNo)) - 1;
}

About the Breaking Change in PR 1181

The condition in CUDT::checkRexmitTimer(..):

if (is_fastrexmit && (CSeqNo::seqoff(m_iSndLastAck, CSeqNo::incseq(m_iSndCurrSeqNo)) > 0))

in #1181 was replaced by:

if (is_fastrexmit && getFlightSpan() != 0)

essentially identical to

CSeqNo::seqlen(m_iSndLastAck, CSeqNo::incseq(m_iSndCurrSeqNo));
CSeqNo::seqoff(A, A) = 0;
CSeqNo::seqlen(A, A) = 1;
CSeqNo::seqoff(A, A) != CSeqNo::seqlen(A, A);

@maxsharabayko maxsharabayko added Type: Bug Indicates an unexpected problem or unintended behavior [core] Area: Changes in SRT library core labels Oct 19, 2020
@maxsharabayko maxsharabayko added this to the v1.5.0 milestone Oct 19, 2020
@maxsharabayko maxsharabayko self-assigned this Oct 19, 2020
srtcore/core.cpp Outdated Show resolved Hide resolved
srtcore/core.cpp Outdated Show resolved Hide resolved
srtcore/core.h Outdated Show resolved Hide resolved
Copy link
Collaborator

@ethouris ethouris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix as for me should be ok as is - as far as I could understand it. You can consider some proposed improvements.

@maxsharabayko maxsharabayko marked this pull request as ready for review October 20, 2020 15:42
A notable help in investigating the issue:

Co-authored-by: CommonsNat <CommonsNat@hotmail.com>
@maxsharabayko maxsharabayko merged commit 1e6e5ac into Haivision:master Oct 21, 2020
@maxsharabayko maxsharabayko deleted the hotfix/fastrexmit-issue1000 branch October 21, 2020 13:25
@maxsharabayko maxsharabayko modified the milestones: v1.5.0, v1.4.3 Oct 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[core] Area: Changes in SRT library core Priority: Critical Type: Bug Indicates an unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants