Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added mutex protection and null check for removeFromGroup #1617

Merged
merged 2 commits into from Oct 21, 2020

Conversation

ethouris
Copy link
Collaborator

Fixes #1608

The removeFromGroup could be called in the meantime or somewhere from inside connectIn call, possibly also in the receiver worker thread as a reaction on connection timeout. The call that has caused a crash was in the exception handler, which stated that the m_IncludedGroup field is still valid as set before. To prevent things like this from happening, this function now rewrites the m_IncludedGroup and reads it under protection. The following call to CUDTGroup::remove is also protected by a group mutex and can be called multiple times (the call on an already removed socket is ignored).

@ethouris ethouris added Priority: Critical Type: Bug Indicates an unexpected problem or unintended behavior [core] Area: Changes in SRT library core labels Oct 20, 2020
srtcore/api.cpp Outdated
Comment on lines 3034 to 3058
void CUDTSocket::removeFromGroup(bool broken)
{
m_IncludedGroup->remove(m_SocketID);
if (broken)
CUDTGroup* pg = 0;
{
// Activate the SRT_EPOLL_UPDATE event on the group
// if it was because of a socket that was earlier connected
// and became broken. This is not to be sent in case when
// it is a failure during connection, or the socket was
// explicitly removed from the group.
m_IncludedGroup->activateUpdateEvent();
ScopedLock grd (m_ControlLock);
pg = m_IncludedGroup;
m_IncludedIter = CUDTGroup::gli_NULL();
m_IncludedGroup = NULL;
}

m_IncludedIter = CUDTGroup::gli_NULL();
m_IncludedGroup = NULL;
// Another facility could have deleted it in the meantime.
if (pg)
{
pg->remove(m_SocketID);
if (broken)
{
// Activate the SRT_EPOLL_UPDATE event on the group
// if it was because of a socket that was earlier connected
// and became broken. This is not to be sent in case when
// it is a failure during connection, or the socket was
// explicitly removed from the group.
pg->activateUpdateEvent();
}
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reduced nesting:

void CUDTSocket::removeFromGroup(bool broken)
{
    // First lock access and clear pointers
    enterCS(m_ControlLock);
    CUDTGroup* pg = m_IncludedGroup;
    m_IncludedIter = CUDTGroup::gli_NULL();
    m_IncludedGroup = NULL;
    leaveCS(m_ControlLock);

    if (!pg)
        return;

    pg->remove(m_SocketID);
    if (broken)
    {
        // Activate the SRT_EPOLL_UPDATE event on the group
        // if it was because of a socket that was earlier connected
        // and became broken. This is not to be sent in case when
        // it is a failure during connection, or the socket was
        // explicitly removed from the group.
        pg->activateUpdateEvent();
    }
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reduces clarity as well. A critical section inside braces marks it clearer than the pair of enter/leave calls. The return on no pg looks good.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wherever there is a code block without ifs, whiles, etc., it means there should have been a function.

// A sample name, not pretending to be good
CUDTGroup* CUDTSocket::clearGroupMembership()
{
    ScopedLock grd (m_ControlLock);
    CUDTGroup* pg = m_IncludedGroup;
    m_IncludedIter = CUDTGroup::gli_NULL();
    m_IncludedGroup = NULL;
    return pg;
}

void CUDTSocket::removeFromGroup(bool broken)
{
    // First lock access and clear pointers
    CUDTGroup* pg = clearGroupMembership();
    if (!pg)
        return;

    pg->remove(m_SocketID);
    if (broken)
    {
        // Activate the SRT_EPOLL_UPDATE event on the group
        // if it was because of a socket that was earlier connected
        // and became broken. This is not to be sent in case when
        // it is a failure during connection, or the socket was
        // explicitly removed from the group.
        pg->activateUpdateEvent();
    }
}

@mbakholdina mbakholdina added this to the v1.5.0 milestone Oct 21, 2020
Copy link
Collaborator

@maxsharabayko maxsharabayko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No more comments except for the suggestion above.

@maxsharabayko maxsharabayko merged commit bda3383 into Haivision:master Oct 21, 2020
@maxsharabayko maxsharabayko modified the milestones: v1.5.0, v1.4.3 Oct 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[core] Area: Changes in SRT library core Priority: Critical Type: Bug Indicates an unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

crash when reconnecting a group
3 participants