GCM Notification Memory Leak #90

Closed
Runescrye opened this Issue Jan 30, 2013 · 1 comment

Comments

Projects
None yet
3 participants
@Runescrye

Hello,

I'm maintaining a large scale notification push service (~5-20 million per day). Lately the server has been getting more and more bloated eating up all available memory and keeping many threads open (over 5GB of memory and having ~3000 threads open).

I've profiled the problem and it seems the crux of the problem is that GCM messages are kept alive and cannot be collected by the GC. According to a memory profiler I used, they are kept alive by the AsyncParam objects used in HttpRequests callbacks.

I've looked through the code and I believe I found the problem in the error handling of GCM notifications:

From row 221 at PushSharp.Android.TransportAsync:

void processResponseError(GcmAsyncParameters asyncParam)
{
    var result = new GcmMessageTransportResponse();
    result.ResponseCode = GcmMessageTransportResponseCode.Error;

    if (asyncParam.WebResponse.StatusCode == HttpStatusCode.Unauthorized)
    {
        //401 bad auth token
        result.ResponseCode = GcmMessageTransportResponseCode.InvalidAuthToken;
        throw new GcmAuthenticationErrorTransportException(result);
    }
    else if (asyncParam.WebResponse.StatusCode == HttpStatusCode.BadRequest)
    {
        result.ResponseCode = GcmMessageTransportResponseCode.BadRequest;
        throw new GcmBadRequestTransportException(result);
    }
    else if (asyncParam.WebResponse.StatusCode == HttpStatusCode.ServiceUnavailable)
    {
        //First try grabbing the retry-after header and parsing it.
        TimeSpan retryAfter = new TimeSpan(0, 0, 120);

        var wrRetryAfter = asyncParam.WebResponse.GetResponseHeader("Retry-After");

        if (!string.IsNullOrEmpty(wrRetryAfter))
        {
            DateTime wrRetryAfterDate = DateTime.UtcNow;

            if (DateTime.TryParse(wrRetryAfter, out wrRetryAfterDate))
                retryAfter = wrRetryAfterDate - DateTime.UtcNow;
            else
            {
                int wrRetryAfterSeconds = 120;
                if (int.TryParse(wrRetryAfter, out wrRetryAfterSeconds))
                    retryAfter = new TimeSpan(0, 0, wrRetryAfterSeconds);
            }
        }

        //503 exponential backoff, get retry-after header
        result.ResponseCode = GcmMessageTransportResponseCode.ServiceUnavailable;

        throw new GcmServiceUnavailableTransportException(retryAfter, result);
    }

    asyncParam.WebResponse.Close();

    throw new GcmMessageTransportException("Unknown Transport Error", result);
}

The problem here is that the functions throws out errors, thereby bypassing the asyncParam.WebResponse.Close() line and causing the WebResponses to not close, keeping the requests/threads open which are still referencing the GCM Messages and thus they cannot be collected, causing threads/memory to pile up.

I have altered the code to

void processResponseError(GcmAsyncParameters asyncParam)
{
    try
    {
        var result = new GcmMessageTransportResponse();
        result.ResponseCode = GcmMessageTransportResponseCode.Error;

        if (asyncParam == null || asyncParam.WebResponse == null)
            throw new GcmMessageTransportException("Unknown Transport Error", result);

        if (asyncParam.WebResponse.StatusCode == HttpStatusCode.Unauthorized)
        {
            //401 bad auth token
            result.ResponseCode = GcmMessageTransportResponseCode.InvalidAuthToken;
            throw new GcmAuthenticationErrorTransportException(result);
        }
        else if (asyncParam.WebResponse.StatusCode == HttpStatusCode.BadRequest)
        {
            result.ResponseCode = GcmMessageTransportResponseCode.BadRequest;
            throw new GcmBadRequestTransportException(result);
        }
        else if (asyncParam.WebResponse.StatusCode == HttpStatusCode.InternalServerError)
        {
            result.ResponseCode = GcmMessageTransportResponseCode.InternalServiceError;
            throw new GcmMessageTransportException("Internal Service Error", result);
        }
        else if (asyncParam.WebResponse.StatusCode == HttpStatusCode.ServiceUnavailable)
        {
            //First try grabbing the retry-after header and parsing it.
            TimeSpan retryAfter = new TimeSpan(0, 0, 120);

            var wrRetryAfter = asyncParam.WebResponse.GetResponseHeader("Retry-After");

            if (!string.IsNullOrEmpty(wrRetryAfter))
            {
                DateTime wrRetryAfterDate = DateTime.UtcNow;

                if (DateTime.TryParse(wrRetryAfter, out wrRetryAfterDate))
                    retryAfter = wrRetryAfterDate - DateTime.UtcNow;
                else
                {
                    int wrRetryAfterSeconds = 120;
                    if (int.TryParse(wrRetryAfter, out wrRetryAfterSeconds))
                        retryAfter = new TimeSpan(0, 0, wrRetryAfterSeconds);
                }
            }

            //503 exponential backoff, get retry-after header
            result.ResponseCode = GcmMessageTransportResponseCode.ServiceUnavailable;

            throw new GcmServiceUnavailableTransportException(retryAfter, result);
        }

        throw new GcmMessageTransportException("Unknown Transport Error", result);
    }
    catch (Exception ex)
    {
        if (asyncParam.WebResponse != null)
        {
            asyncParam.WebResponse.Close();
        }
        throw;
    }
}

The service has been behaving since.

@sdasun

This comment has been minimized.

Show comment Hide comment
@sdasun

sdasun Jan 30, 2013

It is better to call in finally block instead of the catch block.

    try
    {
        var result = new GcmMessageTransportResponse();
        ....
        throw new GcmServiceUnavailableTransportException(retryAfter, result);
    }
    finally
    {
        if (asyncParam.WebResponse != null)
        {
            asyncParam.WebResponse.Close();
        }
    }

sdasun commented Jan 30, 2013

It is better to call in finally block instead of the catch block.

    try
    {
        var result = new GcmMessageTransportResponse();
        ....
        throw new GcmServiceUnavailableTransportException(retryAfter, result);
    }
    finally
    {
        if (asyncParam.WebResponse != null)
        {
            asyncParam.WebResponse.Close();
        }
    }

@Redth Redth closed this in aff2a56 Jan 30, 2013

Redth added a commit that referenced this issue Feb 11, 2013

Fixes #99
Had a leak similar to issue #90 but in C2DM as it was only fixed for GCM
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment