Notifications not sent, event when NotificationSent delegate is called (requeue involved) #157

omardelrio · 2013-04-16T23:49:08Z

Setup:
Singleton of PushBroker configured through Autofac. All default settings.

From time to time, we get events for requeuing - and then notification sent. However, sometimes we also see channels being created and destroyed (but sometimes we never see that, only notifications being sent).

Is this expected behavior? We are not even stressing the broker and get what seems apparently low throughput.

For example, this one is a sequence for a single message (although the payload is built every time) sent to only three devices. The notification DID NOT get to the device (the one that was requeued).

15:46:29.8495 DEBUG api.Core.Dispatcher.iosDispatcher Sending notification... 
15:46:34.2485 DEBUG api.Core.Dispatcher.iosDispatcher Sending notification... 
15:46:34.3265 TRACE api.Core.Dispatcher.iosDispatcher Sending notification to <<devicetoken>>, ringin,<<otherdata>> 
15:46:34.3265 TRACE api.Core.Dispatcher.iosDispatcher Sending notification to <<anotherdevicetoken>>, ding, <<otherdata>>
15:46:34.3265 TRACE api.Core.Dispatcher.iosDispatcher Sending notification to <<anothertoken>>, EmergencyNotification, <<otherdata>
15:46:34.5605 ERROR api.Core.Dispatcher.PushService A notification has requested to requeue - {"aps":{"alert":"New Notification","badge":0,"sound":"ding.mp3"},"MessageId":536884044,"Type":"Incident","Subscriber":"SomeName","SubscriberId":"Someid"}  
15:46:36.8068 WARN api.Core.Dispatcher.PushService Notification channel has been created: PushSharp.Apple.ApplePushService 
15:46:38.3512 INFO api.Core.Dispatcher.PushService Notification sent q:0  d:4/16/2013 15:46:34 PM c:PushSharp.Apple.ApplePushService - {"aps":{"alert":"New Notification","badge":0,"sound":"ringin.mp3"},"MessageId":536884044,"Type":"Incident","Subscriber":"SomeName","SubscriberId":"SomeId"} 
15:46:39.3651 INFO api.Core.Dispatcher.PushService Notification sent q:0  d:4/16/2013 15:46:34 PM c:PushSharp.Apple.ApplePushService - {"aps":{"alert":"New Notification","badge":0,"sound":"EmergencyNotification.mp3"},"MessageId":536884044,"Type":"Incident","Subscriber":"SomeName","SubscriberId":"SomeId"} 
15:46:39.3651 INFO api.Core.Dispatcher.PushService Notification sent q:1  d:4/16/2013 15:46:34 PM c:PushSharp.Apple.ApplePushService - {"aps":{"alert":"New Notification","badge":0,"sound":"ding.mp3"},"MessageId":536884044,"Type":"Incident","Subscriber":"SomeName","SubscriberId":"SomeId"} 
15:46:51.8133 INFO PushSharp.Core.Log PushSharp.Apple.ApplePushService -> Destroying Channel

All this is called through async methods (a web api async controller). I am wondering if this the right usage pattern. I included the timestamps in the messages since I thought they might be useful.

Here's the code sending the notifications:

var payload = new AppleNotificationPayload
                              {
                                  Sound = string.IsNullOrWhiteSpace(deviceToken.Item2) ? "" : deviceToken.Item2 + ".mp3",
                                  Badge = 0,
                                  Alert = 
                                      new AppleNotificationAlert
                                          {
                                              Body = !string.IsNullOrWhiteSpace(message.Subject)
                                                        ? message.Subject.Substring(0, 
                                                            message.Subject.Length > 50 
                                                                ? 50 : message.Subject.Length) : "Subject Text"
                                          }
                              };

            payload.AddCustom("MessageId", new object[] {message.MessageId});
            payload.AddCustom("Type", new object[]{ message.Type.ToString()});
            payload.AddCustom("<<someproperty>>", new object[] { deviceToken.Item3});

            payload.AddCustom("<<anotherproperty>>", subs);

            if(_logger.IsTraceEnabled) _logger.Trace("Sending notification to {0}, {1}, {2}", deviceToken.Item1, deviceToken.Item2, deviceToken.Item3);


            _pushService.Service.QueueNotification
                (new AppleNotification().ForDeviceToken(deviceToken.Item1)                    
                    .WithPayload(payload));

And this is how it is being called:

 recipientDevices.AsParallel().ForAll(
                    deviceToken => SendPayloadToSubscribers(message, subs,deviceToken));

Is this the expected usage pattern?

Any help or pointers would be very appreciated. I am using the latest version of the code.

The text was updated successfully, but these errors were encountered:

omardelrio · 2013-04-17T01:03:58Z

After a bit more testing - I am experiencing exactly the same thing as #85.

After a while, no notifications are sent, and I also start getting MaxConnectionAttempts exceeded.

normanhh3 · 2013-04-17T12:14:38Z

You might want to keep a lid on the number of parallel processes being
spawned to see if too many threads are being run.

On Tue, Apr 16, 2013 at 9:04 PM, omardelrio notifications@github.comwrote:

After a bit more testing - I am experiencing exactly the same thing as #85 #85.

After a while, no notifications are sent, and I also start getting
MaxConnectionAttempts exceeded.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/157#issuecomment-16481277
.

omerkirk · 2013-05-09T17:15:13Z

I am having the same issues and using a single thread to queue the notifications. The exact same frequencies and settings don't cause any issues in the previous releases, if you keep having issues I suggest downgrading the PushSharp version.

Redth · 2013-05-09T17:18:46Z

@omerkirk please tell me which release you are on currently, and which 'previous' release you are referring to by version, and I can see what the changes made between them to see if I can find a regression.

omerkirk · 2013-05-09T17:34:37Z

@Redth The previous files we used are version 1.1.1.0, which seems to work great for us, but as soon as I upgrade the versions to 2.0.4.0 which is I think the final version on NuGet we start receiving a lot of MaxSendAttemptsReached or similar. I am not really sure about the name.

Another stat that might help you is in the older version our ratio of errors to successful notifications was 1 to 1000 however on the latest release we nearly get a 1 to 1 ratio.

We use a single thread to send our notifications so it can't be about parallel connections.

omardelrio · 2013-05-09T17:40:02Z

The way we resolved this was simply restarting the whole service after 15 minutes of inactivity. The trick is to keep a "last notification sent timestamp" and restart everything after some time passes; an additional optimization keeps the service from starting if there is no notification pending (some lazy initialization going on there).

I also removed additional parallelization to avoid possible synchronization issues until I have more time to create an appropriate test case that consistently reproduces the first issue.

I believe my "real" issue has to do with the channel being closed by either APNS servers or intermediary networking equipment; PushSharp, which uses a standard socket, does not find out about this until a few notifications have been considered as "sent", even when they never reached Apple. It appears that the sweet spot for our case was 20 minutes, which seems to be consistent with other reports on this kind of issue.

BTW - the TCP keep-alive option used in the connection does not help, since that only kicks in after 2 hours.

samritchie · 2013-05-16T07:31:37Z

We're having issues with this on 2.0.4 also - we were getting 80-90% of notifications from our AppHarbor worker fail with MaxSendAttemptsReached, although exactly the same code works fine locally. Regularly spinning up a new PushBroker seems to be a workaround.

Looking through the code, it looks like the connection is not properly re-established if it has problems (suspect we are repeatedly hitting ApplePushChannel.cs#136). We'll add some logging to the channel and try to narrow down further.

Redth · 2013-06-05T00:32:13Z

@samritchie did you ever find anything more interesting in your logging?

samritchie · 2013-06-05T00:54:44Z

Not much, because we're restarting the broker now we're not hitting the error. We did get a couple of logs out of it a few weeks ago - some IOExceptions: "Authentication failed because the remote party closed the transport stream"

Redth · 2013-06-05T00:55:25Z

@samritchie Any stack traces you could provide?

samritchie · 2013-06-05T03:58:33Z

No, the logs have been purged since then, sorry. We'll switch back to a singleton lifecycle and wait for the issue to recur if that would be helpful.

Redth · 2013-06-05T12:24:57Z

@samritchie if you can, that would be much appreciated, I'd love to get to the bottom of this!

samritchie · 2013-06-13T02:17:18Z

This is what we seem to be getting the most of at the moment:

System.IO.IOException: Unable to write data to the transport connection: An established connection was aborted by the software in your host machine. ---> System.Net.Sockets.SocketException: An established connection was aborted by the software in your host machine
at System.Net.Sockets.Socket.Send(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags)
at System.Net.Sockets.NetworkStream.Write(Byte[] buffer, Int32 offset, Int32 size)
--- End of inner exception stack trace ---
at System.Net.Sockets.NetworkStream.Write(Byte[] buffer, Int32 offset, Int32 size)
at System.Net.Security._SslStream.StartWriting(Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest)
at System.Net.Security._SslStream.ProcessWrite(Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest)
at System.Net.Security.SslStream.Write(Byte[] buffer, Int32 offset, Int32 count)
at PushSharp.Apple.ApplePushChannel.SendNotification(INotification notification, SendNotificationCallbackDelegate callback)

This seems to happen repeatedly and requeue the notifications until there's a MaxSendAttemptsReachedException. As before, we're seeing a handful of notifications going through, but most are failing. There doesn't appear to be a pattern.

alaa9jo · 2013-07-15T09:31:23Z

I can confirm this issue is happening even with the latest-beta build 2.1.1, in my error log I have this error repeated most of the time [The maximum number of Send attempts to send the notification was reached!] and by repeated I'm talking about thousands,most of the users aren't getting any pushes on ios and android

in my code I send in bulks for android [1K of tokens per push] and for ios I send one by one

aviromascanu · 2013-07-15T10:27:10Z

I can also confirm this issue has happened to us as well with the latest stable version (2.0.4)
Were using the library to send notifications to hundreds of applications daily, and were hitting those socket issues with APNS only, no notifications arrive what so ever for some of these applications whitest their certificate is valid and everything else is ok.
We have attempted to disable the auto-scaling and use a single channel, nothing improved though.

We pushed the latest beta to production and will see if anything resolves.

bmatson · 2013-07-17T09:26:30Z

I've seen similar issues, and had implemented a similar fix (i.e. tearing down and re-establishing the connection after a period of "no pushes") in my fork of the older version. I used 30 minutes in my implementation. I've since updated to the latest version of PushSharp in the hope that the keepalive/canwrite checking would fix the issue. I'd vote for adding a 20-30 minute idle check before sending a push, and just closing and re-opening the socket if it's been more than that time since the last push in the channel itself. It worked wonders in the old version...

Actually, after poking around a bit, it may be possible to change the 2 hour default on the keepalives using an IOControl call....might be simpler to just bump the keepalive time down to 20 minutes and call it a day.

I'm working through some other issues in APNS in my fork so I'd be willing to take a stab at rolling this logic back in as a part of that effort. I'm going to open a separate issue for my other fixes since they seem unrelated.

aviromascanu · 2013-07-17T10:07:40Z

I was thinking about implementing an external rate per second monitor and then indeed kill the connection and retry it, why would it make sense to be idled for a period like 20-30 minutes before pushes start going through?

Redth · 2013-07-17T11:19:29Z

@aviromascanu I like the idea of implementing some sort of mechanism to kill idle connections. Apple requests that we keep connections open to them as long as possible, but in reality, killing an idle connection after say 5 minutes or so is not going to get anyone banned from APNS (trust me, I've done MUCH worse to no ill effect).

I've created a new issue: #231 to track this. @bmatson if you want to add something to your fork I'd be more than happy to bring it in. Thanks guys.

…nutes to compensate for crappy NAT devices.

Added logging facade, additional log messages in APNS. Added debug level to allow for repetitive messages. Fixed re-entrancy issues on timers. Corrected potential infinite loop in Dispose(). Possible fix for PushSharp Redth#157, dropped keepalive time to 20 minutes to compensate for crappy NAT devices.

Redth · 2013-08-08T19:49:26Z

All, please try version 2.1.2-beta or newer. It is on NuGet as a Pre-release and it's also in master now on the repository. Feel free to re-open this if you continue to have issues!

samritchie · 2013-08-09T01:44:59Z

FWIW we've been running commit dc37756 in our hosting environment for 2 weeks with no problems - unsure if the underlying issue was fixed somewhere after 2.0.4, or we haven't had a recurrence of the network behaviour within AppHarbor.

I've upgraded to 2.1.2-beta, I'll report back if I see the it happen again.

Redth · 2013-08-09T01:54:01Z

@samritchie thanks for the feedback. I have a feeling 2.1.2 should work much better in general for APNS.. There's a number of improvements now, one of which is the ApplePushService will scale down the number of channels to 0 after being Idle for more than 5 minutes (this idle timeout is configurable in the ApplePushChannelSettings)... Lots of other improvements that should prevent notifications from being lost, etc.

bmatson pushed a commit to bmatson/PushSharp that referenced this issue Jul 17, 2013

Possible fix for PushSharp Redth#157, dropped keepalive time to 20 mi…

78c497f

…nutes to compensate for crappy NAT devices.

bmatson pushed a commit to bmatson/PushSharp that referenced this issue Jul 17, 2013

Possible fix for PushSharp Redth#157, dropped keepalive time to 20 mi…

4093b2d

…nutes to compensate for crappy NAT devices.

samritchie mentioned this issue Jul 26, 2013

Close TcpClient in ApplePushChannel prior to reconnecting #235

Closed

Redth closed this as completed Aug 8, 2013

shaungl mentioned this issue Oct 5, 2016

APNS Notifications are not receiving on device using PushSharp 4.0.10 #744

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Notifications not sent, event when NotificationSent delegate is called (requeue involved) #157

Notifications not sent, event when NotificationSent delegate is called (requeue involved) #157

omardelrio commented Apr 16, 2013

omardelrio commented Apr 17, 2013

normanhh3 commented Apr 17, 2013

omerkirk commented May 9, 2013

Redth commented May 9, 2013

omerkirk commented May 9, 2013

omardelrio commented May 9, 2013

samritchie commented May 16, 2013

Redth commented Jun 5, 2013

samritchie commented Jun 5, 2013

Redth commented Jun 5, 2013

samritchie commented Jun 5, 2013

Redth commented Jun 5, 2013

samritchie commented Jun 13, 2013

alaa9jo commented Jul 15, 2013

aviromascanu commented Jul 15, 2013

bmatson commented Jul 17, 2013

aviromascanu commented Jul 17, 2013

Redth commented Jul 17, 2013

Redth commented Aug 8, 2013

samritchie commented Aug 9, 2013

Redth commented Aug 9, 2013

Notifications not sent, event when NotificationSent delegate is called (requeue involved) #157

Notifications not sent, event when NotificationSent delegate is called (requeue involved) #157

Comments

omardelrio commented Apr 16, 2013

omardelrio commented Apr 17, 2013

normanhh3 commented Apr 17, 2013

omerkirk commented May 9, 2013

Redth commented May 9, 2013

omerkirk commented May 9, 2013

omardelrio commented May 9, 2013

samritchie commented May 16, 2013

Redth commented Jun 5, 2013

samritchie commented Jun 5, 2013

Redth commented Jun 5, 2013

samritchie commented Jun 5, 2013

Redth commented Jun 5, 2013

samritchie commented Jun 13, 2013

alaa9jo commented Jul 15, 2013

aviromascanu commented Jul 15, 2013

bmatson commented Jul 17, 2013

aviromascanu commented Jul 17, 2013

Redth commented Jul 17, 2013

Redth commented Aug 8, 2013

samritchie commented Aug 9, 2013

Redth commented Aug 9, 2013