Negative acknowledgments #17

martinthomson · 2015-04-07T18:33:44Z

We need to consider this.

brianraymor · 2015-04-07T22:44:08Z

The original proposal for context:

The push server MUST generate a 504 (Gateway Timeout) if the user agent fails to acknowledge the receipt of the push message or the push server fails to deliver the message prior to its expiration.

martinthomson · 2015-04-10T17:37:49Z

Action for Martin to choose the right status code to use for NACKs.

martinthomson · 2015-04-13T17:16:01Z

OK, here's the problem: there is no status code that works for this scenario.

We could try to be clever and attach different semantics to 404 and 410, or we could mint a new status code. As for whether this is part of -00, we need text soon if it is going to be in.

We could use 418 (I'm A Teapot), but IANA still don't have it registered.

brianraymor · 2015-04-13T17:21:33Z

I still believe that 504 is closest to what we need:

The 504 (Gateway Timeout) status code indicates that the server, while acting as a gateway or proxy, did not receive a timely response from an upstream server it needed to access in order to complete the
request.

martinthomson · 2015-04-13T17:25:39Z

The server (the push service in this case) is not acting as a gateway or proxy. It is the authority for the resource, so using 504 would be nonsensical. A status code for "the contents of the resource expired" are more appropriate.

brianraymor · 2015-04-13T17:34:05Z

503 is also close. Pick one for 00 and discuss -or- mint a new one?

martinthomson · 2015-04-13T17:37:14Z

I think that we need a new one. Close doesn't work.

martinthomson · 2015-04-15T18:33:27Z

We just had a discussion about this issue. The concern raised was that there isn't a great deal of information that is provided to the application server with the NACK. This means that the application server is unable to distinguish between all the error cases: push service failure, subscription removal, TTL expiry, error in the application at the user agent, intentional NACK by the application at the user agent.

The two obvious options in response to a NACK for an application server is to either resend or to escalate. Resending has the problem that you could end up in a tight loop. Escalating might work, but escalation options are quickly exhausted. For instance, you can send a different message that has stronger semantics, like sending a reset state message if the state update message fails.

Other than that, the primary advantage provide to the application server is the running of the TTL timer. I know that Elio thought that this it was pretty important to run timers-as-a-service, but I still think that this is better left to the application server. See https://en.wikipedia.org/wiki/End-to-end_principle :

Put in economics terms, the marginal cost of additional reliability in the network exceeds the marginal cost of obtaining the same additional reliability by measures in the end hosts. The economically efficient level of reliability improvement inside the network depends on the specific circumstances; however, it is certainly nowhere near zero:[Ref 2] Clearly, some effort at the lower levels to improve network reliability can have a significant effect on application performance. (p. 281)

brianraymor · 2015-04-15T21:31:21Z

This means that the application server is unable to distinguish between all the error cases: push
service failure, subscription removal, TTL expiry, error in the application at the user agent, intentional
NACK by the application at the user agent.

This suggests that the push message needs to include data with the status to distinguish between the error cases.

martinthomson · 2015-04-15T22:33:44Z

It was not my intent to suggest that. You will note that at least one of those failures results in no feedback at all.

The intent was to try to lay out why building something like this isn't simple. And why the end-to-end solution is perhaps superior in all respects - other than having the push service run a timer that the application server could run.

Here's another:
My push service manages message TTL in the most efficient way possible. It stores messages against a subscription and delivers them when a user agent requests them. It only expires messages in one of two ways: If it is delivering messages to the user agent, it filters out all the messages that have expired; otherwise, it performs a regular, continuous sweep of all stored messages in the system. This cleanup sweep can take as much as a day to run depending on load and other conditions. The consequence is that a NACK delivery might be sent almost a day late in the worst case.

Requiring NACK as this does makes this sort of push service architecture more expensive to run. And the only benefit it delivers is that application servers can not run timers. Well, that is, application servers that care about reliability, but they can't care too much about it, or they are back to running timers again because only they can account for push service failures.

martinthomson · 2015-07-01T21:31:47Z

We have the following error scenarios to consider:

a message times out (?)
the push service gives up on delivery (either because it was not acknowledged in time, or otherwise) (504?)
the subscription is removed or deleted (410?)

MS have another state, where the app can send a negative signal indicating "yes, I received your message, but it caused an error". That might be a positive acknowledgment, but some applications want to have additional information carried back.

martinthomson · 2015-07-01T21:34:42Z

Elio suggested that we might want to leave some latitude for the push service to generate a range of status codes. Some suggestions might be sensible though.

brianraymor · 2015-07-29T16:42:07Z

Proposal discussed at IETF 93:

Prior to TTL, if the push service gives up on a message (or an acknowledgment) - signal the error to the application server

Even (or especially) with full reliability, there is no point in signaling the expiration of the TTL - the application server might be offline

brianraymor mentioned this issue Apr 9, 2015

Acknowledgment reliability #18

Closed

brianraymor mentioned this issue Apr 13, 2015

Acknowledging TTL=0 messages #24

Closed

brianraymor self-assigned this Apr 15, 2015

brianraymor mentioned this issue Apr 15, 2015

added NACK to section 6.2 #26

Merged

martinthomson unassigned brianraymor Jul 13, 2015

brianraymor added design acknowledgements labels Jul 20, 2015

brianraymor mentioned this issue Aug 4, 2015

Returning data with Acknowledgements #42

Closed

brianraymor self-assigned this Aug 4, 2015

brianraymor added the editor-ready label Aug 4, 2015

This was referenced Oct 13, 2015

editorial changes for negative acknowledgements #48

Merged

Status Code for Negative Acknowledgements #49

Closed

brianraymor closed this as completed in #48 Oct 14, 2015

brianraymor mentioned this issue Jun 8, 2016

Different status codes for negative Push Message Receipts #110

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Negative acknowledgments #17

Negative acknowledgments #17

martinthomson commented Apr 7, 2015

brianraymor commented Apr 7, 2015

martinthomson commented Apr 10, 2015

martinthomson commented Apr 13, 2015

brianraymor commented Apr 13, 2015

martinthomson commented Apr 13, 2015

brianraymor commented Apr 13, 2015

martinthomson commented Apr 13, 2015

martinthomson commented Apr 15, 2015

brianraymor commented Apr 15, 2015

martinthomson commented Apr 15, 2015

martinthomson commented Jul 1, 2015

martinthomson commented Jul 1, 2015

brianraymor commented Jul 29, 2015

Negative acknowledgments #17

Negative acknowledgments #17

Comments

martinthomson commented Apr 7, 2015

brianraymor commented Apr 7, 2015

martinthomson commented Apr 10, 2015

martinthomson commented Apr 13, 2015

brianraymor commented Apr 13, 2015

martinthomson commented Apr 13, 2015

brianraymor commented Apr 13, 2015

martinthomson commented Apr 13, 2015

martinthomson commented Apr 15, 2015

brianraymor commented Apr 15, 2015

martinthomson commented Apr 15, 2015

martinthomson commented Jul 1, 2015

martinthomson commented Jul 1, 2015

brianraymor commented Jul 29, 2015