send() can hang due to http2 client.request() apparently having no timeout #24

TysonAndre · 2020-09-30T14:32:40Z

It seems like many connection timeouts and socket timeout customizations were taken out, and if there are networking issues, I'm concerned that requests to send() could hang forever (until the server ran out of memory?).

Memory rose, probably because this was an express app waiting for send() to finish.
I was seeing similar issues in node 12, but I think the spikes in send() request duration did not last arbitrarily long. There was refactoring of the http2 server to change the timeout from 120 seconds to no timeout, but I can't imagine why the client would be affected

https://nodejs.org/api/http2.html#http2_http2_connect_authority_options_listener mentions options are passed to net.connect.
https://nodejs.org/api/net.html#net_socket_settimeout_timeout_callback (That doesn't look like the correct thing - I probably want the timeout to apply to requests instead)

Similar to node-apn#665
Mentioned in #23

https://nodejs.org/api/http2.html#http2_http2stream_settimeout_msecs_callback may be what I want added to ensure that the - there's examples in the documentation. Instead of NGHTTP2_CANCEL, I may want to close the client and destroy it asynchronously if requests hang for too long.

I'm assuming that each request calling req.setTimeout is its own timeout - otherwise it could get extended indefinitely

const http2 = require('http2');
const client = http2.connect('http://example.org:8000');
const { NGHTTP2_CANCEL } = http2.constants;
const req = client.request({ ':path': '/' });

// Cancel the stream if there's no activity after 5 seconds
// Instead of NGHTTP2_CANCEL, I want to close the client and destroy it asynchronously if requests hang for too long.
// Not sure what new errors that would cause this to need to handle.
// This should also use the logger to indicate that node is cancelling the stream because it took too long.
// 5 seconds may be too low for a default for a wide variety of use cases - consider using a higher timeout and making this configurable
req.setTimeout(5000, () => req.close(NGHTTP2_CANCEL));

petitchevalroux · 2020-10-01T14:02:20Z

I can confirme our worker is in the air doing nothing.

All the "write" part is a bit messy, maybe check to https://github.com/AndrewBarba/apns2/blob/master/lib/http2-client.js as a replacement of client.js. To me it appears cleaner ;)

(Despite session.setTimeout steel missing)

TysonAndre · 2020-10-01T21:50:31Z

@petitchevalroux Refer to https://github.com/AndrewBarba/apns2/issues/33 - there's quite a few unaddressed review comments, and that file does call setTimeout but seems like it introduce different potential failure modes (e.g. it can be establishing multiple connections to APNs concurrently in connect(), but only use one).

You may be interested in #27 - A few unit tests of failure modes were added but I haven't tested this against the real apns servers for a prolonged period of time yet.

petitchevalroux · 2020-10-02T10:08:16Z

No I haven't seen this problem but it's a great find, @trevh3 comments sound interesting.
I dove into http2 for the first time yesterday so I don't understand all. We will check #27 and potentially test its behavior next week.

At the moment we are not sure that timeout is the real problem. But from my first client.js read and this issue it's my first guess We enabled debug mode in production to see if we can have more information on the "worker in the air" effect.

Thanks for your feedback ;)

trx222 · 2020-10-05T18:12:09Z

I also do not understand why the problem is back now - and we have it really often. We actually have implemented an own watchdog .... waiting 10 sec and then creates a new APN Provider and continues. But we are sending packages with 100 receivers and I do not know which of them have received the notification and which not.
We will try now the new version but I think only the timeout will not solve the problem. We are already checking for others ways to send - with curl or whatever.
The version worked over month - but now it happens nearly on every message.

btw.
I posted in parse forum ( https://community.parseplatform.org/t/node-apn-problem-send-hangs/871/5 ) but cool that now issues is enabled here.

TysonAndre · 2020-10-06T14:09:26Z

We actually have implemented an own watchdog .... waiting 10 sec and then creates a new APN Provider and continues. But we are sending packages with 100 receivers and I do not know which of them have received the notification and which not.

@trx222 frequently opening and closing connections to APNs may make the problem worse - APNs may treat frequently opening and closing connections as a malfunctioning client or a denial of service and block connections from your IP address. (I forget where I read that.)

trevh3 · 2020-10-06T14:28:27Z

You saw it in Apple's APNS documentation

Keep your connections with APNs open across multiple notifications; do not repeatedly open and close connections. APNs treats rapid connection and disconnection as a denial-of-service attack. You should leave a connection open unless you know it will be idle for an extended period of time—for example, if you only send notifications to your users once a day, it is acceptable practice to use a new connection each day.

trx222 · 2020-10-06T15:03:49Z

Yes sure - I know this part and we only do a limited number of retrys ( 3 to 5 times ) and then we give up ... But what else should I do - I cannot completly stop sending in that case. We anyways have 30 parallel processes for sending to speed up everything - otherwise it takes to long for transmitting everything.

But The problem also happens if we have just 300 receivers sending in one process ... it makes no difference.

davimacedo · 2020-10-06T17:06:41Z

Guys, could try with this branch? #29

It looks working better for me in my tests, but it would be good to have more feedback before merging.

trx222 · 2020-10-06T18:16:11Z

I will setup a test tomorrow.

rileyweber · 2020-10-15T17:53:34Z

I think this can be closed now with the release of 4.0.0?

TysonAndre mentioned this issue Oct 1, 2020

Enforce timeouts and add simple http2 client test cases #27

Closed

TysonAndre closed this as completed Oct 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

send() can hang due to http2 client.request() apparently having no timeout #24

send() can hang due to http2 client.request() apparently having no timeout #24

TysonAndre commented Sep 30, 2020 •

edited

Loading

petitchevalroux commented Oct 1, 2020

TysonAndre commented Oct 1, 2020 •

edited

Loading

petitchevalroux commented Oct 2, 2020

trx222 commented Oct 5, 2020

TysonAndre commented Oct 6, 2020

trevh3 commented Oct 6, 2020

trx222 commented Oct 6, 2020

davimacedo commented Oct 6, 2020

trx222 commented Oct 6, 2020

rileyweber commented Oct 15, 2020

send() can hang due to http2 client.request() apparently having no timeout #24

send() can hang due to http2 client.request() apparently having no timeout #24

Comments

TysonAndre commented Sep 30, 2020 • edited Loading

petitchevalroux commented Oct 1, 2020

TysonAndre commented Oct 1, 2020 • edited Loading

petitchevalroux commented Oct 2, 2020

trx222 commented Oct 5, 2020

TysonAndre commented Oct 6, 2020

trevh3 commented Oct 6, 2020

trx222 commented Oct 6, 2020

davimacedo commented Oct 6, 2020

trx222 commented Oct 6, 2020

rileyweber commented Oct 15, 2020

TysonAndre commented Sep 30, 2020 •

edited

Loading

TysonAndre commented Oct 1, 2020 •

edited

Loading