Retry calling the API when a retry could succeed #12

ebdrup · 2016-03-02T14:18:06Z

We're seeing a few internal_server_error errors in production.

Could you make figo retry (maybe 3 times) on retry-able errors? Meaning any 5XX statuscode or any of these errors:

const RETRIABLE_ERRORS = [
    'ECONNRESET',
    'ENOTFOUND',
    'ESOCKETTIMEDOUT',
    'ETIMEDOUT',
    'ECONNREFUSED',
    'EHOSTUNREACH',
    'EPIPE',
    'EAI_AGAIN'
];

The text was updated successfully, but these errors were encountered:

mfilenko · 2016-03-03T09:22:32Z

Hey @ebdrup,

Could you please provide more information on how did you face those errors so we can investigate that?

Thanks!

ebdrup · 2016-03-03T19:03:16Z

We see these kind of network errors (probably) on all web requests periodically, when the volume is high.

If it was a genuine 500, returned by figo, you should see it in your own logs that you hopefully monitor - I don't think it's a 500, but a network failure.

That's why we build the module request-retry-stream that we use almost everywhere we call webservices. The retries made all these errors go away. And thats nice since we are trying to implement a zero-tolerance for failing web requests.

Unfortunately we can't use request-retry-stream on your web requests as they are embedded in your own raw implementation inside your module.

Allan Ebdrup, CTO @ Debitoor

On 3. mar. 2016, at 10.22, Max Filenko notifications@github.com wrote:

Hey @ebdrup,

Could you please provide more information on how did you face those errors so we can investigate that?

Thanks!

—
Reply to this email directly or view it on GitHub.

ebdrup · 2016-03-08T21:15:33Z

@mfilenko Any news on this issue? Today we got some 502s from you. These can probably also successful if retried. Response we got from you:

<html> <head><title>502 Bad Gateway</title></head> <body bgcolor="white"> <center><h1>502 Bad Gateway</h1></center> <hr><center>nginx</center> </body> </html>

mfilenko · 2016-03-09T10:45:19Z

Hey @ebdrup,

Is it possible to use your request-retry-stream library for that or methods except GET are still truly experimental ;-)?

mfilenko · 2016-03-15T12:51:15Z

Hey @ebdrup,

Thank you again for pointing out this issue.

We are working hard to provide our partners and users with the best experience (you can check our Pingdom uptime history). And we also have a zero-tolerance for failing requests.

We will do our best to eliminate the root cause of this issue on the infrastructure level, so there will be no need in workaround like automatically retry failing request in the SDK.

ebdrup · 2016-03-15T13:15:31Z

When you are providing a SDK that does network connections to your API, a situation where retries are never needed does simply not exist. You HAVE to build in retries to make things work reliably. There is no two ways about it, you simply have to. It is NOT a workaround. It's the only real option for distributed computing.

Give this wikipedia article a good read:
https://en.wikipedia.org/wiki/Fallacies_of_distributed_computing

Please reconsider. We need retries in your SDK, as any seasoned developer with experience with distributed computing will be able to tell you.

ebdrup · 2016-03-15T13:23:07Z

@mfilenko Sorry I didn't see you question about request-retry-stream. No those are not experimental. We are actually using them in production. I'll update the readme. :-)

ebdrup · 2016-03-15T13:42:51Z

@mfilenko I updated the readme, also with information about errors returned. We often use err.statusCode for error handling. As I mentioned we are using it in production on debitoor. It's handling hundreds of thousands of requests every day. Since we added it all our randomly failing network requests have disappeared.

We would have added it to our requests to figo. But that was impossible since the requests are embedded in your SDK.

We are making quite a lot of http requests with it, because our application is build with a lot of microservices.

mfilenko · 2016-03-15T14:07:29Z

@ebdrup, great, thanks! We will include this in the next release.

JeremyCraigMartinez · 2016-07-29T13:47:17Z

Issue resolved with PR #25

mfilenko assigned mfilenko and cokeeffe and unassigned mfilenko Mar 15, 2016

mfilenko added the enhancement label Jul 1, 2016

mfilenko assigned JeremyCraigMartinez Jul 22, 2016

JeremyCraigMartinez mentioned this issue Jul 29, 2016

Feature/sdk 15 #25

Merged

JeremyCraigMartinez closed this as completed Jul 29, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Retry calling the API when a retry could succeed #12

Retry calling the API when a retry could succeed #12

ebdrup commented Mar 2, 2016

mfilenko commented Mar 3, 2016

ebdrup commented Mar 3, 2016

ebdrup commented Mar 8, 2016

mfilenko commented Mar 9, 2016

mfilenko commented Mar 15, 2016

ebdrup commented Mar 15, 2016

ebdrup commented Mar 15, 2016

ebdrup commented Mar 15, 2016

mfilenko commented Mar 15, 2016

JeremyCraigMartinez commented Jul 29, 2016

Retry calling the API when a retry could succeed #12

Retry calling the API when a retry could succeed #12

Comments

ebdrup commented Mar 2, 2016

mfilenko commented Mar 3, 2016

ebdrup commented Mar 3, 2016

ebdrup commented Mar 8, 2016

mfilenko commented Mar 9, 2016

mfilenko commented Mar 15, 2016

ebdrup commented Mar 15, 2016

ebdrup commented Mar 15, 2016

ebdrup commented Mar 15, 2016

mfilenko commented Mar 15, 2016

JeremyCraigMartinez commented Jul 29, 2016