New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add retry logic to requests #33
Conversation
Nice work! I have a couple of questions:
|
Also:
|
|
@grassator @shuhei No problem, I'll move it into the same function while I am adding the transient error configuration |
@shuhei @grassator I have now moved the retry logic into the same function and have defaulted the config to 0 retries, so that it behaves exactly as it did before. For the transient error configuration, I decided on a function that can be passed in by the consumer to allow them to customise the configuration any way they want. I thought that would be easier than expecting lists of status codes or error strings. Perhaps this should just be a noop by default instead of what I have there? Does anyone have suggestions on how to approach the logging of the retries? Should we allow the user to pass in their own logger which will get called or some kind of aggregator? Perhaps we should pass the retry count for a request in the response/error object the same way as the timings? |
test/client.test.js
Outdated
@@ -1,5 +1,7 @@ | |||
'use strict'; | |||
|
|||
/* eslint-disable */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to disable only a certain check and not whole eslint?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Woops my bad, missed this!
README.md
Outdated
Occasionaly target server can have high latency for a short period of time, or in the case of a stack of servers, one server can be having issues | ||
and retrying the request will allow perron to attempt to access one of the other servers that currently aren't facing issues. | ||
|
||
By default `perron` has retry logic implemented, but configured to perform 0 retries. Internally `perron` [node-retry](https://github.com/tim-kos/node-retry) to handle the retry logic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably missing "uses" after "Internally perron
"
README.md
Outdated
const catWatch = new ServiceClient({ | ||
hostname: 'catwatch.opensource.zalan.do', | ||
// These are the default settings | ||
retryOptions: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that README should provide useful example options, since that is what users will copy-paste by default. If I read node-retry documentation formula right, current config will result in a not linear distribution between 200 and 300ms with more than half delays being 300ms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any recommendations as to what you would prefer here? Currently this example is the same as what the default retryOptions are configured in perron (but with retries set to 1), so I should also update there if you believe that the current config is not appropriate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just change maxTimeout
to 400 probably
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok no worries, I'll update :)
lib/client.js
Outdated
randomize: true, | ||
// this is the default function to check transient errors | ||
// users should define their own check in order to handle errors as they require | ||
transientErrorCheck: (err) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the default function shouldn't be just () => true
?
README.md
Outdated
minTimeout: 200, | ||
maxTimeout: 300, | ||
randomize: true, | ||
transientErrorCheck: (err) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need an arrow function here:
transientErrorCheck(err) {
lib/client.js
Outdated
@@ -248,13 +273,27 @@ class ServiceClient { | |||
|
|||
return new Promise((resolve, reject) => this.breaker.run( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like my comment got lost somehow, but here I'm a little worried that you will call failure
callback multiple times inside a single breaker.run
call, which doesn't seem logically right and also might cause issues. Can we change that so each retry is also a breaker.run
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll look into restructuring it this afternoon.
Any thoughts on how we can provide information on the number of retries performed to the user of perron? I was thinking of possibly adding a value to the response/error object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be OK, but I'm not sure so far what is the best way to provide information about each stage of the request, would be open to any ideas, as now error / info reporting is not really great in perron at the moment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussing with @shuhei I will add a callback function like onRetry
that the consumer of perron can define in order to track the retry action how they like
lib/client.js
Outdated
// it takes the error as an argument | ||
transientErrorCheck(err) { | ||
// by default don't retry if the circuit breaker is open | ||
if (err.type === ServiceClient.CIRCUIT_OPEN) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought this would be a good idea to have by default, but again if you just want it to return true by default I can remove this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about renaming this as shouldRetry
? Open circuit breaker is also literally transient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In relation to this naming, it would conflict with here: https://github.com/zalando-incubator/perron/pull/33/files#diff-50cfa59973c04321b5da0c6da0fdf4feR314
What would your naming preference be for the already defined shouldRetry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing shouldRetry
is just an internal variable, which matters less than public API. Now I think that the existing one is a result of retry
module's decision, which may have already triggered a retry. So how about retrying
, retryingNext
or willRetry
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, will update
lib/client.js
Outdated
* minTimeout?: number, | ||
* maxTimeout?: number, | ||
* randomize?: boolean | ||
* }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to have transientErrorCheck
and onRetry
here.
lib/client.js
Outdated
}, this.options.retryOptions); | ||
|
||
if (this.options.retryOptions.minTimeout > this.options.retryOptions.maxTimeout) { | ||
throw new Error('The `retryInterval` must be equal to or greater than the `minTimeout`'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maxTimeout
intead of retryInterval
?
lib/client.js
Outdated
reject(error); | ||
return; | ||
} | ||
onRetry(currentAttempt, error); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't it be currentAttempt + 1
? The retry will be the next attempt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first time onRetry is called, currentAttempt is 1 which I think is correct
lib/client.js
Outdated
// it receives the current retry attempt and error as args | ||
// eslint-disable-next-line | ||
onRetry(currentAttempt, err) { | ||
return currentAttempt; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you return this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is just a default placeholder, do you think an empty function would be better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if we return something here, people will think that they should also return something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed
README.md
Outdated
attempt the retries or not depending on the type of error. If the function returns true and the number of retries hasn't been exceeded, the request can be retried. | ||
|
||
There is also an onRetry function which can be defined by the user of `perron`. This function is called every time a retry request will be triggered. | ||
It is provided the currentAttempt, as well as the error that is causing the retry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to refer to what currentAttempt
means. Is it 0 or 1 for the first retry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first time onRetry is triggered it is 1, I'll update the readme with more info.
test/client.test.js
Outdated
it('should perform the desired number of retries based on the configuration', (done) => { | ||
let numberOfRetries = 0; | ||
clientOptions.retryOptions = { | ||
retries: 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd put 2
or more just to make sure that it retries multiple times.
assert.equal(err.type, ServiceClient.CIRCUIT_OPEN); | ||
done(); | ||
}); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about adding a test for the transient error function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done :)
lib/client.js
Outdated
}) | ||
.catch(error => { | ||
failure(); reject(error); | ||
failure(); | ||
const shouldRetry = operation.retry(error); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohh, by the way, this is already triggering retry! This should come after the transient error check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe something like this?
const canRetry = currentAttempt > retries;
if (!shouldRetry(error) || !canRetry) {
reject(error);
return;
}
if(operation.retry(error)) {
onRetry(currentAttempt, error);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or this probably makes more sense:
if (!shouldRetry(error)) {
reject(error);
return;
}
if(!operation.retry(error)) {
reject(error);
return;
}
onRetry(currentAttempt, error);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the second looks good to me.
README.md
Outdated
minTimeout: 200, | ||
maxTimeout: 400, | ||
randomize: true, | ||
shouldRetry(err) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about it, just the error might not be enough. You might need info from original request as well to make a judgement (e.g. only retry GET requests)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I'll add the original request info as well
@grassator @shuhei Ready for the next review :) |
👍 |
1 similar comment
👍 |
@Joneser Thank you for the contribution! I released 0.4.0-beta1 to NPM |
Thanks a million @grassator and @shuhei for the reviews! |
This is a first PR to get feedback on my approach
Fixes #18