New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Snapshot restore operations throttle more than specified #13828
Snapshot restore operations throttle more than specified #13828
Conversation
@mikemccand can you have a look at this PR? |
LGTM, this is a very bad bug! That small |
Are we directly calling |
I checked the other places, and they are all properly guarded. Maybe RateLimiter.pause could have an assertion? |
Well, it's called (correctly) by |
bae0790
to
c95b2e2
Compare
I'm very much in favour of back porting this to 1.7.x, as I saw a 10x slowdown on a production cluster. I added your suggestions. Not closing the RateLimitingInputStream is ok, as it just delegates the close to PartSliceStream (which is now auto-closed). Just banning RateLimiter.pause does not work as it is used in other places of ES (e.g. RecoverySourceHandler). |
OK thanks, and it looks like it does the right thing (checks |
LGTM, thanks @ywelsch! |
@clintongormley which ES versions should we push this back to? |
@ywelsch i'm good with 1.7.x and above |
c95b2e2
to
968419e
Compare
Lucene's RateLimiter can do too much sleeping on small values (see also elastic#6018). The issue here is that calls to "pause" are not properly guarded in "restoreFile". Instead of simply adding the guard, this commit uses the RateLimitingInputStream similar as for "snapshotFile". Closes elastic#13828
968419e
to
03a4e22
Compare
Snapshot restore operations throttle more than specified
Lucene's RateLimiter can do too much sleeping on small values (see also #6018).
The issue here is that calls to "pause" are not properly guarded in "restoreFile".
Instead of simply adding the guard, this commit uses the RateLimitingInputStream similar as for "snapshotFile".