Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store IO throttling throttles far more than asked #6018

Closed
mikemccand opened this issue May 2, 2014 · 2 comments
Closed

Store IO throttling throttles far more than asked #6018

mikemccand opened this issue May 2, 2014 · 2 comments

Comments

@mikemccand
Copy link
Contributor

I've been digging into the "merges can fall behind" at high indexing
rates, and I discovered some serious issues with the IO throttling,
which we recently (#5902) up'd from 20 MB/sec to 50 MB/sec by default.

Net/net I think when we ask for 50 MB/sec today we are really
throttling at something like 8 MB/sec!

Details:

I indexed a bunch of small log-file type docs into 1 shard, 0
replicas, using 1 sync _bulk client, to the point where it did it's
first big-ish merge (611 MB, 440K docs); the merge does not use CFS so
it's really writing 611 MB. I'm using a fast SSD.

With no throttling (index.store.throttle.type=none), the merge takes
20.8 seconds.

With the default 50 MB/sec merge throttling, it takes 72.1 sec, which
far too long (611 MB / 50 = 12.2 sec). The rate limiter enforces the
instantaneous rate, so at worse the merge time should have been 20.8 +
12.2 = 33 sec but likely much less than that because merging takes
CPU time.

So I dug in and discovered one problem, I think caused by the
super.flush and then delegate.flush in BufferedChecksumIndexOutput,
where the RateLimiter is always alternately called first on 8192 bytes
then on 0 bytes. If I fix RateLimiter to just ignore those 0 bytes,
the merge time with 50 MB/sec throttle drops to 49.9 sec: better, but
still too long. (I think once we cutover to Lucene's checksums this 0
byte issue will be fixed?)

System.nanoTime is actually quite costly, so I suspect the overhead of
just checking whether to pause, and of calling Thread.sleep, is way
too much when the pause time is small. So I change SimpleRateLimiter to
just accumulate the incoming bytes and then once it crosses 1 msec
worth at the specified rate, invoke the pause logic.

This really improved it: now the merge takes 25.7 sec at 50 MB/sec
throttle, and 64.9 sec at 10 MB/sec throttle. These times seem correct.

I'll also open a Lucene issue to fix this, and make an XRateLimiter
for ES in the meantime.

@mikemccand mikemccand self-assigned this May 2, 2014
@rmuir
Copy link
Contributor

rmuir commented May 2, 2014

So I dug in and discovered one problem, I think caused by the
super.flush and then delegate.flush in BufferedChecksumIndexOutput,
where the RateLimiter is always alternately called first on 8192 bytes
then on 0 bytes. If I fix RateLimiter to just ignore those 0 bytes,
the merge time with 50 MB/sec throttle drops to 49.9 sec: better, but
still too long. (I think once we cutover to Lucene's checksums this 0
byte issue will be fixed?)

It will solve it in that place: the BufferedChecksumIndexOutput. Note that RateLimitedIndexOutput itself has a similar construction, we should ensure its not doing anything stupid too.

Another similar expensive call is any digester's updateBytes(). We want to avoid calling this native method frequently (e.g. for zero bytes) for the same reasons.

@mikemccand
Copy link
Contributor Author

I opened https://issues.apache.org/jira/browse/LUCENE-5641 to fix this in Lucene; I'll carry the fix over to ES once that's in ...

s1monw added a commit that referenced this issue May 17, 2014
This commit upgrades to the latest Lucene 4.8.1 release including the
following bugfixes:

 * An IndexThrottle now kicks in when merges start falling behind
   limiting index threads to 1 until merges caught up. Closes #6066
 * RateLimiter now kicks in at the configured rate where previously
   the limiter was limiting at ~8MB/sec almost all the time. Closes #6018
@s1monw s1monw added the bug label May 18, 2014
@s1monw s1monw closed this as completed in 85a0b76 May 19, 2014
s1monw added a commit that referenced this issue May 19, 2014
This commit upgrades to the latest Lucene 4.8.1 release including the
following bugfixes:

 * An IndexThrottle now kicks in when merges start falling behind
   limiting index threads to 1 until merges caught up. Closes #6066
 * RateLimiter now kicks in at the configured rate where previously
   the limiter was limiting at ~8MB/sec almost all the time. Closes #6018
s1monw added a commit that referenced this issue May 19, 2014
This commit upgrades to the latest Lucene 4.8.1 release including the
following bugfixes:

 * An IndexThrottle now kicks in when merges start falling behind
   limiting index threads to 1 until merges caught up. Closes #6066
 * RateLimiter now kicks in at the configured rate where previously
   the limiter was limiting at ~8MB/sec almost all the time. Closes #6018
ywelsch pushed a commit to ywelsch/elasticsearch that referenced this issue Oct 3, 2015
Lucene's RateLimiter can do too much sleeping on small values (see also elastic#6018).
The issue here is that calls to "pause" are not properly guarded in "restoreFile".

Instead of simply adding the guard, this commit uses the RateLimitingInputStream similar as for "snapshotFile".

Closes elastic#13828
ywelsch pushed a commit that referenced this issue Oct 3, 2015
Lucene's RateLimiter can do too much sleeping on small values (see also #6018).
The issue here is that calls to "pause" are not properly guarded in "restoreFile".

Instead of simply adding the guard, this commit uses the RateLimitingInputStream similar as for "snapshotFile".

Closes #13828
ywelsch pushed a commit that referenced this issue Oct 3, 2015
Lucene's RateLimiter can do too much sleeping on small values (see also #6018).
The issue here is that calls to "pause" are not properly guarded in "restoreFile".

Instead of simply adding the guard, this commit uses the RateLimitingInputStream similar as for "snapshotFile".

Closes #13828
ywelsch pushed a commit that referenced this issue Oct 3, 2015
Lucene's RateLimiter can do too much sleeping on small values (see also #6018).
The issue here is that calls to "pause" are not properly guarded in "restoreFile".

Instead of simply adding the guard, this commit uses the RateLimitingInputStream similar as for "snapshotFile".

Closes #13828
ywelsch pushed a commit that referenced this issue Oct 3, 2015
Lucene's RateLimiter can do too much sleeping on small values (see also #6018).
The issue here is that calls to "pause" are not properly guarded in "restoreFile".

Instead of simply adding the guard, this commit uses the RateLimitingInputStream similar as for "snapshotFile".

Closes #13828
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants