Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ACCUMULO-4187: Added rate limiting for major compactions. #90

Merged
merged 1 commit into from Apr 19, 2016

Conversation

ShawnWalker
Copy link
Contributor

Added configuration property tserver.compaction.major.throughput of type PropertyType.MEMORY to control rate limiting of major compactions on each tserver.

Specifying a value of 0B (the default) disables rate limiting.

If a positive value is specified, then all tablet servers will limit the I/O performed during major compaction accordingly. For example, with tserver.compaction.major.throughput=30M, then each tserver will read no more than 30MiB per second and write no more than 30MiB combined over all major compaction threads.

This change involved adding an optional RateLimiter parameter to FileOperations.openReader(...) and FileOperations.openWriter(...). Most of the file changes involve adding an appropriate null to invocations of these methods.

import java.util.List;
import java.util.WeakHashMap;
import java.util.concurrent.Callable;
import java.util.concurrent.atomic.LongAdder;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, LongAdder is only available in JDK8, and we're still on JDK7.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was unaware of LongAdder... its really neat. Yet another nice thing in JDK8 we can't use yet.

@joshelser
Copy link
Member

Made a first pass through the code. Wow! Great work for a first contribution @ShawnWalker! Some general themes:

  • nit-picky stylistic things
  • Missing javadoc on public classes/methods

Some new tests on these new classes (testing the rate limiting components and input/output streams should be really important) would really make this even better.

I'll have to go back to reread about the use of <T extends Class & Interface> littered everywhere with a fresh mind. First time I've run across it and I don't think I entirely grokked the point.

private long currentRate;

public GuavaRateLimiter(long initialRate) {
this.currentRate = initialRate;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should there be a sanity check here to ensure non negative? or are there sufficient checks elsewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm adopting the convention that a non-positive rate should mean "unlimited", and so allowing non-positive values as the current rate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs for tserver.compaction.major.throughput specify using 0 for unlimited. Is specifying 0 or negative documented elsewhere?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking into the validation associated with, PropertyType.MEMORY it seems to check for >=0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a hint in the javadocs for RateLimiter.getRate():

/** Get current QPS of the rate limiter, with a nonpositive rate indicating no limit. */
public long getRate();

But I'll add more explicit comments to GuavaRateLimiter and to SharedRateLimiterFactory

@keith-turner
Copy link
Contributor

I played around with this branch locally. I created a table with 10,000,000 entries using test_ingest using the following commands.

./bin/accumulo shell -u root -p secret -e "createtable test_ingest"
./bin/accumulo org.apache.accumulo.test.TestIngest -u root -p secret --timestamp 1 --size 50 --random 56 --rows 10000000 --start 0 --cols 1 --instance instance16

I set the rate limit to 5M and forced a compaction. I saw the following in the tserver logs.

Compaction 2<< 10,000,000 read | 10,000,000 written | 122,925 entries/sec | 81.350 secs |  431,758,096 bytes | 5307413.596 byte/sec

Then I split the table into 8 tablets and forced a compaction to test the rate limit for multiple threads. I had the default of 3 compaction threads. I saw the following in the logs for this test.

Compaction 2;row_0003749;row_00025 1,249,000 read | 1,249,000 written | 41,866 entries/sec | 29.833 secs |   53,926,291 bytes | 1807605.370 byte/sec
Compaction 2;row_00025;row_000125 1,250,000 read | 1,250,000 written | 41,899 entries/sec | 29.833 secs |   53,970,229 bytes | 1809078.168 byte/sec
Compaction 2;row_000125< 1,250,000 read | 1,250,000 written | 41,783 entries/sec | 29.916 secs |   53,969,343 bytes | 1804029.382 byte/sec
Compaction 2;row_000625;row_0005 1,250,000 read | 1,250,000 written | 42,134 entries/sec | 29.667 secs |   53,970,847 bytes | 1819221.593 byte/sec
Compaction 2;row_0005;row_0003749 1,251,000 read | 1,251,000 written | 42,109 entries/sec | 29.708 secs |   54,012,874 bytes | 1818125.555 byte/sec
Compaction 2;row_00075;row_000625 1,250,000 read | 1,250,000 written | 41,881 entries/sec | 29.846 secs |   53,969,549 bytes | 1808267.406 byte/sec
Compaction 2;row_000875;row_00075 1,250,000 read | 1,250,000 written | 63,909 entries/sec | 19.559 secs |   53,969,511 bytes | 2759318.523 byte/sec
Compaction 2<;row_000875 1,250,000 read | 1,250,000 written | 63,798 entries/sec | 19.593 secs |   53,969,987 bytes | 2754554.535 byte/sec

@ShawnWalker
Copy link
Contributor Author

I've made changes to address most of the comments on this thread. I've also addressed a performance concern that Keith Turner noticed (PositionedOutputs.PositionedOutputStream was behaving poorly).

I've additionally fixed an issue with tracing in TabletServerBatchWriter that was causing the test ShellServerIT.trace(...) to fail for me for reasons unrelated to my changes. Perhaps I should separate that out as a separate issue/patch?

@ctubbsii
Copy link
Member

@ShawnWalker That sounds great! If you can separate out the issue with ShellServerIT as a separate issue, that'd be helpful. I'm guessing it affects older branches, as well.

@keith-turner
Copy link
Contributor

@ShawnWalker I suspect the issue you fixed that was causing ShellServerIT to fail was introduced by ACCUMULO-1755

@keith-turner
Copy link
Contributor

+1

@ShawnWalker
Copy link
Contributor Author

I've reverse committed changes to TabletServerBatchWriter (which were moved to ACCUMULO-4191), and then squashed the changeset to a single commit.

@joshelser
Copy link
Member

I think everything I asked about has been taken care of. Thanks for that, Shawn.

The only thing I don't see (and I didn't say it explicitly earlier, so I don't think it's a blocker to merge this in) is a high-level test. I see you added some tests for the rate-limiter piece. I'm wondering if we could make an integration test specifically to test this feature. It's nice when we have a general test class (built around a minicluster) available so that we can easily test potential bugs and add new tests easily.

@keith-turner LMK if you have time to merge this in and run the tests. Otherwise, I'll kick off something myself.

@keith-turner
Copy link
Contributor

@joshelser I can merge it. An end-to-end test would be nice to detect regressions. I suppose the test would make sure a compaction doesn't run too fast?

@joshelser
Copy link
Member

I suppose the test would make sure a compaction doesn't run too fast?

Yeah, I was thinking about how best to test this. You can only reasonably assert a lower-bound on compaction time (to avoid performance skew on certain hosts). Maybe turning off compression for a table, writing a bunch of data and then asserting that a compaction takes at least X time is easiest. You'll still have to account for "compression" from the run-length encoding, but at least that should be uniform across hosts.

@ShawnWalker
Copy link
Contributor Author

With Keith's help, I've added a small end-to-end IT on rate limiting of major compactions.

@joshelser
Copy link
Member

With Keith's help, I've added a small end-to-end IT on rate limiting of major compactions.

Looks good. Great work guys! 👍

@ctubbsii
Copy link
Member

Looks like there's a few trivial findbugs issues to address in the IT.

Added configuration property tserver.compaction.major.throughput of type PropertyType.MEMORY with a default of 0B (unlimited).  If another value is specified (e.g. 30M), then all tablet servers will limit the I/O performed during major compaction accordingly (e.g. neither reading nor writing more than 30MiB per second combined over all major compaction threads).
@asfgit asfgit merged commit 783314c into apache:master Apr 19, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants