Skip to content

GEODE-6424: Greatly improves statistic counter storage throughput.#3204

Merged
jake-at-work merged 5 commits intoapache:developfrom
jake-at-work:feature/GEODE-6424
Feb 20, 2019
Merged

GEODE-6424: Greatly improves statistic counter storage throughput.#3204
jake-at-work merged 5 commits intoapache:developfrom
jake-at-work:feature/GEODE-6424

Conversation

@jake-at-work
Copy link
Copy Markdown
Contributor

Reduces thread contention by using LongAdder and DoubleAdder to store
counters. Benchmarking (on specific hardware) showed
Atomic50StatisticsImpl could perform about 41M increments/second
regardless of the number of threads updating the counter. Poor use of
volatile memory access and CAS operations created unnecessary
contention. The replacement, StripedStatisticsImpl, uses LongAdder and
DoubleAdder to reduce contention. The same benchmark showed the
throughput in increments/second scale nearly linearly up to the
physical hardware threads of the host, seeing values as high as 2.8B
increments/second on a 36 thread host.

Thank you for submitting a contribution to Apache Geode.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

  • Is there a JIRA ticket associated with this PR? Is it referenced in the commit message?

  • Has your PR been rebased against the latest commit within the target branch (typically develop)?

  • Is your initial contribution a single, squashed commit?

  • Does gradlew build run cleanly?

  • Have you written or updated unit tests to verify your changes?

  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?

Note:

Please ensure that once the PR is submitted, you check travis-ci for build issues and
submit an update to your PR as soon as possible. If you need help, please send an
email to dev@geode.apache.org.

Reduces thread contention by using LongAdder and DoubleAdder to store
counters. Benchmarking (on specific hardware) showed
Atomic50StatisticsImpl could perform about 41M increments/second
regardless of the number of threads updating the counter. Poor use of
volatile memory access and CAS operations created unnecessary
contention. The replacement, StripedStatisticsImpl, uses LongAdder and
DoubleAdder to reduce contention. The same benchmark showed the
throughput in increments/second scale nearly linearly up to the
physical hardware threads of the host, seeing values as high as 2.8B
increments/second on a 36 thread host.
@jake-at-work
Copy link
Copy Markdown
Contributor Author

See https://issues.apache.org/jira/browse/GEODE-6424 for benchmarking details.

LocalStatisticsImpl benchmarks even worse that Atomic50StatisticsImpl.
It is tightly integrated with OS stats.
public static void compareStatArchiveFiles(final File expectedStatArchiveFile,
final File actualStatArchiveFile) throws IOException {
System.out.println(actualStatArchiveFile);
System.out.println(expectedStatArchiveFile);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these new System.out.printlns intentional?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops! Didn’t mean to check that in. Thanks!

@jake-at-work jake-at-work merged commit 888d2b2 into apache:develop Feb 20, 2019
@jake-at-work jake-at-work deleted the feature/GEODE-6424 branch February 20, 2019 04:49
jake-at-work added a commit that referenced this pull request Feb 20, 2019
…3204)

Reduces thread contention by using LongAdder and DoubleAdder to store
counters. Benchmarking (on specific hardware) showed
Atomic50StatisticsImpl could perform about 41M increments/second
regardless of the number of threads updating the counter. Poor use of
volatile memory access and CAS operations created unnecessary
contention. The replacement, StripedStatisticsImpl, uses LongAdder and
DoubleAdder to reduce contention. The same benchmark showed the
throughput in increments/second scale nearly linearly up to the
physical hardware threads of the host, seeing values as high as 2.8B
increments/second on a 36 thread host.

LocalStatisticsImpl benchmarks even worse that Atomic50StatisticsImpl.
It is tightly integrated with OS stats.

(cherry picked from commit 888d2b2)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants