Skip to content

HBASE-27704 Quotas can drastically overflow configured limit#5099

Merged
bbeaudreault merged 1 commit intoapache:masterfrom
HubSpot:HBASE-27704
Mar 29, 2023
Merged

HBASE-27704 Quotas can drastically overflow configured limit#5099
bbeaudreault merged 1 commit intoapache:masterfrom
HubSpot:HBASE-27704

Conversation

@bbeaudreault
Copy link
Copy Markdown
Contributor

Removes limits in refill which did not allow to go negative.
FixedIntervalRateLimiter also needed some changes in getWaitInterval to properly match with how refilling really works.
Updated the tests to revive the original intended logic for quotas, where we really test for overconsumption.

@Apache-HBase
Copy link
Copy Markdown

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 24s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
_ master Compile Tests _
+1 💚 mvninstall 3m 42s master passed
+1 💚 compile 2m 24s master passed
+1 💚 checkstyle 0m 34s master passed
+1 💚 spotless 0m 40s branch has no errors when running spotless:check.
+1 💚 spotbugs 1m 23s master passed
_ Patch Compile Tests _
+1 💚 mvninstall 3m 13s the patch passed
+1 💚 compile 2m 20s the patch passed
+1 💚 javac 2m 20s the patch passed
+1 💚 checkstyle 0m 30s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 hadoopcheck 12m 46s Patch does not cause any errors with Hadoop 3.2.4 3.3.4.
+1 💚 spotless 0m 37s patch has no errors when running spotless:check.
+1 💚 spotbugs 1m 27s the patch passed
_ Other Tests _
+1 💚 asflicense 0m 10s The patch does not generate ASF License warnings.
37m 44s
Subsystem Report/Notes
Docker ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5099/1/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #5099
Optional Tests dupname asflicense javac spotbugs hadoopcheck hbaseanti spotless checkstyle compile
uname Linux 836966f77f32 5.4.0-1093-aws #102~18.04.2-Ubuntu SMP Wed Dec 7 00:31:59 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / c2b64e7
Default Java Eclipse Adoptium-11.0.17+8
Max. process+thread count 82 (vs. ulimit of 30000)
modules C: hbase-server U: hbase-server
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5099/1/console
versions git=2.34.1 maven=3.8.6 spotbugs=4.7.3
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link
Copy Markdown

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 23s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+1 💚 mvninstall 3m 41s master passed
+1 💚 compile 0m 40s master passed
+1 💚 shadedjars 4m 27s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 23s master passed
_ Patch Compile Tests _
+1 💚 mvninstall 3m 13s the patch passed
+1 💚 compile 0m 41s the patch passed
+1 💚 javac 0m 41s the patch passed
+1 💚 shadedjars 4m 28s patch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 20s the patch passed
_ Other Tests _
+1 💚 unit 199m 39s hbase-server in the patch passed.
222m 25s
Subsystem Report/Notes
Docker ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5099/1/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile
GITHUB PR #5099
Optional Tests javac javadoc unit shadedjars compile
uname Linux 5b2431fe935f 5.4.0-1094-aws #102~18.04.1-Ubuntu SMP Tue Jan 10 21:07:03 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / c2b64e7
Default Java Eclipse Adoptium-11.0.17+8
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5099/1/testReport/
Max. process+thread count 2437 (vs. ulimit of 30000)
modules C: hbase-server U: hbase-server
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5099/1/console
versions git=2.34.1 maven=3.8.6
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link
Copy Markdown

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 51s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+1 💚 mvninstall 2m 57s master passed
+1 💚 compile 0m 39s master passed
+1 💚 shadedjars 4m 14s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 26s master passed
_ Patch Compile Tests _
+1 💚 mvninstall 2m 46s the patch passed
+1 💚 compile 0m 39s the patch passed
+1 💚 javac 0m 39s the patch passed
+1 💚 shadedjars 4m 15s patch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 23s the patch passed
_ Other Tests _
+1 💚 unit 218m 30s hbase-server in the patch passed.
240m 14s
Subsystem Report/Notes
Docker ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5099/1/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile
GITHUB PR #5099
Optional Tests javac javadoc unit shadedjars compile
uname Linux b3043fee6145 5.4.0-137-generic #154-Ubuntu SMP Thu Jan 5 17:03:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / c2b64e7
Default Java Temurin-1.8.0_352-b08
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5099/1/testReport/
Max. process+thread count 2411 (vs. ulimit of 30000)
modules C: hbase-server U: hbase-server
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5099/1/console
versions git=2.34.1 maven=3.8.6
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@Apache9 Apache9 requested a review from infraio March 16, 2023 10:22
@bbeaudreault
Copy link
Copy Markdown
Contributor Author

@infraio any chance you might have time to take a look at this?

@Apache9
Copy link
Copy Markdown
Contributor

Apache9 commented Mar 22, 2023

The implementation of RateLimiter is a bit strange to me. The name of the 'canExecute' method indicates that it just tests whether a request can be executed, which should be read only, but in the method we will update avail? Then what is the purpose for consume method?

And in consume method

  public synchronized void consume(final long amount) {

    if (isBypass()) {
      return;
    }

    if (amount >= 0) {
      this.avail -= amount;
      if (this.avail < 0) {
        this.avail = 0;
      }
    } else {
      if (this.avail <= Long.MAX_VALUE + amount) { // <===== here
        this.avail -= amount;
        this.avail = Math.min(this.avail, this.limit);
      } else {
        this.avail = this.limit;
      }
    }
  }

Why we do a 'Long.MAX_VALUE + amount'? It will always overflow?

@bbeaudreault
Copy link
Copy Markdown
Contributor Author

@Apache9 Thanks for taking a look. I agree the API is weird.

I think the main reason is, unlike a normal RateLimiter, in HBase we can't simply do rateLimiter.acquire(permits). When a request comes in (Get/Scan), we don't know how many bytes it will consume. We also have a bunch of RateLimiters of various types to check.

So at a higher level, we do quota.checkQuota which does two loops of all rate limiters with an estimated read size (100 bytes for get, 1000 bytes for scan):

  • The first loop checks canExecute of the estimate bytes, which refills and confirms whether after refill there are enough.
  • The second loop does consume of the estimate, which actually acquires them from the rate limiter.

Once the request finishes, we know how much bytes it consumed. At that point we call consume once more, with the real byte value minus the estimate. This can overflow the rate limiter.

I was not around for the initial implementation of this. I tried looking up reviews from back then, but we weren't on GitHub yet so it's only in reviewboard where the review was minimal.

My guess on why they went with 2 loops is to minimize the chance that the 5th rate limiter (or whatever) fails to consume, and then you have to rollback all the ones that succeeded in consuming.

Why we do a 'Long.MAX_VALUE + amount'? It will always overflow?

In that code snippet, amount is negative. So despite being confusing to read, it won't overflow. I agree its confusing though.

Copy link
Copy Markdown
Member

@ndimiduk ndimiduk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice tests.

@ndimiduk
Copy link
Copy Markdown
Member

Is there an off-the-shelf rate limiter that we can use instead of our own implementation?

@bbeaudreault
Copy link
Copy Markdown
Contributor Author

Thanks again nick! I agree it'd be nice if there were an OTS rate limiter we can use. That was my first thought here as well, since it took me a bit to really grok what was happening here.

The most obvious candidate is guava, but unfortunately it doesn't really work with how we currently use rate limiters in quotas. As described above, we need sort of a multi-phase commit rate limiter where you can lease and return permits, check for permits without consuming, and consume non-blocking (after request is finished, when we finally know how much we needed).

For this issue I'm partial to merging as is. I can give a quick search around for rate limiters like this and file a jira if any look promising.

@bbeaudreault bbeaudreault merged commit c52f4b2 into apache:master Mar 29, 2023
@bbeaudreault bbeaudreault deleted the HBASE-27704 branch March 29, 2023 20:19
bbeaudreault added a commit to HubSpot/hbase that referenced this pull request Mar 29, 2023
…5099)

Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
bbeaudreault added a commit to HubSpot/hbase that referenced this pull request Mar 29, 2023
…5099)

Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants