Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZOOKEEPER-3301:Enforce the quota limit #934

Closed
wants to merge 1 commit into from

Conversation

maoling
Copy link
Member

@maoling maoling commented May 9, 2019

  • Thanks for the original work from ZOOKEEPER-1383, ZOOKEEPER-2593, ZOOKEEPER-451, especially the work from ZOOKEEPER-1383 contributed by Thawan Kooburat(I also sign off his name in the commit message) which also implemented the very valuable throughput quota.In the further, we will also do this.

  • zookeeper.enforeQuota. When enabled and the client exceeds the total bytes or children count hard quota under a znode, the server will reject the request and reply the client a QuotaExceededException by force. The default value is: false.

  • the checkQuota involves the create() and setData() api, not including the delete().

  • When users set the quota which's less than the existing stats, we give a thoughtful warning info.

  • the following code in the StatsTrack has a bad augmentability:

            if (split.length != 2) {
                throw new IllegalArgumentException("invalid string " + stats);
            }
    

    we do a trick to solve it for the expansibility, but we will get a little strange quota info(Output quota for /c2 count=-1,bytes=-1=;byteHardLimit=-1;countHardLimit=5) when using listquota. some UTs has covered it.

  • the logic about checkQuota should be put in the PrepRequestProcessor, other than DataTree.
    we will get the following two negative effects if putting checkQuota in the DataTree:

      1. When the write request has exceeded the quota, the corresponding transaction log will load into disk successfully.It's not good, although it has any data inconsistency issue, because when the server restart, so long as the transaction logs are applied in the same order, the exceeded nodes will not be applied into the state machine.
      1. the client will be blocking and waiting for the response, because when throwing QuotaExceededException in the the DataTree, the rc.stat will be null and BinaryOutputArchive#writeRecord will throw NPE.
      1. Overall, the pre-check about the write request should be done in the PrepRequestProcessor(at least before SyncRequestProcessor)(Look at an example from checkACL())
  • more detail in the ZOOKEEPER-3301.

  • [Added in 2020-02-25] use RateLogger to replace LOG to avoid quota exceed logs flooding the disk

  • A TODO improvement is: only users have admin permission can write to /zookeeper/quota(just like /zookeeper/config) to avoid some users' misoperation

@maoling maoling changed the title Zookeeper-3301:[WIP]Enforce the quota limit and doc the design of the quota. ZOOKEEPER-3301:Enforce the quota limit Jun 11, 2019
@maoling
Copy link
Member Author

maoling commented Jun 12, 2019

retest this please

@anmolnar
Copy link
Contributor

@maoling Do you still work on this patch?
Please rebase.

@maoling
Copy link
Member Author

maoling commented Dec 11, 2019

@anmolnar I will rebase it this weekend

Copy link
Contributor

@ztzg ztzg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @maoling,

Excellent "remix" of the previous work; plus your integration of the logic in PrepRequestProcessor and other clean-ups look good to me. I did not spot any critical issue on the technical front.

I am selecting "Request changes," however, because of a single point which I believe should be addressed as it would difficult to change later: a typo in the enforeQuota property name (enfore instead of enforce—unless I am missing something).

I am quite interested in quotas (and am willing to contribute to other aspects not already handled by your patch), so you will find that my review is a bit "nit-picky": let's be clear that I do not expect you (nor am I trying to push you) to take care of all these points before the "pull request" gets in!

My comments are just personal notes which I am sharing as constructive remarks; moreover, they can very well go in subsequent PRs if considered valid.

In other words: my only concern is enforeQuota; I will change this review to "Approve" is that is changed (or explained).

Cheers, -D

@@ -856,6 +856,10 @@ property, when available, is noted below.
commit log is triggered.
Does not affect the limit defined by *flushDelay*.
Default is 1000.
* *enforeQuota* :
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a typo; I suppose you meant enforceQuota. (This applies throughout.)

FYI, Mohammad Arshad was suggesting to use two properties, enforce.number.quota and enforce.byte.quota:

enforce the quota limit at more granular level. can we add enforce.number.quota and enforce.byte.quota ?

(But I don't know his "use case.")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If users really need this two separated property, we can extend it easily in the future

private void checkQuota(String lastPrefix, long bytesDiff, int countDiff)
throws KeeperException.QuotaExceededException {
if (!enforeQuota) {
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this mean that LOG.warn("Quota exceeded: ...) entries are completely lost? I.e., not only do we not enforce the quota (as requested), but no warnings are produced. Is that the intent?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enforeQuota enables by default. When it disables, users will see a log to show disable infos at the start-up of server and the LOG.warn("Quota exceeded: ...) will never been seen

LOG.debug("checkQuota: lastPrefix={}, bytesDiff={}, countDiff={}", lastPrefix, bytesDiff, countDiff);

if (lastPrefix == null || lastPrefix.length() == 0) {
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting that all callers check for lastPrefix != null, but that checkQuota() does another, more thorough, check. I also note that DataTree.getMaxPrefixWithQuota() explicitly avoids returning empty strings. Should we tighten some of these runtime checks, and/or turn them into invariant checks?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use if (StringUtils.isEmpty(lastPrefix)) instead now , checking extra lastPrefix.length() may be very cheap:)

String[] split = stats.split(",");
if (split.length != 2) {
throw new IllegalArgumentException("invalid string " + stats);
String[] keyValuePairs = stats.split("[,;]+");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation for String.split says:

An invocation of this method of the form str.split(regex, n) yields the same result as the expression Pattern.compile(regex).split(str, n)

Looking that String.java, it seems that anything simpler than a single character and/or escaped character does indeed lead to the instantiation of a new Pattern object. I haven't measured the cost, but it doesn't seem trivial.

So I'd perhaps suggest keeping complex compiled patterns around; something like this (untested):

private static final PAIRS_SEPARATOR = Pattern.compile("[,;]+");
[...]
String[] keyValuePairs = PAIRS_SEPARATOR.split(stats);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @ztzg

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch

@@ -109,17 +123,11 @@ public static boolean delQuota(ZooKeeper zk, String path,
} catch (IllegalArgumentException ex) {
throw new MalformedPathException(ex.getMessage());
} catch (KeeperException.NoNodeException ne) {
System.err.println("quota does not exist for " + path);
return true;
throw new KeeperException.NoNodeException(ne.getMessage());
}
StatsTrack strack = new StatsTrack(new String(data));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumes that the "platform's default charset" is ASCII-compatible. I would suggest being explicit about US-ASCII here:

import static java.nio.charset.StandardCharsets.US_ASCII;
[...]
new String(data, US_ASCII)

Also: there are quite a few instances of new StatsTrack(new String(data)) in this patch; wouldn't it be better to just add a constructor which takes a byte[]?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes and maybe UTF-8 is better

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

  1. set UTF-8
  2. add new constructor public StatsTrack(byte[] stat)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the content of quota znode is generated by server, it only contains character, figure. so no garbled codes?

strack.setByteHardLimit(-1L);
}

zk.setData(quotaPath, strack.toString().getBytes(), -1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same US-ASCII remark as above, but for getBytes:

strack.toString().getBytes(US_ASCII)

with, perhaps, a new byte[] StatsTrack.toNodeData() method which centralizes the conversion in one place?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a new method public byte[] getStatsBytes()

boolean checkByteQuota = (limitStats.getBytes() > -1 || limitStats.getByteHardLimit() > -1);

if (!checkCountQuota && !checkByteQuota) {
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just thinking out loud: how often does that happen? Should the administrator be alerted, somehow, that the clients are hitting an useless quota node? Or am I missing an existing case and/or foreseen extension?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get the point :)

@@ -602,11 +584,11 @@ public void deleteNode(String path, long zxid)
String lastPrefix = getMaxPrefixWithQuota(path);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(GitHub won't let me add a note on line 577—about something I noticed but which was not introduced by this patch.)

Isn't this code slightly wrong?

        if (parentName.startsWith(procZookeeper) && Quotas.limitNode.equals(childName)) {
            // delete the node in the trie.
            // we need to update the trie as well
            pTrie.deletePath(parentName.substring(quotaZookeeper.length()));
        }

Shouldn't it test for parentName.startsWith(quotaZookeeper), instead of procZookeeper?

Oh, and DataTree is full of .substring(quotaZookeeper.length()) which could be replaced by your trimQuotaPath helper :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Yes, parentName.startsWith(quotaZookeeper) is closer to the truth. we can create another ticket to fix it
  • substring(quotaZookeeper.length()) are all replaced with trimQuotaPath

return;
}
synchronized (node) {
limitStats = new StatsTrack(new String(node.data));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does the parsing & "newing" with the node locked; I'd just get the data and move the constructors out of the synchronized block.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember correctly, if don't synchronized (node), spotbugs will complain about it

limitStats = new StatsTrack(new String(node.data));
}
//check the quota
boolean checkCountQuota = (limitStats.getCount() > -1 || limitStats.getCountHardLimit() > -1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

boolean checkCountQuota = countDiff != 0 && (limitStats.getCount() > -1 || limitStats.getCountHardLimit() > -1);

(and the same for checkByteQuota/bytesDiff.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch

@eolivelli
Copy link
Contributor

@maoling can you rebase this branch please ?

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really useful for the client to have a distinction between the two kind of errors ?

* *enforeQuota* :
(Java system property: **zookeeper.enforeQuota**)
enfore the quota check. Enable this option to use the [quota feature](http://zookeeper.apache.org/doc/current/zookeeperQuotas.html).
the default value:true.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be disabled by default, otherwise applications may break

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the original soft quota limit(print warning log) had already enabled by default.
Turning on enforeQuota will not break the compatibility

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it breaks because before this change in case of quota related error you had only a log line, now the client will be blocked

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on disabling the feature by default

@@ -403,7 +407,11 @@ public void setCode(int code) {
/** The session has been closed by server because server requires client to do SASL authentication,
* but client is not configured with SASL authentication or configuted with SASL but failed
* (i.e. wrong credential used.). */
SESSIONCLOSEDREQUIRESASLAUTH(-124);
SESSIONCLOSEDREQUIRESASLAUTH(-124),
/** Exceeded the count quota that setted on the path.*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"setted" -> "was set"

SESSIONCLOSEDREQUIRESASLAUTH(-124),
/** Exceeded the count quota that setted on the path.*/
COUNTQUOTAEXCEEDED(-125),
/** Exceeded the bytes quota that setted on the path.*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"setted" -> "was set"

String[] split = stats.split(",");
if (split.length != 2) {
throw new IllegalArgumentException("invalid string " + stats);
String[] keyValuePairs = stats.split("[,;]+");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @ztzg

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maoling I have finished my first pass.

The patch is very big.
A better approach would be to split the patch into smaller patches, for instance:

  • server side changes (always before the client)
  • client side changes (always after the server)
  • CLI

This way we can make review simpler and more effective.

/** Exceeded the count quota that setted on the path.*/
COUNTQUOTAEXCEEDED(-125),
/** Exceeded the bytes quota that setted on the path.*/
BYTEQUOTAEXCEEDED(-126);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to use 2 separate return codes.
I think that the client in any case cannot recover "better" if it has this information.

The real information is "you have no more space" and the client cannot adapt its behavior given the error code.

Usually ZK clients perform tests (with java switch for instance) over the RC, adding 2 new RCs will be very annoying.

The best option would be to not add a new RC, but I can't find a good value to use

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The patch is very big. A better approach would be to split the patch into smaller patches
the patch

  • Although the patch is a little big, the implementation is straightforward and simple, and lots of codes are written only for testing.
  • They're all in one, I cannot split them. If I split the server and CLI codes, it's not easy for me to test the changes

@@ -59,7 +71,7 @@ public StatsTrack(String stats) {
* @return the count as part of this string
*/
public int getCount() {
return this.count;
return (int) getValue(countStr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we casting to "int" ?
it the value is a "long" we must return a "long" or we have to handle the overflow some way

@@ -109,17 +123,11 @@ public static boolean delQuota(ZooKeeper zk, String path,
} catch (IllegalArgumentException ex) {
throw new MalformedPathException(ex.getMessage());
} catch (KeeperException.NoNodeException ne) {
System.err.println("quota does not exist for " + path);
return true;
throw new KeeperException.NoNodeException(ne.getMessage());
}
StatsTrack strack = new StatsTrack(new String(data));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes and maybe UTF-8 is better

static {
skipACL = System.getProperty("zookeeper.skipACL", "no").equals("yes");
if (skipACL) {
LOG.info("zookeeper.skipACL==\"yes\", ACL checks will be skipped");
}
enforeQuota = System.getProperty("zookeeper.enforeQuota", "true").equals("true");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enforeQuota -> enforceQuota

DataNode n = zkDatabase.getNode(path);
byte[] data = setDataRequest.getData();
if (n != null) {
String lastPrefix = zkDatabase.getDataTree().getMaxPrefixWithQuota(path);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are adding computations even if it is not needed (for instance in my project I do not have quotas)
if there any way to not run this code if enforceQuota is not enabled ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if there any way to not run this code if enforceQuota is not enabled ?

if enforceQuota is not enabled, this code will not been run. But if enabled, that's inevitable


//check the quota firstly
ZKDatabase zkDatabase = zks.getZKDatabase();
String lastPrefix = zkDatabase.getDataTree().getMaxPrefixWithQuota(path);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, please do not make the system work if it is not needed

Copy link
Contributor

@ztzg ztzg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @maoling,

I had another look at the diff since your last push. Thank you for taking care of my nits :) (I also noticed the vastly-expanded test suite; so thank you for that, too!)

I'm "Approving" this as the points I was concerned about are gone, and as I would like GitHub to remove the "Request changes" sign from my review.

I do however agree with @eolivelli that StatsTrack.getCount() and friends could just be longs, as the casting opens an unnecessary can of worms. (And I'll of course defer to him wrt. the actual "mergeability" of this PR.)

In any case, I'll continue monitoring this as I am planning to later build upon it.

Cheers, -D

@maoling
Copy link
Member Author

maoling commented Jan 10, 2020

@ztzg @eolivelli Thanks for your great review work, and I have answered all your review comments
Plz continue to review with latest patch.

* *enforeQuota* :
(Java system property: **zookeeper.enforeQuota**)
enfore the quota check. Enable this option to use the [quota feature](http://zookeeper.apache.org/doc/current/zookeeperQuotas.html).
the default value:true.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it breaks because before this change in case of quota related error you had only a log line, now the client will be blocked

/** Exceeded the count quota that was set on the path.*/
COUNTQUOTAEXCEEDED(-125),
/** Exceeded the bytes quota that was set on the path.*/
BYTESQUOTAEXCEEDED(-126),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that for the client it does not have much value to have two different error codes.
The fact is that "you are using too much resources".

Having two codes would make sense only if the application could automatically handle the error

@maoling
Copy link
Member Author

maoling commented Mar 31, 2020

@eolivelli That's OK. I have received your comment.
@anmolnar @nkalmar Do you have time to review it :)?
After that, I'll process this round review comments in another push.

Copy link
Contributor

@ztzg ztzg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still LGTM, in general, and seems to work well. Thank you again for your work on this!

I noticed a bug in DelQuotaCommand and SetQuotaCommand, however. It was not introduced by this patch, but the additional options make it worse :) Feel free to ignore it in this round, in which case I will open a separate ticket & PR (which may be warranted, if other Commands suffer from the same issue).

Comment on lines -49 to +50
og1.addOption(new Option("b", false, "bytes quota"));
og1.addOption(new Option("n", false, "num quota"));
og1.addOption(new Option("n", false, "num soft quota"));
og1.addOption(new Option("b", false, "bytes soft quota"));
og1.addOption(new Option("N", false, "num hard quota"));
og1.addOption(new Option("B", false, "bytes hard quota"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a bug you introduced—it's also in master—but it turns out that that OptionGroup business does not quite work as one would expect:

] create /foo
Created /foo
] setquota -b 42 /foo

Okay.

] create /bar
Created /bar
] setquota -b 42 /bar

Still okay, because we're using the -b flag again. But this:

] create /baz
Created /baz
] setquota -B 42 /baz
org.apache.commons.cli.AlreadySelectedException: The option 'B' was specified but an option from this group has already been selected: 'b'

… now fails, because the "global" OptionGroup object has been modified to mark the -b as in use! I'd suggest getting rid of the option field, and recreating the Option "tree" for each parse.

A workaround is to quit and restart the "cli" program.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ztzg

  • I know this. It may be not a bug, just a feature:). A little inconvenient when changing the option.
  • Change OptionGroup to Option will survive from it? e.g: SetAclCommand

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have been a bit clearer, because unless I missed something, once one has used the -b flag of setquota, for example, one can never, ever again use another flag with setquota, even on another node, unless one exits and restarts zkCli.sh. That seems a bit extreme as a "feature" :)

(But again: you did not introduce that bug; it was there before. I can fix it in another ticket/PR.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Yes it's doable. Please create another ticket for it.
  • using OptionGroup will have that inconvenient, which making options exclusive in one client session.
  • For the fix, no need to restart? just replace OptionGroup to Option, then check the args length in the code?
  • You can also create a ticket for this issue you found:
    if (f (parentName.startsWith(prh(procZookeeper) &&Quotas.limitNode.equals(chi(childName))

Copy link
Contributor

@ztzg ztzg Apr 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay—I'll take care of those :)

(Including experimenting with Option vs. OptionGroup, as you suggested.)

Comment on lines -55 to +57
og1.addOption(new Option("b", true, "bytes quota"));
og1.addOption(new Option("n", true, "num quota"));
og1.addOption(new Option("n", true, "num soft quota"));
og1.addOption(new Option("b", true, "bytes soft quota"));
og1.addOption(new Option("N", true, "num hard quota"));
og1.addOption(new Option("B", true, "bytes hard quota"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same OptionGroup issue as with DelQuotaCommand.

@ztzg
Copy link
Contributor

ztzg commented Oct 12, 2020

Hi, @maoling,

For the record, I asked this question on the dev mailing list:

[Is] something specific blocking your pull request, or is it just a matter of finding time to work on it? Is there a way I can help? It would be great to push this topic over the hump (it started with ZOOKEEPER-451!).

(The whole discussion about 3.7 is visible on this page.)

Copy link
Contributor

@hanm hanm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think quota is a useful feature - we have something internally developed but not contributed back. Would be good to have this PR finalized and then we can contribute ours (on top of this) back.

I have some quick inline comments.

Some other general comments (not necessarily has to be done in this PR, but raise for discussion)

  • It might be better to provide an option so server can optionally close client session when a client exceeds the quota. Leaving that client open by simply rejecting the requests does not sound very optimal.
  • Quota check sometimes is expensive so it might worth to add a new processor stage to optimize processing pipeline, instead of doing the check in pre processor.
  • Quota can be applied to not just write operations, but also reads.

* *enforeQuota* :
(Java system property: **zookeeper.enforeQuota**)
enfore the quota check. Enable this option to use the [quota feature](http://zookeeper.apache.org/doc/current/zookeeperQuotas.html).
the default value:true.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on disabling the feature by default


data = zk.getData(statPath, false, new Stat());
StatsTrack statStrack = new StatsTrack(data);
if ((quotaStrack.getCount() > -1 && quotaStrack.getCount() < statStrack.getCount()) || (quotaStrack.getCountHardLimit() > -1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are quite a few checks in this code block that can be extracted into some sort of checkQuota abstraction

@maoling
Copy link
Member Author

maoling commented Oct 14, 2020

@hanm @eolivelli I will rebase this PR and address all your review comments at this weekend

@anmolnar
Copy link
Contributor

@maoling Great!

@maoling
Copy link
Member Author

maoling commented Oct 26, 2020

@hanm @eolivelli @anmolnar I have rebased this PR, update the PR description and address all your review comments. Please remind me if I forget something. New changes:

1. Only one Exception: QuotaExceededException(delete ByteQuotaExceededException and CountQuotaExceededException)
2. zookeeper.enforceQuota is false by default now
3. extract a methond in the SetQuotaCommand#checkQuota to reduce the duplicated codes

Quota check sometimes is expensive so it might worth to add a new processor stage to optimize processing pipeline, instead of doing the check in pre processor.

IMO, if the logics about the quota inflate in the future, we can use a new class: QuotaManager to decouple codes.

Quota can be applied to not just write operations, but also reads.

Strongly Agree

@maoling
Copy link
Member Author

maoling commented Oct 26, 2020

Some thoughts about Quotas:

0. Why is quota necessary?

It is possible for clients to send/fetch very high volumes of data and thus monopolize server resources, cause network saturation and even DOS attack to server. Having quotas protects against these issues and is all the more important in large multi-tenant clusters where a small set of badly behaved clients can degrade user experience for the well behaved ones. Quota provides an isolation effect on users applications, eliminate the noisy of neighbor, avoid users to deploy multiple sets of zk cluster for every application in most general scenarios. Of course, because in the same JVM instance, this isolation provided is not as hard as the hardware/physics isolation.

1. Space Quota

We now can use quota feature to limit the the total children count or bytes under a znode. we can called this Quota type: Space Quota. (Not repeat it again here).

2. Throttle Quota

2.0 brief introduction

We also require another Quota type: Throttle Quota(or throughput Quota) which is able to limit the QPS of requests, throttle request peak/pressure and smooth request traffic. Throttle Quota can also have protective effects on the server as RequestThrottler did, and more flexible as it can be tuning during the runtime

2.1 manual

  • Throttle type: it can be expressed as READ, WRITE, or the default type(read + write).

  • Timeframes: it can be expressed in the following units: sec, min, hour, day. Actually, for simplification, it's enough to only support sec for most user cases.

  • Request sizes: it can be expressed in the following units: B (bytes), K (kilobytes), M (megabytes), G (gigabytes), T(
    terabytes), P (petabytes). For example: 1048576 bytes/sec means: allow to request 1MB data every second.

  • Request number: it is a number. For example: 1000 reqs/sec represents: allow one thousand requests every second.

  • How to measure the request rate?

    • Most systems take a Sliding Window algorithm approach. Its base idea is: the sliding window is that instead of making discrete jumps as the fixed window does, it smoothly slides with passing time. For example, split the one second into ten small time window(every time window is 100ms, we call it a slot or block), then judge which slot the current request located and calculate current QPS. more detail in [6]
        AtomicLong[] countPerSlot = new AtomicLong();
        countPerSlot[index].incrementAndGet();
        count.incrementAndGet();
        index = (index + 1) % slot;
        scheduledExecutorService.scheduleAtFixedRate(slidingWindowRateLimiter, 100, 100, 
        TimeUnit.MILLISECONDS);// slot = 10;
    

2.2 Scope

Background: Suppose that we have 26 downstream businesses(from business-A to business-Z). If latency spike happens and traffic surges, how can we detect which business is responsible for this and control this emergency situation rapidly? To meet this demand, we need to do some works to identity the client. The scope of Throttle Quota can be implemented from these two alternative dimensions: User or Client.id

2.2.1 User:

ZooKeeper doesn't have a thorough user management component. For simplification, we can take advantage of the username:password of ACL digest auth.

String auth = userName + ":" +password;
zk.addAuthInfo("digest", auth.getBytes());
2.2.2 Client.id:

client.id is a logical grouping of clients with a meaningful name chosen by the client application.
client.id are defined in the application using the client.id property. A client ID identifies an application making a request.

2.2.2.1 How to generate the client.id?
  • If using a unauthorized client, the client id is hard-coded by a "client.id" property in the ZKClientConfig.

  • If using an authorized client, we can obtain the secured identity from client certificate as the client.id.
    For a Kerberos' example:

    Client {
      xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
      principal="businessA/cluster2-host1@HADOOP.COM";
    };
    

    Parse the keyword: "businessA" as the client.id. Of course, an authorized client can also use the hard-coded way as the unauthorized client did. Some discussions in [5].

2.2.3 Hard/Soft Throttle Quota
  • For the hard Throttle Quota, the server will reject requests and reply the client a QuotaExceededException by force.
  • However, for the soft Throttle Quota, we should not do nothing, just log the warnings as count or bytes space quota did. At this case, when server detects this quota violation, the server slows down a client exceeding its quota. It computes the amount of delay needed to bring a guilty client under it's quota and delays the response for that time. This approach keeps the quota violation transparent to clients. This also keeps them from having to implement any special backoff and retry behavior which can get tricky. In fact, bad client behavior (retry without backoff) can exacerbate the very problem quotas are trying to solve. Of course at this case, users need to increase the value of zookeeper.request.timeout to avoid too much client's timeout

3 Some existing designs

3.1

In the original implementation of ZOOKEEPER-1383, it counted the throughput from ServerCnxn#packetReceived based on a simple time-window approach. However it did not distinguish between read and write requests, and it's not easy and appropriate to do it at that place: ServerCnxn. It used the ServerCnxn#resetStats to split/divide time windows, it's not good for controlling the traffic more precisely in a fine-grained manner.

A Simplified request workflow:
request ---> ServerCnxn ---> RequestThrottler ---> PrepRequestProcessor --->......---> FinalRequestProcessor

3.2

In the FB's design[1], they combine with RequestThrottler based on the Rolling-window rate throttling approach. When request arrives at the RequestThrottler, asks for the RequestOracle to update/check Quotas and make a throttling decision, then pipeline requests in the RequestProcessors. more details in [1],[2]

Reference:

@hanm
Copy link
Contributor

hanm commented Oct 27, 2020

Some thoughts about Quotas:

0. Why is quota necessary?

It is possible for clients to send/fetch very high volumes of data and thus monopolize server resources, cause network saturation and even DOS attack to server. Having quotas protects against these issues and is all the more important in large multi-tenant clusters where a small set of badly behaved clients can degrade user experience for the well behaved ones. Quota provides an isolation effect on users applications, eliminate the noisy of neighbor, avoid users to deploy multiple sets of zk cluster for every application in most general scenarios. Of course, because in the same JVM instance, this isolation provided is not as hard as the hardware/physics isolation.

1. Space Quota

We now can use quota feature to limit the the total children count or bytes under a znode. we can called this Quota type: Space Quota. (Not repeat it again here).

2. Throttle Quota

2.0 brief introduction

We also require another Quota type: Throttle Quota(or throughput Quota) which is able to limit the QPS of requests, throttle request peak/pressure and smooth request traffic. Throttle Quota can also have protective effects on the server as RequestThrottler did, and more flexible as it can be tuning during the runtime

2.1 manual

  • Throttle type: it can be expressed as READ, WRITE, or the default type(read + write).

  • Timeframes: it can be expressed in the following units: sec, min, hour, day. Actually, for simplification, it's enough to only support sec for most user cases.

  • Request sizes: it can be expressed in the following units: B (bytes), K (kilobytes), M (megabytes), G (gigabytes), T(
    terabytes), P (petabytes). For example: 1048576 bytes/sec means: allow to request 1MB data every second.

  • Request number: it is a number. For example: 1000 reqs/sec represents: allow one thousand requests every second.

  • How to measure the request rate?

    • Most systems take a Sliding Window algorithm approach. Its base idea is: the sliding window is that instead of making discrete jumps as the fixed window does, it smoothly slides with passing time. For example, split the one second into ten small time window(every time window is 100ms, we call it a slot or block), then judge which slot the current request located and calculate current QPS. more detail in [6]
        AtomicLong[] countPerSlot = new AtomicLong();
        countPerSlot[index].incrementAndGet();
        count.incrementAndGet();
        index = (index + 1) % slot;
        scheduledExecutorService.scheduleAtFixedRate(slidingWindowRateLimiter, 100, 100, 
        TimeUnit.MILLISECONDS);// slot = 10;
    

2.2 Scope

Background: Suppose that we have 26 downstream businesses(from business-A to business-Z). If latency spike happens and traffic surges, how can we detect which business is responsible for this and control this emergency situation rapidly? To meet this demand, we need to do some works to identity the client. The scope of Throttle Quota can be implemented from these two alternative dimensions: User or Client.id

2.2.1 User:

ZooKeeper doesn't have a thorough user management component. For simplification, we can take advantage of the username:password of ACL digest auth.

String auth = userName + ":" +password;
zk.addAuthInfo("digest", auth.getBytes());
2.2.2 Client.id:

client.id is a logical grouping of clients with a meaningful name chosen by the client application.
client.id are defined in the application using the client.id property. A client ID identifies an application making a request.

2.2.2.1 How to generate the client.id?
  • If using a unauthorized client, the client id is hard-coded by a "client.id" property in the ZKClientConfig.

  • If using an authorized client, we can obtain the secured identity from client certificate as the client.id.
    For a Kerberos' example:

    Client {
      xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
      principal="businessA/cluster2-host1@HADOOP.COM";
    };
    

    Parse the keyword: "businessA" as the client.id. Of course, an authorized client can also use the hard-coded way as the unauthorized client did. Some discussions in [5].

2.2.3 Hard/Soft Throttle Quota
  • For the hard Throttle Quota, the server will reject requests and reply the client a QuotaExceededException by force.
  • However, for the soft Throttle Quota, we should not do nothing, just log the warnings as count or bytes space quota did. At this case, when server detects this quota violation, the server slows down a client exceeding its quota. It computes the amount of delay needed to bring a guilty client under it's quota and delays the response for that time. This approach keeps the quota violation transparent to clients. This also keeps them from having to implement any special backoff and retry behavior which can get tricky. In fact, bad client behavior (retry without backoff) can exacerbate the very problem quotas are trying to solve. Of course at this case, users need to increase the value of zookeeper.request.timeout to avoid too much client's timeout

3 Some existing designs

3.1

In the original implementation of ZOOKEEPER-1383, it counted the throughput from ServerCnxn#packetReceived based on a simple time-window approach. However it did not distinguish between read and write requests, and it's not easy and appropriate to do it at that place: ServerCnxn. It used the ServerCnxn#resetStats to split/divide time windows, it's not good for controlling the traffic more precisely in a fine-grained manner.

A Simplified request workflow:
request ---> ServerCnxn ---> RequestThrottler ---> PrepRequestProcessor --->......---> FinalRequestProcessor

3.2

In the FB's design[1], they combine with RequestThrottler based on the Rolling-window rate throttling approach. When request arrives at the RequestThrottler, asks for the RequestOracle to update/check Quotas and make a throttling decision, then pipeline requests in the RequestProcessors. more details in [1],[2]

Reference:

This is an excellent write up and summarize of existing public JIRA tickets, thanks for spending time pull the information together. Can we move this into a google doc or in a specific JIRA ticket?

@arshadmohammad had an excellent point from security point of view which I agree - and I also think crafting admin only Quota APIs can be done separately because the quota enforce feature is disabled by default.

I'll give a detailed review of this later this week - please bear with my limited review bandwidth.

@ztzg
Copy link
Contributor

ztzg commented Jan 5, 2021

Hi @maoling, all,

There are quite a few outstanding good ideas in the comments above. But I am under the impression that the PR itself is ready to be merged, and that the other items, including your design document, should be lifted into tickets and/or wiki pages. Would you agree? If you do, I would like to give this a last push so that it lands in 3.7.0.

Cheers, -D

@maoling
Copy link
Member Author

maoling commented Jan 6, 2021

@ztzg
Yes, we should create an umbrella JIRA tickets for the future works described in the design doc.
I'll rebase this PR tomorrow

@maoling
Copy link
Member Author

maoling commented Jan 9, 2021

A little busy recently, just basically solve the conflicts in this PR. I will have a booked time at next weekend(01-16 and 01-17) to recheck this patch and give it another push

@maoling maoling closed this Jan 9, 2021
@maoling maoling reopened this Jan 9, 2021
@ztzg
Copy link
Contributor

ztzg commented Jan 10, 2021

A little busy recently, just basically solve the conflicts in this PR. I will have a booked time at next weekend(01-16 and 01-17) to recheck this patch and give it another push

Sure, no problem.

FYI, I had also rebased the PR on top of current master, 1/ to see if there was anything problematic and 2/ because I will probably have to carry it as a patch on top of 3.7.0 anyway.

I have pushed the results there: ztzg@87fa21f . I haven't spotted anything unexpected in the "interdiff"; feel free to grab (and check) my tree if it helps.

@anmolnar
Copy link
Contributor

anmolnar commented Jan 15, 2021

@maoling Hopefully you don't mind that I copied your great write up about Quotas to our wiki pages:
https://cwiki.apache.org/confluence/display/ZOOKEEPER/Quotas

I hope I can get back to this patch sooner rather than later, because I'd really love to see this submitted, but way too busy with other stuff.

Would you mind rebasing it first?

>
>
Co-authored-by: maoling <maoling199210191@sina.com>
Co-authored-by: Thawan Kooburat <thawan@apache.org>
@@ -394,6 +394,7 @@ protected void pRequest2Txn(int type, long zxid, Request request, Record record,
validatePath(path, request.sessionId);
nodeRecord = getRecordForPath(path);
zks.checkACL(request.cnxn, nodeRecord.acl, ZooDefs.Perms.WRITE, request.authInfo, path, null);
zks.checkQuota(path, setDataRequest.getData(), OpCode.setData);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THREAD_SAFETY_VIOLATION: Unprotected write. Non-private method PrepRequestProcessor.pRequest2Txn(...) indirectly writes to field server.ZooKeeperServer.RATE_LOGGER.msg outside of synchronization.
Reporting because this access may occur on a background thread.


switch (type) {
case OpCode.create:
checkQuota(lastPrefix, dataBytes, 1);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THREAD_SAFETY_VIOLATION: Read/Write race. Non-private method ZooKeeperServer.checkQuota(...) indirectly reads without synchronization from server.ZooKeeperServer.RATE_LOGGER.count. Potentially races with write in method ZooKeeperServer.checkQuota(...).
Reporting because this access may occur on a background thread.

@maoling
Copy link
Member Author

maoling commented Jan 17, 2021

@ztzg @anmolnar Thanks for your ping and work. I have rebased this PR, please help me recheck it.

@ztzg
Copy link
Contributor

ztzg commented Jan 17, 2021

@ztzg @anmolnar Thanks for your ping and work. I have rebased this PR, please help me recheck it.

@maoling: Fantastic, thanks! Will do!

@anmolnar, @eolivelli, @hanm, @nkalmar, @symat: I need (at least) a second approval for merging this. Is any of you up to it, so that I can pull this feature into 3.7?

Copy link
Contributor

@symat symat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think this patch is about to be ready to be merged, and this should be one of the highlighted features of 3.7.0. Thanks @maoling for working on this so much and also thanks for all the reviewers!

I quickly went through the code and also checked the tests.
I gave the +1 assuming that the unit tests run successfully (I saw that the maven job "full-build-java-tests" failed, while the "continuous-integration/jenkins/pr-merge" job succeeded, so I assume that the tests errors were due to CI problems...)

@symat
Copy link
Contributor

symat commented Jan 18, 2021

I re-triggered the github CI checks

@ztzg ztzg closed this in 190a227 Jan 19, 2021
ztzg pushed a commit that referenced this pull request Jan 19, 2021
- Thanks for the original work from ZOOKEEPER-1383, ZOOKEEPER-2593, ZOOKEEPER-451, especially the work from ZOOKEEPER-1383 contributed by [Thawan Kooburat](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=thawan)(I also sign off his name in the commit message) which also implemented the very valuable throughput quota.In the further, we will also do this.
- `zookeeper.enforeQuota`. When enabled and the client exceeds the total bytes or children count hard quota under a znode, the server will reject the request and reply the client a `QuotaExceededException` by force. The default value is: false.
- the checkQuota involves the `create()` and `setData()` api, not including the `delete()`.
- When users set the quota which's less than the existing stats, we give a thoughtful warning info.
- the following code in the StatsTrack has a bad augmentability:

  >             if (split.length != 2) {
  >                 throw new IllegalArgumentException("invalid string " + stats);
  >             }

   we do a trick to solve it for the expansibility, but we will get a little strange quota info(`Output quota for /c2 count=-1,bytes=-1=;byteHardLimit=-1;countHardLimit=5`) when using `listquota`. some UTs has covered it.
- the logic about `checkQuota` should be put in the `PrepRequestProcessor`, other than `DataTree`.
  we will get the following two negative effects if putting `checkQuota` in the `DataTree`:
  - 1. When the write request has exceeded the quota, the corresponding transaction log will load into disk successfully.It's not good, although it has any data inconsistency issue, because when the server restart, so long as the transaction logs are applied in the same order, the exceeded nodes will not be applied into the state machine.
  - 2. the client will be blocking and waiting for the response, because when throwing `QuotaExceededException` in the the `DataTree`, the` rc.stat` will be `null` and `BinaryOutputArchive#writeRecord` will throw `NPE`.
  - 3. Overall, the pre-check about the write request should be done in the `PrepRequestProcessor`(at least before `SyncRequestProcessor`)(Look at an example from `checkACL()`)
- more detail in the [ZOOKEEPER-3301](https://issues.apache.org/jira/browse/ZOOKEEPER-3301).
- [Added in 2020-02-25] use `RateLogger` to replace `LOG` to avoid quota exceed logs flooding the disk
- A `TODO` improvement is: only users have admin permission can write to `/zookeeper/quota`(just like `/zookeeper/config`) to avoid some users' misoperation

Author: maoling <maoling199210191@sina.com>

Reviewers: Mate Szalay-Beko <symat@apache.org>, Damien Diederen <ddiederen@apache.org>, Enrico Olivelli <eolivelli@apache.org>, Michael Han <hanm@apache.org>

Closes #934 from maoling/ZOOKEEPER-3301

(cherry picked from commit 190a227)
Signed-off-by: Damien Diederen <ddiederen@apache.org>
ztzg pushed a commit that referenced this pull request Jan 19, 2021
- Thanks for the original work from ZOOKEEPER-1383, ZOOKEEPER-2593, ZOOKEEPER-451, especially the work from ZOOKEEPER-1383 contributed by [Thawan Kooburat](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=thawan)(I also sign off his name in the commit message) which also implemented the very valuable throughput quota.In the further, we will also do this.
- `zookeeper.enforeQuota`. When enabled and the client exceeds the total bytes or children count hard quota under a znode, the server will reject the request and reply the client a `QuotaExceededException` by force. The default value is: false.
- the checkQuota involves the `create()` and `setData()` api, not including the `delete()`.
- When users set the quota which's less than the existing stats, we give a thoughtful warning info.
- the following code in the StatsTrack has a bad augmentability:

  >             if (split.length != 2) {
  >                 throw new IllegalArgumentException("invalid string " + stats);
  >             }

   we do a trick to solve it for the expansibility, but we will get a little strange quota info(`Output quota for /c2 count=-1,bytes=-1=;byteHardLimit=-1;countHardLimit=5`) when using `listquota`. some UTs has covered it.
- the logic about `checkQuota` should be put in the `PrepRequestProcessor`, other than `DataTree`.
  we will get the following two negative effects if putting `checkQuota` in the `DataTree`:
  - 1. When the write request has exceeded the quota, the corresponding transaction log will load into disk successfully.It's not good, although it has any data inconsistency issue, because when the server restart, so long as the transaction logs are applied in the same order, the exceeded nodes will not be applied into the state machine.
  - 2. the client will be blocking and waiting for the response, because when throwing `QuotaExceededException` in the the `DataTree`, the` rc.stat` will be `null` and `BinaryOutputArchive#writeRecord` will throw `NPE`.
  - 3. Overall, the pre-check about the write request should be done in the `PrepRequestProcessor`(at least before `SyncRequestProcessor`)(Look at an example from `checkACL()`)
- more detail in the [ZOOKEEPER-3301](https://issues.apache.org/jira/browse/ZOOKEEPER-3301).
- [Added in 2020-02-25] use `RateLogger` to replace `LOG` to avoid quota exceed logs flooding the disk
- A `TODO` improvement is: only users have admin permission can write to `/zookeeper/quota`(just like `/zookeeper/config`) to avoid some users' misoperation

Author: maoling <maoling199210191@sina.com>

Reviewers: Mate Szalay-Beko <symat@apache.org>, Damien Diederen <ddiederen@apache.org>, Enrico Olivelli <eolivelli@apache.org>, Michael Han <hanm@apache.org>

Closes #934 from maoling/ZOOKEEPER-3301

(cherry picked from commit 190a227)
Signed-off-by: Damien Diederen <ddiederen@apache.org>
@ztzg
Copy link
Contributor

ztzg commented Jan 19, 2021

Hi @maoling,

Enforceable quota support is now in master, branch-3.7 and branch-3.7.0. Thank you very, very much for your work on this! This comes on the heels of your promotion to committer status :)

As you and other mentioned, there are more features to come. I hope this constitutes a good basis for @hanm to refactor, and hopefully, unload, his internal patches. @arshadmohammad had important suggestions, which I have duly noted. As per our discussions above, I also have a few minor followups I have to take care of.

Thank you @anmolnar for taking care of transferring @maoling's notes; @eolivelli, @hanm and others for the early/intermediate reviews, and @symat for the closing one—as well as re-running CI, which allowed the Jenkins test suite to succeed.

@maoling maoling deleted the ZOOKEEPER-3301 branch January 19, 2021 10:15
@maoling
Copy link
Member Author

maoling commented Sep 14, 2021

I found another issue/bug introduced by this PR: ZOOKEEPER-4375 Quota cannot limit the specify value when multiply clients create/set znodes. I will fix this bug asap !!!!

RokLenarcic pushed a commit to RokLenarcic/zookeeper that referenced this pull request Aug 31, 2022
- Thanks for the original work from ZOOKEEPER-1383, ZOOKEEPER-2593, ZOOKEEPER-451, especially the work from ZOOKEEPER-1383 contributed by [Thawan Kooburat](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=thawan)(I also sign off his name in the commit message) which also implemented the very valuable throughput quota.In the further, we will also do this.
- `zookeeper.enforeQuota`. When enabled and the client exceeds the total bytes or children count hard quota under a znode, the server will reject the request and reply the client a `QuotaExceededException` by force. The default value is: false.
- the checkQuota involves the `create()` and `setData()` api, not including the `delete()`.
- When users set the quota which's less than the existing stats, we give a thoughtful warning info.
- the following code in the StatsTrack has a bad augmentability:

  >             if (split.length != 2) {
  >                 throw new IllegalArgumentException("invalid string " + stats);
  >             }

   we do a trick to solve it for the expansibility, but we will get a little strange quota info(`Output quota for /c2 count=-1,bytes=-1=;byteHardLimit=-1;countHardLimit=5`) when using `listquota`. some UTs has covered it.
- the logic about `checkQuota` should be put in the `PrepRequestProcessor`, other than `DataTree`.
  we will get the following two negative effects if putting `checkQuota` in the `DataTree`:
  - 1. When the write request has exceeded the quota, the corresponding transaction log will load into disk successfully.It's not good, although it has any data inconsistency issue, because when the server restart, so long as the transaction logs are applied in the same order, the exceeded nodes will not be applied into the state machine.
  - 2. the client will be blocking and waiting for the response, because when throwing `QuotaExceededException` in the the `DataTree`, the` rc.stat` will be `null` and `BinaryOutputArchive#writeRecord` will throw `NPE`.
  - 3. Overall, the pre-check about the write request should be done in the `PrepRequestProcessor`(at least before `SyncRequestProcessor`)(Look at an example from `checkACL()`)
- more detail in the [ZOOKEEPER-3301](https://issues.apache.org/jira/browse/ZOOKEEPER-3301).
- [Added in 2020-02-25] use `RateLogger` to replace `LOG` to avoid quota exceed logs flooding the disk
- A `TODO` improvement is: only users have admin permission can write to `/zookeeper/quota`(just like `/zookeeper/config`) to avoid some users' misoperation

Author: maoling <maoling199210191@sina.com>

Reviewers: Mate Szalay-Beko <symat@apache.org>, Damien Diederen <ddiederen@apache.org>, Enrico Olivelli <eolivelli@apache.org>, Michael Han <hanm@apache.org>

Closes apache#934 from maoling/ZOOKEEPER-3301
RokLenarcic pushed a commit to RokLenarcic/zookeeper that referenced this pull request Aug 31, 2022
- Thanks for the original work from ZOOKEEPER-1383, ZOOKEEPER-2593, ZOOKEEPER-451, especially the work from ZOOKEEPER-1383 contributed by [Thawan Kooburat](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=thawan)(I also sign off his name in the commit message) which also implemented the very valuable throughput quota.In the further, we will also do this.
- `zookeeper.enforeQuota`. When enabled and the client exceeds the total bytes or children count hard quota under a znode, the server will reject the request and reply the client a `QuotaExceededException` by force. The default value is: false.
- the checkQuota involves the `create()` and `setData()` api, not including the `delete()`.
- When users set the quota which's less than the existing stats, we give a thoughtful warning info.
- the following code in the StatsTrack has a bad augmentability:

  >             if (split.length != 2) {
  >                 throw new IllegalArgumentException("invalid string " + stats);
  >             }

   we do a trick to solve it for the expansibility, but we will get a little strange quota info(`Output quota for /c2 count=-1,bytes=-1=;byteHardLimit=-1;countHardLimit=5`) when using `listquota`. some UTs has covered it.
- the logic about `checkQuota` should be put in the `PrepRequestProcessor`, other than `DataTree`.
  we will get the following two negative effects if putting `checkQuota` in the `DataTree`:
  - 1. When the write request has exceeded the quota, the corresponding transaction log will load into disk successfully.It's not good, although it has any data inconsistency issue, because when the server restart, so long as the transaction logs are applied in the same order, the exceeded nodes will not be applied into the state machine.
  - 2. the client will be blocking and waiting for the response, because when throwing `QuotaExceededException` in the the `DataTree`, the` rc.stat` will be `null` and `BinaryOutputArchive#writeRecord` will throw `NPE`.
  - 3. Overall, the pre-check about the write request should be done in the `PrepRequestProcessor`(at least before `SyncRequestProcessor`)(Look at an example from `checkACL()`)
- more detail in the [ZOOKEEPER-3301](https://issues.apache.org/jira/browse/ZOOKEEPER-3301).
- [Added in 2020-02-25] use `RateLogger` to replace `LOG` to avoid quota exceed logs flooding the disk
- A `TODO` improvement is: only users have admin permission can write to `/zookeeper/quota`(just like `/zookeeper/config`) to avoid some users' misoperation

Author: maoling <maoling199210191@sina.com>

Reviewers: Mate Szalay-Beko <symat@apache.org>, Damien Diederen <ddiederen@apache.org>, Enrico Olivelli <eolivelli@apache.org>, Michael Han <hanm@apache.org>

Closes apache#934 from maoling/ZOOKEEPER-3301
RokLenarcic pushed a commit to RokLenarcic/zookeeper that referenced this pull request Aug 31, 2022
- Thanks for the original work from ZOOKEEPER-1383, ZOOKEEPER-2593, ZOOKEEPER-451, especially the work from ZOOKEEPER-1383 contributed by [Thawan Kooburat](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=thawan)(I also sign off his name in the commit message) which also implemented the very valuable throughput quota.In the further, we will also do this.
- `zookeeper.enforeQuota`. When enabled and the client exceeds the total bytes or children count hard quota under a znode, the server will reject the request and reply the client a `QuotaExceededException` by force. The default value is: false.
- the checkQuota involves the `create()` and `setData()` api, not including the `delete()`.
- When users set the quota which's less than the existing stats, we give a thoughtful warning info.
- the following code in the StatsTrack has a bad augmentability:

  >             if (split.length != 2) {
  >                 throw new IllegalArgumentException("invalid string " + stats);
  >             }

   we do a trick to solve it for the expansibility, but we will get a little strange quota info(`Output quota for /c2 count=-1,bytes=-1=;byteHardLimit=-1;countHardLimit=5`) when using `listquota`. some UTs has covered it.
- the logic about `checkQuota` should be put in the `PrepRequestProcessor`, other than `DataTree`.
  we will get the following two negative effects if putting `checkQuota` in the `DataTree`:
  - 1. When the write request has exceeded the quota, the corresponding transaction log will load into disk successfully.It's not good, although it has any data inconsistency issue, because when the server restart, so long as the transaction logs are applied in the same order, the exceeded nodes will not be applied into the state machine.
  - 2. the client will be blocking and waiting for the response, because when throwing `QuotaExceededException` in the the `DataTree`, the` rc.stat` will be `null` and `BinaryOutputArchive#writeRecord` will throw `NPE`.
  - 3. Overall, the pre-check about the write request should be done in the `PrepRequestProcessor`(at least before `SyncRequestProcessor`)(Look at an example from `checkACL()`)
- more detail in the [ZOOKEEPER-3301](https://issues.apache.org/jira/browse/ZOOKEEPER-3301).
- [Added in 2020-02-25] use `RateLogger` to replace `LOG` to avoid quota exceed logs flooding the disk
- A `TODO` improvement is: only users have admin permission can write to `/zookeeper/quota`(just like `/zookeeper/config`) to avoid some users' misoperation

Author: maoling <maoling199210191@sina.com>

Reviewers: Mate Szalay-Beko <symat@apache.org>, Damien Diederen <ddiederen@apache.org>, Enrico Olivelli <eolivelli@apache.org>, Michael Han <hanm@apache.org>

Closes apache#934 from maoling/ZOOKEEPER-3301
@maoling maoling restored the ZOOKEEPER-3301 branch September 2, 2022 11:56
RokLenarcic pushed a commit to RokLenarcic/zookeeper that referenced this pull request Sep 3, 2022
- Thanks for the original work from ZOOKEEPER-1383, ZOOKEEPER-2593, ZOOKEEPER-451, especially the work from ZOOKEEPER-1383 contributed by [Thawan Kooburat](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=thawan)(I also sign off his name in the commit message) which also implemented the very valuable throughput quota.In the further, we will also do this.
- `zookeeper.enforeQuota`. When enabled and the client exceeds the total bytes or children count hard quota under a znode, the server will reject the request and reply the client a `QuotaExceededException` by force. The default value is: false.
- the checkQuota involves the `create()` and `setData()` api, not including the `delete()`.
- When users set the quota which's less than the existing stats, we give a thoughtful warning info.
- the following code in the StatsTrack has a bad augmentability:

  >             if (split.length != 2) {
  >                 throw new IllegalArgumentException("invalid string " + stats);
  >             }

   we do a trick to solve it for the expansibility, but we will get a little strange quota info(`Output quota for /c2 count=-1,bytes=-1=;byteHardLimit=-1;countHardLimit=5`) when using `listquota`. some UTs has covered it.
- the logic about `checkQuota` should be put in the `PrepRequestProcessor`, other than `DataTree`.
  we will get the following two negative effects if putting `checkQuota` in the `DataTree`:
  - 1. When the write request has exceeded the quota, the corresponding transaction log will load into disk successfully.It's not good, although it has any data inconsistency issue, because when the server restart, so long as the transaction logs are applied in the same order, the exceeded nodes will not be applied into the state machine.
  - 2. the client will be blocking and waiting for the response, because when throwing `QuotaExceededException` in the the `DataTree`, the` rc.stat` will be `null` and `BinaryOutputArchive#writeRecord` will throw `NPE`.
  - 3. Overall, the pre-check about the write request should be done in the `PrepRequestProcessor`(at least before `SyncRequestProcessor`)(Look at an example from `checkACL()`)
- more detail in the [ZOOKEEPER-3301](https://issues.apache.org/jira/browse/ZOOKEEPER-3301).
- [Added in 2020-02-25] use `RateLogger` to replace `LOG` to avoid quota exceed logs flooding the disk
- A `TODO` improvement is: only users have admin permission can write to `/zookeeper/quota`(just like `/zookeeper/config`) to avoid some users' misoperation

Author: maoling <maoling199210191@sina.com>

Reviewers: Mate Szalay-Beko <symat@apache.org>, Damien Diederen <ddiederen@apache.org>, Enrico Olivelli <eolivelli@apache.org>, Michael Han <hanm@apache.org>

Closes apache#934 from maoling/ZOOKEEPER-3301
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants