Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhancement request #8

Closed
bayareagreg opened this issue Dec 30, 2016 · 27 comments
Closed

enhancement request #8

bayareagreg opened this issue Dec 30, 2016 · 27 comments
Labels

Comments

@bayareagreg
Copy link

would it be possible to enhance Bucket interface (or at least GridBucket impl) to add a method
that tries (no waiting) to consume up to N tokens at once and returns:

  1. number of tokens actually consumed AND
  2. the expiration date associated with these tokens.
    Thanks
    I can explain my use case if needed.
    Greg
@vladimir-bukhtoyarov
Copy link
Collaborator

Yes it is possible. Moreover, looks like I already added this sort of methods to Bucket interface in the 2.0 branch. Please review the javadocs for interface in order to be sure that all necessary features is planned.

Unfortunately, I have no plans to continue development in the 1.0 branch, and because of time limitation I am not able to release 2.0 immediately, so the 2.0 version will be released in the end of January 2017. So If you need in this methods ASAP then try to prepare pull request for 1.0 branch by itself.

@bayareagreg
Copy link
Author

Vladimir, I can wait another month, no problem. However I do not see the method I need in the 2.0 proposed interface. Would you please clarify which method will address what I asked for? Thanks in advance. Greg

@vladimir-bukhtoyarov
Copy link
Collaborator

vladimir-bukhtoyarov commented Dec 30, 2016

  1. It will be possible to configure statistics and record each consumed token https://github.com/vladimir-bukhtoyarov/bucket4j/blob/2.0/src/main/java/com/github/bucket4j/statistic/BucketStatistic.java#L9

  2. The second case the expiration date associated with these tokens need to be explained, I do not understand it.

@bayareagreg
Copy link
Author

No, that's not quite what I need. Let's try again.

I am looking for something like
Pair<long, long> tryConsume(long numtokens)
the semantics are as follows:

try to consume up to numtokens without any waiting and return how many could be consumed along with how long these tokens are valid for.

For example, say the bucket configuration is 5 tokens per minute. At time t0 the bucket was started. At time t0 + 55 seconds, we call tryConsume(10). The return value should be Pair of <5, t0 + 1 minute>

Hopefully this is a little clearer. As to why I need this, that's a separate question, I can explain that too if you like.
Thanks
Greg

@bayareagreg
Copy link
Author

Let me explain why I need this. I am trying to use bucket4j library in a very distributed environment. That is, at any given time I may have multiple JVMs running concurrently and all of them sharing one “bucket”. Each JVM is essentially running a servlet container where the requests need to be rate-limited. So each request requires a check against the rate limit.
Due to current constraints out of my control, we do not use Coherence, Ignite or Hazelcast to store shared state. We need to use etcd.

I have implemented a GridProxy for etcd. It works, but it is very expensive to use. Every execute() call results in 3 round trips to etcd server - do a distributed lock, get state and update state. This overhead is simply not acceptable for production. So I have thought of a way to cut down on the overhead, by using another level of caching on top of what Bucket offers. Instead of getting one token at a time, I will get 10 at a time. So the next 9 requests will not need to go to etcd server.
That is provided those 9 tokens are still valid at the time the request comes in.

Something like this:

class MyRateLimiter
{
private final Bucket _bucket;
private final EtcdProxy _proxy;
private final ManualBucket _manual;

public MyRateLimiter()
{
    _proxy = new EtcdProxy();
    _bucket = Buckets.withMillisTimePrecision().
            withLimitedBandwidth(5000, TimeUnit.MINUTES, 1).  // 5000 messages per 1 minute
            buildCustomGrid(_proxy);
    _manual = new ManualBucket();
}

public synchronized boolean tryConsumeOne()
{
    if (_manual.remove() )
    {
        return true;
    }
    else
    {
        Pair<Long, Long> pair = _bucket.someNewTryConsumeMethod(10);
        if (pair.getFirst() > 0)
        {
            _manual.add(pair.getFirst() - 1, pair.getSecond());
            return true;
        }
    }
    return false;
}

private static final class ManualBucket
{
private long _numTokens;
private long _expiration;

    public void add(long numTokens, long expiration)
    {
        if (_numTokens != 0)
        {
            throw new IllegalStateException();
        }
        _numTokens = numTokens;
        _expiration = expiration;
    }

    public boolean remove()
    {
        if (_numTokens > 0)
        {
            if (System.currentTimeMillis() <= _expiration)
            {
                _numTokens--;
                return true;
            }
            else
            {
                _numTokens = 0;
            }
        }
        return false;
    }
}

So only 1 out of every 10 calls actually needs to go to the “grid”. Obviously, the trade-off is the rate-limiter is not 100% accurate anymore, i.e. some tokens might be unused, while another JVM might not get all the tokens it needs (up to 10), but this is acceptable to us.

The only thing I am missing here is the API that lets me try to consume up to, say, 10 tokens without blocking and also once I do get these tokens, to know how long they’re valid for.

Thanks
Greg

@vladimir-bukhtoyarov
Copy link
Collaborator

vladimir-bukhtoyarov commented Dec 31, 2016

Ok, now I clearly understand your problem. It is really, that without introducing of "expiration" concept tokens can be over consumed in naive scenario like following:

public static int BATCH_SIZE = 10;

public MyRateLimiter()
{
   _localTokens = new AtomicLong(0);
    _proxy = new EtcdProxy();
    _bucket = Buckets.withMillisTimePrecision().
            withLimitedBandwidth(5000, TimeUnit.MINUTES, 1).  // 5000 messages per 1 minute
            buildCustomGrid(_proxy);
    _manual = new ManualBucket();
}

public synchronized boolean tryConsumeOne()
{
    if (_localTokens.get() > 0 )
    {
        _localTokens.decrementAndGet();
         return true;
    }
    else
    {
        long issuedTokens = _bucket.consumeAsMuchAsPossible(BATCH_SIZE);
        if (issuedTokens == 0)
        {
            // there are no tokens in the remote bucket
            return false;
        } else {
           _localTokens.addAndGet(issuedTokens - 1); // 1 token consumed by current request
           return true;
        }
    }
}

In this scenario distributed limit can be overconsumed by N * (BATCH_SIZE - 1). Where N - is count of JVM in cluster.
However, is it so critical for you? As I can see, your current BATCH_SIZE is 10, and it is dramatically less than bucket capacity 5000. For example if there 10 JVM in your cluster then limit in worst case can be broken at most 1.8%. And, lets discuss probability of worst case. In order to achieve worst case the following things should happen:

  • Cluster is initialized, 10 servlet containers started.
  • Each servlet container processed one request, as result 100 tokens consumed from etcd bucket, each servlet container holds 10 - 1 = 9 buckets in the memory, 10 containers hold 90 tokens.
  • The system is unused for a time required to refill 100 tokens, for example 2 seconds.
  • Any abnormal request storming begin, and in same time request distribution is equal for each container.
  • As result of storming 10 container consumed 5000 tokens form etcd bucket and used 90 buckets stored in local memory, as result bucket limit broken by 90/5000 = 1.8%

Before going forward, and spend time to API extension, I want to be sure that described scenario is really critical for you. In my opinion if batch_size is dramatically less than bucket capacity, then we can ignore scenario described above. Also, I strongly recommend to read this discussion, explicitly last comment in the issue. The problem raised in #2 looks more hard that we are discussing in this thread.

@bayareagreg
Copy link
Author

Vladimir,

  1. 5000 is just an example number I happen to be testing with right now. The real limits will be specified by our users. It could be 20 or it could be 20000.
  2. is consumeAsMuchAsPossble() not blocking? I did not realize that. I thought all consumeXYZ() APIs were blocking until tokens are available and tryConsumeXYZ(...) were not. In my domain, I need to reject requests with 429 Rate Limit Exceeded rather than wait.
  3. In your code snippet locally cached tokens are valid forever. In other words, a token obtained from remote grid at time t0 would still be usable at time t0 + 1hour regardless of bucket configuration. That does not work for us. Erroring on the side of "over the limit" is not acceptable to us. Erroring (a little bit) under the limit is acceptable.
    Greg

@bayareagreg
Copy link
Author

I guess my #3 is what you meant by "over consumed". Again, that is not acceptable in our domain.

@vladimir-bukhtoyarov
Copy link
Collaborator

consumeAsMuchAsPossble is non-blocking. If there no tokens in the bucket then it will return zero.

@vladimir-bukhtoyarov
Copy link
Collaborator

vladimir-bukhtoyarov commented Dec 31, 2016

Anyway, If any type of "over the limit consuption" is not acceptable, then you should not use any solution based on the leaky/token bucket, at least because of burst. As explicitly described in the wikipedia page for leaky bucket, you should use leaky bucket iff you are agree that burst can happen sometimes. In classic algoritm the burst equals to capacity. In your case, in additional to classic burst, there is yet another burst, which value as described above, equals to CLUSTER_SIZE * (BATCH_SIZE - 1)(lets call it batch-burst). I want to speculate that we can neglect the batch-burst, and if you agree with classic-burst then you should agree with batch-burst.

@bayareagreg
Copy link
Author

Vladimir,
I guess I disagree with your logic.
I didn't really say bursts are not allowed. However this "batch-burst" as you call it, in my opinion is not a "real" burst. Its not backed up by real demand. It is simply a side effect of our workaround solution. I do not agree that we can ignore "batch-burst". To be 100% certain, I would need to check back with my teammates (after the holidays).

Anyway, I am a little confused. Earlier you said "Before going forward, and spend time to API extension". What kind of API extension were you referring to? Originally, I asked for two things. Looks like one of them is already there. So now I the only thing missing for me is how long the cached tokens are valid for. Is that not a reasonable request?
Thanks
Greg

@bayareagreg
Copy link
Author

Let me add something else. I wasn't really asking fore lectures on what token/leaky bucket is or isn't. I asked for a very specific API enhancement. So either you will do it or you won't and then I will move on to something else. Caching the tokens without any expiration dates feels all kinds of wrong to me and I won't use such solution. Presumably you put this library out here because you thought people might find it useful. Well, here I am interested in using it save for one enhancement. Just decide whether or want to do it or not. Spare the lectures.
Respectfully
Greg

@vladimir-bukhtoyarov
Copy link
Collaborator

I will answer tomorrow. I need to think a little bit.

@vladimir-bukhtoyarov
Copy link
Collaborator

Anyway, I am a little confused. Earlier you said "Before going forward, and spend time to API extension". What kind of API extension were you referring to?

I need to figure proper place to put new functionality. To make correctly design solution I need as much information as possible.

So now I the only thing missing for me is how long the cached tokens are valid for. Is that not a reasonable request?

Not. You should nener ask questions in this way. Always prefer to begin from problem which you are trying to solve instead of request to implementation something. This manner will preserve your time and time of your colleagues. This is example of good question in this thread.

Let me add something else. I wasn't really asking fore lectures on what token/leaky bucket is or isn't. I asked for a very specific API enhancement.

Remember that You are not in MacDonnalds. Moreover, even in MacDonalds you are able to only order hod-dog, but You can not specify the way in which hot dogs must be cooked. There are many users which already succesfully use Bucket4j. If I will grow up my library in monkey patching style, then library will become too complex in short time. I want to old and new users be happy to use this library. In order to achive this goal, I need to keep API is consistent and logically coupled. So I will ask as many questions as I need. Usually I know nothing about how many anybody knows about details of leaky bucket, and I ask same sort of questions to everybody who ask to implement something new.

I will answer tomorrow. I need to think a little bit.

I rethought this thread from the beginning and found solution to your problem described in this post:

  • "Expiration" concept should not be a part of Bucket interface. Because of mathematical model of leaky/token bucket does not support this concept by design. The bucket model is responsible only to answer to one question: is it possible to consume(imediately or after delay) N tokens or not. Any attempt to introduce expiration concept to Bucket model will fail.
  • Bucket cannot respond the concrete date when issued tokens will be expired, but bucket can respond how many time required to refill similar amount of tokens. Client can use this time period to implement expiration by itself.
  • The major point of client-side expiration - is that process of expiration should be continually. It should not be written like this, using single date will not work.
  • The speed of expiration of any portion of consumed tokens should be calculated by following formaula:
    EXPIRATION_SPEED = CONSUMED_TOKENS / TIME_TO_REFILL / CLUSTER_SIZE;
    where:
    CONSUMED_TOKENS - the count of tokens returned by consumeAsMuchAsPossibleAndCalculateTimeRequiredForRefill(limit) method;
    TIME_TO_REFILL - the time required to refill CONSUMED_TOKENS. Similar to CONSUMED_TOKENS this is part of response.
    CLUSTER_SIZE - the count of bucket clients in your cluster.
    example:
    CLUSTER_SIZE is 5, CONSUMED_TOKENS is 10, TIME_TO_REFILL is 10 seconds.
    The '''EXPIRATION_SPEED''' will be 10/10/5 = 0,2 tokens per second.
  • The most complicated part of implementation will be the tracking of actual CLUSTER_SIZE. But if I understand crrectly, the Etcd is comparable with Zookeeper, and this problem should be solved by Etcd.

The resulted model provides guarantee that issued tokens will be used to pay for client requests, and unused tokens will be thrown away by result of expiration process. The example of user code can be following:

import com.github.bucket4j.Bucket;

import java.util.concurrent.TimeUnit;

/**
 * This example is pretty simple, For pro like usage synchronized blocks can be replaced by Atomics
 */
public class EtcdBucketDecorator {
    
    private final Bucket bucket;
    private final int batchSize;
    
    private int clusterSize;
    
    private double tokens;
    private double expirationSpeedTokensPerMillis = -1.0;
    private long lastRequestTimeMillis;

    public EtcdBucketDecorator(Bucket bucket, int batchSize, int clusterSize) {
        this.bucket = bucket;
        this.batchSize = batchSize;
        this.clusterSize = clusterSize;
    }

    /**
     * Call this method when any JVM which uses bucket joined to cluster or left cluster.
     * 
     * @param newClusterSize
     */
    synchronized public void setClusterSize(int newClusterSize) {
        this.clusterSize = newClusterSize;
    }

    synchronized boolean tryConsume() {
        expire();
        if (tokens < 1.0) {
            TokensWithEstimation issuedTokens = bucket.consumeAsMuchAsPossibleAndCalculateTimeRequiredT(batchSize);
            if (issuedTokens.getCount() == 0) {
                return false;
            }
            tokens += issuedTokens.getCount();
            this.expirationSpeedTokensPerMillis = (double) issuedTokens.getCount() 
                    / TimeUnit.NANOSECONDS.toMillis(issuedTokens.getNanosToRefill())
                    / clusterSize;
            this.lastRequestTimeMillis = System.currentTimeMillis();
        }
        if (tokens >= 1.0) {
            tokens -= 1.0;
            return true;
        } else {
            return false;
        }
    }

    private void expire() {
        if (expirationSpeedTokensPerMillis == -1.0 || tokens == 0.0) {
            return;
        }
        long nowMillis = System.currentTimeMillis();
        long millisSinceLastRequest = nowMillis - lastRequestTimeMillis;
        this.tokens = Math.max(0.0, tokens - expirationSpeedTokensPerMillis * millisSinceLastRequest);
        this.lastRequestTimeMillis = nowMillis;
    }

}

@bayareagreg
Copy link
Author

Thank you for your time

@bayareagreg
Copy link
Author

would it be possible to add this new method consumeAsMuchAsPossibleAndCalculateTimeRequiredForRefill(limit) to the Bucket 2.0 interface?

@vladimir-bukhtoyarov
Copy link
Collaborator

Yes, I will add it. I decided to release 1.1 version to close this and this #7 issues, these 2 tasks are very simple to close. So, You should not wary about scope of 2.0.

@bayareagreg
Copy link
Author

Thank you. We are still discussing whether token bucket is the right solution for us. There is some debate on my team about what exactly it means when I say "rate limit of 5 tokens/minute". The majority of people say it does not mean 1 token per 12 seconds. It means you should get 5 fresh tokens every minute. It sounds to me like in that case token bucket would not be appropriate as described on wiki.

@vladimir-bukhtoyarov
Copy link
Collaborator

vladimir-bukhtoyarov commented Jan 4, 2017

1 token per 12 seconds

RateLimiter from Guava library, works in same manner. I think it is useless behaviour. But to achieve this strange behaviour with bucket4j you can configure bandwidth exactly to 1 token per 12 seconds instead of 5 tokens per minute.

It means you should get 5 fresh tokens every minute.

FixedIntervalRefillStrategy created by @bbeck works in this manner.

Bucket4j works similar to token-bucket, but bucket4j refill strategy is smoothly. I will try to describe how to "Smoothly" strategy works:

  • Imagine that you instantiated bucket with bandwidth: 10 tokens per minute at t0.
  • At t0 + 10 seconds you consume 9 tokens, so we have 1 token in the rest.
  • At t0 + 16 seconds you are trying to consume 2 tokens, and result of this operation will be success, because 1 token was in the rest and 1 token regenerated by smoothly refill strategy, in other words you should not wait to t0+1 minute to consume 2 tokens, the refill is continuous. In same time when you use FixedIntervalRefillStrategy the result of consume 2 tokens operation will fail, because nothing will be added to bucket until interval will elapsed at all, and there is difference between smoothly and discrete refill.

In my opinion smoothly refill implemented in Bucket4j has better user experience then discrete refill. But sometimes users ask to discrete refill, and I will provide this functionality as first class citizen since 2.0 version.

@bayareagreg
Copy link
Author

Hi Vladimir

two months ago you said

Yes, I will add it. I decided to release 1.1 version to close this and this #7 issues,
these 2 tasks are very simple to close

I am looking at 1.1 Bucket interface

https://github.com/vladimir-bukhtoyarov/bucket4j/blob/1.1/src/main/java/com/github/bucket4j/Bucket.java
and I don't see it. Why?

Thanks
Greg

@vladimir-bukhtoyarov
Copy link
Collaborator

@bayareagreg

I am absolutely busy at the moment. But You should not wait for release of diagnostic features. Just implement prediction on your side. It should be easy, for example when having bandwidth 100 tokens per second, then it is obviously that 10 tokens should be expired in 100 millisecond, 1 token will be expired in 10 millis, etc. There is nothing hard.

@vladimir-bukhtoyarov
Copy link
Collaborator

Workaround exists, there are no need to change anything.

@bayareagreg
Copy link
Author

Vladimir,
I'd like to ask a follow-up question about this. Do you have time for an answer?
Thanks
Greg

@vladimir-bukhtoyarov
Copy link
Collaborator

sure, I can answer

@bayareagreg
Copy link
Author

Thank you

My question is this:
You might have to go back a few months to review this. Back in January you stated
"The speed of expiration of any portion of consumed tokens should be calculated by following formaula:
EXPIRATION_SPEED = CONSUMED_TOKENS / TIME_TO_REFILL / CLUSTER_SIZE;
where:
CONSUMED_TOKENS - the count of tokens returned by consumeAsMuchAsPossibleAndCalculateTimeRequiredForRefill(limit) method;
TIME_TO_REFILL - the time required to refill CONSUMED_TOKENS. Similar to CONSUMED_TOKENS this is part of response.
CLUSTER_SIZE - the count of bucket clients in your cluster."

I have been working with this problem for a little while, and to be honest, I am having a little trouble seeing why CLUSTER_SIZE has anything to do with expiration speed/rate. Why should there be a difference in the expiration rate of a single client that gets a batch of 10 tokens 5 times versus 5 clients that get 10 tokens each once? It would be great if you could elaborate on this, or better yet give an real example that illustrates the need to factor in CLUSTER_SIZE into calculation of expiration rate.

Thank you in advance
Greg

@vladimir-bukhtoyarov
Copy link
Collaborator

vladimir-bukhtoyarov commented Apr 20, 2017

It is simple. Imagine the limitation 100 tokens per minute, batch size 10 token and cluster from 10 nodes. If expiration will be constant(1,66 token/second) then following problem happens:

  • At time X, one request comes to each cluster node, each node acquires 10 tokens from shared bucket and issues 1 tokens, so we have state 0 tokens in the bucket and 91 tokens on the nodes.
  • There are no activity for 6 seconds, so at X+6 seconds we have state 10 tokens in the shared bucket(because refilled) and 0 tokens on each node(because expired).
  • At X+6,001 seconds one request come to node-1, it acquires 10 tokens from shared bucket, so shared bucket has 0 tokens since this moment, At X+6,002 seconds one request come to noe-2 an request need to be rejected because node-2 has no tokens and shared buckets has no tokens, So you want to provide 100 tokens per minute, but actually achieved only 12 tokens.

To prevent under-consumption problem above, it is need to control that speed of expiration equals to speed of regeneration, according to configuration above, for case of 10 client nodes, the speed of expiration on each client node should be 1/10 from speed of tokens regeneration.

@bayareagreg
Copy link
Author

Yes you are absolutely correct. Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants