Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADR] 53. Email rate limiting #833

Merged
merged 10 commits into from Jan 24, 2022

Conversation

vttranlina
Copy link
Contributor

No description provided.

src/adr/0051-email-rate-limiting.md Outdated Show resolved Hide resolved
src/adr/0051-email-rate-limiting.md Outdated Show resolved Hide resolved
src/adr/0051-email-rate-limiting.md Outdated Show resolved Hide resolved
src/adr/0051-email-rate-limiting.md Outdated Show resolved Hide resolved
src/adr/0051-email-rate-limiting.md Outdated Show resolved Hide resolved
src/adr/0051-email-rate-limiting.md Outdated Show resolved Hide resolved
src/adr/0051-email-rate-limiting.md Outdated Show resolved Hide resolved
Copy link
Contributor

@chibenwa chibenwa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use number 52 not to have conflicts with the Pulsar PR...

src/adr/0051-email-rate-limiting.md Outdated Show resolved Hide resolved
src/adr/0051-email-rate-limiting.md Outdated Show resolved Hide resolved
@vttranlina vttranlina marked this pull request as ready for review January 10, 2022 10:42
@vttranlina vttranlina changed the title [WIP] [ADR] Email rate limiting [ADR] 53. Email rate limiting Jan 10, 2022
vttranlina and others added 2 commits January 10, 2022 17:47
Co-authored-by: Benoit TELLIER <btellier@linagora.com>
Co-authored-by: Benoit TELLIER <btellier@linagora.com>
Copy link
Contributor

@chibenwa chibenwa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

Thanks for proposing this.

Copy link
Contributor

@Arsnael Arsnael left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just typo remarks, but good job overall :)

src/adr/0053-email-rate-limiting.md Outdated Show resolved Hide resolved
src/adr/0053-email-rate-limiting.md Outdated Show resolved Hide resolved
src/adr/0053-email-rate-limiting.md Outdated Show resolved Hide resolved
src/adr/0053-email-rate-limiting.md Outdated Show resolved Hide resolved
src/adr/0053-email-rate-limiting.md Outdated Show resolved Hide resolved
src/adr/0053-email-rate-limiting.md Outdated Show resolved Hide resolved
src/adr/0053-email-rate-limiting.md Outdated Show resolved Hide resolved
src/adr/0053-email-rate-limiting.md Outdated Show resolved Hide resolved
src/adr/0053-email-rate-limiting.md Show resolved Hide resolved
Co-authored-by: Rene Cordier <rene.cordier@gmail.com>

- Those mailets will be based on a generic RateLimiter interface. We will propose two implementations for it:
- In memory (guava based) suitable for single instance deployments
- [Redis](https://redis.io) based, suitable for distributed deployments.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't Cassandra good enough? or you really want non-persistent data storage to have better performance? What about using Pulsar or Rabbit to propagate counters and use an in-memory datastructure on each node?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You really want non-persistent data storage to have better performance?

Yes,
More point, with Redis

  • we don't need to revoke expired data manually. All counters will be removed after the end of the period (easy to reset counter to zero).
  • Redis support atomic API "getAndIncrement" counters

What about using Pulsar or Rabbit to propagate counters and use an in-memory datastructure on each node?

I don't sure about Pulsar (I don't use it before).
But with RabbitMQ, IMO, It does not guarantee the order of messages -> inconsistency counter

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Using a message queue will increase the delay of counter more than using a central database

Copy link
Contributor

@chibenwa chibenwa Jan 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't Cassandra good enough?

Cassandra counters are terrible. Actually I want to try alternatives.

or you really want non-persistent data storage to have better performance?

Yes here the tradeoffs of loosing state IMO is outwaited by better performance.

What about using Pulsar or Rabbit to propagate counters and use an in-memory datastructure on each node?

Others are free not to use it and come up with imaginative solutions to implement this.

@vttranlina that's nonetheless interesting remarks, we could have an alternative section explaining that 1. we do not use Cassandra and its counters for performance reasons nor a Cassandra time serie (because it seems complex) and want to avoid the hammer-nail syndrom.

We can then explain that streaming based systems could be used - but might have implications in term of scalability and implementation complexity. Of course our choice should not be exclusive of alternative solutions. We can show that by splitting the rate limiting maven plugin in two: one for the RateLimiter API and the mailet tooling, one for the Redis implementation....

Would this sound fair to you @mbaechler ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Anyway, I was just curious about the Redis choice, I have a very bad experience performance-wise with Redis.
Cassandra is probably not better for that.
And I think that an in-memory solution fits the need for the reason you described, too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just curious about the Redis choice

So do I.

Honnestly if we get a bad experience too we will try data-grids out ;-)

Co-authored-by: Benoit TELLIER <btellier@linagora.com>

- [JIRA](https://issues.apache.org/jira/browse/JAMES-3693)
- [Redis driver](https://github.com/lettuce-io/lettuce-core#reactive-api)
- [Guava rate limiter](https://guava.dev/releases/19.0/api/docs/index.html?com/google/common/util/concurrent/RateLimiter.html)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read the Guava Rate limiter docs and I do not think its API support rate-limiting by other time units than Second directly. Should we consider other libraries?
I found this library: https://github.com/mokies/ratelimitj
It provides 2 implementations: Inmemory and Redis. The interesting thing is its Redis implementation uses lettuce-io reactive Redis client. So IMO we could benefit it reducing some Redis client code and counters implementation also.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Quan!

This is a very good comment!

I read the Guava Rate limiter docs and I do not think its API support rate-limiting by other time units than Second directly.

Yes this could be problematic as saying "I allow you 100 mails per hour" is not the same as "I allow you one mail every 36 seconds". Also the Guava rate limiter is blocking which is not really fancy.

Should we consider other libraries?

We should ALWAYS look for alternatives!

It provides 2 implementations: Inmemory and Redis. The interesting thing is its Redis implementation uses lettuce-io reactive Redis client. So IMO we could benefit it reducing some Redis client code and counters implementation also.

This is of course very interesting!

Yet there is one key point to me: We need a James interface in plain Java (or Scala) fully abstracting away the RateLimitJ that would then just be an implementation detail (for memory AND redis implementations).

Also the concept of "batching rules together" is interesting. We should definitly add this concept to our rate limiter API as well

(

RedisRateLimiterFactory factory = new RedisRateLimiterFactory(RedisClient.create("redis://localhost"));
Set<RequestLimitRule> rules = new HashSet<>();
rules.add(RequestLimitRule.of(Duration.ofSeconds(1), 10));
rules.add(RequestLimitRule.of(Duration.ofSeconds(3600), 240).withPrecision(60));
ReactiveRequestRateLimiter requestRateLimiter = factory.getInstanceReactive(rules);
Mono<Boolean> overLimit = requestRateLimiter.overLimitWhenIncrementedReactive("ip:127.0.1.6");

)

@quantranhong1999 can you propose changes to this ADR following your remarks?

Copy link
Contributor

@chibenwa chibenwa Jan 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, you propose an external dependency.

Can I ask you to do a quick assessment regarding the health of that community?

-> License ?
-> Are there recent releases ? Are the release frequent ?
-> Code activity ? Release cycles ?
-> Companies behind this ?
-> Are PR and Issues actively / passively managed ?
-> Are there some forums/chats where we can ask questions if we have issues ? Do people receive answers ?
-> Are there several maintainers or is it more or less one guy ?

(Ultimately the question is will we save more energy on the long run depending on this dependency?)

(and yes, we should do this assessment every time we add a dependency)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

License ?

Apache License 2.0

Are there recent releases ? Are the release frequent ?

The latest release is May 2021.

Code activity ? Release cycles ?

The latest code update is 8 months ago and none recently.
image
It looks like the release cycles is about 6 months to 1 year.

Are PR and Issues actively / passively managed ?

I see some issues without answers recently.

Are there some forums/chats where we can ask questions if we have issues ? Do people receive answers ?

I do not see anything but Github.

Are there several maintainers or is it more or less one guy ?

There are 10 contributors but mostly Craig Baker is the main author.
He is the Chief architect at Afterpay (a quite big fintech in Australia). I think it is his personal project other than being backed by his company.

It has 413 stars so I think it is good enough. IMO it may be good enough for us in near future but it is a risk for the long future because it likely has 1 main contributor.

@chibenwa chibenwa merged commit 5d1b762 into apache:master Jan 24, 2022
@vttranlina vttranlina deleted the ADR_email_rate_limiting branch June 29, 2023 05:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants