Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADR] 53. Email rate limiting #833

Merged
merged 10 commits into from Jan 24, 2022
2 changes: 1 addition & 1 deletion src/adr/0050-oidc.md → src/adr/0051-oidc.md
@@ -1,4 +1,4 @@
# 50. Integrate James with OIDC
# 51. Integrate James with OIDC

Date: 2021-12-06

Expand Down
70 changes: 70 additions & 0 deletions src/adr/0053-email-rate-limiting.md
@@ -0,0 +1,70 @@
# 53. Email rate limiting

Date: 2021-01-10

## Status

Accepted (lazy consensus).

Not yet implemented.

## Context

Rate limiting is one of the common features expected from an email system. Examples: SaaS is
one https://www.fastmail.help/hc/en-us/articles/1500000277382-Account-limits#sending/receiving
, https://support.google.com/mail/answer/22839

They limit how many emails users can send/receive from/to each email account over a given period of time.
We believe the rate-limiting will help James to have more benefits:

- Control of the resources
- Easy to configure dynamically the user policy.
- Complements the storage quota
- Can be a security requirement for SaaS deployments.
- Minimise impacts of Open-relay types of compression.
- Limiting the amount of emails sent to third parties can also prevent them from considering you as an open relay and can
be beneficial as well.

## Decision

Set up a new maven project dedicated to rating limiting. This allows the rate limiting mailets to be embedded in a James
server as a soft dependency using the external-jar loading mechanism. Please note that this will take the form of an
extension jar, that could be dropped in one's James installation, and thus is optional, and not a runtime dependency.

Rate limiting will be enabled per sender, per receiver and globally. For each we will provide options for email size and
email count.

- This can be done via mailets:
- PerSenderRateLimit is per sender
- PerRecipientRateLimit is per recipient
- GlobalRateLimit is for everyone. Depending on the position in the pipeline this could allow rate limiting all emails in
transit, relayed emails or locally delivered emails.
The rate limit will be configured
in [mailetcontainer.xml](/server/apps/distributed-app/sample-configuration/mailetcontainer.xml).

- Those mailets will be based on a generic RateLimiter interface. We will propose two implementations for it:
- In memory (guava based) suitable for single instance deployments
- [Redis](https://redis.io) based, suitable for distributed deployments.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't Cassandra good enough? or you really want non-persistent data storage to have better performance? What about using Pulsar or Rabbit to propagate counters and use an in-memory datastructure on each node?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You really want non-persistent data storage to have better performance?

Yes,
More point, with Redis

  • we don't need to revoke expired data manually. All counters will be removed after the end of the period (easy to reset counter to zero).
  • Redis support atomic API "getAndIncrement" counters

What about using Pulsar or Rabbit to propagate counters and use an in-memory datastructure on each node?

I don't sure about Pulsar (I don't use it before).
But with RabbitMQ, IMO, It does not guarantee the order of messages -> inconsistency counter

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Using a message queue will increase the delay of counter more than using a central database

Copy link
Contributor

@chibenwa chibenwa Jan 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't Cassandra good enough?

Cassandra counters are terrible. Actually I want to try alternatives.

or you really want non-persistent data storage to have better performance?

Yes here the tradeoffs of loosing state IMO is outwaited by better performance.

What about using Pulsar or Rabbit to propagate counters and use an in-memory datastructure on each node?

Others are free not to use it and come up with imaginative solutions to implement this.

@vttranlina that's nonetheless interesting remarks, we could have an alternative section explaining that 1. we do not use Cassandra and its counters for performance reasons nor a Cassandra time serie (because it seems complex) and want to avoid the hammer-nail syndrom.

We can then explain that streaming based systems could be used - but might have implications in term of scalability and implementation complexity. Of course our choice should not be exclusive of alternative solutions. We can show that by splitting the rate limiting maven plugin in two: one for the RateLimiter API and the mailet tooling, one for the Redis implementation....

Would this sound fair to you @mbaechler ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Anyway, I was just curious about the Redis choice, I have a very bad experience performance-wise with Redis.
Cassandra is probably not better for that.
And I think that an in-memory solution fits the need for the reason you described, too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just curious about the Redis choice

So do I.

Honnestly if we get a bad experience too we will try data-grids out ;-)


The implementation chosen will be configurable as part of mailet configuration. One would be able to configure the
implementation he wishes to use.

- We will document such a setup, and provide a mailetcontainer sample file.

## Consequences

- When having a change in the rate limit configuration, we need to restart James.
- Only protocols allowing to submit emails make sense here so SMTP and JMAP.
- It is more than acceptable to lose all redis data, which is equivalent to resetting the rate limiting.
Arsnael marked this conversation as resolved.
Show resolved Hide resolved

vttranlina marked this conversation as resolved.
Show resolved Hide resolved
## Alternatives

Alternatives implementation of the rate limiter can be proposed, and used within the aforementionned mailet.

For instance one could rely on Cassandra counters and Cassandra time series (thus not needing additional dependencies) however we fear the potential performance impact doing so. Streaming based options, that aggregate in memory counters, might be a viable option too.

## References

- [JIRA](https://issues.apache.org/jira/browse/JAMES-3693)
- [Redis driver](https://github.com/lettuce-io/lettuce-core#reactive-api)
- [Guava rate limiter](https://guava.dev/releases/19.0/api/docs/index.html?com/google/common/util/concurrent/RateLimiter.html)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read the Guava Rate limiter docs and I do not think its API support rate-limiting by other time units than Second directly. Should we consider other libraries?
I found this library: https://github.com/mokies/ratelimitj
It provides 2 implementations: Inmemory and Redis. The interesting thing is its Redis implementation uses lettuce-io reactive Redis client. So IMO we could benefit it reducing some Redis client code and counters implementation also.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Quan!

This is a very good comment!

I read the Guava Rate limiter docs and I do not think its API support rate-limiting by other time units than Second directly.

Yes this could be problematic as saying "I allow you 100 mails per hour" is not the same as "I allow you one mail every 36 seconds". Also the Guava rate limiter is blocking which is not really fancy.

Should we consider other libraries?

We should ALWAYS look for alternatives!

It provides 2 implementations: Inmemory and Redis. The interesting thing is its Redis implementation uses lettuce-io reactive Redis client. So IMO we could benefit it reducing some Redis client code and counters implementation also.

This is of course very interesting!

Yet there is one key point to me: We need a James interface in plain Java (or Scala) fully abstracting away the RateLimitJ that would then just be an implementation detail (for memory AND redis implementations).

Also the concept of "batching rules together" is interesting. We should definitly add this concept to our rate limiter API as well

(

RedisRateLimiterFactory factory = new RedisRateLimiterFactory(RedisClient.create("redis://localhost"));
Set<RequestLimitRule> rules = new HashSet<>();
rules.add(RequestLimitRule.of(Duration.ofSeconds(1), 10));
rules.add(RequestLimitRule.of(Duration.ofSeconds(3600), 240).withPrecision(60));
ReactiveRequestRateLimiter requestRateLimiter = factory.getInstanceReactive(rules);
Mono<Boolean> overLimit = requestRateLimiter.overLimitWhenIncrementedReactive("ip:127.0.1.6");

)

@quantranhong1999 can you propose changes to this ADR following your remarks?

Copy link
Contributor

@chibenwa chibenwa Jan 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, you propose an external dependency.

Can I ask you to do a quick assessment regarding the health of that community?

-> License ?
-> Are there recent releases ? Are the release frequent ?
-> Code activity ? Release cycles ?
-> Companies behind this ?
-> Are PR and Issues actively / passively managed ?
-> Are there some forums/chats where we can ask questions if we have issues ? Do people receive answers ?
-> Are there several maintainers or is it more or less one guy ?

(Ultimately the question is will we save more energy on the long run depending on this dependency?)

(and yes, we should do this assessment every time we add a dependency)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

License ?

Apache License 2.0

Are there recent releases ? Are the release frequent ?

The latest release is May 2021.

Code activity ? Release cycles ?

The latest code update is 8 months ago and none recently.
image
It looks like the release cycles is about 6 months to 1 year.

Are PR and Issues actively / passively managed ?

I see some issues without answers recently.

Are there some forums/chats where we can ask questions if we have issues ? Do people receive answers ?

I do not see anything but Github.

Are there several maintainers or is it more or less one guy ?

There are 10 contributors but mostly Craig Baker is the main author.
He is the Chief architect at Afterpay (a quite big fintech in Australia). I think it is his personal project other than being backed by his company.

It has 413 stars so I think it is good enough. IMO it may be good enough for us in near future but it is a risk for the long future because it likely has 1 main contributor.