Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Data Transfer Limits for each connection, configurable by vhost #88

Merged
merged 1 commit into from
Jun 30, 2022

Conversation

adamncasey
Copy link
Contributor

@adamncasey adamncasey commented Jun 10, 2022

This commit adds support for limiting the amount of data which can be
read from the client-side (ingress) socket in a one second time period.

The aim is to be able to avoid one single connection being able to
overwhelm the proxy, while also providing an extra tool for limiting an
out of control vhost - giving time for both RabbitMQ, and amqpprox to
recover.

These limits are configurable using

amqpprox_ctl ... LIMIT DATA_RATE DEFAULT 500000000 # About 500MB per second limit per connection on all vhosts

amqpprox_ctl ... LIMIT DATA_RATE VHOST vhostname 5000000 # Restrict 'vhostname' down to about 5MB/s

The limits are enforced by counting the number of bytes read from the
socket during each read operation, and pausing for one second before
starting a read operation if the connection is at the limit.

This might transfer a bit more data than the precise limit since amqpprox
will always pass the whole read buffer to the server after a read operation - which
technically could lead to overshooting the limit, but the next read operation will pause for
until the next 'tick' (up to one second) after doing so. In testing the data transferred per second
stays roughly at the configured limit such that this isn't a huge problem.

I tested and compared maximum data throughput before and after this
change - with no limit enabled and with a high limit enabled. No
particular differences were seen:
Data Transfer rates all measured in / MiB/s

Run # Without Change With Change
No Limit
With Change
2000MiB/s Limit
With Change
50MiB/s Limit (per client)
1 1058 1078 1064 497
2 1056 1030 1060 497
3 1049 1023 1054 497

Tested with amqpprox and the performance tester running on the same machine - a VM with 8x 2.5GHz cores and 32GB of ram

I ran the performance tester using these arguments:

$ RUST_LOG=info cargo run --release -- --address amqp://127.0.0.1:5672 --listen-address 0.0.0.0:30424 --message-size 10000000 --num-messages 500 --clients 10 --max-threads 50 

This runs 10 parallel client threads transmitting ~5000GB each as fast as they can. Each test run took between ~45seconds and ~2-3minutes, depending on whether the rate limit was applied. Longer tests better reflect the true peak data rate achieved since some setup/teardown time is included in the summary calculations. Using bmon I could see that most of the tests were peaking at around 2.2-2.4GiB/s which means our results here aren't too far off that - since the number in bmon reflects the client -> proxy -> server data flow all over the same interface.

When amqpprox has a large number of connections which have hit their
limit at least once since being opened (this triggers the one second
reset interval) there is some additional CPU load from this change.

I run the following experiment:

  1. Start amqpprox
  2. Connect 21k clients which send a 100byte message every minute
  3. Measure CPU usage of amqpprox using pidstat 15 5 -ru -p <pid>

In a few different scenarios:

  1. amqpprox without this change
  2. With the change, with no data rate limits configured
  3. With the change, with a low data rate limit configured (50 bytes per second)
  4. With the change, with a high data rate limit configured but with all timers triggering.
    • Triggering all timers was possible by setting the data rate limit very low (50b/s) for a couple of minutes, then set it high: 2GB/s.

Results:

Cpu usage tests Without Change No Limit Low limit All connections triggering timers
%CPU sample 1 17.93 13.4 19.53 20.8
%CPU sample 2 15.73 18.4 12.73 24.27
%CPU sample 3 18.8 14.53 18.8 21.47
%CPU sample 4 17.73 18.07 13.33 23.87
%CPU sample 5 18.73 13.4 19.47 20.67
%CPU average 17.79 15.56 16.77 22.21

I started amqpprox, setup 21k clients connecting to it which publish a 100byte message every minute.

TODO:

  • Re-run cpu usage comparisons with 20k connections and put the results on this PR
  • Figure out the clang-format changes going on here - unsure why it's reformatting some code which looks fine.

if (d_dataRateLimit.remainingQuota() == 0) {
if (d_dataRateTimerStarted) {
d_dataRateTimer.cancel();
d_dataRateTimer.expires_at(d_dataRateTimer.expiry());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems out of line with the description of the PR as I understood it. It seems like this is resetting the exceeded limit on the next 1 second clock tick, which could be anything in the range 0.0s-1.0s, rather than waiting for a full 1.0 second peroid after hitting a limit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awkward. I believe I started out with waiting a whole second each time but changed to this due to failing to hit expected performance test results (not that unsurprising). I just collected some data on throughput results when being limited, and it looks like the method in code is a lot more predictable - which I think is preferable here. I'll update the PR description to reflect the current behaviour - but equally let me know if you prefer waiting the whole second.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I was not happy with the 1 second approach so I went looking at the code. I think this approach is better but I am still somewhat concerned about the jitter of confirm times it might introduce, and I wonder if there's a way to smooth the rate limiting out without being too expensive to implement (it doesn't look like there's any leaky bucket or similar implemantation in boost).

If we do decide to take this approach, is the cost of the timers expensive? Because if they might be, then you could have a single 1-second clock and just maintain a map of the sockets that are being limited and need calling back to do the socket read on the next cycle.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah as you say - the timers aren't free - the cost of ASIO figuring out which timers are ready does start to show in perf results, but the CPU usage benchmarks I did down below seem to indicate this isn't overly expensive. Do you agree? We could switch the implementation to have a single timer which runs every second - I think that would work, it just needs a bit more thought around socket ownership & how many timers are needed to keep latency / cpu spikes to a minimum. At least at the moment a lower hanging performance fruit is tidying up statistics collection when there are 20k connections (there's some perf output kicking around internally for this).

Copy link
Contributor Author

@adamncasey adamncasey Jun 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and I think this is everything you already know, but to make it explicit: if someone is pushing a lot of data to the broker they will see increased confirm times when they get throttled, but worth noting that don't do anything to hold back broker confirms flowing to them.

We could introduce a smoother threshold to help bursty clients which would like to push (for example) large amounts of data in only one second every 30.

I could go for supporting something like X limit every second, but allow up to Y to accumulate if unused - which looks roughly like what tc exposes in one of its modes. We did discuss something a bit more complicated than this for connection rate limits, but eventually decided to go for just a simple limit to keep it easy for users to understand when they get rate limited.

I'm am unsure how much of a difference it would make overall though since 1 second seems like only a medium period of time for a (mostly) asynchronous application like rabbitmq. At the moment I don't think we have a great way to measure if this would make a difference - not to say that means we should stick with what is currently in the PR - just makes it a bit trickier to decide.

@adamncasey
Copy link
Contributor Author

adamncasey commented Jun 15, 2022

Most of this comment is now in the PR description, but this does still note the difference between sleeping an entire second, and sleeping for the remainder of the current rate "tick" (the latter is implemented in this PR).

Data Transfer rates all measured in / MiB/s

Run # Without Change With Change
No Limit
With Change
2000MiB/s Limit
With Change
50MiB/s Limit (per client)
With Change
200MiB/s 1.0s sleep
With Change
50MiB/s 1.0s sleep
1 1058 1078 1064 497 1063 393
2 1056 1030 1060 497 1076 434
3 1049 1023 1054 497 1054 401

Tested with amqpprox and the performance tester running on the same machine - a VM with 8x 2.5GHz cores and 32GB of ram

I ran the performance tester using these arguments:

$ RUST_LOG=info cargo run --release -- --address amqp://127.0.0.1:5672 --listen-address 0.0.0.0:30424 --message-size 10000000 --num-messages 500 --clients 10 --max-threads 50 

This runs 10 parallel client threads transmitting ~5000GB each as fast as they can. Each test run took between ~45seconds and ~2-3minutes, depending on whether the rate limit was applied. Longer tests better reflect the true peak data rate achieved since some setup/teardown time is included in the summary calculations. Using bmon I could see that most of the tests were peaking at around 2.2-2.4GiB/s which means our results here aren't too far off that - since the number in bmon reflects the client -> proxy -> server data flow all over the same interface.

@adamncasey adamncasey force-pushed the datarate branch 2 times, most recently from 72a11a6 to 098a41b Compare June 20, 2022 10:29
@@ -293,7 +406,7 @@ void LimitControlCommand::handleCommand(const std::string & /* command */,
}

auto vhostOrDefault = readVhostOrDefault(iss);
if (!readVhostOrDefault) {
if (!vhostOrDefault) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bug in main at the moment.

Copy link
Contributor

@Chinmay1412 Chinmay1412 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we also need to add the commands in README.md and with each command details here: https://github.com/bloomberg/amqpprox/blob/main/docs/config.md

libamqpprox/amqpprox_dataratelimit.cpp Show resolved Hide resolved
libamqpprox/amqpprox_dataratelimit.h Outdated Show resolved Hide resolved
/**
* Fetch the currently configured quota
*/
std::size_t getQuota();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::size_t getQuota();
std::size_t getQuota() const;

libamqpprox/amqpprox_dataratelimit.h Outdated Show resolved Hide resolved
libamqpprox/amqpprox_dataratelimitmanager.cpp Show resolved Hide resolved
d_defaultDataRateQuota = quota;
}

void DataRateLimitManager::setDefaultDataRateAlarm(std::size_t quota)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need to maintain default datarate limit per vhost?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the counters which tick down are stored individually under each Session, and changing the values already propagates down into the session, we only need to store what values they would get assigned when they're created here. For this we can just store the default values, plus any vhost overrides which exist.

@@ -331,6 +371,46 @@ class MaybeSecureSocketAdaptor {
handler);
}

if (!d_alarmed && d_dataRateAlarm.remainingQuota() == 0) {
if (d_dataRateTimerStarted) {
LOG_INFO << "Data Rate Alarm: Hit "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we log at warning level with certain keyword as a string?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going for "Data Rate Alarm" being the string we alarm on

libamqpprox/amqpprox_dataratelimitmanager.cpp Show resolved Hide resolved
libamqpprox/amqpprox_dataratelimit.cpp Show resolved Hide resolved
@adamncasey
Copy link
Contributor Author

There's a few commits here now, but squashing makes reviewing awkward. I'll squash before merging.

Copy link
Contributor

@willhoy willhoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think this is good to go for now

void onTimer(const boost::system::error_code &error)
{
if (error == boost::asio::error::operation_aborted) {
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

take it this isnt worth logging. e.g. timer cancelled

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, it gets cancelled every time a connection hits their data rate limit, and also whenever a socket is closed.

This commit adds support for limiting the amount of data which can be
read from the client-side (ingress) socket in a one second time period.

The aim is to be able to avoid one single connection being able to
overwhelm the proxy, while also providing an extra tool for limiting an
out of control vhost - giving time for both RabbitMQ, and amqpprox to
recover.

These limits are configurable using
```
amqpprox_ctl ... LIMIT DATA_RATE
DEFAULT 500000000 # About 500MB per second limit per connection on all
vhosts

amqpprox_ctl ... LIMIT DATA_RATE VHOST vhostname 5000000 # Restrict
'vhostname' down to about 5MB/s
```

The limits are enforced by counting the number of bytes read from the
socket during each read operation, and pausing for one second before
starting a read operation if the connection is at the limit.

This might transfer more, or less data than the precise limit since amqpprox
will always pass the whole read buffer to the server after a read operation - which
could lead to overshooting the limit, but the next read operation will pause for
a whole second after doing so - which could undershoot. In testing the
data transferred per second stays roughly at the limit configured.

I tested and compared maximum data throughput before and after this
change - with no limit enabled and with a high limit enabled. No
particular differences were seen.

When amqpprox has a large number of connections which have hit their
limit at least once since being opened (this triggers the one second
reset interval) there is some additional CPU load from this change.
@adamncasey adamncasey merged commit 89eac7b into bloomberg:main Jun 30, 2022
@adamncasey adamncasey deleted the datarate branch June 30, 2022 12:49
adamncasey added a commit to adamncasey/amqpprox that referenced this pull request Oct 18, 2022
This commit changes MaybeSecureSocketAdaptor to template a few critical
types which allows us to write unit tests against some of it's behaviours

In theory this could probably also replace the SocketIntercept things, which
were a nice solution when the implementation of this class passed straight through
to the asio types.

Since working around some asio bugs bloomberg#69
and adding support for data rate limits bloomberg#88
the MaybeSecureSocketAdaptor class has expanded enough that we should
be writing unit tests for it - and probably should have done so before now.
adamncasey added a commit to adamncasey/amqpprox that referenced this pull request Oct 18, 2022
This commit changes MaybeSecureSocketAdaptor to template a few critical
types which allows us to write unit tests against some of it's behaviours

In theory this could probably also replace the SocketIntercept things, which
were a nice solution when the implementation of this class passed straight through
to the asio types.

Since working around some asio bugs bloomberg#69
and adding support for data rate limits bloomberg#88
the MaybeSecureSocketAdaptor class has expanded enough that we should
be writing unit tests for it - and probably should have done so before now.
adamncasey added a commit to adamncasey/amqpprox that referenced this pull request Oct 18, 2022
This commit changes MaybeSecureSocketAdaptor to template a few critical
types which allows us to write unit tests against some of it's behaviours

In theory this could probably also replace the SocketIntercept things, which
were a nice solution when the implementation of this class passed straight through
to the asio types.

Since working around some asio bugs bloomberg#69
and adding support for data rate limits bloomberg#88
the MaybeSecureSocketAdaptor class has expanded enough that we should
be writing unit tests for it - and probably should have done so before now.
willhoy pushed a commit that referenced this pull request Oct 19, 2022
This commit changes MaybeSecureSocketAdaptor to template a few critical
types which allows us to write unit tests against some of it's behaviours

In theory this could probably also replace the SocketIntercept things, which
were a nice solution when the implementation of this class passed straight through
to the asio types.

Since working around some asio bugs #69
and adding support for data rate limits #88
the MaybeSecureSocketAdaptor class has expanded enough that we should
be writing unit tests for it - and probably should have done so before now.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants