Skip to content
This repository has been archived by the owner on Feb 11, 2020. It is now read-only.

Mosca (and Aedes) performance issues - test suite provided - any fix available? #381

Closed
cefn opened this issue Dec 10, 2015 · 10 comments
Closed

Comments

@cefn
Copy link

cefn commented Dec 10, 2015

I've been trying to trace performance issues in an app which relies on either Mosca or Mosquitto as a broker via websockets with both client and server running on v8.

I've experienced surprising delays in message dispatch from Mosca in synchronous and asynchronous (retained) messages and I would like to know what I can do to improve performance, possibly by changing server configuration?

While my client code reports 20ms to dispatch 100 messages to Mosca, it reports 1100ms for receiving them. I can't see how anything in the handling code my side could introduce such a large delay in processing as the messages from the same test suite are received through Mosquitto within 227ms.

A repository containing just the minimal test suite has been published at https://github.com/cefn/stressMQTT which has been deliberately simplified and had the promise and eventStream code we're using in the app removed.

Logs are available at https://github.com/cefn/stressMQTT/tree/master/logs for the test suite at https://github.com/cefn/stressMQTT/blob/master/stressMQTT.js (Note that in-order guarantees are only satisfied by Mosca if 'zeroPad' is set to true, because it orders retained message deliveries lexically, not numerically)

All these tests are running via the localhost network stack, so I don't expect there to be overhead within the socket streams themselves, and I would expect everything else (parsing frames from websockets, parsing Number from string, and equality tests) to be very fast. However, hopefully someone else can replicate the numbers on a different machine and architecture to validate this.

There also seems to be a trend with the first messages taking around 20ms each, and the last messages taking more like 5ms each, suggesting some kind of systematic performance issue depending on the size of the backlog, or a large processing task still going on from the 100 messages previously received. Is this as expected?

The second test in the suite shows retained messages are slightly more efficient than synchronous messages (taking 451ms, again roughly 5ms each), but this compares to 161ms for Mosquitto to deliver the same.

I've been considering Aedes for performance benefits, but not only does it seem to be slower on key metrics, but the app relies on retained messages which are not currently supported I think. A test run of Aedes is provided (without testing retained messages) at https://github.com/cefn/stressMQTT/blob/master/logs/aedes_100_no-retained.log and this shows 1371ms to deliver the 100 messages in synchronous mode - slightly worse than Mosca.

I can't use Mosquitto because of astonishing delays in round-trip time of 200ms which show up in the final 'echo' test of round-trip times (Mosquitto:20314ms vs. Mosca 932ms vs Aedes 2289ms), as well as the desire to use a node solution for simple integration with the rest of the application, so Mosca is still the overall winner!

However, I would really like to know if there's any obvious reason underpinning the slowness and anything I can do within the Mosca server configuration to fix it.

PS: Since the laptop has more than one core, there may be some inherent speed-up assuming Mosquitto and V8 (node) running on separate cores. However, this doesn't seem to explain the magnitude of the slowdown.

@mcollina
Copy link
Collaborator

Thanks for reporting. A couple of notes.

I think you should use an external process for testing Mosca and Aedes as
well. If you are on a multicore box, this might explain a part of the
slowdown on some tests.

As with any vm, V8 needs some time to optimize our JS, to have some
production number you would want to have a 'preheat' of 10k messages at
least for the code path you are using.

Retained messages should be supported in aedes, if not working as expected
report a bug.

Aedes is built to support a high load of concurrent clients and messages
per second. In order to do so, it is possible that the latency is slightly
higher. I would check later on.

On my load tests, Aedes can dispatch roughly 100k QoS 0 msg/s, and 40k QoS
1 msg/s to a single client. I have not tested latency.

Il giorno gio 10 dic 2015 alle 07:02 Cefn Hoile notifications@github.com
ha scritto:

I've been trying to trace performance issues in an app which relies on
either Mosca or Mosquitto as a broker via websockets with both client and
server running on v8.

I've experienced surprising delays in message dispatch from Mosca in
synchronous and asynchronous (retained) messages and I would like to know
what I can do to improve performance, possibly by changing server
configuration?

While my client code reports 20ms to dispatch 100 messages to Mosca, it
reports 1100ms for receiving them. I can't see how anything in the handling
code my side could introduce such a large delay in processing as the
messages from the same test suite are received through Mosquitto within
227ms.

A repository containing just the minimal test suite has been published at
https://github.com/cefn/stressMQTT which has been deliberately simplified
and had the promise and eventStream code we're using in the app removed.

Logs are available at https://github.com/cefn/stressMQTT/tree/master/logs
for the test suite at
https://github.com/cefn/stressMQTT/blob/master/stressMQTT.js (Note that
in-order guarantees are only satisfied by Mosca if 'zeroPad' is set to
true, because it orders retained message deliveries lexically, not
numerically)

All these tests are running via the localhost network stack, so I don't
expect there to be overhead within the socket streams themselves, and I
would expect everything else (parsing frames from websockets, parsing
Number from string, and equality tests) to be very fast. However, hopefully
someone else can replicate the numbers on a different machine and
architecture to validate this.

There also seems to be a trend with the first messages taking around 20ms
each, and the last messages taking more like 5ms each, suggesting some kind
of systematic performance issue depending on the size of the backlog, or a
large processing task still going on from the 100 messages previously
received. Is this as expected?

The second test in the suite shows retained messages are slightly more
efficient than synchronous messages (taking 451ms, again roughly 5ms each),
but this compares to 161ms for Mosquitto to deliver the same.

I've been considering Aedes for performance benefits, but not only does it
seem to be slower on key metrics, but the app relies on retained messages
which are not currently supported I think. A test run of Aedes is provided
(without testing retained messages) at
https://github.com/cefn/stressMQTT/blob/master/logs/aedes_100_no-retained.log
and this shows 1371ms to deliver the 100 messages in synchronous mode -
slightly worse than Mosca.

I can't use Mosquitto because of astonishing delays in round-trip time of
200ms which show up in the final 'echo' test of round-trip times
(Mosquitto:20314ms vs. Mosca 932ms vs Aedes 2289ms), as well as the desire
to use a node solution for simple integration with the rest of the
application, so Mosca is still the overall winner!

However, I would really like to know if there's any obvious reason
underpinning the slowness and anything I can do within the Mosca server
configuration to fix it.


Reply to this email directly or view it on GitHub
#381.

@mcollina
Copy link
Collaborator

In my tests I can get a round-trip time of 0.14 ms (average) with both Aedes and Mosquitto, while mosca sits at 0.4 ms (average).

This is my test: https://github.com/mcollina/aedes/blob/master/benchmarks/pingpong.js

@cefn
Copy link
Author

cefn commented Dec 14, 2015

When I run your benchmark (using the server.js broker from the same folder after running npm install from a fresh github checkout) I get the console log shown below, which I think indicates 40ms per message based on 25 messages per second, where the number should be nearer 2500 messages per second to match with your performance numbers.

This is something which I don't think can be explained by my lesser hardware (this is running on a dual core Chromebook). However I'd better try it on a different machine and build to be sure...

/usr/bin/node benchmarks/pingpong.js
sent/s 24
sent/s 24.2
sent/s 24.6
sent/s 24
sent/s 24.6
sent/s 24.6
sent/s 24.6
sent/s 24.8
sent/s 24.6
sent/s 24.8
sent/s 24.6
sent/s 24.400000000000002
sent/s 24.6
sent/s 24.8
sent/s 24.6
sent/s 25
sent/s 24.8
sent/s 24.8

@mcollina
Copy link
Collaborator

On my box (MacBook Pro 2014, i7, 16GB of RAM):

$ node benchmarks/pingpong.js
sent/s 8239.800000000001
sent/s 9060.6
total 17985.528844000117
average 0.14862967914783295
mode [ 0.202568, 0.204749 ]

Which node.js version are you using? I'm running this on node v4.2.0.

@cefn
Copy link
Author

cefn commented Dec 14, 2015

(trusty)cefn@localhost:~/Documents/code/imagination/git/aedes$ /usr/bin/node -v
v0.12.8

I'm currently building the setup on a Mac to verify a completely different net stack. Although we were having similar performance issues when deployed on our Mac OS X Server, that may have been related to our use of Object.defineProperties(...) within the library which was synchronizing local data structures. We've now eliminated this so worth revisiting to see if this was the only thing slowing that down (and if this other issue is not present there).

@mcollina
Copy link
Collaborator

Something else should be going on your system, because:

$ node -v
v0.12.7
$ node benchmarks/pingpong.js
sent/s 7628
sent/s 8211.6
sent/s 8054.6
total 20693.492406999892
average 0.16176775046317565
mode [ 0.244732 ]

@mcollina
Copy link
Collaborator

@cefn
Copy link
Author

cefn commented Dec 14, 2015

Thanks again for engaging with my debugging process here. I have committed the latest changes to the suite I'm running and I'll test against both cases once XCode, (Macports and Node) have finally arrived on the alternate Mac OS test machine.

The tests I was running previously were slightly different to the pingpong test in that new topics are created frequently, that websockets are used.

However, they now include multi-core support by spawning subprocesses to host Aedes + Mosca, so they should be a fairer comparison with Mosquitto. Updates visible at https://github.com/cefn/stressMQTT/

The test results on the (potentially broken) machine looked as follows (I'll look into the Aedes case via the separate issue tracker, and once I've switched to a new architecture) ...

Spawned Broker \ ms/msg Realtime Total: 100 1000 10000 Retained Total: 100 1000 10000 Echo Roundtrip Total (10000)
Mosca 8.8ms 7.8ms 15.4ms 8.8ms 2.8ms FAILS 20.5ms
Mosquitto 2.0ms 2.0ms 3.7ms 0.9ms 0.5ms 0.5ms ~200ms (before termination)
Aedes 11.8ms 7.4ms ~10.7 (failed at 6738) FAILS FAILS FAILS ~10.8 (failed at 5549)

@mcollina
Copy link
Collaborator

Please send bug reports against Aedes. It should not fail, at least in my
tests I got the expected behavior. It's alpha software anyway. Make sure
you are on node 4, as it as lots of perf improvements.
Also, explain what failure means.
Il giorno lun 14 dic 2015 alle 13:05 Cefn Hoile notifications@github.com
ha scritto:

Thanks again for engaging with my debugging process here. I have committed
the latest changes to the suite I'm running and I'll test against both
cases once XCode, (Macports and Node) have finally arrived on the alternate
Mac OS test machine.

The tests I was running previously were slightly different to the pingpong
test in that new topics are created frequently, that websockets are used.

However, they now include multi-core support by spawning subprocesses to
host Aedes + Mosca, so they should be a fairer comparison with Mosquitto.
Updates visible at https://github.com/cefn/stressMQTT/

The test results on the (potentially broken) machine looked as follows
(I'll look into the Aedes case via the separate issue tracker, and once
I've switched to a new architecture) ...
Spawned Broker \ ms/msg Realtime Total: 100 1000 10000 Retained Total: 100
1000 10000 Echo Roundtrip Total (10000) Mosca 8.8ms 7.8ms 15.4ms 8.8ms
2.8ms FAILS 20.5ms Mosquitto 2.0ms 2.0ms 3.7ms 0.9ms 0.5ms 0.5ms ~200ms
(before termination) Aedes 11.8ms 7.4ms ~10.7 (failed at 6738) FAILS FAILS
FAILS ~10.8 (failed at 5549)


Reply to this email directly or view it on GitHub
#381 (comment).

@cefn
Copy link
Author

cefn commented Dec 14, 2015

Following the example of your benchmark test, having rebuilt my development environment on a Mac, I get quite different results.

/usr/local/bin/node benchmarks/pingpong.js
sent/s 3870.8
sent/s 4294.8
sent/s 4292.4

Although running the same test with retained messages, new topic names and a wildcard topic subscription produces instead...

/usr/local/bin/node benchmarks/pingpong.js
sent/s 1225.8
sent/s 810.8
sent/s 839
sent/s 836
total 22979.612756999984
average 1.1577214346818472
mode [ 1.195709,  1.217539,  1.233395,  1.234064,  1.254005,  1.260854,  1.267436,  1.281156,  1.284388,  1.32123,  1.370942,  1.429676, 1.626149 ]

This all seems much more sane.

Finally I found what I think is a bug where topic names are actually numerical, which I'll file separately.

Below are the updated results from my own test suite at https://github.com/cefn/stressMQTT/ now it's running on a Mac. The first figure in each column is the delay between subscribe or first send (whichever is later) and the first receive. The second figure is the average time between individual messages received.

The tests are as follows...

  • messages sent while the client is subscribed.
  • messages sent before the client is subscribed.
  • messages sent one by one after publication of the last has been notified (ping pong).

N.B. I had to increase the max Inflight to 20000 for the 10000 retained messages to work on Mosca and Mosquitto.

Failure is a Mocha test failure, typically created because messages were sent but never received (in most cases this means exceeding max Inflight, or hitting a stack overflow if the max Inflight numbers are increased to accommodate the entire backlog). However, in the case of Aedes in the configuration documented currently at https://github.com/cefn/stressMQTT no retained messages are ever received, while the identical test suite running with Mosca is successful. This may be because the topic names are numerical.

Broker messages Synchronous delayms deliveryms/msg Retained delayms deliveryms/msg Ping delayms deliveryms/msg
Mosca 10 20 3 8 1.3 4 6.1
Mosca 100 43 3.8 44 1.2 4 4.2
Mosca 1000 93 2.35 685 0.9 3 3.1
Mosca 10000 486 6.1 FAIL (maxInflight=20000) 4 7.6
------------------- --------------------------------------- ----------------------------------- -------------------------------
Mosquitto 10 212 0.4 414 0.4 207 187
Mosquitto 100 417 0.7 417 0.4 206 205
Mosquitto 1000 217 0.5 902 0.1 207 207
Mosquitto 10000 829 1.0 981 0.1 207 ~207 [Gave up]
------------------- --------------------------------------- ----------------------------------- -------------------------------
Aedes 100 34 4.2 FAIL 5 5.5
Aedes 1000 93 2.8 FAIL 4 3.5
Aedes 10000 471 3.4 FAIL 6 4.1
------------------- --------------------------------------- ----------------------------------- -------------------------------

Will switch to Aedes for ongoing bug focus, or file separate bugs for other performance issues.

@cefn cefn closed this as completed Dec 14, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants