Benchmarking

Ramon Servadei edited this page Mar 17, 2015 · 8 revisions

#Benchmarking

Datafission has a built-in benchmarking test that establishes an expected latency for subscribers receiving published updates (RX latency). The test requires running two processes; a publisher and a subscriber.

##Benchmark scenario The publisher performs 64 loops publishing 10,000 record updates in each loop and using a specific number of records for each loop. For each loop, an increasing number of records is updated. The first loop only uses 1 record for all 10,000 updates. The final loop spreads the 10,000 updates across the 64 records in a round-robin mode. A loop only finishes when the subscriber acknowledges it has received 10,000 updates. Each loop is timed and this is used to calculate the receive latency (loop time / 10,000). The latency times are measured in microseconds. On average, each message is 134 bytes.

In total the test ends up sending 640,000 record updates.

##Running the benchmarking There are 2 ways to perform a benchmark:

  1. Localhost mode: this provides benchmarking to compare the performance of one host versus another. This test essentially ignores the network component and exercises the host local CPU capability. Typically you'd do this to compare machine specs to see what benefit you get from beefier hardware.
  2. Network mode: this provides benchmarking for the network (it is also host sensitive so is best performed using 2 hosts of equal spec).

In either mode, the same set of tests are executed.

###Step 1: run the publisher Run the class BenchmarkPublisher [optional IP address to bind the publisher to]. If no IP address is provided, the test defaults to using the loopback device IP (127.0.0.1).

###Step 2: run the subscriber Run the class BenchmarkSubscriber [optional IP address of the publisher to connect to]. If no IP address is provided, the test defaults to using the loopback device IP (127.0.0.1).

##Results

###Localhost mode results The benchmark tests were run on 2 machines. Consistently, the core i5 machine has a much lower RX latency. Whether this is due to the OS architecture difference or hardware architecture difference is not clear. However it is expected that on more advanced hardware the latency should drop, which is what is shown here.

+-------------------------+---------+--------------+--------------------+
| Machine                 | JVM      | OS          | Average RX latency |
+-------------------------+----------+-------------+--------------------+
| Intel core i3 @ 2.53Ghz | 1.7.0_72 | Windows 7   | 15 us              |
| Intel core i5 @ 1.6 Ghz | 1.7.0_72 | Windows 8.1 | 9 us               |
+-------------------------+----------+-------------+--------------------+

The chart below is for the core i5 machine and displays a fairly flat latency profile as the number of concurrently updating records increases from 1 to 64. This provides some demonstration of the stable throughput of datafission's threading model and event handling.

###Network mode results The benchmark test was then carried out using the core i5 machine has the publisher and the core i3 machine as the subscriber. Both machines were connected over an 802.11n wireless network (150Mbps) with good signal strength. The average RX latency was 56 usec. However from the graph it is apparent that over the network the RX latency is not as stable as for the localhost test, which is to be expected.

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.