Skip to content

Benchmarks

BluePyth edited this page Mar 22, 2012 · 20 revisions

Gatling is brand new, but why would you choose it? After all, there are lots of tools out there that are well known: JMeter, LoadUI, The Grinder, Tsung, etc.

This page, provides data that shows why we made Gatling, and why we think you should use it.

The Test Protocol #

The Tested Application #

The tested application is the same as the one from our documentation (which was hosted on CloudFoundry). It's an e-banking application developed by eBusiness Information, Excilys Group. It was developed with performance and code quality in mind, which lead to a quite fast and responsive application. It is available here on Github.

To be able to test the different stress tools with the best conditions possible, the tests were made in a private gigabit network. The tested application has been deployed on Apache Tomcat 7 with a PostgreSQL 9.1 database reset after each run.

All of this was installed on a Dell desktop class computer with an hyper-threaded quad-core processor @ 2.80GHz and 6GB of RAM.

In the following benchmarks, we will use two versions of the same application:

  • Normal - The application as is
  • Lagged - The same application with a filter executing a Thread.sleep(500) to increase the response time of at least 500ms

Note: Increasing the response time allows us to see how the stress tool reacts under heavy load.

Injector Computer Configurations

To make the benchmark as representative as possible, we used the injectors on two different configurations:

  • A desktop class computer equipped with an Intel Xeon W3520 @ 2.67GHz (4 cores + Hyper Threading) and 6Go RAM.
  • A laptop class computer equipped with an Intel Core 2 Duo P8600 @ 2.40GHz (2 cores) and 4Go RAM.

Note: Each configuration will be referenced by either Desktop or Laptop in the results

The Simulations #

Two simulations have been used to test the injectors; both simulations use the scenario defined below, the same ramp (200 seconds) and a different number of users: 1500 or 2000.

Pauses in the scenario represent a duration of 518s.

There are 184 requests executed in the scenario, the response time of each request is important in the computation of the theoretical duration of the simulation. Therefore, this duration will be computed for each simulation.

The more the stress tool's simulation duration is close to the theoretical duration of the simulation, the more it can be considered accurate.

Note: The accuracy of the tool also depends on the number of failed requests

The scenario #

  1. The user accesses to the login page
  2. The user enters its credentials and logs in
  3. The user is redirected to its accounts list
  4. The user accesses to the operations of its first listed account
  5. A request is sent by the browser to get these operations
  6. The user accesses to the card operations of the same account
  7. A request is sent by the browser to get these operations
  8. The user accesses the form to perform a transfer
  9. The user enters the required information and performs the transfer
  10. The user is then redirected to its account list
  11. The user logs out
  12. The user is redirected to the login page

The results #

For each simulation, we gathered specific information to try to explain what worked well, and what didn't during the simulation. The results will show:

  • The application on which the simulation was run (Normal or Lagged)
  • The number of users asked for the simulation
  • The memory allocated for the tool
  • The theoretical duration of the simulation (depending on the response times recorded during the simulation)
  • The simulation duration on the Desktop machine and the number of errors
  • The simulation duration on the Laptop machine and the number of errors

Gatling's results #

Gatling's simulation has been created using Gatling Recorder and modified by the testers afterwards. You can see the result here.

The simulations have been run without any problem, their results are shown below:

App Nb of Users Allocated Memory Theoretical duration Desktop Laptop
Normal 1500 512 MB 12'00" 12'02" (+0'02") E: 0 12'00" (+0'00") E: 2
Normal 2000 512 MB 12'02" 12'03" (+0'01") E: 11 12'03" (+0'01") E: 13
Lagged 1500 512 MB 14'56" 16'06" (+1'10") E: 22 15'28" (+0'32") E: 15
Lagged 2000 512 MB 18'09" 20'00" (+1'51") E: 675 19'44" (+1'35") E: 321

As you can see, the duration of the simulation run by Gatling is almost identical to the theoretical one for the normal application. This is what we, as Gatling developers, expected when we developed it.

For the lagged application, things change a little, indeed, the nature of the application (long and irregular response times) forces the stress tool to manage several connections and to process responses in a more chaotic manner. Gatling's asynchronous and memory efficient nature allows it to manage the load imposed by the simulation, but it can't help being delayed.

Gatling vs. JMeter #

JMeter

JMeter is an Apache project currently available in version 2.5.1. It is one of the most popular stress tool. It's user interface is based solely on a rich GUI.

User Experience #

The testers, new to both Gatling and JMeter found that JMeter was harder to learn and use than Gatling to create the simulations, despite the use of a proxy.

Having a GUI seems to be an advantage over scripts or even code. But the problem with JMeter's GUI is its richness: functionalities are too hard to find.

You can see the simulation here and appreciate the difference with Gatling's scenario file.

Note: JMeter's developers didn't plan that a user might want to edit the file with a text editor; the proposed comparison is just here to show how readable Gatling's scenarios are.

JMeter's results #

JMeter creates one thread per user simulated. If there is not enough memory allocated to the JVM, it can crash trying to create these threads. For instance, JMeter could not run 1500 users with 512 MB (what was used for Gatling even with 2000 users); OutOfMemoryErrors are recorded in the table as OOM.

Another problem occurred with the 2000 users simulations; it seems that JMeter can not simulate more than 1514 users independently from the memory that was allocated to the JVM. These reached user limits are recorded as UL in the table.

To be able to run a simulation with 2000 users, we had to use the cluster mode of JMeter. It consists in launching several nodes, two in our case, each node executing half of the users, and a master process to control them. Using 2 nodes worked but it needs 5 commands to be run at the same time and makes use of jmeter as a server; this already is advanced usage of JMeter for a simple simulation which needs to run more than 1514 users.

App Nb of Users Nodes Allocated Memory Theoretical duration Desktop Laptop
Normal 1500 1 512 MB 12'00" OOM OOM
Normal 1500 1 768 MB 12'00" 12'02" (+0'02") E: 0 12'05" (+0'05") E: 0
Normal 2000 1 768 MB 12'02" NT UL
Normal 2000 1 1024 MB 12'02" NT UL
Normal 2000 1 1536 MB 12'02" NT UL
Normal 2000 1 2048 MB 12'02" NT UL
Normal 2000 2 3x768 MB 12'02" 12'08" (+0'08") E: 149 32'19" (+20'19") E: 17
Lagged 1500 1 768 MB 14'11" 15'56" (+1'45") E: 81 15'18" (+1'07") E: 30039
Lagged 2000 2 3x768 MB 15'49" 20'52" (+5'03") E: 251 33'40" (+17'51") E: 15

Note: For X nodes, X+1 JVMs will be launched, that is why the allocated memory for 2 nodes is 3x768 MB

Note: NT means Not Tested. When we saw that JMeter couldn't launch more than 1514 users with one node, we skipped these simulations.

As you can see, for the normal application, the duration of the simulation run by JMeter on a powerful machine is quite good; even if we can see the beginning of a time drift when the number of users increases. As a matter of fact, the drift gets really bigger on a less powerful machine: more than 20 minutes late!

For the lagged application, we can see that JMeter is longer than Gatling to run the simulations, and its drift increases faster. Concerning the Laptop's duration for 1500 users, it has been fast but this is explained by the 30,000 requests that failed during the simulation.

Conclusions

JMeter can load test your application, no doubts about it; it has helped many people for many years. The problem with JMeter is the validity of the results. You might think that everything went well, your application handled the 1200 users without crashing. But the time drift of JMeter - due to the 1 user = 1 Thread paradigm - led to a load smaller than what you expected before launching the simulation.

If you want to use JMeter, having a powerful computer with lots of cores might save you life ;-)

Something went wrong with that request. Please try again.