Skip to content

costajob/ruby-app-servers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Table of Contents

Scope

The scope of this comparison is to figure out how modern Ruby application servers perform against a simple Rack application.

Primes

The Ruby application computes the sum of a range of prime numbers, fetched by the Prime standard library.
The range of the first prime numbers to compute is configurable via a HTTP parameter to stretch computational time.

Ruby

Ruby 2.3 version is used for all of the tests.
JRuby 9.1.2.0 is used to test the Puma application server in order to compare the threads-pool model versus the pre-forking one.

Tested servers

I only focused on standalone Ruby servers solutions: no external balancers and/or reverse proxies.
For the above reason i removed Thin from the pack, since it does not include a balancer for the spawned processes. The pack includes:

Puma

Puma is a concurrent application server crafted by Evan Phoenix.
The original idea from Mongrel HTTP Parser was extended to make it compatible with Rack-era.
Puma offers the threads-pool and the pre-forking models to grant parallelism on both MRI and JRuby.

Bootstrap:

bundle exec puma -w 7 --preload
jruby --server -S bundle exec puma

Passenger

Phusion Passenger is the only Ruby application server existing as a commercial solution (Enterprise version).
Passenger supports both the pre-forking and threads-pool models, the latter is only available for the commercial version (not tested). Pre-forking automatically spawn a new process on demand (no need to specify the number of workers).

Bootstrap:

bundle exec passenger start -p 9292

Unicorn

Unicorn is an application server using the pre-forking processes model to elegantly delegate most of the load balancing to the underlaying operating system.
It has been proved to be a reliable deployment option for large Rails application (e.g. Github).

Bootstrap:

bundle exec unicorn -c config/unicorn.rb

Benchmarks

Platform

I registered these benchmarks with a MacBook PRO 15 late 2011 having these specs:

  • OSX El Captain
  • 2,2 GHz Intel Core i7 (4 cores)
  • 8 GB 1333 MHz DDR3

I measured memory peak consumption by using Xcode's Instruments.

Wrk

I used wrk as the loading tool. I measured each application server three times, picking the best lap.
The following script command is used:

wrk -t 4 -c 100 -d 30s --timeout 2000 http://127.0.0.1:9292/?count=1000

First 1000 numbers

App server Throughput (req/s) Latency in ms (avg/stdev/max) RAM peak (MB)
Unicorn 548.71 41.66/24.76/207.39 ~183
Passenger 10036.23 9.95/1.35/36.67 ~138
Puma (MRI) 27442.68 3.43/1.82/73.06 ~226
Puma (JVM) 30372.77 0.51/0.11/9.83 531.69

Considerations

Speed

No crash was registered during the benchmarks.
When HTTP pipe-lining is enabled Puma outperforms other application servers by a large margin.
Passenger was simply not able to perform on par with Puma, although it offers better latency.
Unicorn seems to not support HTTP keep alive option in standalone mode: that's why its throughput is so disappointing.

Memory

Memory consumption seems to be inversely proportional to throughput: Passenger and Unicorn are the less memory-hungry application servers, followed by Puma MRI and, with a large gap, Puma JVM.

Dependencies

All of the application servers, but for Unicorn, depends on the Rack gem.
Puma and Passenger have no other runtime dependencies.

Configuration

Passenger could run in production without any particular changes. Integration with both Nginx and Apache is a breeze thanks to the wizard installation.
Passenger provides commands to start and stop the server, while Puma relies on a separate bin (pumactl).
Unicorn configuration is the more hardcore of the bucket: it explicitly demands for a configuration file, while the rest of the pack can be configured directly by command line.

Threads vs processes

Puma on JVM proved to be the fastest of the tested solutions, although MRI implementation is also very close in throughput.
JRuby latency is also better than MRI, despite JVM confirmed to be a memory-hungry piece of software.

About

Ruby app servers benchmark on a simple Rack application

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages