1 Million Concurrent Connections
Pull request Compare This branch is 5 commits ahead of ryanmcgary:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.



Handle 1 Million Concurrent Connections using Ruby

  • App should remain responsive and be able to process at least 100 requests per second

  • App should consume MAX 15GB of RAM and keep load under 10 on a 8 CPU machine

  • App should communicate to clients every 15 seconds without any lags

Ruby Software

Espresso Framework - fast and easy to use.

Rainbows! Web Server - supports streaming and forking.

EventMachine - mature and stable I/O library for Ruby.

Rainbows! setup:

Rainbows! do
  use :EventMachine
  keepalive_timeout  3600*12
  worker_connections 128_000
  client_max_body_size nil
  client_header_buffer_size 512

worker_processes 8

So connections will be handled concurrently by 8 Ruby processes - 1 process per core.

If you know a way to make a single Ruby process to handle 1 million connections, feel free to repeat this test with a single Rainbows! worker or using Thin web server.

The application code:

class App < E
  map '/'

  # index and status_watcher actions should return event-stream content type
  before :index, :status_watcher do
    content_type 'text/event-stream'

  def index
    stream :keep_open do |stream|

      # communicate to client every 15 seconds
      timer = EM.add_periodic_timer(15) {stream << "\0"}

      stream.errback do      # when connection closed/errored:
        DB.decr :connections # 1. decrement connections amount by 1
        timer.cancel         # 2. cancel timer that communicate to client

      # increment connections amount by 1
      DB.incr :connections

  # frontend for status watchers - http://localhost:5252/status
  def status

  # backend for status watchers
  def status_watcher
    stream :keep_open do |stream|
      # adding a timer that will update status watchers every second
      timer = EM.add_periodic_timer(1) do
        connections = FormatHelper.humanize_number(DB.get :connections)
        stream << "data: %s\n\n" % connections
      stream.errback { timer.cancel } # cancel timer if connection closed/errored

  def get_ping

More on Streaming

Ruby Version and Tunings

Used MRI 1.9.3p385 installed/managed via rbenv.

To make Ruby a bit faster, applied these GarbageCollector tunings:

# The initial number of heap slots as well as the minimum number of slots allocated.

# The minimum number of heap slots that should be available after the GC runs.
# If they are not available then, ruby will allocate more slots.

# The number of C data structures that can be allocated before the GC kicks in.
# If set too low, the GC kicks in even if there are still heap slots available.

This had a spectacular impact - performance increased by about 40%

Really wanted to have this test also completed on Rubinius 2.0.0rc1 1.9mode, but somehow it is always segfaults after ~10,000 connections because of some libpthread issues. Had no time to investigate this.

Would be really exciting to see this experiment running on JRuby.

Anyone interested? I can provide load generation farm.

Operating System

Ubuntu 12.04 - really easy to make it accept 1 million connections.

The only files modified was /etc/security/limits.conf:

* - nofile 1048576

and /etc/sysctl.conf:

net.ipv4.netfilter.ip_conntrack_max = 1048576

How to Repeat

Prepare the Server

Clone this repo:

git clone https://github.com/slivu/1mc2

run bundler:

cd 1mc2/
rbenv rehash # in case you are using rbenv

start redis server on default port using config coming with this repo:

redis-server ./redis.conf

start app:


Prepare Clients

To generate the load i used 50 EC2 micro instances.

Special thanks to @ashtuchkin for creating a great tool to manage EC2 instances.

Using ec2-fleet it is really easy to manage any number of instances directly from terminal.

Follow there instructions to setup your AWS. After that done, start instances:

$ ./aws.js start 50

wait about 2 minutes. You can see what happens by typing $ ./aws.js status

When instances are ready, point them to tested host:

$ ./aws.js set host <ip>

our app running on 5252 port:

$ ./aws.js set port 5252

set the number of connections per instance:

$ ./aws set n 20000

Now we have 50 instances that will generate 20,000 clients each, resulting in 1,000,000 connections.

The app should start accepting connections now.

To see what happens, open a ServerSentEvents enabled browser(any recent Chrome/Firefox/Safari/Opera) and type http://localhost:5252/status

You should see the number of established connections as well as requests per second and mean response time.


As you can see, all aims achieved:

  • App remaining responsive - it is able to process about 179 requests per second

  • App does communicate to clients every 15 seconds - see network usage, it is about 3MB/s in/out

  • RAM usage is under 15GB and load is under 10

After all connections established i kept clients connected for about one hour.

All clients remained connected and RAM not increased(read no memory leaks).

Some graphs:

Mean response time are calculated by sending a request every second, registering the time needed for response and calculating the median of last 60 requests:

Mean processed requests per second depending on established connections:

As seen, while app holding and communicating to 1 million persistent connections it is still able to process about 200 standard requests per second.

Pretty well if taking in account that a good half of average websites are able to process only about 100 requests per second.

On link below you can see the progress - screenshots taken every 15 seconds(history starts at 12th slide):

Please leave your comments here - http://news.ycombinator.com/item?id=5249271

Author - Silviu Rusu. License - MIT.