Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

Commit

Permalink
cluster: use round-robin load balancing
Browse files Browse the repository at this point in the history
Empirical evidence suggests that OS-level load balancing (that is,
having multiple processes listen on a socket and have the operating
system wake up one when a connection comes in) produces skewed load
distributions on Linux, Solaris and possibly other operating systems.

The observed behavior is that a fraction of the listening processes
receive the majority of the connections. From the perspective of the
operating system, that somewhat makes sense: a task switch is expensive,
to be avoided whenever possible. That's why the operating system likes
to give preferential treatment to a few processes, because it reduces
the number of switches.

However, that rather subverts the purpose of the cluster module, which
is to distribute the load as evenly as possible. That's why this commit
adds (and defaults to) round-robin support, meaning that the master
process accepts connections and distributes them to the workers in a
round-robin fashion, effectively bypassing the operating system.

Round-robin is currently disabled on Windows due to how IOCP is wired
up. It works and you can select it manually but it probably results in
a heavy performance hit.

Fixes #4435.
  • Loading branch information
bnoordhuis committed May 13, 2013
1 parent bdc5881 commit e72cd41
Show file tree
Hide file tree
Showing 2 changed files with 313 additions and 51 deletions.
53 changes: 40 additions & 13 deletions doc/api/cluster.markdown
Expand Up @@ -53,14 +53,28 @@ The worker processes are spawned using the `child_process.fork` method,
so that they can communicate with the parent via IPC and pass server
handles back and forth.

When you call `server.listen(...)` in a worker, it serializes the
arguments and passes the request to the master process. If the master
process already has a listening server matching the worker's
requirements, then it passes the handle to the worker. If it does not
already have a listening server matching that requirement, then it will
create one, and pass the handle to the child.
The cluster module supports two methods of distributing incoming
connections.

The first one (and the default one on all platforms except Windows),
is the round-robin approach, where the master process listens on a
port, accepts new connections and distributes them across the workers
in a round-robin fashion, with some built-in smarts to avoid
overloading a worker process.

The second approach is where the master process creates the listen
socket and sends it to interested workers. The workers then accept
incoming connections directly.

The second approach should, in theory, give the best performance.
In practice however, distribution tends to be very unbalanced due
to operating system scheduler vagaries. Loads have been observed
where over 70% of all connections ended up in just two processes,
out of a total of eight.

This causes potentially surprising behavior in three edge cases:
Because `server.listen()` hands off most of the work to the master
process, there are three cases where the behavior between a normal
node.js process and a cluster worker differs:

1. `server.listen({fd: 7})` Because the message is passed to the master,
file descriptor 7 **in the parent** will be listened on, and the
Expand All @@ -77,12 +91,10 @@ This causes potentially surprising behavior in three edge cases:
want to listen on a unique port, generate a port number based on the
cluster worker ID.

When multiple processes are all `accept()`ing on the same underlying
resource, the operating system load-balances across them very
efficiently. There is no routing logic in Node.js, or in your program,
and no shared state between the workers. Therefore, it is important to
design your program such that it does not rely too heavily on in-memory
data objects for things like sessions and login.
There is no routing logic in Node.js, or in your program, and no shared
state between the workers. Therefore, it is important to design your
program such that it does not rely too heavily on in-memory data objects
for things like sessions and login.

Because workers are all separate processes, they can be killed or
re-spawned depending on your program's needs, without affecting other
Expand All @@ -91,6 +103,21 @@ continue to accept connections. Node does not automatically manage the
number of workers for you, however. It is your responsibility to manage
the worker pool for your application's needs.

## cluster.schedulingPolicy

The scheduling policy, either `cluster.SCHED_RR` for round-robin or
`cluster.SCHED_NONE` to leave it to the operating system. This is a
global setting and effectively frozen once you spawn the first worker
or call `cluster.setupMaster()`, whatever comes first.

`SCHED_RR` is the default on all operating systems except Windows.
Windows will change to `SCHED_RR` once libuv is able to effectively
distribute IOCP handles without incurring a large performance hit.

`cluster.schedulingPolicy` can also be set through the
`NODE_CLUSTER_SCHED_POLICY` environment variable. Valid
values are `"rr"` and `"none"`.

## cluster.settings

* {Object}
Expand Down

0 comments on commit e72cd41

Please sign in to comment.