Permalink
Newer
Older
100644 252 lines (177 sloc) 9.27 KB
1
# Einhorn: the language-independent shared socket manager
2
3
![Einhorn](https://stripe.com/img/blog/posts/meet-einhorn/einhorn.png)
4
5
Let's say you have a server process which processes one request at a
6
time. Your site is becoming increasingly popular, and this one process
7
is no longer able to handle all of your inbound connections. However,
8
you notice that your box's load number is low.
9
10
So you start thinking about how to handle more requests. You could
11
rewrite your server to use threads, but threads are a pain to program
12
against (and maybe you're writing in Python or Ruby where you don't
13
have true threads anyway). You could rewrite your server to be
14
event-driven, but that'd require a ton of effort, and it wouldn't help
15
you go beyond one core. So instead, you decide to just run multiple
16
copies of your server process.
17
18
Enter Einhorn. Einhorn makes it easy to run (and keep alive) multiple
19
copies of a single long-lived process. If that process is a server
20
listening on some socket, Einhorn will open the socket in the master
21
process so that it's shared among the workers.
22
23
Einhorn is designed to be compatible with arbitrary languages and
24
frameworks, requiring minimal modification of your
25
application. Einhorn is simple to configure and run.
26
27
## Installation
28
29
Install from Rubygems as:
30
31
$ gem install einhorn
32
33
Or build from source by:
34
35
$ gem build einhorn.gemspec
36
37
And then install the built gem.
38
39
## Usage
40
41
Einhorn is the language-independent shared socket manager. Run
42
`einhorn -h` to see detailed usage. At a high level, usage looks like
43
the following:
44
45
einhorn [options] program
46
47
Einhorn will open one or more shared sockets and run multiple copies
48
of your process. You can seamlessly reload your code, dynamically
49
reconfigure Einhorn, and more.
50
51
## Overview
52
53
To set Einhorn up as a master process running 3 copies of `sleep 5`:
54
55
$ einhorn -n 3 sleep 5
56
57
You can communicate your running Einhorn process via `einhornsh`:
58
59
$ einhornsh
60
Welcome gdb! You are speaking to Einhorn Master Process 11902
61
Enter 'help' if you're not sure what to do.
62
63
Type "quit" or "exit" to quit at any time
64
> help
65
You are speaking to the Einhorn command socket. You can run the following commands:
66
...
67
68
### Server sockets
69
70
If your process is a server and listens on one or more sockets,
@gdb
Sep 27, 2012
71
Einhorn can open these sockets and pass them to the workers. You can
72
specify the addresses to bind by passing one or more `-b ADDR`
73
arguments:
@gdb
Sep 27, 2012
75
einhorn -b 127.0.0.1:1234 my-command
76
einhorn -b 127.0.0.1:1234,r -b 127.0.0.1:1235 my-command
77
78
Each address is specified as an ip/port pair, possibly accompanied by options:
79
80
ADDR := (IP:PORT)[<,OPT>...]
81
82
In the worker process, the opened file descriptors will be represented
83
as a space-separated list of file descriptor numbers in the
84
EINHORN_FDS environment variable (respecting the order that the `-b`
85
options were provided in):
86
87
EINHORN_FDS="6" # 127.0.0.1:1234
88
EINHORN_FDS="6 7" # 127.0.0.1:1234,r 127.0.0.1:1235
89
90
Valid opts are:
91
92
r, so_reuseaddr: set SO_REUSEADDR on the server socket
93
n, o_nonblock: set O_NONBLOCK on the server socket
94
95
You can for example run:
96
@gdb
Sep 27, 2012
97
$ einhorn -b 127.0.0.1:2345,r -m manual -n 4 -- example/time_server
98
99
Which will run 4 copies of
100
@gdb
Sep 27, 2012
101
EINHORN_FDS=6 example/time_server
102
103
Where file descriptor 6 is a server socket bound to `127.0.0.1:2345`
104
and with `SO_REUSEADDR` set. It is then your application's job to
105
figure out how to `accept()` on this file descriptor.
106
107
### Command socket
108
109
Einhorn opens a UNIX socket to which you can send commands (run
110
`help` in `einhornsh` to see what admin commands you can
111
run). Einhorn relies on file permissions to ensure that no malicious
112
users can gain access. Run with a `-d DIRECTORY` to change the
113
directory where the socket will live.
114
@gdb
Sep 26, 2012
115
Note that the command socket uses a line-oriented YAML protocol, and
116
you should ensure you trust clients to send arbitrary YAML messages
117
into your process.
118
119
### Seamless upgrades
120
121
You can cause your code to be seamlessly reloaded by upgrading the
122
worker code on disk and running
123
124
$ einhornsh
125
...
126
> upgrade
127
128
Once the new workers have been spawned, Einhorn will send each old
129
worker a SIGUSR2. SIGUSR2 should be interpreted as a request for a
130
graceful shutdown.
131
132
### ACKs
133
134
After Einhorn spawns a worker, it will only consider the worker up
135
once it has received an ACK. Currently two ACK mechanisms are
136
supported: manual and timer.
137
138
#### Manual ACK
139
140
A manual ACK (configured by providing a `-m manual`) requires your
141
application to send a command to the command socket once it's
142
ready. This is the safest ACK mechanism. If you're writing in Ruby,
143
just do
144
145
require 'einhorn/worker'
146
Einhorn::Worker.ack!
147
148
in your worker code. If you're writing in a different language, or
149
don't want to include Einhorn in your namespace, you can send the
150
string
151
152
{"command":"worker:ack", "pid":PID}
153
154
to the UNIX socket pointed to by the environment variable
155
`EINHORN_SOCK_PATH`. (Be sure to include a trailing newline.)
156
@gdb
Sep 27, 2012
157
To make things even easier, you can pass a `-g` to Einhorn, in which
158
case you just need to `write()` the above message to the open file
@gdb
Sep 27, 2012
159
descriptor pointed to by `EINHORN_SOCK_FD`.
160
161
(See `lib/einhorn/worker.rb` for details of these and other socket
162
discovery mechanisms.)
163
164
#### Timer ACK [default]
165
166
By default, Einhorn will use a timer ACK of 1 second. That means that
167
if your process hasn't exited after 1 second, it is considered ACK'd
168
and healthy. You can modify this timeout to be more appropriate for
169
your application (and even set to 0 if desired). Just pass a `-m
170
FLOAT`.
171
172
### Preloading
173
174
If you're running a Ruby process, Einhorn can optionally preload its
175
code, so it only has to load the code once per upgrade rather than
176
once per worker process. This also saves on memory overhead, since all
177
of the code in these processes will be stored only once using your
178
operating system's copy-on-write features.
179
180
To use preloading, just give Einhorn a `-p PATH_TO_CODE`, and make
181
sure you've defined an `einhorn_main` method.
182
183
In order to maximize compatibility, we've worked to minimize Einhorn's
@gdb
Sep 26, 2012
184
dependencies. It has no dependencies outside of the Ruby standard
185
library.
186
187
### Command name
188
189
You can set the name that Einhorn and your workers show in PS. Just
190
pass `-c <name>`.
191
192
### Options
193
@gdb
Sep 27, 2012
194
-b, --bind ADDR Bind an address and add the corresponding FD to EINHORN_FDS
195
-c, --command-name CMD_NAME Set the command name in ps to this value
196
-d, --socket-path PATH Where to open the Einhorn command socket
197
-e, --pidfile PIDFILE Where to write out the Einhorn pidfile
198
-f, --lockfile LOCKFILE Where to store the Einhorn lockfile
@gdb
Sep 27, 2012
199
-g, --command-socket-as-fd Leave the command socket open as a file descriptor, passed in the EINHORN_SOCK_FD environment variable. This allows your worker processes to ACK without needing to know where on the filesystem the command socket lives.
200
-h, --help Display this message
201
-k, --kill-children-on-exit If Einhorn exits unexpectedly, gracefully kill all its children
202
-l, --backlog N Connection backlog (assuming this is a server)
203
-m, --ack-mode MODE What kinds of ACK to expect from workers. Choices: FLOAT (number of seconds until assumed alive), manual (process will speak to command socket when ready). Default is MODE=1.
204
-n, --number N Number of copies to spin up
205
-p, --preload PATH Load this code into memory, and fork but do not exec upon spawn. Must define an "einhorn_main" method
206
-q, --quiet Make output quiet (can be reconfigured on the fly)
207
-s, --seconds N Number of seconds to wait until respawning
208
-v, --verbose Make output verbose (can be reconfigured on the fly)
209
--with-state-fd STATE [Internal option] With file descriptor containing state
210
--version Show version
211
212
213
## Contributing
214
215
Contributions are definitely welcome. To contribute, just follow the
216
usual workflow:
217
218
1. Fork Einhorn
219
2. Create your feature branch (`git checkout -b my-new-feature`)
220
3. Commit your changes (`git commit -am 'Added some feature'`)
221
4. Push to the branch (`git push origin my-new-feature`)
222
5. Create new Github pull request
223
224
## History
225
226
Einhorn came about when Stripe was investigating seamless code
227
upgrading solutions for our API worker processes. We really liked the
228
process model of [Unicorn](http://unicorn.bogomips.org/), but didn't
229
want to use its HTTP functionality. So Einhorn was born, providing the
230
master process functionality of Unicorn (and similar preforking
231
servers) to a wider array of applications.
232
233
See https://stripe.com/blog/meet-einhorn for more background.
234
235
Stripe currently uses Einhorn in production for a number of
236
services. Our Thin + EventMachine servers currently require patches to
237
both Thin and EventMachine (to support file-descriptor passing). You
238
can obtain these patches from our public forks of the
239
[respective](https://github.com/stripe/thin)
240
[projects](https://github.com/stripe/eventmachine). Check out
241
`example/thin_example` for an example of running Thin under Einhorn.
242
243
## Compatibility
244
245
Einhorn was developed and tested under Ruby 1.8.7.
246
247
## About
248
249
Einhorn is a project of [Stripe](https://stripe.com), led by [Greg
250
Brockman](https://twitter.com/thegdb). Feel free to get in touch at
251
info@stripe.com.