/
README
549 lines (403 loc) · 20.3 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
Name
ngx_chunkin - HTTP 1.1 chunked-encoding request body support for Nginx.
*This module is not distributed with the Nginx source.* See the
installation instructions.
Status
This module is considered production ready.
Version
This document describes chunkin-nginx-module v0.21
(<http://github.com/agentzh/chunkin-nginx-module/tarball/v0.21>)
released on August 3, 2010.
Synopsis
chunkin on;
error_page 411 = @my_411_error;
location @my_411_error {
chunkin_resume;
}
location /foo {
# your fastcgi_pass/proxy_pass/set/if and
# any other config directives go here...
}
...
chunkin on;
error_page 411 = @my_411_error;
location @my_411_error {
chunkin_resume;
}
location /bar {
chunkin_keepalive on; # WARNING: too experimental!
# your fastcgi_pass/proxy_pass/set/if and
# any other config directives go here...
}
Description
This module adds HTTP 1.1 chunked
(<http://tools.ietf.org/html/rfc2616#section-3.6.1>) input support for
Nginx without the need of patching the Nginx core.
Behind the scene, it registers an access-phase handler that will eagerly
read and decode incoming request bodies when a "Transfer-Encoding:
chunked" header triggers a 411 error page in Nginx. For requests that
are not in the "chunked" transfer encoding, this module is a "no-op".
To enable the magic, just turn on the chunkin config option and define a
custom "411 error_page" using chunkin_resume, like this:
server {
chunkin on;
error_page 411 = @my_411_error;
location @my_411_error {
chunkin_resume;
}
...
}
No other modification is required in your nginx.conf file and everything
should work out of the box including the standard [module
(HttpProxyModule)] (except for those known issues). Note that the
chunkin directive is not allowed in the location block while the
chunkin_resume directive is only allowed on in "locations".
The core module's client_body_buffer_size, client_max_body_size, and
client_body_timeout directive settings are honored. Note that, the "body
sizes" here always indicate chunked-encoded body, not the data that has
already been decoded. Basically, the chunked-encoded body will always be
slightly larger than the original data that is not encoded.
The client_body_in_file_only and client_body_in_single_buffer settings
are followed partially. See Know Issues.
This module is not supposed to be merged into the Nginx core because
I've used Ragel (<http://www.complang.org/ragel/>) to generate the
chunked encoding parser for joy :)
How it works
Nginx explicitly checks chunked "Transfer-Encoding" headers and absent
content length header in its very early phase. Well, as early as the
"ngx_http_process_request_header" function. So this module takes a
rather tricky approach. That is, use an output filter to intercept the
"411 Length Required" error page response issued by
"ngx_http_process_request_header", fix things and finally issue an
internal redirect to the current location, thus starting from those
phases we all know and love, this time bypassing the horrible
"ngx_http_process_request_header" function.
In the "rewrite" phase of the newly created request, this module eagerly
reads in the chunked request body in a way similar to that of the
standard "ngx_http_read_client_request_body" function, but using its own
chunked parser generated by Ragel. The decoded request body will be put
into "r->request_body->bufs" and a corresponding "Content-Length" header
will be inserted into "r->headers_in".
Those modules using the standard "ngx_http_read_client_request_body"
function to read the request body will just work out of box because
"ngx_http_read_client_request_body" returns immediately when it sees
"r->request_body->bufs" already exists.
Special efforts have been made to reduce data copying and dynamic memory
allocation.
Directives
chunkin
syntax: *chunkin on|off*
default: *off*
context: *http, server*
phase: *access*
Enables or disables this module's hooks.
chunkin_resume
syntax: *chunkin_resume*
default: *no*
context: *location*
phase: *content*
This directive must be used in your custom "411 error page" location to
help this module work correctly. For example:
error_page 411 = @my_error;
location @my_error {
chunkin_resume;
}
For the technical reason behind the necessity of this directive, please
read the "nginx-devel" thread Content-Length is not ignored for chunked
requests: Nginx violates RFC 2616
(<http://nginx.org/pipermail/nginx-devel/2009-December/000041.html>).
This directive was first introduced in the v0.17 release.
chunkin_max_chunks_per_buf
syntax: *chunkin_max_chunks_per_buf <number>*
default: *512*
context: *http, server, location*
Set the max chunk count threshold for the buffer determined by the
client_body_buffer_size directive. If the average chunk size is "1 KB"
and your client_body_buffer_size setting is 1 meta bytes, then you
should set this threshold to 1024 or 2048.
When the raw body size is exceeding client_body_buffer_size *or* the
chunk counter is exceeding this "chunkin_max_chunks_per_buf" setting,
the decoded data will be temporarily buffered into disk files, and then
the main buffer gets cleared and the chunk counter gets reset back to 0
(or 1 if there's a "pending chunk").
This directive was first introduced in the v0.17 release.
chunkin_keepalive
syntax: *chunkin_keepalive on|off*
default: *off*
context: *http, server, location, if*
Turns on or turns off HTTP 1.1 keep-alive and HTTP 1.1 pipelining
support.
Keep-alive without pipelining should be quite stable but pipelining
support is very preliminary, limited, and almost untested.
This directive was first introduced in the v0.07 release.
Technical note on the HTTP 1.1 pipeling support
The basic idea is to copy the bytes left by my chunked parser in
"r->request_body->buf" over into "r->header_in" so that nginx's
"ngx_http_set_keepalive" and "ngx_http_init_request" functions will pick
it up for the subsequent pipelined requests. When the request body is
small enough to be completely preread into the "r->header_in" buffer,
then no data copy is needed here -- just setting "r->header_in->pos"
correctly will suffice.
The only issue that remains is how to enlarge "r->header_in" when the
data left in "r->request_body->buf" is just too large to be hold in the
remaining room between "r->header_in->pos" and "r->header_in->end". For
now, this module will just give up and simply turn off "r->keepalive".
I know we can always use exactly the remaining room in "r->header_in" as
the buffer size when reading data from "c->recv", but's suboptimal when
the remaining room in "r->header_in" happens to be very small while
"r->request_body->buf" is quite large.
I haven't fully grokked all the details among "r->header_in",
"c->buffer", busy/free lists and those so-called "large header buffers".
Is there a clean and safe way to reallocate or extend the "r->header_in"
buffer?
Trouble Shooting
When combining this module with ngx_proxy and ngx_fastcgi, nginx sends a
"Transfer-Encoding: " header which is invalid and not being treated well
by some webservers on backend, for example, riak. So a work-around for
now is to use the ngx_headers_more module to remove the
"Transfer-Encoding" completely, as in
chunkin on;
error_page 411 = @my_411_error; location @my_411_error { chunkin_resume;
}
location / { more_clear_input_headers 'Transfer-Encoding'; proxy_pass
http://riak; }
Thanks hoodoos (<http://github.com/hoodoos>) for sharing this trick :)
Installation
Grab the nginx source code from nginx.net (<http://nginx.net/>), for
example, the version 0.8.41 (see nginx compatibility), and then build
the source with this module:
$ wget 'http://sysoev.ru/nginx/nginx-0.8.41.tar.gz'
$ tar -xzvf nginx-0.8.41.tar.gz
$ cd nginx-0.8.41/
# Here we assume you would install you nginx under /opt/nginx/.
$ ./configure --prefix=/opt/nginx \
--add-module=/path/to/chunkin-nginx-module
$ make -j2
$ make install
Download the latest version of the release tarball of this module from
chunkin-nginx-module file list
(<http://github.com/agentzh/chunkin-nginx-module/downloads>).
For Developers
The chunked parser is generated by Ragel
(<http://www.complang.org/ragel/>). If you want to regenerate the
parser's C file, i.e., src/chunked_parser.c
(<http://github.com/agentzh/chunkin-nginx-module/blob/master/src/chunked
_parser.c>), use the following command from the root of the chunkin
module's source tree:
$ ragel -G2 src/chunked_parser.rl
Packages from users
Fedora 13 RPM files
The following source and binary rpm files are contributed by Ernest
Folch, with nginx 0.8.54, ngx_chunkin v0.21 and ngx_headers_more v0.13:
* nginx-0.8.54-1.fc13.src.rpm
(<http://agentzh.org/misc/nginx/ernest/nginx-0.8.54-1.fc13.src.rpm>)
* nginx-0.8.54-1.fc13.x86_64.rpm
(<http://agentzh.org/misc/nginx/ernest/nginx-0.8.54-1.fc13.x86_64.rp
m>)
Compatibility
The following versions of Nginx should work with this module:
* 1.0.x (last tested: 1.0.2)
* 0.8.x (last tested: 0.8.54)
* 0.7.x >= 0.7.21 (last tested: 0.7.67)
Earlier versions of Nginx like 0.6.x and 0.5.x will *not* work.
If you find that any particular version of Nginx above 0.7.21 does not
work with this module, please consider reporting a bug.
Report Bugs
Although a lot of effort has been put into testing and code tuning,
there must be some serious bugs lurking somewhere in this module. So
whenever you are bitten by any quirks, please don't hesitate to
1. send a bug report or even patches to <agentzh@gmail.com>,
2. or create a ticket on the issue tracking interface
(<http://github.com/agentzh/chunkin-nginx-module/issues>) provided
by GitHub.
Source Repository
Available on github at agentzh/chunkin-nginx-module
(<http://github.com/agentzh/chunkin-nginx-module>).
ChangeLog
v0.21
* applied a patch from Gong Kaihui (龚开晖) to always call "post_handler"
in "ngx_http_chunkin_read_chunked_request_body".
v0.20
* fixed a bug that may read incomplete chunked body. thanks Gong
Kaihui (龚开晖).
* fixed various memory issues in the implementation which may cause
nginx processes to crash.
* added support for chunked PUT requests.
* now we always require "error_page 411 @resume" and no default
(buggy) magic any more. thanks Gong Kaihui (龚开晖).
v0.19
* we now use ragel -G2 to generate the chunked parser and we're 36%
faster.
* we now eagerly read the data octets in the chunked parser and we're
43% faster.
v0.18
* added support for "chunk-extension" to the chunked parser as per RFC
2616 (<http://tools.ietf.org/html/rfc2616#section-3.6.1>), but we
just ignore them (if any) because we don't understand them.
* added more diagnostic information for certian error messages.
v0.17
* implemented the chunkin_max_chunks_per_buf directive to allow
overriding the default 512 setting.
* we now bypass nginx's discard requesty body bug
(<http://nginx.org/pipermail/nginx-devel/2009-December/000041.html>)
by requiring our users to define explicit "411 error_page" with
chunkin_resume in the error page location. Thanks J for reporting
related bugs.
* fixed "r->phase_handler" in our post read handler. our handler may
run one more time before :P
* the chunkin handler now returns "NGX_DECLINED" rather than "NGX_OK"
when our "ngx_http_chunkin_read_chunked_request_body" function
returns "NGX_OK", to avoid bypassing other access-phase handlers.
v0.16
* turned off ddebug in the previous release. thanks J for reporting
it.
v0.15
* fixed a regression that ctx->chunks_count never incremented in
earlier versions.
v0.14
* now we no longer skip those operations between the (interrupted)
ngx_http_process_request_header and the server rewrite phase. this
fixed the security issues regarding the internal directive as well
as SSL sessions.
* try to ignore CR/LF/SP/HT at the begining of the chunked body.
* now we allow HT as padding spaces and ignore leading CRLFs.
* improved diagnostic info in the error.log messages when parsefail
occurs.
v0.11
* added a random valid-chunked-request generator in t/random.t.
* fixed a new connection leak issue caught by t/random.t.
v0.10
* fixed a serious bug in the chunked parser grammer: there would be
ambiguity when CRLF appears in the chunked data sections. Thanks J
for reporting it.
v0.08
* fixed gcc compilation errors on x86_64, thanks J for reporting it.
* used the latest Ragel 6.6 to generate the "chunked_parser.c" file in
the source tree.
v0.07
* marked the disgarded 411 error page's output chain bufs as consumed
by setting "buf->pos = buf->last". (See this nginx-devel thread
(<http://nginx.org/pipermail/nginx-devel/2009-December/000025.html>)
for more details.)
* added the chunkin_keepalive directive which can enable HTTP 1.1
keep-alive and HTTP 1.1 pipelining, and defaults to "off".
* fixed the "alphtype" bug in the Ragel parser spec; which caused
rejection of non-ascii octets in the chunked data. Thanks J for his
bug report.
* added "Test::Nginx::Socket" to test our nginx module on the socket
level. Thanks J for his bug report.
* rewrote the bufs recycling part and preread-buf-to-rb-buf transition
part, also refactored the Ragel parser spec, thus eliminating lots
of serious bugs.
* provided better diagnostics in the error log message for "bad
chunked body" parsefails in the chunked parser. For example:
2009/12/02 17:35:52 [error] 32244#0: *1 bad chunked body (offset 7, near "4^M
hell <-- HERE o^M
0^M
^M
", marked by " <-- HERE ").
, client: 127.0.0.1, server: localhost, request: "POST /main
HTTP/1.1", host: "localhost"
* added some code to let the chunked parser handle special 0-size
chunks that are not the last chunk.
* fixed a connection leak bug regarding incorrect "r->main->count"
reference counter handling for nginx 0.8.11+ (well, the
"ngx_http_read_client_request_body" function in the nginx core also
has this issue, I'll report it later.)
v0.06
* minor optimization: we won't traverse the output chain link if the
chain count is not large enough.
Test Suite
This module comes with a Perl-driven test suite. The test cases
(<http://github.com/agentzh/chunkin-nginx-module/tree/master/test/t/>)
are declarative
(<http://github.com/agentzh/chunkin-nginx-module/blob/master/test/t/sani
ty.t>) too. Thanks to the Test::Base
(<http://search.cpan.org/perldoc?Test::Base>) module in the Perl world.
To run it on your side:
$ cd test
$ PATH=/path/to/your/nginx-with-chunkin-module:$PATH prove -r t
You need to terminate any Nginx processes before running the test suite
if you have changed the Nginx server binary.
At the moment, LWP::UserAgent
(<http://search.cpan.org/perldoc?LWP::UserAgent>) is used by the test
scaffold
(<http://github.com/agentzh/chunkin-nginx-module/blob/master/test/lib/Te
st/Nginx/LWP.pm>) for simplicity.
Because a single nginx server (by default, "localhost:1984") is used
across all the test scripts (".t" files), it's meaningless to run the
test suite in parallel by specifying "-jN" when invoking the "prove"
utility.
Some parts of the test suite requires modules proxy and echo to be
enabled as well when building Nginx.
Known Issues
* May not work with certain 3rd party modules like the upload module
(<http://www.grid.net.ru/nginx/upload.en.html>) because it
implements its own request body reading mechanism.
* "client_body_in_single_buffer on" may *not* be obeyed for short
contents and fast network.
* "client_body_in_file_only on" may *not* be obeyed for short contents
and fast network.
* HTTP 1.1 pipelining may not fully work yet.
TODO
* make the chunkin handler run at the end of the "access phase" rather
than beginning.
* add support for "trailers" as specified in the RFC 2616
(<http://tools.ietf.org/html/rfc2616#section-3.6.1>).
* fix the known issues.
Getting involved
You'll be very welcomed to submit patches to the author or just ask for
a commit bit to the source repository on GitHub.
Author
Zhang "agentzh" Yichun (章亦春) *<agentzh@gmail.com>*
This wiki page is also maintained by the author himself, and everybody
is encouraged to improve this page as well.
Copyright & License
The basic client request body reading code is based on the
"ngx_http_read_client_request_body" function and its utility functions
in the Nginx 0.8.20 core. This part of code is copyrighted by Igor
Sysoev.
Copyright (c) 2009, Taobao Inc., Alibaba Group ( http://www.taobao.com
).
Copyright (c) 2009, 2010, 2011, Zhang "agentzh" Yichun (章亦春)
<agentzh@gmail.com>.
This module is licensed under the terms of the BSD license.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
* Neither the name of the Taobao Inc. nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS
IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
See Also
* The original thread on the Nginx mailing list that inspires this
module's development: "'Content-Length' header for POSTs"
(<http://forum.nginx.org/read.php?2,4453,20543>).
* The orginal announcement thread on the Nginx mailing list: "The
chunkin module: Experimental chunked input support for Nginx"
(<http://forum.nginx.org/read.php?2,22967>).
* The original blog post
(<http://agentzh.spaces.live.com/blog/cns!FF3A735632E41548!481.entry
>) about this module's initial development.
* The thread discussing chunked input support on the nginx-devel
mailing list: "Chunked request body and HTTP header parser"
(<http://nginx.org/pipermail/nginx-devel/2009-December/000021.html>)
.
* The [module (HttpEchoModule)] for Nginx module's automated testing.
* RFC 2616 - Chunked Transfer Coding
(<http://tools.ietf.org/html/rfc2616#section-3.6.1>).