-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTTP normalization #2
Comments
Motivation#628 implements strict alphabet validation. However, for example, if an alphabet prohibits space (
Allowed characters (bytes) must be taken from the same configuration options as for #628. The encodings must be validated, see for example Traffic normalization for intrusion detection is well studied, see for example Network Intrusion Detection: Evasion, HPACK & QPACKHuffman decoder and encoders should be reviewed: at the moment we use 1-character decoding table which shows better performance than for nghttp2 and Nginx decoders #1176 (comment) . However, LiteSpeed uses large tables and batching to speedup Huffman encoding and decoding. Probably allowed characters (in sense of #628), already decoded (in sense of this issue) can be encoded in the large table. Also see #1207. References:
Modes of operationTo not to hurt performance in cases which don't require strong security, the feature should be configurable per-vhost and per-location in the same sense as #688. The transformation logic (as described in RFC 7230 5.7.2) for cookies and URI must be done by a configuration option (see also #902 ):
, e.g.
Following checks and transformations must be done:
Additional alphabets must be introduced to validate the strings after all the decodings. These alphabets may prevent double percent encodings (e.g.
Implementation requirementsIf none of the normalization option is specified, then the HTTP parser must not perform detailed processing and just validate allowed alphabet as now, i.e. there must be zero performance overhead if normalization isn't required by configuration. All the decoders and The normalization must be done before the cache processing to minimize different URI keys stored in the cache. Since it's unwished to grow current HTTP parser states set, the logic must be done in the plugable (external) FSM by conditional unlikely jump (no need to support the compilation directive any more). Also please fix the TODO for URI abs_path for more accurate filtering of injection attacks, e.g. it'd be good to be able to prohibit SIMDThere are SIMD implementations of UTF-8 validation or recoding (e.g. to/from UTF-16 or UTF-32). See for example
However, probably it makes sense to sacrifice SIMD to do percent-decoding, UTF-8 validation, validation of allowed character sets (in sense of #628 ) and transformations (path or arguments) in single pass. Tests and docsPlease update https://github.com/tempesta-tech/tempesta/wiki/Web-security Wiki page on finishing the task.
Further possible extensionsWe leave back-end server personality normalization for further development if there will be any real requests. Probably this won't be needed since we're going to provide full Web server functionality and leave really heavy processing logic to dedicated WAF solutions. HTTP responses also aren't normalized - we target initial attacks filtering instead of filtration of their consequences. Also the decoders set is very restricted, e.g. there is no lower case conversion or Microsoft %U decoding or unicode normalization, so please keep in mind possible further extension of the decoder. |
there is a chance that the server connection resources wont't be released on server disconnect and tempesta shutdown threafter so we have to explicitly call the tfw_connection_release() [17193.213542] ------------[ cut here ]------------ [17193.217477] Kernel BUG at ffffffffc04f06e7 [verbose debug info unavailable] [17193.217477] invalid opcode: 0000 [#1] SMP [17193.217477] Modules linked in: tfw_sched_ratio(O) tfw_sched_http(O) tfw_sched_hash(O) tempesta_fw(O) tempesta_db(O) tempesta_tls(O) bochs_drm ttm drm_kms_helper drm fb_sys_fops syscopyarea sysfillrect ppdev input_leds led_class sg serio_raw sysimgblt parport_pc parport pcspkr button ip_tables x_tables autofs4 ext4 crc16 jbd2 fscrypto mbcache sr_mod cdrom sd_mod ata_generic crc32c_intel psmouse ata_piix libata i2c_piix4 e1000 scsi_mod floppy [last unloaded: tempesta_tls] [17193.217477] CPU: 1 PID: 4288 Comm: sysctl Tainted: G O 4.9.35 #2 [17193.217477] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014 [17193.217477] task: ffff93939a8f6700 task.stack: ffffa400005f0000 [17193.217477] RIP: 0010:[<ffffffffc04f06e7>] [<ffffffffc04f06e7>] tfw_sock_srv_del_conns+0xf7/0x110 [tempesta_fw] [17193.217477] RSP: 0018:ffffa400005f3ce8 EFLAGS: 00010202 [17193.217477] RAX: 0000000000000001 RBX: ffff93937642b2a0 RCX: ffffffffc0500a50 [17193.217477] RDX: ffff93937642b350 RSI: ffff93939ba9c700 RDI: ffff9393734a6420 [17193.217477] RBP: ffff93937642b2d8 R08: fffffffffffffffc R09: 0000000000000003 [17193.217477] R10: 0010000100023588 R11: 0000000000000000 R12: ffff9393734a63d8 [17193.217477] R13: ffff9393734a6410 R14: ffff9393734a6410 R15: ffffa400005f3f20 [17193.217477] FS: 00007fbc0655c880(0000) GS:ffff9393bfd00000(0000) knlGS:0000000000000000 [17193.217477] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [17193.217477] CR2: 00007fbc05c55d00 CR3: 0000000036565000 CR4: 00000000000006e0 [17193.217477] Stack: [17193.217477] ffff9393734a6410 ffff93939ba9c490 ffffffffc04f05f0 ffff93939ba9c480 [17193.217477] 0000561d95e7e2e0 ffffffffc04eceb8 ece581022d7a1302 ffffffffc0500fd0 [17193.217477] ffffffffc0500fd0 dead000000000200 dead000000000100 ffffffffc04f0835 [17193.217477] Call Trace: [17193.217477] [<ffffffffc04f05f0>] ? tfw_cfg_handle_ratio_predyn_opts+0x150/0x150 [tempesta_fw] [17193.217477] [<ffffffffc04eceb8>] ? tfw_sg_for_each_srv+0x58/0x90 [tempesta_fw] [17193.217477] [<ffffffffc04f0835>] ? tfw_clean_srv_groups+0x135/0x150 [tempesta_fw] [17193.217477] [<ffffffffc04d0c5f>] ? tfw_cfg_stop+0x6f/0xb0 [tempesta_fw] [17193.217477] [<ffffffffc04eb98d>] ? handle_sysctl_state_io+0x19d/0x1d0 [tempesta_fw] [17193.217477] [<ffffffffc04eb82a>] ? handle_sysctl_state_io+0x3a/0x1d0 [tempesta_fw] [17193.217477] [<ffffffffa55fc22e>] ? proc_sys_call_handler+0xde/0x100 [17193.217477] [<ffffffffa558b6ae>] ? __vfs_write+0x2e/0x160 [17193.217477] [<ffffffffa558bdab>] ? vfs_write+0xab/0x190 [17193.217477] [<ffffffffa558d14d>] ? SyS_write+0x4d/0xb0 [17193.217477] [<ffffffffa59278e4>] ? entry_SYSCALL_64_fastpath+0x17/0x98 [17193.217477] Code: 38 48 83 e8 38 4c 39 f5 74 23 4d 8b ac 24 88 00 00 00 4d 85 ed 74 14 49 8b 54 24 38 4c 89 e3 49 89 c4 48 39 d5 0f 85 46 ff ff ff <0f> 0b 5b 31 c0 5d 41 5c 41 5d 41 5e c3 66 90 66 2e 0f 1f 84 00 [17193.217477] RIP [<ffffffffc04f06e7>] tfw_sock_srv_del_conns+0xf7/0x110 [tempesta_fw] [17193.217477] RSP <ffffa400005f3ce8> [17193.473816] ---[ end trace c254541427767bd1 ]--- [17193.485634] [tempesta] Un-registering scheduler: hash [17193.518667] [tempesta] Un-registering scheduler: http [17193.550624] [tempesta] Un-registering scheduler: ratio [17193.585109] [tempesta] exiting... [17193.588362] kmem_cache_destroy tfw_srv_conn_cache: Slab cache still has objects [17193.594033] CPU: 0 PID: 4301 Comm: rmmod Tainted: G D O 4.9.35 #2 [17193.598002] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014 [17193.598002] 0000000000000000 ffffffffa569ddf8 ffff93939a0bdb40 0000000000000000 [17193.598002] ffffffffa553c59e 00ff93939a0bdae0 ffffa400003b7e70 ffffa400003b7e70 [17193.598002] ffffa400003b7e80 ffffa400003b7e80 4501bb40293c7162 000000000000000a [17193.598002] Call Trace: [17193.598002] [<ffffffffa569ddf8>] ? dump_stack+0x46/0x5e [17193.598002] [<ffffffffa553c59e>] ? kmem_cache_destroy+0x23e/0x250 [17193.598002] [<ffffffffc04eb9eb>] ? tfw_exit+0x2b/0x60 [tempesta_fw] [17193.598002] [<ffffffffa54cb858>] ? SyS_delete_module+0x178/0x240 [17193.598002] [<ffffffffa5402064>] ? exit_to_usermode_loop+0x64/0x80 [17193.598002] [<ffffffffa59278e4>] ? entry_SYSCALL_64_fastpath+0x17/0x98 Signed-off-by: Denis Kirjanov <dk@tempesta-tech.com>
The issue was wrongly closed |
Review fixes #2 - fuse domain fronting ans strict host checking frang validations together - show full SNI in "unknown host" message
We can do this, because connection won't be removed while we have alvie response-request pair.
We can do this, because connection won't be removed while we have alvie response-request pair.
We need at least minimal HTTP requests normalization (like ngx_http_parse_complex_uri() from Nginx does it). The normalization must be implemented as a part of current HTTP FSM (to avoid double processing as in Nginx) and write normalized fields to appropriate part of TfwHttpReq.
Normalization depending on back-end server personality also must be done, however this is very customizable and expensive logic, so it should be possible to switch the functionality off. Thus it must be implemented in http_norm.h as plugable HTTP FSM states.
Basically, we shouldn't do the normalization if
Cache-Control: no-transform
is presented.Depends on #902. Linked with #1207.
The text was updated successfully, but these errors were encountered: