META-ISSUE: Performance task force #1734

zimmerle · 2018-04-06T21:31:29Z

Performance discussion in a environment such as libModSecurity which is deployed in different manners and meant to be used in different scenarios is always interesting because it leads to circumstances where the benefit for some may be negative for others. That said, this ticket will be open forever :).

The definition of performance optimization is
The process of making something, especially a computer system, work as effectively as possible.

Due the fact that some optimization may lead to the benefit of a few, different angles and point of views are expected to be discussed. Luckily for us numbers are the numbers to everybody. :)

The main point is: what is a good or bad performance?

Usually performance is a trade-of between: CPU, Memory, and I/O in a triangle shape. Decrease the usage of one, increase the usage of other. libModSecurity was designed to support low latency; Even if it costs throughput. That is kind of desirable in o cloud-like deployment.

How to measure the performance

Usually ModSecurity is deployed together with different components:

libModSecurity
Connetor: ModSecurity-nginx, ModSecurity-apache [etc].
The webserver: nginx, apache [etc].

Don't be confused, the numbers here are about the library ModSecurity. The library already contains the benchmark utility and the stap scripts necessary to plot the data.

Other testing components, tools and methods may be suitable for their own tickets and discussion.

What the numbers means

Are "bad" benchmark numbers means that my web application will be that bad? Not necessarily. Although it can have a relation. In a real world deployment other things like: latency and our back end application will add substantial processing time, leading the overhead of the WAF to be negligible, at least in a theoretical scenario were: well performant rules met performant core.

Notice that the request also play a important role in this equation, hard to choose what it a typical one these days.

Knowing all those variables, the numbers gives us a strong indication where to look in order to improve the performance.

How can I send a suggestion here?

The idea is taking everybody's input into consideration. The uptake on the decision is what is best for the community overall, so, in order to everybody understand your performance problem please explain your suggestion pointing to facts.

I want to participate but I'm not following the discussions

There are a few tasks that you may want to help:

Variable computation up to utilization

ModSecurity v3 computes all variables regardless if it is being used or not, the variable are gradativelly filled everytime the a new piece of information is delivered to ModSecurity. This does not need to be that way. As the rules are pre-compiled, the variable just needs to be computed (and therefore allocate memory for it) if they are really used. The architecture already fits this implementation, it just needs to be done. In this process patches thatremove some configuration directives that will be deprecated will be more than welcome. As the example of: SecResponseBodyAccess.

Memory pools to avoid memory fragmentation

Memory fragmentation in our case is a consequence to very little pieces of informations that are allocated in each transaction. In busy servers that most certainly will be an issue. Further info: Memory fragmentation

Technical investigation on the feasibility to reduce the unused variables or variables with same or similar content.

The are some variables that may hold the same content as the example of:

https://github.com/SpiderLabs/ModSecurity/wiki/Reference-Manual-(v2.x)#REQUEST_LINE
https://github.com/SpiderLabs/ModSecurity/wiki/Reference-Manual-(v2.x)#REQUEST_URI
https://github.com/SpiderLabs/ModSecurity/wiki/Reference-Manual-(v2.x)#REQUEST_URI_RAW

Do they need to co-exist? If so, can we use offsets in a way to use the same memory space to represent it?

Support for cloud-like collection storage.

Collections are now saved into a key-pair storage which has the benefit of simplicity but not optimal in terms of performance. Handle that to an external process will lead to a good reduction of CPU usage.

Performant implementation for popular transformations.

Some of the transformations are executed too many times, some already does in-memory operations, but better logic in the implementation will help a lot. Assembly is welcomed here.

[to be update with new stuff as needed.]

dune73 · 2018-04-07T19:52:41Z

References to exchange on mailinglist

Requests per Second: Realworld scenario : https://sourceforge.net/p/mod-security/mailman/message/36286327/
Requests per Second: Localhost : https://sourceforge.net/p/mod-security/mailman/message/36285879/
Substantial Slowdown report : https://sourceforge.net/p/mod-security/mailman/message/36285471/
Performance Expectations : https://sourceforge.net/p/mod-security/mailman/message/36285446/
Limitations in core code and TODOs : https://sourceforge.net/p/mod-security/mailman/message/36285367/
Problem with Rule::evaluate : https://sourceforge.net/p/mod-security/mailman/message/36284403/

Related Issues

Cache/cheapen evaluated Rule variables #1732 : Cache/cheapen evaluated Rule variables
Reduce dynamic memory allocations in hot path code #1731 : Reduce dynamic memory allocations in hot path code

dune73 · 2018-04-07T20:05:27Z

I do not understand the "designed to support low latency"? What does this mean exactly? I take it, this is not the latency that the client sees, or is it?

Other than that, I get the feeling these are all very good considerations, but we are not yet in a situation where libModSecurity has efficient code and further optimization demands priorities in the classic triangle. We are in a situation where inefficient code means subpar performance and I do not see this mirrored above. Various people who have taken a closer look at the source code and they have pointed out where they see room for optimization. There is only a small intersection with the ideas described in the introduction to this issue. We miss a statement of the lead developer with regards to these suggestions.

dune73 · 2018-04-07T20:08:00Z

REQUEST_LINE vs. REQUEST_URI vs. REQUEST_URI_RAW vs. REQUEST_FILENAME

Yes, they are all needed in different situations. If you can "use offsets in a way to use the same memory space to represent " them, then yes, by all means. But each of these 4 cousins are really important.

This is a powerful anti-evasion rule for example:

SecRule REQUEST_URI "!@streq %{REQUEST_URI_RAW}" \
    "id:11000,phase:1,deny,t:normalizePathWin,log,\
    msg:'URI evasion attempt'"

jeremyjpj0916 · 2020-02-04T07:19:58Z

We spent some time playing with ModSec v3/master + NGINX-connector and OWASP v3.2 CRS today too in our personal DCs(All the latest flagship releases of the libs as of 02/04/2020). Figured I would throw some of our results and configuration information out there for yah to digest:

We run NGINX+ OpenResty + Kong(API Gateway application) for REST/SOAP/Websocket API services.

2020/02/04 06:53:02 [debug] Kong: 1.4.3
2020/02/04 06:48:58 [debug] ngx_lua: 10015
2020/02/04 06:48:58 [debug] nginx: 1015008
2020/02/04 06:48:58 [debug] Lua: LuaJIT 2.1.0-beta3

6 worker processes.

Environments:
4 Pods total, each with:
CPU: 6 cores
Memory: 10 GiB

Relevant OWASP Config info:

Paranoia Level: 1

Request ruleset minimizations:

# Remove Drupal App Specific Rulset
SecRuleRemoveById 9001100-9001200
# Remove Wordpress App Specific Ruleset
SecRuleRemoveById 9002000-9002900
# Remove NextCloud App Specific Ruleset
SecRuleRemoveById 9003000-9003500
# Remove Dokuwiki App Specific Ruleset
SecRuleRemoveById 9004000-9004380
# Remove cPanel App Specific Ruleset
SecRuleRemoveById 9005000-9005100
# Remove XenForo App Specific Ruleset
SecRuleRemoveById 9006000-9006950
# Remove XenForo App Specific Ruleset
SecRuleRemoveById 9006000-9006950
# Remove PHP Lang App Specific Ruleset(AFAIK we are protecting no PHP based APIs, all Java/.NET)
SecRuleRemoveByTag "language-php"
# Remove NodeJS Lang App Specific Ruleset(AFAIK we are protecting no NodeJS based APIs, all Java/.NET)
SecRuleRemoveByTag "language-javascript"

Response ruleset minimizations:

# Remove Response reading
SecRuleRemoveById 950000-999999

Generic ModSec Configurations(removed response body reading + disabled audit log for extra perf):

# -- Rule engine initialization ----------------------------------------------
SecRuleEngine On
SecRequestBodyAccess On
SecRule REQUEST_HEADERS:Content-Type "(?:application(?:/soap\+|/)|text/)xml" \
     "id:'200000',phase:1,t:none,t:lowercase,pass,nolog,ctl:requestBodyProcessor=XML"

SecRule REQUEST_HEADERS:Content-Type "application/json" \
     "id:'200001',phase:1,t:none,t:lowercase,pass,nolog,ctl:requestBodyProcessor=JSON"

SecRequestBodyLimit 13107200
SecRequestBodyNoFilesLimit 131072
SecRequestBodyLimitAction Reject

SecRule REQBODY_ERROR "!@eq 0" \
"id:'200002', phase:2,t:none,log,deny,status:400,msg:'Failed to parse request body.',logdata:'%{reqbody_error_msg}',severity:2"

SecRule MULTIPART_STRICT_ERROR "!@eq 0" \
"id:'200003',phase:2,t:none,log,deny,status:400, \
msg:'Multipart request body failed strict validation: \
PE %{REQBODY_PROCESSOR_ERROR}, \
BQ %{MULTIPART_BOUNDARY_QUOTED}, \
BW %{MULTIPART_BOUNDARY_WHITESPACE}, \
DB %{MULTIPART_DATA_BEFORE}, \
DA %{MULTIPART_DATA_AFTER}, \
HF %{MULTIPART_HEADER_FOLDING}, \
LF %{MULTIPART_LF_LINE}, \
SM %{MULTIPART_MISSING_SEMICOLON}, \
IQ %{MULTIPART_INVALID_QUOTING}, \
IP %{MULTIPART_INVALID_PART}, \
IH %{MULTIPART_INVALID_HEADER_FOLDING}, \
FL %{MULTIPART_FILE_LIMIT_EXCEEDED}'"

SecRule MULTIPART_UNMATCHED_BOUNDARY "@eq 1" \
    "id:'200004',phase:2,t:none,log,deny,msg:'Multipart parser detected a possible unmatched boundary.'"

SecPcreMatchLimit 1000
SecPcreMatchLimitRecursion 1000

SecRule TX:/^MSC_/ "!@streq 0" \
        "id:'200005',phase:2,t:none,deny,msg:'ModSecurity internal error flagged: %{MATCHED_VAR_NAME}'"


SecResponseBodyAccess Off
SecResponseBodyMimeType text/plain text/html text/xml
SecResponseBodyLimit 524288
SecResponseBodyLimitAction ProcessPartial

SecTmpDir /tmp/

Dir /tmp/

#SecUploadDir /opt/modsecurity/var/upload/
#SecUploadKeepFiles RelevantOnly
#SecUploadFileMode 0600
#SecDebugLog /opt/modsecurity/var/log/debug.log
#SecDebugLogLevel 3
SecAuditEngine Off
SecAuditLogRelevantStatus "^(?:5|4(?!04))"
SecAuditLogParts ABIJDEFHZ
SecAuditLogType Concurrent
SecAuditLog /usr/local/kong/logs/modsec_audit.log
# Specify the path for concurrent audit logging.
#SecAuditLogStorageDir /opt/modsecurity/var/audit/
SecArgumentSeparator &
SecCookieFormat 0
SecUnicodeMapFile unicode.mapping 20127
SecStatusEngine Off

LoadTest environment:

Beefy machine running WRK client (https://github.com/wg/wrk), parameters:

TEST_DURATION (seconds) | 300
TEST_PAYLOAD_SIZE (kb) | 10
TEST_THREADS | 10
TEST_CONNECTIONS | 10

RESULTS

Before installing the WAF:

THROUGHPUT | 2185 TPS
BYTES | 10562966112
REQUESTS | 655874

E2E Client Latency Observed:

P50 | 8ms
P90 | 12ms
P95 | 15ms

After installing the WAF:

THROUGHPUT | 1397 TPS
BYTES | 6755109941
REQUESTS | 419445

E2E Client Latency Observed:

P50 | 12ms
P90 | 21ms
P95 | 23ms

Fairly substantial drop in throughput with this sample workload.

Overall I don't consider the performance degradation to be enough for me to scrap use of the application. But based on our testing there is certainly room to be desired. I believe I have minimized the ruleset and optimized as much as I can just about for an API Gateway's use case(feel free to correct me if you think up any irrelevant to my use case rules to drop that may improve these numbers/performance).

Anyways thanks for all the hard work put into this application and the widely adopted rulesets! I remember glancing at ModSec in 2016 for NGINX and decided against going down that rabbit hole at the time. Seems the application has matured and is much more flexible here in 2020 so cheers for that. Maybe a future release can dial in on areas for performance improvement (geared to help with throughput/latency bottlenecks associated with the WAF layer.)

dune73 · 2020-02-04T11:06:14Z

Thank you for this extensive writeup. It's interesting to see what you can get out of it, when you drop rules that are of no interest to you.

zimmerle · 2020-02-04T12:11:07Z

Thank you for the notes @jeremyjpj0916. There is this wiki page - https://github.com/defanator/modsecurity-performance/wiki where @defanator carefully collects data on the performance of the engine with and without CRS loaded.

zimmerle self-assigned this Apr 6, 2018

victorhora added enhancement RIP - libmodsecurity 3.x Related to ModSecurity version 3.x labels Apr 6, 2018

p0pr0ck5 mentioned this issue Jul 7, 2018

Performance issue with commercial ruleset in modsecv3 on nginx #1832

Closed

This was referenced Dec 10, 2018

Reduce dynamic memory allocations in hot path code #1731

Closed

Should we cancel Macro Expansion support to tag action? #1950

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

META-ISSUE: Performance task force #1734

META-ISSUE: Performance task force #1734

zimmerle commented Apr 6, 2018

dune73 commented Apr 7, 2018 •

edited

Loading

dune73 commented Apr 7, 2018

dune73 commented Apr 7, 2018 •

edited

Loading

jeremyjpj0916 commented Feb 4, 2020 •

edited

Loading

dune73 commented Feb 4, 2020

zimmerle commented Feb 4, 2020

META-ISSUE: Performance task force #1734

META-ISSUE: Performance task force #1734

Comments

zimmerle commented Apr 6, 2018

How to measure the performance

What the numbers means

How can I send a suggestion here?

I want to participate but I'm not following the discussions

Variable computation up to utilization

Memory pools to avoid memory fragmentation

Technical investigation on the feasibility to reduce the unused variables or variables with same or similar content.

Support for cloud-like collection storage.

Performant implementation for popular transformations.

dune73 commented Apr 7, 2018 • edited Loading

References to exchange on mailinglist

Related Issues

dune73 commented Apr 7, 2018

dune73 commented Apr 7, 2018 • edited Loading

REQUEST_LINE vs. REQUEST_URI vs. REQUEST_URI_RAW vs. REQUEST_FILENAME

jeremyjpj0916 commented Feb 4, 2020 • edited Loading

Relevant OWASP Config info:

LoadTest environment:

RESULTS

dune73 commented Feb 4, 2020

zimmerle commented Feb 4, 2020

dune73 commented Apr 7, 2018 •

edited

Loading

dune73 commented Apr 7, 2018 •

edited

Loading

jeremyjpj0916 commented Feb 4, 2020 •

edited

Loading