Skip to content
This repository has been archived by the owner on May 14, 2020. It is now read-only.

Improve invalid character rule (960901/920270) #323

Closed
zinoe opened this issue Mar 26, 2016 · 12 comments
Closed

Improve invalid character rule (960901/920270) #323

zinoe opened this issue Mar 26, 2016 · 12 comments
Assignees

Comments

@zinoe
Copy link

zinoe commented Mar 26, 2016

As part of the CRS3 paranoia project, we will be discussing possible stricter siblings for some rules. The goal of this series of issues is to discuss the exact details.

This issue brings up rule 960901 (2.2.x) / 920270 (3.0.0-rc1). Information on the proposed changes can be found here.

Please add your thoughts, proposals or concerns in the comment section.

@dune73
Copy link
Contributor

dune73 commented Apr 21, 2016

One thing I do not understand with this rule is, why Request-Header Referer is excluded. This allows the Referer to carry null characters.

The standard rule restricts to ascii 1-255. The 2.2.X rules carry an optional rule, which does 32-126 and includes REQUEST_BODY.

Paranoia Mode gives us more options. Here is my proposal:

920270  PL1 All but null        1-255
920271  PL2 All printable       32-126,128-255
920272  PL3 All low range ascii 32-126
920273  PL4 a-Z0-9              48-57,65-90,97-122

So, PL1 remains the same, old optional rule moves to PL3. PL2 and PL4 are new variants.
The PL4 variant demands to exclude a few additional request headers.

Outside of this, I am inclined to remove the exclusion for Referer on PL1/2/3. And I plan to add REQUEST_BODY to PL3 and PL4.

@dune73
Copy link
Contributor

dune73 commented Apr 22, 2016

Ran some tests. For 920273, I had to disable all the Request-Headers or FPs would skyrocket. Then I added - _ = & to reduce the remaining FPs. This brings us to the following stats based on a sample of 8K legitimate requests and 24K attacking request.

                Total   PL1/920270      PL2/920271      PL3/920272      PL4/920273
Vanilla:         8405            0               3              93            5778
Attack:         26960          456            1048            1053            9874

I see two issues with this. There are not enough real positives on PL3 and still too many false positives in PL4.

@zinoe
Copy link
Author

zinoe commented Apr 22, 2016

@dune73: There are not enough real positives on PL3 and still too many false positives in PL4.

Must we alter the rules for every single paranoia level then? Would it significantly limit PM in its purpose if we did something like this?

920270  PL1 All but null              1-255
920271  PL2 All printable             32-126,128-255
920272  PL3 All printable             32-126,128-255
920273  PL4 All low range ascii       32-126

This is just an example; not further interfering on PL4 but also not providing significant improvements on security from PL3. I think one divergent rule - in terms of FP's, like 920273 - across a certain PL could really throw off the whole concept since no one would ever run that level.
On the other hand, further tinkering on specific ranges seems not to be the goal of this CRS but should be left for individual use.

@dune73
Copy link
Contributor

dune73 commented Apr 22, 2016

@zinoe A bit of tinkering seems worthwhile if we can improve the stats. In fact two adjustments did the trick for me: PL3: Exclude %. This does not affect the reaction on my legitimate test traffic much, but it raises the real positives on the attack traffic significantly. And for PL4: Allow , . :

This brings us to:

SecRule ARGS|ARGS_NAMES|REQUEST_HEADERS|!REQUEST_HEADERS:Referer "@validateByteRange 1-255" 
SecRule ARGS|ARGS_NAMES|REQUEST_HEADERS                          "@validateByteRange 32-126,128-255" 
SecRule ARGS|ARGS_NAMES|REQUEST_BODY|REQUEST_HEADERS             "@validateByteRange 32-36,38-126" 
SecRule ARGS|ARGS_NAMES|REQUEST_BODY                             "@validateByteRange 38,44-46,48-58,61,65-90,95,97-122" 

My new stats:

           Total    PL1/920270      PL2/920271      PL3/920272      PL4/920273
Vanilla:    8405             0               3             140            2492
Attack:    26960           456            1048            4591            9851

I think this is close to a sweet spot. PL4 brings a lot of false positives, but I assume users accept that when running in PL4. The other positive numbers look good to me: The attacks are substantially more affected.

Where I am unsure is the exclusion of REQUEST_HEADERS from the strict charset of PL4.
One option could be a separate PL4 strict sibling for REQUEST_HEADERS where at least some additional chars are acceptable (like the space character).

And finally: All these rules give a score of 4. I wonder if we do not want to raise this to 5. The stats make this useful in my eyes. The proposed rule 920273 is so strict, I can not see how an attack could be possible (ruling out extremely stupid applications of course) with 920273. Setting an anomaly limit of 5 would then block with 920973.

@zinoe
Copy link
Author

zinoe commented Apr 22, 2016

@dune73: A bit of tinkering seems worthwhile if we can improve the stats. In fact two adjustments did the trick for me ...

Fair enough. The stats definitely talk for them selfs.

Thanks for the detailed approach - well done.

@csanders-git
Copy link
Contributor

@zinoe and @dune73 I can craft this rule if we think it is ready. In general I think this is a pretty good state at least to include in an RC1. If we get lots of issues we can always tinker more but the numbers look good for PL1 and PL2 certainly. Just let me know :)

@dune73
Copy link
Contributor

dune73 commented Jul 11, 2016

Hold on. This is still in my queue. I just wanted to have the DDoS stuff merged before I continue to work on a 2nd controversial PR. I have made a bit progress since April though, and my plan is to have it ready for RC1. The REQUEST_HEADERS make this really quite difficult, but I would really like to include them as well.

Thoughts on raising the score to 5? I means it's an invalid character. Invalid characters should be blocked.

@csanders-git
Copy link
Contributor

I'll review the DOS stuff today. I think the score at 5 makes sense as this is specified in the RFC.
Also, I should note for PL-1 we shouldn't need ALL ASCII only up to 128. The RFC says (if i'm reading it right says a CHAR = <any US-ASCII character (octets 0 - 127)>

"
Many HTTP/1.1 header field values consist of words separated by LWS
or special characters. These special characters MUST be in a quoted
string to be used within a parameter value (as defined in section
3.6).

   token          = 1*<any CHAR except CTLs or separators>
   separators     = "(" | ")" | "<" | ">" | "@"
                  | "," | ";" | ":" | "\" | <">
                  | "/" | "[" | "]" | "?" | "="
                  | "{" | "}" | SP | HT

"

@dune73
Copy link
Contributor

dune73 commented Jul 14, 2016

Thank you. Probably one of the reasons > 127 is never seen. :)

This means that the ranges for the various PLs will need to be adjusted.

@dune73
Copy link
Contributor

dune73 commented Jul 24, 2016

I tried to stick with the lower ASCII range, but this gives an impressive amount of FPs. I think the high range is used in request bodies. So I am only restricting it in higher PLs.

@dune73
Copy link
Contributor

dune73 commented Jul 24, 2016

See PR #435.

@dune73
Copy link
Contributor

dune73 commented Jul 27, 2016

Has been merged. Thus closing.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants