Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seclang parser issue: wrong '\\' sequence handling #2148

Open
airween opened this issue Aug 14, 2019 · 6 comments
Open

Seclang parser issue: wrong '\\' sequence handling #2148

airween opened this issue Aug 14, 2019 · 6 comments
Assignees
Labels
3.x Related to ModSecurity version 3.x

Comments

@airween
Copy link
Member

airween commented Aug 14, 2019

Describing of bug
In v2, if a SecRule contains a sequence of \\ (double backslash), the parser interprets it as a single \ (backslash) character. This is because the reading of config file is the Apache job. The process chain in v2 is something like this: ap_getword_conf (declared in httpd.h), it is called from ap_build_config (declared in http_config.h), and the root of this chain is ap_read_config (also in http_config.h) - but as I saw there is copy of the body of ap_getword_conf() function in modsec source...

This function strips the double \\, and the parser got a single \ character. For example, if the rule looks like this:

SecRule ARGS "@rx \\\\u006C" ...

then the pattern will only \u006C in stage of parsing of rule, and will \u006C in stage of building regex. Then if an argument contains \u006C, then it matches.

ModSecurity3 has an own parser, and it keeps the patterns as is, doesn't strip them, so the \\\\u006C will \\\\u006C, and the regex will got \\u006C - so the mentioned argument will not match.

The essence of this issue (and the contradictory behavior), that if the SecRule is not in the included configuration file, but is it in the webserver config file (eg in virtual context:

modsecurity_rules '
      SecRuleEngine On
      ....
      SecRule ARGS "@rx \\\\u006C"...

)
then it works. So, that's a bit confused: I can't pass the same rule as same syntax with both avaliable method...

How to reproduce

Set up the Nginx with ModSecurity and OWASP CRS. Send a curl request:

curl -v 'http://localhost/?var=%22in%20\\u0076\\u0061l\\u0075e\\u004F\\u0066%3d

It will passed, ModSecurity 3 doesn't catch it.

Expected behavior

It would be expected that ModSecurity catches this curl request above, like mod_security2.

Affected versions

This issue affects all ModSecurity3 versions.

Rule Set

The good real example is the OWASP CRS, rule 941330:

https://github.com/SpiderLabs/owasp-modsecurity-crs/blob/00077e1401f53c32fc7c170d059f3003cd154f52/rules/REQUEST-941-APPLICATION-ATTACK-XSS.conf#L854

@zimmerle
Copy link
Contributor

Hi @airween,

As explained earlier to you, the CRS is not a good to report issues because it has to many rules; having its own logic sometimes can be confuse to reproduce the issues. Please narrow down that to a rule that only reflect the problem.

@zimmerle zimmerle self-assigned this Sep 13, 2019
@zimmerle zimmerle added the 3.x Related to ModSecurity version 3.x label Sep 13, 2019
@airween
Copy link
Member Author

airween commented Sep 13, 2019

Hi @zimmerle,

here is a gist which contains a regression test:

https://gist.github.com/airween/7f40a0abfa2e8f636bf08893bfd4a4b9

As you can see, in the argument of operator there are 4 backslash before the unicode charcodes, but after the regression test's JSON parser reads that, those will be escaped, and will only \\. Also in the request argument: the \\ will be \. It's important that the SecRule loaded by the regression tests json parser.

Now look at these gist (there are two files):
https://gist.github.com/airween/5b6bb902e725159453bc02de835f55ab
(Note, that you have to copy the second file (with .conf extension) to the ModSecurity_source/test file, because that's the relative root.)

This rule exactly same like in the previous regression test, except that the security rule had organized to an external file. This means that the SecRule will parsed by the libmodsecurity's parser. This test will failed, because the 4 backslash are kept. Again: the two config is same, only the parsing method different, but this same test will failed.

Now finally, look up these two files in this gist:

https://gist.github.com/airween/60bfd8dda2efc1808be2fe0c7ac65556

Here you also have to copy the json file to the test/test-cases/regression directory, and the .conf file to the test/. You can see the differences: in this included file, there are only just the double \\ sequences. Whit this settings, the test will passed.

The problem is the incompatibility between the version 2 and version 3. In case of version 2, the config files parsed by Apache webserver, which unescapes every escaped sequences. I described the parsing flow in the first comment. In case of libmodsecurity3 the used Bison parser doesn't unescapes the sequences.

Hope this will helps to understand the problem. :)

@martinhsv
Copy link
Contributor

I was recently burned by this too.

I'm not really sure what the right solution is here. I'll have to think about this a bit more.

@martinhsv martinhsv self-assigned this May 31, 2021
@airween
Copy link
Member Author

airween commented May 31, 2021

Hi @martinhsv,

thanks for caring about this.

Once I have written a seclang parser in C, used flex, just for fun. In that case I read the config files into a buffer first, where I could replaced the sequences - here is a very similar function. (It does not need to replace the \\ occurrence everywhere, only between the " signs.) After that I just passed the transformed buffer to flex, used yy_scan_buffer instead of using the yyin as directly. As I remember handling of include directive is a bit harder on this way.

@martinhsv
Copy link
Contributor

martinhsv commented Jun 3, 2021

For the following use case:

  • you want to detect two consecutive literal backslash (ASCII 5c) characters with a regex AND
  • you want to be able to use the same rule across ModSecurity v2.9.x and v3.0.x

One simple way to deal with this is to specify each individual backslash with a character class.

A regex character class can be written with individual characters specified multiple times. E.g. [aab] is the same as [ab]

We can take advantage of this by matching exactly one literal backslash character with

[\\\\]

... and then it doesn't matter whether what is passed to the regex engine for compilation is only one backslash character within the square brackets or whether it is two.

For example, suppose you want to be able to detect, within the arg 'var', the 11-character literal string:

aa\\u0076bb

You can write your rule in one of these two ways:

SecRule ARGS:var "@rx aa[\\\\][\\\\]u0076bb" ...
SecRule ARGS:var "@rx aa[\\\\]{2}u0076bb" ...

@airween
Copy link
Member Author

airween commented Jun 6, 2021

Thanks for the tip - I can confirm this workaround works as well in case of both versions: mod_security2 and libmodsecurity3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.x Related to ModSecurity version 3.x
Projects
None yet
Development

No branches or pull requests

3 participants