Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix/feat: Macro expansions, error logs redundancy, support msg/logdata in inner rules #792

Merged
merged 9 commits into from May 18, 2023

Conversation

M4tteoP
Copy link
Member

@M4tteoP M4tteoP commented May 12, 2023

This PR aims to:

  • Fix macro expansions leading to correctly log MATCHED_* variables.
  • Fix capture action propagation.
  • Fix Incorrect log message from coraza v3 for CRS rule 920450 #570 (Closed by the bot, but actually still an open issue)
  • Reduce verbosity by avoiding printing matched data without actual data to be printed.
  • Support Msg/Datalog as part of the inner rule chains without affecting regression (related consideration).
  • Add tests.

Example 1: Rule 920170

Current Coraza:

[id "920170"] [rev ""] [msg "GET or HEAD Request with Body Content"] [data "GET"]

Apache w/ModSec:

[id "920170"] [msg "GET or HEAD Request with Body Content"] [data "2"]

This PR:

[id "920170"] [rev ""] [msg "GET or HEAD Request with Body Content"] [data "2"]

920170 is a CRS chained rule with a first match of REQUEST_METHOD "@rx ^(?:GET|HEAD)$", followed by REQUEST_HEADERS:Content-Length "!@rx ^0?$". The MATCHED_VAR is printed by logdata. Current Coraza expands the latter right after the parent rule execution, therefore [data "GET"] is printed. ModSecurity behavior is different, expanding the macro at the of the whole rule. It leads to logging more meaningful data like the size of the body content. If the meaningful data that has to be printed is the matched variable of the parent rule, CRS already address it using some support variables and printing them (E.g. see TX.932200_MATCHED_VAR_NAME used by rule 932200.

Example 2: Rule 920450 (Reported by Issue #570)

Current Coraza:

[client "127.0.0.1"] Coraza: Warning. HTTP header is restricted by policy (user-agent) [file ""] [line "0"] [id "920450"] [rev ""] [msg "HTTP header is restricted by policy (user-agent)"] [data "Restricted header detected: user-agent"] [severity "critical"] [ver ""] [maturity "0"] [accuracy "0"] [hostname ""] [uri "/"] [unique_id "fmtcIalunsExuoQXkJC"]
[client "127.0.0.1"] Coraza: Warning. HTTP header is restricted by policy (user-agent) [file ""] [line "0"] [id "920450"] [rev ""] [msg "HTTP header is restricted by policy (accept)"] [data "Restricted header detected: accept"] [severity "critical"] [ver ""] [maturity "0"] [accuracy "0"] [hostname ""] [uri "/"] [unique_id "fmtcIalunsExuoQXkJC"]
[client "127.0.0.1"] Coraza: Warning. HTTP header is restricted by policy (user-agent) [file ""] [line "0"] [id "920450"] [rev ""] [msg "HTTP header is restricted by policy (proxy)"] [data "Restricted header detected: proxy"] [severity "critical"] [ver ""] [maturity "0"] [accuracy "0"] [hostname ""] [uri "/"] [unique_id "fmtcIalunsExuoQXkJC"]
[client "127.0.0.1"] Coraza: Warning. HTTP header is restricted by policy (user-agent) [file ""] [line "0"] [id "920450"] [rev ""] [msg "HTTP header is restricted by policy (host)"] [data "Restricted header detected: host"] [severity "critical"] [ver ""] [maturity "0"] [accuracy "0"] [hostname ""] [uri "/"] [unique_id "fmtcIalunsExuoQXkJC"]
[client "127.0.0.1"] Coraza: Warning. HTTP header is restricted by policy (user-agent) [file ""] [line "0"] [id "920450"] [rev ""] [msg ""] [data ""] [severity "critical"] [ver ""] [maturity "0"] [accuracy "0"] [hostname ""] [uri "/"] [unique_id "fmtcIalunsExuoQXkJC"]

Apache w/ModSec:

ModSecurity: Warning. String match within "/accept-charset/ /content-encoding/ /proxy/ /lock-token/ /content-range/ /if/ /x-http-method-override/ /x-http-method/ /x-method-override/" at TX:header_name_proxy. [file "/etc/modsecurity.d/owasp-crs/rules/REQUEST-920-PROTOCOL-ENFORCEMENT.conf"] [line "1156"] [id "920450"] [msg "HTTP header is restricted by policy (/proxy/)"] [data "Restricted header detected: /proxy/"]

This PR:

Coraza: Warning. HTTP header is restricted by policy (/proxy/) [file "default.conf"] [line "74"] [id "920450"] [rev ""] [msg "HTTP header is restricted by policy (/proxy/)"] [data "Restricted header detected: /proxy/"]

This rule matches with the parent rule multiple request headers names (REQUEST_HEADERS_NAMES "@rx ^.*$"). Because of this, currently we are just printing random header names, not the one that matched against the list of restricted headers (inner rule of the chain). Furthermore, it shows that we are iterating over matched_data generating a lot of verbosity.

Example 3: Rule 920240

Current Coraza:

Coraza: Warning. URL Encoding Abuse Attack Attempt [file "default.conf"] [line "46"] [id "920240"] [rev ""] [msg "URL Encoding Abuse Attack Attempt"] [data "application/x-www-form-urlencoded"] [severity "emergency"] [ver ""] [maturity "0"] [accuracy "0"] [hostname ""] [uri "/anything"] [unique_id "pDblvdXCLhyVQFVEpoO"]
Coraza: Warning. URL Encoding Abuse Attack Attempt [file "default.conf"] [line "46"] [id "920240"] [rev ""] [msg ""] [data ""] [severity "emergency"] [ver ""] [maturity "0"] [accuracy "0"] [hostname ""] [uri "/anything"] [unique_id "pDblvdXCLhyVQFVEpoO"]
Coraza: Warning. URL Encoding Abuse Attack Attempt [file "default.conf"] [line "46"] [id "920240"] [rev ""] [msg ""] [data ""] [severity "emergency"] [ver ""] [maturity "0"] [accuracy "0"] [hostname ""] [uri "/anything"] [unique_id "pDblvdXCLhyVQFVEpoO"]

Apache w/ModSec:

ModSecurity: Warning. Invalid URL Encoding: Non-hexadecimal digits used at REQUEST_BODY. [file "/etc/modsecurity.d/owasp-crs/rules/REQUEST-920-PROTOCOL-ENFORCEMENT.conf"] [line "403"] [id "920240"] [msg "URL Encoding Abuse Attack Attempt"] [data "x=new %w20$"] [severity "WARNING"] [ver "OWASP_CRS/4.0.0-rc1"] [tag "modsecurity"] [tag "application-multi"] [tag "language-multi"] [tag "platform-multi"] [tag "attack-protocol"] [tag "paranoia-level/1"] [tag "OWASP_CRS"] [tag "capec/1000/255/153/267/72"] [hostname "localhost"] [uri "/anything"] [unique_id "ZFpTz9FCx2YFblEP59zHAQAAAIw"]

This PR:

Coraza: Warning. URL Encoding Abuse Attack Attempt [file "default.conf"] [line "46"] [id "920240"] [rev ""] [msg "URL Encoding Abuse Attack Attempt"] [data "application/x-www-form-urlencoded"] [severity "emergency"] [ver ""] [maturity "0"] [accuracy "0"] [hostname ""] [uri "/anything"] [unique_id "vyqsOcYPqNodHzkBIGC"]

Iteration over matched_data that leads to verbosity in which I see not much value, is clearer in this example. Rule 920240 is a chained rule made of 3 total rules, therefore we expect 3 matched_data. Iterating over them leads to logging three times the Coraza: Warning line, effectively printing some data only at the first line. (Just like previous examples, the printed data is also not very useful, not printing the real payload, but just that the request has been done with Content-Type:application/x-www-form-urlencoded.

Example 4: Chained rule with multiple msg/logdata.

Even if ModSecurity explicitly states that metadata actions (id, rev, msg, tag, severity, logdata), skip, and skipAfter actions can be specified only by the chain starter rule., the aforementioned iteration over matched_data allowed to print multiple msg/logdata. This "feature" was also escalated by engine/chains.go tests in which every inner rule is printing different msg and logdata. This PR proposes to better support this feature by printing extra msg/logdata:

  • In a less redundant way.
  • Only if they actually exist.
    Example rule:
SecRule REQUEST_PROTOCOL "@pm HTTP/1.0 HTTP/1.1" \
    "id:99,\
    phase:1,\
    deny,\
    log,\
    msg:'Parent rule',\
    logdata:'%{tx.99_matched_var}',\
    setvar:'tx.99_matched_var=%{matched_var}',\
    chain"
    SecRule REQUEST_METHOD "@streq POST" \
        "msg:'Inner rule',\
        logdata:'%{MATCHED_VAR_NAME}:%{MATCHED_VAR}',\
        chain"
        SecRule REQUEST_HEADERS:Content-Length "@gt 1" \
            "msg:'Inner inner rule',\
            logdata:'%{MATCHED_VAR_NAME}:%{MATCHED_VAR}'"

Current Coraza:

[client "127.0.0.1"] Coraza: Warning. Parent rule [file "default.conf"] [line "17"] [id "99"] [rev ""] [msg "Parent rule"] [data "HTTP/1.1"] [severity "emergency"] [ver ""] [maturity "0"] [accuracy "0"] [hostname ""] [uri "/"] [unique_id "lMXZSIDmmjusLHHMSdv"]
[client "127.0.0.1"] Coraza: Warning. Parent rule [file "default.conf"] [line "17"] [id "99"] [rev ""] [msg "Inner rule"] [data "REQUEST_METHOD:POST"] [severity "emergency"] [ver ""] [maturity "0"] [accuracy "0"] [hostname ""] [uri "/"] [unique_id "lMXZSIDmmjusLHHMSdv"]
[client "127.0.0.1"] Coraza: Warning. Parent rule [file "default.conf"] [line "17"] [id "99"] [rev ""] [msg "Inner inner rule"] [data "REQUEST_HEADERS:Content-Length:7"] [severity "emergency"] [ver ""] [maturity "0"] [accuracy "0"] [hostname ""] [uri "/"] [unique_id "lMXZSIDmmjusLHHMSdv"]

This PR:

[client "127.0.0.1"] Coraza: Warning. Parent rule [file "default.conf"] [line "17"] [id "99"] [rev ""] [msg "Parent rule"] [data "HTTP/1.1"] [severity "emergency"] [ver ""] [maturity "0"] [accuracy "0"] [hostname ""] [uri "/"] [unique_id "rwfUZgkNVEVGIjWHpWB"][msg_match_1 "Inner rule"] [data_match_1 "REQUEST_METHOD:POST"][msg_match_2 "Inner inner rule"] [data_match_2 "REQUEST_HEADERS:Content-Length:7"]

Example 5: Rules with multiMatch

The data_match_* and msg_match_* affects also rules relying on multiMatch. This probably was the initial reason to iterate over matched_data.

Example rule:

SecRule ARGS_GET "@rx .*" "id:96, phase:1, log, chain, deny, t:lowercase, status:403, msg:'msg', logdata:'%{MATCHED_VAR} in %{MATCHED_VAR_NAME}',multiMatch

Current Coraza:

[client "127.0.0.1"] Coraza: Warning. msg [file "default.conf"] [line "9"] [id "9696"] [rev ""] [msg "msg"] [data "tEsT1 in ARGS_GET:testArgGet"] [severity "emergency"] [ver ""] [maturity "0"] [accuracy "0"] [hostname ""] [uri "/?testArgGet=tEsT1"] [unique_id "FIizwseojdbvFfiToEb"]
[client "127.0.0.1"] Coraza: Warning. msg [file "default.conf"] [line "9"] [id "9696"] [rev ""] [msg "msg"] [data "test1 in ARGS_GET:testArgGet"] [severity "emergency"] [ver ""] [maturity "0"] [accuracy "0"] [hostname ""] [uri "/?testArgGet=tEsT1"] [unique_id "FIizwseojdbvFfiToEb"]

Apache w/ModSec:

ModSecurity: Warning. Pattern match ".*" at ARGS_GET:testArgGet. [file "/etc/modsecurity.d/owasp-crs/rules/REQUEST-942-APPLICATION-ATTACK-SQLI.conf"] [line "746"] [id "9696"] [msg "msg"] [data "tEsT1 in ARGS_GET:testArgGet"] [tag "modsecurity"] [hostname "localhost"] [uri "/"] [unique_id "ZFzSsaEyTBNDRYIZ378DDAAAAME"]

ModSecurity: Warning. Pattern match ".*" at ARGS_GET:testArgGet. [file "/etc/modsecurity.d/owasp-crs/rules/REQUEST-942-APPLICATION-ATTACK-SQLI.conf"] [line "746"] [id "9696"] [msg "msg"] [data "test1 in ARGS_GET:testArgGet"] [tag "modsecurity"] [hostname "localhost"] [uri "/"] [unique_id "ZFzSsaEyTBNDRYIZ378DDAAAAME"]

This PR:

[client "127.0.0.1"] Coraza: Warning. msg [file "default.conf"] [line "9"] [id "9696"] [rev ""] [msg "msg"] [data "tEsT1 in ARGS_GET:testArgGet"] [severity "emergency"] [ver ""] [maturity "0"] [accuracy "0"] [hostname ""] [uri "/?testArgGet=tEsT1"] [unique_id "KmnXmuAjmDzWkKWJZRi"][msg_match_1 "msg"] [data_match_1 "test1 in ARGS_GET:testArgGet"]

This is the only case where the current Coraza is more aligned with ModSecurity rather than this PR. If there are no strong concerns about this divergence in error logs, I feel that the PR solution:

  • Avoids redundancy.
  • Prints in a more concise way a single error log line per rule.
  • Leads Coraza to be less prone to logs being flooded by multimatch (That seems to be something you might face with ModSec).

Looking forward to your feedback!

Closes #570

@@ -159,7 +171,7 @@ func (mr MatchedRule) AuditLog(code int) string {
} else {
log.WriteString("Coraza: Warning. ")
}
mr.matchData(matchData, log)
mr.matchData(log, matchData)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, swaps parameters order just to keep consistency across mr.matchData() and mr.writeDetails()

@@ -239,12 +239,81 @@ func TestTagsAreNotPrintedTwice(t *testing.T) {
}
re := regexp.MustCompile(`\[tag "some1"\]`)
for _, l := range logs {
if len(re.FindAllString(l, -1)) < 2 {
Copy link
Member Author

@M4tteoP M4tteoP May 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully get this test. The title (TestTagsAreNotPrintedTwice) seems like that wants to avoid the same tag appearing more than once inside the logs, but then we are trying to find the same tag twice. Originally it was enforcing being just one, but it has been changed with https://github.com/corazawaf/coraza/pull/220/files#diff-090979aa58246451224457e41b61bf533048a2e7091724b5db0345c04f5ef561L248-R249.
Currently, the test is trying to match the following:

"[client \"\"] Coraza: Warning.  [file \"\"] [line \"2\"] [id \"1\"] [rev \"\"] [msg \"\"] [data \"\"] [severity \"emergency\"] [ver \"\"] [maturity \"0\"] [accuracy \"0\"] [tag \"some1\"] [tag \"some2\"] [hostname \"\"] [uri \"\"] [unique_id \"xDFbZiZkbAmlAAKUNXc\"]\n[client \"\"] Coraza: Warning.  [file \"\"] [line \"2\"] [id \"1\"] [rev \"\"] [msg \"\"] [data \"\"] [severity \"emergency\"] [ver \"\"] [maturity \"0\"] [accuracy \"0\"] [tag \"some1\"] [tag \"some2\"] [hostname \"\"] [uri \"\"] [unique_id \"xDFbZiZkbAmlAAKUNXc\"]\n"

And it seems that the ending part is very lengthy and not relevant (Current PR tries to avoid redundancy, therefore it does not print it):

[client \"\"] Coraza: Warning.  [file \"\"] [line \"2\"] [id \"1\"] [rev \"\"] [msg \"\"] [data \"\"] [severity \"emergency\"] [ver \"\"] [maturity \"0\"] [accuracy \"0\"] [tag \"some1\"] [tag \"some2\"] [hostname \"\"] [uri \"\"] [unique_id \"xDFbZiZkbAmlAAKUNXc\"]\n"

cc. @piyushroshan

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping @jptosso

@codecov
Copy link

codecov bot commented May 12, 2023

Codecov Report

Patch coverage: 85.54% and project coverage change: +0.09 🎉

Comparison is base (27b14f8) 81.75% compared to head (ad38730) 81.84%.

Additional details and impacted files
@@            Coverage Diff             @@
##           v3/dev     #792      +/-   ##
==========================================
+ Coverage   81.75%   81.84%   +0.09%     
==========================================
  Files         156      156              
  Lines        8631     8663      +32     
==========================================
+ Hits         7056     7090      +34     
+ Misses       1343     1342       -1     
+ Partials      232      231       -1     
Flag Coverage Δ
default 76.15% <85.54%> (+0.21%) ⬆️
examples 25.41% <37.34%> (+0.06%) ⬆️
ftw 47.24% <66.26%> (+0.04%) ⬆️
ftw-multiphase 49.39% <66.26%> (+0.03%) ⬆️
tinygo 74.30% <85.54%> (+0.27%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
internal/actions/deny.go 84.21% <0.00%> (ø)
internal/actions/drop.go 84.21% <0.00%> (ø)
internal/actions/redirect.go 85.71% <0.00%> (ø)
internal/corazarules/rule_match.go 60.65% <86.66%> (+9.22%) ⬆️
internal/corazawaf/rule.go 94.91% <89.65%> (-0.69%) ⬇️
internal/corazawaf/transaction.go 78.22% <100.00%> (+0.07%) ⬆️

... and 1 file with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@@ -11,3 +11,4 @@ testoverride:
920420-8: 'HEAD request with data. Go/http does not allow it - 400 Bad Request'
920430-5: 'Test has expect_error, Go/http and Envoy return 400'
920430-8: 'Go/http does no allow HTTP/3.0 - 505 HTTP Version Not Supported'
932200-13: 'wip'
Copy link
Member Author

@M4tteoP M4tteoP May 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a very unfortunate combination of rules triggered, 932200-13 test is now failing. It is intended to check that the correct log has been printed, looking for the wrong one found within MATCHED_VAR. The latter string is matched not inside the 932200 log, but inside a 942440 that is triggered alongside it. The latter match is legit, but it results to be a bit more verbose compared to Modsec MATCHED_VARS:ARGS:host vs ARGS:host, therefore the test is marked as failing. A tested workaround would be just looking for found within MATCHED_VAR: instead of found within MATCHED_VAR. I will evolve this discussion CRS side and update the exclusion comment with the related issue/pr.

@M4tteoP M4tteoP marked this pull request as ready for review May 12, 2023 11:12
@M4tteoP M4tteoP requested a review from a team as a code owner May 12, 2023 11:12
@fzipi
Copy link
Member

fzipi commented May 12, 2023

This is awesome progress, and it shows that we are keeping and eye on details! Excellent work @M4tteoP !

Copy link
Member

@fzipi fzipi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some notes.

internal/corazarules/rule_match.go Outdated Show resolved Hide resolved
internal/corazarules/rule_match.go Outdated Show resolved Hide resolved
internal/corazarules/rule_match.go Outdated Show resolved Hide resolved
internal/corazawaf/rule.go Outdated Show resolved Hide resolved
@@ -520,3 +520,64 @@ func TestTransformArgNoCacheForTXVariable(t *testing.T) {
t.Errorf("Expected 0 transformations in cache, got %d. It is not expected to cache TX variable transformations", len(transformationCache))
}
}

func TestCaptureNotPropagatedToInnerChainRule(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<3

@jcchavezs jcchavezs added the v3 label May 15, 2023
@M4tteoP
Copy link
Member Author

M4tteoP commented May 16, 2023

Added a few commits that:

  • propagates to matchedRule the Disruptive characteristic of the rule (Used by Error/Audit logs to print different lines)
  • fixes and extends TestLogCallback test
  • fixes conflicts with v3/dev

if len(data) > maxSizeLogMessage {
data = data[:maxSizeLogMessage]
}
log.WriteString(fmt.Sprintf("[msg_match_%d %q] [data_match_%d %q]", n, msg, n, data))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to use printf with strings.Builder, I think you can still avoid the extra allocation with something like

fmt.Fprintf(log, "[msg_match_%d %q] [data_match_%d %q]", n, msg, n, data)

Applies throughout the PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also it's a bit surprising to be printing the same value n twice

@M4tteoP M4tteoP merged commit eed5643 into corazawaf:v3/dev May 18, 2023
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect log message from coraza v3 for CRS rule 920450
4 participants