Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix/feat: Macro expansions, error logs redundancy, support msg/logdata in inner rules #792

Merged
merged 9 commits into from
May 18, 2023
48 changes: 35 additions & 13 deletions internal/corazarules/rule_match.go
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,19 @@
mr.ServerIPAddress_, mr.URI_, mr.TransactionID_))
}

func (mr MatchedRule) matchData(matchData types.MatchData, log *strings.Builder) {
func (mr MatchedRule) writeExtraRuleDetails(log *strings.Builder, matchData types.MatchData, n int) {
M4tteoP marked this conversation as resolved.
Show resolved Hide resolved
msg := matchData.Message()
data := matchData.Data()
if len(msg) > 200 {
msg = msg[:200]
}

Check warning on line 133 in internal/corazarules/rule_match.go

View check run for this annotation

Codecov / codecov/patch

internal/corazarules/rule_match.go#L132-L133

Added lines #L132 - L133 were not covered by tests
if len(data) > 200 {
data = data[:200]
fzipi marked this conversation as resolved.
Show resolved Hide resolved
}
log.WriteString(fmt.Sprintf("[msg_match_%d %q] [data_match_%d %q]", n, msg, n, data))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to use printf with strings.Builder, I think you can still avoid the extra allocation with something like

fmt.Fprintf(log, "[msg_match_%d %q] [data_match_%d %q]", n, msg, n, data)

Applies throughout the PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also it's a bit surprising to be printing the same value n twice

}

func (mr MatchedRule) matchData(log *strings.Builder, matchData types.MatchData) {

Check warning on line 140 in internal/corazarules/rule_match.go

View check run for this annotation

Codecov / codecov/patch

internal/corazarules/rule_match.go#L140

Added line #L140 was not covered by tests
value := matchData.Value()
if len(value) > 200 {
value = value[:200]
fzipi marked this conversation as resolved.
Show resolved Hide resolved
Expand Down Expand Up @@ -159,7 +171,7 @@
} else {
log.WriteString("Coraza: Warning. ")
}
mr.matchData(matchData, log)
mr.matchData(log, matchData)

Check warning on line 174 in internal/corazarules/rule_match.go

View check run for this annotation

Codecov / codecov/patch

internal/corazarules/rule_match.go#L174

Added line #L174 was not covered by tests
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, swaps parameters order just to keep consistency across mr.matchData() and mr.writeDetails()

mr.writeDetails(log, matchData)
log.WriteString("\n")
}
Expand All @@ -168,7 +180,8 @@

// ErrorLog returns the same as audit log but without matchData
func (mr MatchedRule) ErrorLog(code int) string {
msg := mr.MatchedDatas_[0].Message()
matchData := mr.MatchedDatas_[0]
msg := matchData.Message()
for _, md := range mr.MatchedDatas_ {
// Use 1st set message of rule chain as message
if md.Message() != "" {
Expand All @@ -182,17 +195,26 @@

log := &strings.Builder{}

for _, matchData := range mr.MatchedDatas_ {
log.WriteString(fmt.Sprintf("[client %q] ", mr.ClientIPAddress_))
if mr.Disruptive_ {
log.WriteString(fmt.Sprintf("Coraza: Access denied with code %d (phase %d). ", code, mr.Rule_.Phase()))
} else {
log.WriteString("Coraza: Warning. ")
log.WriteString(fmt.Sprintf("[client %q] ", mr.ClientIPAddress_))
if mr.Disruptive_ {
log.WriteString(fmt.Sprintf("Coraza: Access denied with code %d (phase %d). ", code, mr.Rule_.Phase()))

Check warning on line 200 in internal/corazarules/rule_match.go

View check run for this annotation

Codecov / codecov/patch

internal/corazarules/rule_match.go#L200

Added line #L200 was not covered by tests
} else {
log.WriteString("Coraza: Warning. ")
}
log.WriteString(msg)
log.WriteString(" ")
fzipi marked this conversation as resolved.
Show resolved Hide resolved
mr.writeDetails(log, matchData)

for n, matchData := range mr.MatchedDatas_ {
if n == 0 {
// Skipping first matchData, it has been just added to the log
continue
}
if matchData.Message() != "" || matchData.Data() != "" {
mr.writeExtraRuleDetails(log, matchData, n)
}
log.WriteString(msg)
log.WriteString(" ")
mr.writeDetails(log, matchData)
log.WriteString("\n")
}

log.WriteString("\n")
return log.String()
}
41 changes: 28 additions & 13 deletions internal/corazawaf/rule.go
Original file line number Diff line number Diff line change
Expand Up @@ -167,9 +167,7 @@
}

func (r *Rule) doEvaluate(phase types.RulePhase, tx *Transaction, cache map[transformationKey]*transformationValue) []types.MatchData {
if r.Capture {
tx.Capture = true
}
tx.Capture = r.Capture
rid := r.ID_
if rid == 0 {
fzipi marked this conversation as resolved.
Show resolved Hide resolved
rid = r.ParentID_
Expand All @@ -193,11 +191,14 @@
if r.operator == nil {
tx.DebugLogger().Debug().Int("rule_id", rid).Msg("Forcing rule to match")
md := &corazarules.MatchData{}
if r.Msg != nil {
md.Message_ = r.Msg.Expand(tx)
}
if r.LogData != nil {
md.Data_ = r.LogData.Expand(tx)
if r.ParentID_ != 0 || r.MultiMatch {
// In order to support Msg and LogData for inner rules, we need to expand them now
if r.Msg != nil {
md.Message_ = r.Msg.Expand(tx)
}
if r.LogData != nil {
md.Data_ = r.LogData.Expand(tx)
}

Check warning on line 201 in internal/corazawaf/rule.go

View check run for this annotation

Codecov / codecov/patch

internal/corazawaf/rule.go#L200-L201

Added lines #L200 - L201 were not covered by tests
}
matchedValues = append(matchedValues, md)
r.matchVariable(tx, md)
Expand Down Expand Up @@ -253,11 +254,14 @@
// Set the txn variables for expansions before usage
r.matchVariable(tx, mr)

if r.Msg != nil {
mr.Message_ = r.Msg.Expand(tx)
}
if r.LogData != nil {
mr.Data_ = r.LogData.Expand(tx)
if r.ParentID_ != 0 || r.MultiMatch {
// In order to support Msg and LogData for inner rules, we need to expand them now
if r.Msg != nil {
mr.Message_ = r.Msg.Expand(tx)
}
if r.LogData != nil {
mr.Data_ = r.LogData.Expand(tx)
}
}
matchedValues = append(matchedValues, mr)
tx.DebugLogger().Debug().
Expand All @@ -284,6 +288,7 @@
}

// disruptive actions and rules affecting the rule flow are only evaluated by parent rules
// also, expansion of Msg and LogData of the parent rule is postponed after the chain evaluation (if any)
if r.ParentID_ == 0 {
// we only run the chains for the parent rule
for nr := r.Chain; nr != nil; {
Expand All @@ -295,6 +300,16 @@
matchedValues = append(matchedValues, matchedChainValues...)
nr = nr.Chain
}
// Expansion of Msg and LogData is postponed here. It allows to run it only if the whole rule/chain
// matches and to rely on MATCHED_* variables updated by the chain, not just by the fist rule.
if !r.MultiMatch {
if r.Msg != nil {
matchedValues[0].(*corazarules.MatchData).Message_ = r.Msg.Expand(tx)
}
if r.LogData != nil {
matchedValues[0].(*corazarules.MatchData).Data_ = r.LogData.Expand(tx)
}
}

for _, a := range r.actions {
if a.Function.Type() == plugintypes.ActionTypeFlow {
Expand Down
61 changes: 61 additions & 0 deletions internal/corazawaf/rule_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -520,3 +520,64 @@ func TestTransformArgNoCacheForTXVariable(t *testing.T) {
t.Errorf("Expected 0 transformations in cache, got %d. It is not expected to cache TX variable transformations", len(transformationCache))
}
}

func TestCaptureNotPropagatedToInnerChainRule(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<3

r := NewRule()
r.ID_ = 1
r.operator = nil
r.HasChain = true
r.Phase_ = 1
r.Capture = true
chainedRule := NewRule()
chainedRule.ID_ = 0
chainedRule.ParentID_ = 1
chainedRule.operator = nil
chainedRule.Capture = false
r.Chain = chainedRule
tx := NewWAF().NewTransaction()
r.doEvaluate(types.PhaseRequestHeaders, tx, tx.transformationCache)
// We expect that capture is false after doEvaluate.
if tx.Capture {
t.Errorf("Expected capture to be false. The parent rule enables capture, but inner rule should disable it.")
}
}

func TestExpandMacroAfterWholeRuleEvaluation(t *testing.T) {
r := NewRule()
r.ID_ = 1
r.operator = nil
r.HasChain = true
r.Phase_ = 1
r.Msg, _ = macro.NewMacro("%{MATCHED_VAR_NAME}-msg")
r.LogData, _ = macro.NewMacro("%{MATCHED_VAR_NAME}-data")
action := &dummyNonDisruptiveAction{}
_ = r.AddAction("dummyNonDisruptiveAction", action)
chainedRule := NewRule()
chainedRule.ID_ = 0
chainedRule.ParentID_ = 1
chainedRule.operator = nil

_ = r.AddVariable(variables.RequestURI, "", false)
// ArgsGet is the variable matched by the inner rule. We expect that MATCHED_VAR_NAME
// will be ARGS_GET being the last variable matched
_ = chainedRule.AddVariable(variables.ArgsGet, "", false)
dummyEqOp := &dummyEqOperator{}
r.SetOperator(dummyEqOp, "@eq", "0")
chainedRule.SetOperator(dummyEqOp, "@eq", "0")

r.Chain = chainedRule
tx := NewWAF().NewTransaction()
tx.ProcessURI("0", "GET", "HTTP/1.1")
tx.AddGetRequestArgument("test", "0")

matchdata := r.doEvaluate(types.PhaseLogging, tx, tx.transformationCache)
if len(matchdata) != 2 {
t.Errorf("Expected 2 matchdata from a chained rule (total 2 rules), got %d", len(matchdata))
}
if matchdata[0].Message() != "ARGS_GET:test-msg" {
t.Errorf("Expected ArgsGet-msg, got %s", matchdata[0].Message())
}
if matchdata[0].Data() != "ARGS_GET:test-data" {
t.Errorf("Expected ArgsGet-data, got %s", matchdata[0].Data())
}
}
2 changes: 1 addition & 1 deletion internal/corazawaf/transaction.go
Original file line number Diff line number Diff line change
Expand Up @@ -440,7 +440,7 @@ func (tx *Transaction) ParseRequestReader(data io.Reader) (*types.Interruption,
return tx.ProcessRequestBody()
}

// matchVariable Creates the MATCHED_ variables required by chains and macro expansion
// matchVariable creates MATCHED_* variables required by chains and macro expansions
// MATCHED_VARS, MATCHED_VAR, MATCHED_VAR_NAME, MATCHED_VARS_NAMES
func (tx *Transaction) matchVariable(match *corazarules.MatchData) {
var varName string
Expand Down
73 changes: 71 additions & 2 deletions internal/seclang/rules_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -239,12 +239,81 @@ func TestTagsAreNotPrintedTwice(t *testing.T) {
}
re := regexp.MustCompile(`\[tag "some1"\]`)
for _, l := range logs {
if len(re.FindAllString(l, -1)) < 2 {
Copy link
Member Author

@M4tteoP M4tteoP May 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully get this test. The title (TestTagsAreNotPrintedTwice) seems like that wants to avoid the same tag appearing more than once inside the logs, but then we are trying to find the same tag twice. Originally it was enforcing being just one, but it has been changed with https://github.com/corazawaf/coraza/pull/220/files#diff-090979aa58246451224457e41b61bf533048a2e7091724b5db0345c04f5ef561L248-R249.
Currently, the test is trying to match the following:

"[client \"\"] Coraza: Warning.  [file \"\"] [line \"2\"] [id \"1\"] [rev \"\"] [msg \"\"] [data \"\"] [severity \"emergency\"] [ver \"\"] [maturity \"0\"] [accuracy \"0\"] [tag \"some1\"] [tag \"some2\"] [hostname \"\"] [uri \"\"] [unique_id \"xDFbZiZkbAmlAAKUNXc\"]\n[client \"\"] Coraza: Warning.  [file \"\"] [line \"2\"] [id \"1\"] [rev \"\"] [msg \"\"] [data \"\"] [severity \"emergency\"] [ver \"\"] [maturity \"0\"] [accuracy \"0\"] [tag \"some1\"] [tag \"some2\"] [hostname \"\"] [uri \"\"] [unique_id \"xDFbZiZkbAmlAAKUNXc\"]\n"

And it seems that the ending part is very lengthy and not relevant (Current PR tries to avoid redundancy, therefore it does not print it):

[client \"\"] Coraza: Warning.  [file \"\"] [line \"2\"] [id \"1\"] [rev \"\"] [msg \"\"] [data \"\"] [severity \"emergency\"] [ver \"\"] [maturity \"0\"] [accuracy \"0\"] [tag \"some1\"] [tag \"some2\"] [hostname \"\"] [uri \"\"] [unique_id \"xDFbZiZkbAmlAAKUNXc\"]\n"

cc. @piyushroshan

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping @jptosso

t.Errorf("failed to log tag, missing instances (%d)\n%s", len(re.FindAllString(l, -1)), l)
if len(re.FindAllString(l, -1)) > 1 {
t.Errorf("failed to log tag, got multiple instances (%d)\n%s", len(re.FindAllString(l, -1)), l)
}
}
}

func TestPrintedExtraMsgAndDataFromChainedRules(t *testing.T) {
waf := corazawaf.NewWAF()
var logs []string
waf.SetErrorCallback(func(mr types.MatchedRule) {
logs = append(logs, mr.ErrorLog(403))
})
parser := NewParser(waf)
err := parser.FromString(`
SecRule ARGS_GET "@rx .*" "id:1, phase:1, log, chain, deny, status:403, msg:'Parent msg', logdata:'%{MATCHED_VAR} in %{MATCHED_VAR_NAME}"
SecRule ARGS_GET "@rx .*" "msg:'Inner message 1', logdata:'%{MATCHED_VAR} in %{MATCHED_VAR_NAME}', chain"
SecRule ARGS_GET "@rx .*" "msg:'Inner message 2', logdata:'%{MATCHED_VAR} in %{MATCHED_VAR_NAME}'"
`)
if err != nil {
t.Error(err.Error())
}
tx := waf.NewTransaction()
tx.AddGetRequestArgument("test", "1")
it := tx.ProcessRequestHeaders()
if it == nil {
t.Error("failed to interrupt")
} else if it.Status != 403 {
t.Errorf("failed to set status, got %d", it.Status)
}
if len(logs) != 1 {
t.Errorf("failed to log with %d", len(logs))
}
if count := strings.Count(logs[0], "1 in ARGS_GET:test"); count != 3 {
t.Errorf("failed to log logdata, expected 3 repetitions, got %d", count)
}
if count := strings.Count(logs[0], "Inner message 1"); count != 1 {
t.Errorf("Unexpected number of msg from inner rule 1, expected 1 got %d", count)
}
if count := strings.Count(logs[0], "Inner message 2"); count != 1 {
t.Errorf("Unexpected number of msg from inner rule 2, expected 1 got %d", count)
}
}

func TestPrintedMultipleMsgAndDataWithMultiMatch(t *testing.T) {
waf := corazawaf.NewWAF()
var logs []string
waf.SetErrorCallback(func(mr types.MatchedRule) {
logs = append(logs, mr.ErrorLog(403))
})
parser := NewParser(waf)
err := parser.FromString(`
SecRule ARGS_GET "@rx .*" "id:9696, phase:1, log, chain, deny, t:lowercase, status:403, msg:'msg', logdata:'%{MATCHED_VAR} in %{MATCHED_VAR_NAME}',multiMatch"
`)
if err != nil {
t.Error(err.Error())
}
tx := waf.NewTransaction()
tx.AddGetRequestArgument("testArgGet", "tEsT1")
it := tx.ProcessRequestHeaders()
if it == nil {
t.Error("failed to interrupt")
} else if it.Status != 403 {
t.Errorf("failed to set status, got %d", it.Status)
}
if len(logs) != 1 {
t.Errorf("failed to log with %d", len(logs))
}
if count := strings.Count(logs[0], "tEsT1 in ARGS_GET"); count != 1 {
t.Errorf("failed to log logdata, expected \"tEsT1 in ARGS_GET\" occurence, got %s", logs[0])
}
if count := strings.Count(logs[0], "test1 in ARGS_GET"); count != 1 {
t.Errorf("failed to log logdata, expected \"test1 in ARGS_GET\" occurence, got %s", logs[0])
}
}

func TestStatusFromInterruptions(t *testing.T) {
waf := corazawaf.NewWAF()
var logs []string
Expand Down
1 change: 1 addition & 0 deletions testing/coreruleset/.ftw.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,4 @@ testoverride:
920420-8: 'HEAD request with data. Go/http does not allow it - 400 Bad Request'
920430-5: 'Test has expect_error, Go/http and Envoy return 400'
920430-8: 'Go/http does no allow HTTP/3.0 - 505 HTTP Version Not Supported'
932200-13: 'wip'
Copy link
Member Author

@M4tteoP M4tteoP May 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a very unfortunate combination of rules triggered, 932200-13 test is now failing. It is intended to check that the correct log has been printed, looking for the wrong one found within MATCHED_VAR. The latter string is matched not inside the 932200 log, but inside a 942440 that is triggered alongside it. The latter match is legit, but it results to be a bit more verbose compared to Modsec MATCHED_VARS:ARGS:host vs ARGS:host, therefore the test is marked as failing. A tested workaround would be just looking for found within MATCHED_VAR: instead of found within MATCHED_VAR. I will evolve this discussion CRS side and update the exclusion comment with the related issue/pr.