Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] CRS rule 920171 is not applied on chunked-encoded requests #53

Closed
nick-ge opened this issue Apr 7, 2023 · 4 comments · Fixed by corazawaf/coraza#768 or #55
Closed

[Bug] CRS rule 920171 is not applied on chunked-encoded requests #53

nick-ge opened this issue Apr 7, 2023 · 4 comments · Fixed by corazawaf/coraza#768 or #55

Comments

@nick-ge
Copy link

nick-ge commented Apr 7, 2023

Dear Coraza-Caddy maintainer,

it seems if CRS rule 920171 is not applied on chunked-encoded requests (i.e., requests using Transfer-Encoding: chunked). I used the following request to identify this behavior:

GET / HTTP/1.1
Host: localhost
Connection: close
Transfer-Encoding: chunked

1
A
0

Note: Keep in mind of CRLFs after each line

Issue

This request gets accepted by Coraza+Caddy as the screenshots below demostrate:

image

However, it is expected that a GET request with a message body is blocked by rule 920171.

Side Note

Sending this request to another CRS engine (e.g., ModSecurity) triggers the expected rejection:

image

Here ModSecurity's logs indicating the resposible rule:

image

Impact

This has no severe impact since it seems all other rules are still applied on the body as expected. Here a further request line was embedded into the body which triggers the respective rule:

image

Setup

Summarized I've used:

  • Coraza-Caddy plugin v1.2.2
  • Caddy v2.6.2
  • CRS v4.0.0-rc1 with default settings

Here is my Dockerfile I've used to build Caddy + Coraza:

# Caddy builder image is our base in order to compile the binary with Coraza support
FROM caddy:2.6.2-builder AS builder

# Build Caddy binary with Coraza plugin included
RUN xcaddy build \
    --with github.com/corazawaf/coraza-caddy@v1.2.2

# Replace original caddy binary within our custom image with binary containing Coraza
FROM nick/caddy:2.6.2-no-expose
COPY --from=builder /usr/bin/caddy /usr/bin/caddy

RUN mkdir -p /etc/caddy/crs/rules

# Reverse proxy to dummy backend
CMD ["caddy", "run", "--config", "/etc/caddy/Caddyfile"]

nick/caddy:2.6.2-no-expose is a customized Dockerfile which removed Docker's EXPOSE statements and an error (see this issue for more information). Here is its source code:

FROM alpine:3.16

RUN apk add --no-cache ca-certificates mailcap

RUN set -eux; \
	mkdir -p \
		/config/caddy \
		/data/caddy \
		/etc/caddy \
		/usr/share/caddy \
	; \
	wget -O /etc/caddy/Caddyfile "https://github.com/caddyserver/dist/raw/8c5fc6fc265c5d8557f17a18b778c398a2c6f27b/config/Caddyfile"; \
	wget -O /usr/share/caddy/index.html "https://github.com/caddyserver/dist/raw/8c5fc6fc265c5d8557f17a18b778c398a2c6f27b/welcome/index.html"

# https://github.com/caddyserver/caddy/releases
ENV CADDY_VERSION v2.6.2

RUN set -eux; \
	apkArch="$(apk --print-arch)"; \
	case "$apkArch" in \
		x86_64)  binArch='amd64'; checksum='ae18c0facae7c8ee872492a1ba63a4c7608915d6d9fe267aef4f869cf65ebd4b7f9ff57f609aff2bd52db98c761d877b574aea2c0c9ddc2ec53d0d5e174cb542' ;; \
		armhf)   binArch='armv6'; checksum='6de688e6514df67594635c79be51a3fe3b7b29254b36349955979571d0624dd9bb228abcb798e76fc64ec0e1c4884443c3fd5074a5b5124ee895d29d239bcf1c' ;; \
		armv7)   binArch='armv7'; checksum='ba2186fd97c2e3f8930ee04bf01774938fc3682365fe4be70d9326f2b2a430f337c617f2c385aaa3d4e6dccb7ba980990d26cdc395574e6a5172c4f74cd9391d' ;; \
		aarch64) binArch='arm64'; checksum='91b5d50cd5c0dd84bf7dfcc437880df0d39dc62af57574cea2b560000c5bf12ba415b8723c5cb091276a93b979249ff939d567fef3a2a6ed417f93af266effcc' ;; \
		ppc64el|ppc64le) binArch='ppc64le'; checksum='f8aa3a478a989217a5f4e6b58936d2a69ffb99f2b7625760451ecab6fdc6d2534f815b8414a2121d63cdbea4f92022cebaa8550f9e3a61681ec25893ebf11ee6' ;; \
		s390x)   binArch='s390x'; checksum='2c8f9b6b28194dcc14db98c0657f6a47f35dbfa6c0a45fc485b488ada7c5b77abb4f880d3763dac1699d1007ba8e0f622a075fc7f394a0f3898fb90883c00407' ;; \
		*) echo >&2 "error: unsupported architecture ($apkArch)"; exit 1 ;;\
	esac; \
	wget -O /tmp/caddy.tar.gz "https://github.com/caddyserver/caddy/releases/download/v2.6.2/caddy_2.6.2_linux_${binArch}.tar.gz"; \
	echo "$checksum  /tmp/caddy.tar.gz" | sha512sum -c; \
	tar x -z -f /tmp/caddy.tar.gz -C /usr/bin caddy; \
	rm -f /tmp/caddy.tar.gz; \
	chmod +x /usr/bin/caddy; \
	caddy version

# set up nsswitch.conf for Go's "netgo" implementation
# - https://github.com/docker-library/golang/blob/1eb096131592bcbc90aa3b97471811c798a93573/1.14/alpine3.12/Dockerfile#L9
RUN [ -e /etc/nsswitch.conf ] && echo 'hosts: files dns' > /etc/nsswitch.conf

# See https://caddyserver.com/docs/conventions#file-locations for details
ENV XDG_CONFIG_HOME /config
ENV XDG_DATA_HOME /data

LABEL org.opencontainers.image.version=v2.6.2
LABEL org.opencontainers.image.title=Caddy
LABEL org.opencontainers.image.description="a powerful, enterprise-ready, open source web server with automatic HTTPS written in Go"
LABEL org.opencontainers.image.url=https://caddyserver.com
LABEL org.opencontainers.image.documentation=https://caddyserver.com/docs
LABEL org.opencontainers.image.vendor="Light Code Labs"
LABEL org.opencontainers.image.licenses=Apache-2.0
LABEL org.opencontainers.image.source="https://github.com/caddyserver/caddy-docker"

# DON'T NEED THEM
#EXPOSE 80
#EXPOSE 443
#EXPOSE 443/udp
#EXPOSE 2019

WORKDIR /srv

CMD ["caddy", "run", "--config", "/etc/caddy/Caddyfile", "--adapter", "caddyfile"]

Caddy configuration file:

# Global configuration block
{
	# This line says, in a chain of HTTP handlers execute coraza_waf block first
	order coraza_waf first
}

# Configuration block for site 0.0.0.0:80
:80 {
	# Configure Coraza WAF with CRS v4.0.0-rc1
	coraza_waf {
		include /etc/caddy/coraza.conf
		include /etc/caddy/crs/crs-setup.conf
		include /etc/caddy/crs/rules/*.conf
	}
}

Coraza configuration:

# -- Rule engine initialization ----------------------------------------------

# Enable Coraza, attaching it to every transaction. Use detection
# only to start with, because that minimises the chances of post-installation
# disruption.
#
SecRuleEngine On


# -- Request body handling ---------------------------------------------------

# Allow Coraza to access request bodies. If you don't, Coraza
# won't be able to see any POST parameters, which opens a large security
# hole for attackers to exploit.
#
SecRequestBodyAccess On

# Enable XML request body parser.
# Initiate XML Processor in case of xml content-type
#
SecRule REQUEST_HEADERS:Content-Type "(?:application(?:/soap\+|/)|text/)xml" \
     "id:'200000',phase:1,t:none,t:lowercase,pass,nolog,ctl:requestBodyProcessor=XML"

# Enable JSON request body parser.
# Initiate JSON Processor in case of JSON content-type; change accordingly
# if your application does not use 'application/json'
#
SecRule REQUEST_HEADERS:Content-Type "application/json" \
     "id:'200001',phase:1,t:none,t:lowercase,pass,nolog,ctl:requestBodyProcessor=JSON"

# Sample rule to enable JSON request body parser for more subtypes.
# Uncomment or adapt this rule if you want to engage the JSON
# Processor for "+json" subtypes
#
#SecRule REQUEST_HEADERS:Content-Type "^application/.+[+]json$" \
#     "id:'200006',phase:1,t:none,t:lowercase,pass,nolog,ctl:requestBodyProcessor=JSON"

# Maximum request body size we will accept for buffering. If you support
# file uploads then the value given on the first line has to be as large
# as the largest file you are willing to accept. The second value refers
# to the size of data, with files excluded. You want to keep that value as
# low as practical.
#
SecRequestBodyLimit 13107200
SecRequestBodyNoFilesLimit 131072

# What to do if the request body size is above our configured limit.
# Keep in mind that this setting will automatically be set to ProcessPartial
# when SecRuleEngine is set to DetectionOnly mode in order to minimize
# disruptions when initially deploying Coraza.
#
SecRequestBodyLimitAction Reject

# Verify that we've correctly processed the request body.
# As a rule of thumb, when failing to process a request body
# you should reject the request (when deployed in blocking mode)
# or log a high-severity alert (when deployed in detection-only mode).
#
SecRule REQBODY_ERROR "!@eq 0" \
"id:'200002', phase:2,t:none,log,deny,status:400,msg:'Failed to parse request body.',logdata:'%{reqbody_error_msg}',severity:2"

# By default be strict with what we accept in the multipart/form-data
# request body. If the rule below proves to be too strict for your
# environment consider changing it to detection-only. You are encouraged
# _not_ to remove it altogether.
#
SecRule MULTIPART_STRICT_ERROR "!@eq 0" \
"id:'200003',phase:2,t:none,log,deny,status:400, \
msg:'Multipart request body failed strict validation: \
PE %{REQBODY_PROCESSOR_ERROR}, \
BQ %{MULTIPART_BOUNDARY_QUOTED}, \
BW %{MULTIPART_BOUNDARY_WHITESPACE}, \
DB %{MULTIPART_DATA_BEFORE}, \
DA %{MULTIPART_DATA_AFTER}, \
HF %{MULTIPART_HEADER_FOLDING}, \
LF %{MULTIPART_LF_LINE}, \
SM %{MULTIPART_MISSING_SEMICOLON}, \
IQ %{MULTIPART_INVALID_QUOTING}, \
IP %{MULTIPART_INVALID_PART}, \
IH %{MULTIPART_INVALID_HEADER_FOLDING}, \
FL %{MULTIPART_FILE_LIMIT_EXCEEDED}'"

# Did we see anything that might be a boundary?
#
# Here is a short description about the Coraza Multipart parser: the
# parser returns with value 0, if all "boundary-like" line matches with
# the boundary string which given in MIME header. In any other cases it returns
# with different value, eg. 1 or 2.
#
# The RFC 1341 descript the multipart content-type and its syntax must contains
# only three mandatory lines (above the content):
# * Content-Type: multipart/mixed; boundary=BOUNDARY_STRING
# * --BOUNDARY_STRING
# * --BOUNDARY_STRING--
#
# First line indicates, that this is a multipart content, second shows that
# here starts a part of the multipart content, third shows the end of content.
#
# If there are any other lines, which starts with "--", then it should be
# another boundary id - or not.
#
# After 3.0.3, there are two kinds of types of boundary errors: strict and permissive.
#
# If multipart content contains the three necessary lines with correct order, but
# there are one or more lines with "--", then parser returns with value 2 (non-zero).
#
# If some of the necessary lines (usually the start or end) misses, or the order
# is wrong, then parser returns with value 1 (also a non-zero).
#
# You can choose, which one is what you need. The example below contains the
# 'strict' mode, which means if there are any lines with start of "--", then
# Coraza blocked the content. But the next, commented example contains
# the 'permissive' mode, then you check only if the necessary lines exists in
# correct order. Whit this, you can enable to upload PEM files (eg "----BEGIN.."),
# or other text files, which contains eg. HTTP headers.
#
# The difference is only the operator - in strict mode (first) the content blocked
# in case of any non-zero value. In permissive mode (second, commented) the
# content blocked only if the value is explicit 1. If it 0 or 2, the content will
# allowed.
#

#
# See #1747 and #1924 for further information on the possible values for
# MULTIPART_UNMATCHED_BOUNDARY.
#
SecRule MULTIPART_UNMATCHED_BOUNDARY "@eq 1" \
    "id:'200004',phase:2,t:none,log,deny,msg:'Multipart parser detected a possible unmatched boundary.'"

# Some internal errors will set flags in TX and we will need to look for these.
# All of these are prefixed with "MSC_".  The following flags currently exist:
#
# COR_PCRE_LIMITS_EXCEEDED: PCRE match limits were exceeded.
#
SecRule TX:/^COR_/ "!@streq 0" \
        "id:'200005',phase:2,t:none,deny,msg:'Coraza internal error flagged: %{MATCHED_VAR_NAME}'"


# -- Response body handling --------------------------------------------------

# Allow Coraza to access response bodies.
# You should have this directive enabled in order to identify errors
# and data leakage issues.
#
# Do keep in mind that enabling this directive does increases both
# memory consumption and response latency.
#
SecResponseBodyAccess On

# Which response MIME types do you want to inspect? You should adjust the
# configuration below to catch documents but avoid static files
# (e.g., images and archives).
#
SecResponseBodyMimeType text/plain text/html text/xml

# Buffer response bodies of up to 512 KB in length.
SecResponseBodyLimit 524288

# What happens when we encounter a response body larger than the configured
# limit? By default, we process what we have and let the rest through.
# That's somewhat less secure, but does not break any legitimate pages.
#
SecResponseBodyLimitAction ProcessPartial


# -- Filesystem configuration ------------------------------------------------

# The location where Coraza stores temporary files (for example, when
# it needs to handle a file upload that is larger than the configured limit).
#
# This default setting is chosen due to all systems have /tmp available however,
# this is less than ideal. It is recommended that you specify a location that's private.
#
SecTmpDir /tmp/

# The location where Coraza will keep its persistent data.  This default setting
# is chosen due to all systems have /tmp available however, it
# too should be updated to a place that other users can't access.
#
SecDataDir /tmp/


# -- File uploads handling configuration -------------------------------------

# The location where Coraza stores intercepted uploaded files. This
# location must be private to Coraza. You don't want other users on
# the server to access the files, do you?
#
#SecUploadDir /opt/coraza/var/upload/

# By default, only keep the files that were determined to be unusual
# in some way (by an external inspection script). For this to work you
# will also need at least one file inspection rule.
#
#SecUploadKeepFiles RelevantOnly

# Uploaded files are by default created with permissions that do not allow
# any other user to access them. You may need to relax that if you want to
# interface Coraza to an external program (e.g., an anti-virus).
#
#SecUploadFileMode 0600


# -- Debug log configuration -------------------------------------------------

# Default debug log path
# Debug levels:
# 1: fatal
# 2: panic
# 3: error
# 4: warn
# 5: info
# 6: debug
# Most logging has not been implemented because it will be replaced with
# advanced rule profiling options
SecDebugLog /var/log/coraza/debug.log
SecDebugLogLevel 3


# -- Audit log configuration -------------------------------------------------

# Log the transactions that are marked by a rule, as well as those that
# trigger a server error (determined by a 5xx or 4xx, excluding 404,
# level response status codes).
#
SecAuditEngine RelevantOnly
SecAuditLogRelevantStatus "^(?:(5|4)(0|1)[0-9])$"

# Log everything we know about a transaction.
SecAuditLogParts ABIJDEFHZ

# Use a single file for logging. This is much easier to look at, but
# assumes that you will use the audit log only occasionally.
#
SecAuditLogType Serial
SecAuditLog /var/log/modsec_audit.log


# -- Miscellaneous -----------------------------------------------------------

# Use the most commonly used application/x-www-form-urlencoded parameter
# separator. There's probably only one application somewhere that uses
# something else so don't expect to change this value.
#
SecArgumentSeparator &

# Settle on version 0 (zero) cookies, as that is what most applications
# use. Using an incorrect cookie version may open your installation to
# evasion attacks (against the rules that examine named cookies).
#
SecCookieFormat 0

If I'm missing something please let me know. Otherwise I'm glad if I can contribute to your project!

King regards
Nick

@jptosso
Copy link
Member

jptosso commented Apr 7, 2023

Hey ! First of all, thank you for the detailed issue. It's way easier to debug this way.

I will validate but I think chunk transfers are removed from the http request objects in golang, which means coraza will never have access to it in this connector. Probably coraza receives the full buffered request.

I think @M4tteoP has information on this

@M4tteoP
Copy link
Member

M4tteoP commented Apr 7, 2023

Hi, many thanks for taking the time on providing lots of details!

I will validate but I think chunk transfers are removed from the http request objects in golang, which means Coraza will never have access to it in this connector.

I really think this is the issue. 920171-2 and 920171-3 are two of the few CRS tests that are still failing against the go/http middleware (that is the same used by the latest commit in the caddy connector). The Go library removes the Transfer-Encoding header (see here), and this is exactly what the 920171 rule is looking for (SecRule &REQUEST_HEADERS:Transfer-Encoding "!@eq 0").

What we should maybe do is understand if it is possible from the http.Request to read the Chunked boolean. If so we can populate again the header. But I don't know if it is feasible or if the HTTP library makes the Chunked logic completely transparent.

Update: https://github.com/golang/go/blob/master/src/net/http/transfer.go#L586 this line is definitely interesting. Maybe something like the following code can fix it.

if r.TransferEncoding != nil && r.TransferEncoding[0] == "chunked" {
	r.Header.Set("Transfer-Encoding", "chunked")
}

I will give it a go next week :)

@nick-ge
Copy link
Author

nick-ge commented Apr 11, 2023

Thank you for your fast response!

@M4tteoP This could do the trick and you should definitively give it a shot!

In my opinion however, it is not a problem of this connector but of the go/http library which gives you a HTTP request without a header that indicates a body is available. As the RFC says, Content-Length and Transfer-Encoding are responsible for that. So a more cleaner solution would be, if the go/http library replaces the Transfer-Encoding with a appropriate Content-Length header instead of removing it completely. This should trigger CRS rule 920170 then. So maybe we should populate this problem to the devs of go/http?

@anuraaga
Copy link
Contributor

I believe the behavior Go is going for is that, because the body that is provided doesn't have chunked headers, the header is removed - I guess the idea is to present to business logic in the same way regardless of encoding.

But even so the body is still streamed so it can't compute a content length.

There doesn't seem to be a great reason to remove the encoding header anyways but I guess they'll keep the behavior for backwards compatibility.

Probably synthesizing the header for the coraza transaction is going to be the way to go if it's possible. Otherwise maybe a phase 2 rule could be added that checks &REQUEST_BODY.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants