Skip to content

Commit

Permalink
Accept charset in Content-Type header application/x-www-form-urlencoded.
Browse files Browse the repository at this point in the history
Warn when this does not fit the default encoding.
This is my proposal from plone/buildout.coredev#844 (comment)
It fixes several problems in Plone.
  • Loading branch information
mauritsvanrees committed Mar 9, 2023
1 parent baf54b8 commit 305d2dc
Show file tree
Hide file tree
Showing 2 changed files with 25 additions and 1 deletion.
3 changes: 3 additions & 0 deletions CHANGES.rst
Expand Up @@ -20,6 +20,9 @@ https://github.com/zopefoundation/Zope/blob/4.x/CHANGES.rst

Fix encoding handling and ``:bytes`` converter.

Accept but ignore illegal charset in Content-Type header ``application/x-www-form-urlencoded``.
Warn when this does not fit the default encoding.

See `#1094 <https://github.com/zopefoundation/Zope/pull/1094>`_.

- Clean out and refactor dependency configuration files.
Expand Down
23 changes: 22 additions & 1 deletion src/ZPublisher/HTTPRequest.py
Expand Up @@ -16,6 +16,7 @@

import codecs
import html
import logging
import os
import random
import re
Expand Down Expand Up @@ -51,6 +52,8 @@
from .cookie import getCookieValuePolicy


logger = logging.getLogger('ZPublisher')

# DOS attack protection -- limiting the amount of memory for forms
# probably should become configurable
FORM_MEMORY_LIMIT = 2 ** 20 # memory limit for forms
Expand Down Expand Up @@ -1425,7 +1428,25 @@ def __init__(self, fp, environ):
disk_limit=FORM_DISK_LIMIT,
memfile_limit=FORM_MEMFILE_LIMIT,
charset="latin-1").parts()
elif content_type == "application/x-www-form-urlencoded":
elif content_type.startswith("application/x-www-form-urlencoded"):
# In some cases we get a charset:
# "application/x-www-form-urlencoded; charset=UTF-8"
# This is illegal according to the specification.
# See https://github.com/plone/buildout.coredev/pull/844
# We ignore it.
# When the charset does not match our default encoding,
# we log a warning, to make the user aware.
ct_split = content_type.split("charset=")
if len(ct_split) > 1:
requested_charset = ct_split[1].lower()
if requested_charset != default_encoding:
logger.warning(
"Specifying a charset in this Content-Type header "
"is not allowed by the HTTP specification. "
"We ignore the charset. Header is: %r",
content_type
)

if qs:
qs += "&"
qs += fp.read(FORM_MEMORY_LIMIT).decode("latin-1")
Expand Down

0 comments on commit 305d2dc

Please sign in to comment.