From d1eee606b2a1491509acae25136dbe03501eefa8 Mon Sep 17 00:00:00 2001 From: "Benjamin C. Wiley Sittler" Date: Wed, 6 Dec 2017 14:36:45 -0800 Subject: [PATCH 1/6] Specify HTML numeric character reference fallback encoding for multipart upload filename characters not representable in form charset --- source | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/source b/source index e3ad9bf4b9a..07ed9df5709 100755 --- a/source +++ b/source @@ -56318,10 +56318,11 @@ fur values must be encoded using the character encoding selected above.

File names included in the generated multipart/form-data resource (as part of - file fields) must use the character encoding selected above, though the precise name may be - approximated if necessary (e.g. newlines could be removed from file names, quotes could be - changed to "%22", and characters not expressible in the selected character encoding could be - replaced by other characters). + file fields) must use the character encoding selected above. For each character in the entry's + file name that cannot be expressed using the selected character encoding, replace the character + by a string consisting of a U+0026 AMPERSAND character (&), a U+0023 NUMBER SIGN character + (#), one or more ASCII digits representing the code point of the character in base + ten, and finally a U+003B (;).

The boundary used by the user agent in generating the return value of this algorithm is the multipart/form-data boundary string. (This value is used From bf7ad5ce3f592ac58e29c1904d497b6320bd2386 Mon Sep 17 00:00:00 2001 From: "Benjamin C. Wiley Sittler" Date: Thu, 7 Dec 2017 09:58:10 -0800 Subject: [PATCH 2/6] Add normative quoting of syntactically-significant characters --- source | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/source b/source index 07ed9df5709..a86952ca521 100755 --- a/source +++ b/source @@ -56291,7 +56291,8 @@ fur

  • For each character in the entry's name and value that cannot be expressed using the selected character encoding, replace the character by a string consisting of a U+0026 AMPERSAND character (&), a U+0023 NUMBER SIGN character (#), one or more ASCII digits - representing the code point of the character in base ten, and finally a U+003B (;).

  • + representing the code point of the character in base ten, and finally a U+003B SEMICOLON + character (;).

    @@ -56324,6 +56325,13 @@ fur (#), one or more ASCII digits representing the code point of the character in base ten, and finally a U+003B (;).

    +

    Field names and file names included in the generated multipart/form-data + resource must undergo multipart parameter-value character replacement for syntactically + significant characters: for each character in the parameter value that is one of U+0022 + QUOTATION MARK ("), U+000A LINE FEED (LF), or U+000D CARRIAGE RETURN (CR), replace the character + by a string consisting of a U+0025 PERCENT SIGN character (%) and two ASCII hex + digits representing the code point of the character in base sixteen.

    +

    The boundary used by the user agent in generating the return value of this algorithm is the multipart/form-data boundary string. (This value is used to generate the MIME type of the form submission payload generated by this algorithm.)

    From e9af73f1e6263756bad32e250af4988bf737ed50 Mon Sep 17 00:00:00 2001 From: "Benjamin C. Wiley Sittler" Date: Thu, 7 Dec 2017 18:36:26 -0800 Subject: [PATCH 3/6] Add NUL to the should-be-escaped set for multipart parameters --- source | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/source b/source index a86952ca521..fdfc94e0c29 100755 --- a/source +++ b/source @@ -56327,10 +56327,11 @@ fur

    Field names and file names included in the generated multipart/form-data resource must undergo multipart parameter-value character replacement for syntactically - significant characters: for each character in the parameter value that is one of U+0022 - QUOTATION MARK ("), U+000A LINE FEED (LF), or U+000D CARRIAGE RETURN (CR), replace the character - by a string consisting of a U+0025 PERCENT SIGN character (%) and two ASCII hex - digits representing the code point of the character in base sixteen.

    + significant characters: for each character in the parameter value that is one of U+0000 + <control> (NUL), U+0022 QUOTATION MARK ("), U+000A LINE FEED (LF), or U+000D CARRIAGE + RETURN (CR), replace the character by a string consisting of a U+0025 PERCENT SIGN character (%) + and two ASCII hex digits representing the code point of the character in base + sixteen.

    The boundary used by the user agent in generating the return value of this algorithm is the multipart/form-data boundary string. (This value is used From 87bc6546f7ebec82155370c853fcb610ba68a749 Mon Sep 17 00:00:00 2001 From: "Benjamin C. Wiley Sittler" Date: Fri, 8 Dec 2017 15:20:11 -0800 Subject: [PATCH 4/6] Escape all the ASCII controls and double \ to \\ to avoid breaking multipart parsers --- source | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/source b/source index fdfc94e0c29..7d73bc3d76f 100755 --- a/source +++ b/source @@ -56327,11 +56327,12 @@ fur

    Field names and file names included in the generated multipart/form-data resource must undergo multipart parameter-value character replacement for syntactically - significant characters: for each character in the parameter value that is one of U+0000 - <control> (NUL), U+0022 QUOTATION MARK ("), U+000A LINE FEED (LF), or U+000D CARRIAGE - RETURN (CR), replace the character by a string consisting of a U+0025 PERCENT SIGN character (%) - and two ASCII hex digits representing the code point of the character in base - sixteen.

    + significant characters: for each character in the parameter value that is U+0022 QUOTATION MARK + ("), U+007F <control> (DEL) or any other controls in the + range U+0000-U+001F except U+001B <control> (ESC), replace the character by a string + consisting of a U+0025 PERCENT SIGN character (%) and two ASCII hex digits + representing the code point of the character in base sixteen, and replace each U+005C REVERSE + SOLIDUS character (\) with two consecutive U+005C REVERSE SOLIDUS characters (\\).

    The boundary used by the user agent in generating the return value of this algorithm is the multipart/form-data boundary string. (This value is used From 2400d5df34567d934846358cd65c2f9ab8d8b6dd Mon Sep 17 00:00:00 2001 From: "Benjamin C. Wiley Sittler" Date: Fri, 8 Dec 2017 15:54:02 -0800 Subject: [PATCH 5/6] typofix --- source | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/source b/source index 7d73bc3d76f..5e7bb4f5357 100755 --- a/source +++ b/source @@ -56328,7 +56328,7 @@ fur

    Field names and file names included in the generated multipart/form-data resource must undergo multipart parameter-value character replacement for syntactically significant characters: for each character in the parameter value that is U+0022 QUOTATION MARK - ("), U+007F <control> (DEL) or any other controls in the + ("), U+007F <control> (DEL) or any other control in the range U+0000-U+001F except U+001B <control> (ESC), replace the character by a string consisting of a U+0025 PERCENT SIGN character (%) and two ASCII hex digits representing the code point of the character in base sixteen, and replace each U+005C REVERSE From d973d73484a78fce317d1d932b63a58d3e3d5cb9 Mon Sep 17 00:00:00 2001 From: "Benjamin C. Wiley Sittler" Date: Fri, 8 Dec 2017 19:09:33 -0800 Subject: [PATCH 6/6] explain why ESC is not quoted (thanks, @bzbarsky !) --- source | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/source b/source index 5e7bb4f5357..0764323ffa7 100755 --- a/source +++ b/source @@ -56332,7 +56332,8 @@ fur range U+0000-U+001F except U+001B <control> (ESC), replace the character by a string consisting of a U+0025 PERCENT SIGN character (%) and two ASCII hex digits representing the code point of the character in base sixteen, and replace each U+005C REVERSE - SOLIDUS character (\) with two consecutive U+005C REVERSE SOLIDUS characters (\\).

    + SOLIDUS character (\) with two consecutive U+005C REVERSE SOLIDUS characters (\\). U+001B + <control> (ESC) is preserved (passed through unquoted) for Web-compatible ISO-2022-JP.

    The boundary used by the user agent in generating the return value of this algorithm is the multipart/form-data boundary string. (This value is used