From 36ef3c8aef34ff77ebf713b6498d008fe853034f Mon Sep 17 00:00:00 2001 From: Anne van Kesteren Date: Wed, 31 Jan 2018 09:58:59 +0100 Subject: [PATCH] Define data: URL processing Unfortunately RFC 2397 has some ambiguities and implementations never really followed it in detail. Tests: https://github.com/w3c/web-platform-tests/pull/6890. Fixes #234. --- fetch.bs | 117 +++++++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 93 insertions(+), 24 deletions(-) diff --git a/fetch.bs b/fetch.bs index 9701c52b5..9fed0d88f 100644 --- a/fetch.bs +++ b/fetch.bs @@ -61,11 +61,6 @@ url: https://tools.ietf.org/html/rfc7234#section-1.2.1;text:delta-seconds;type:d "publisher": "IETF", "title": "HTTP Client Hints" }, - "DATAURL": { - "authors": ["Simon Sapin"], - "href": "https://simonsapin.github.io/data-urls/", - "title": "The data URL scheme" - }, "HTTPVERBSEC1": { "publisher": "US-CERT", "href": "https://www.kb.cert.org/vuls/id/867593", @@ -151,13 +146,14 @@ of abstraction.

This specification depends on the Infra Standard. [[!INFRA]] -

This specification uses terminology from the ABNF, Encoding, HTML, HTTP, IDL, Streams, and URL -Standards. +

This specification uses terminology from the ABNF, Encoding, HTML, HTTP, IDL, MIME Sniffing, +Streams, and URL Standards. [[!ABNF]] [[!ENCODING]] [[!HTML]] [[!HTTP]] [[!WEBIDL]] +[[!MIMESNIFF]] [[!STREAMS]] [[!URL]] @@ -2983,23 +2979,21 @@ steps:

"data"
-

If obtaining a resource from - request's current url does not return - failure, then return a response whose - header list consist of a single - header whose name is - `Content-Type` and value is the - MIME type and parameters returned from - obtaining a resource, - body is the data returned from - obtaining a resource, and - HTTPS state is request's - client's HTTPS state - if request's client is non-null. - [[!DATAURL]] - - -

Otherwise, return a network error. +

    +
  1. Let dataURLStruct be the result of running the + data: URL processor on request's current url. + +

  2. If dataURLStruct is failure, then return a network error. + +

  3. Return a response whose header list consist of a single + header whose name is `Content-Type` and + value is dataURLStruct's MIME type, + serialized, whose body is + dataURLStruct's body, and whose + HTTPS state is request's client's + HTTPS state if request's + client is non-null. +

"file"
"ftp" @@ -6055,6 +6049,78 @@ if the script checks that the URL has the right hostname. +

data: URLs

+ +

For an informative description of data: URLs, see RFC 2397. This section replaces +that RFC's normative processing requirements to be compatible with deployed content. [[RFC2397]] + +

A data: URL struct is a struct that consists of a +MIME type (a MIME type) and a +body (a byte sequence). + +

The data: URL processor takes a URL +dataURL and then runs these steps: + +

    +
  1. Assert: dataURL's scheme is "data". + +

  2. Let input be the result of running the URL serializer on + dataURL with the exclude fragment flag set. + +

  3. Remove the leading "data:" string from input. + +

  4. Let position point at the start of input. + +

  5. Let mimeType be the result of collecting a sequence of code points that + are not equal to U+002C (,), given position. + +

  6. +

    Strip leading and trailing ASCII whitespace from mimeType. + +

    This will only remove U+0020 SPACE code points, if any. + +

  7. If position is past the end of input, then return failure. + +

  8. Advance position by 1. + +

  9. Let encodedBody be the remainder of input. + +

  10. Let body be the string percent decoding of encodedBody. + +

  11. +

    If mimeType ends with U+003B (;), followed by zero or more U+0020 SPACE, followed by + an ASCII case-insensitive match for "base64", then: + +

      +
    1. Let stringBody be the isomorphic decode of body. + +

    2. Set body to the forgiving-base64 decode of stringBody. + +

    3. If body is failure, then return failure. + +

    4. Remove the last 6 code points from mimeType. + +

    5. Remove trailing U+0020 SPACE code points from mimeType, if any. + +

    6. Remove the last U+003B (;) code point from mimeType. +

    + +
  12. If mimeType starts with U+003B (;), then prepend "text/plain" + to mimeType. + +

  13. Let mimeTypeRecord be the result of parsing + mimeType. + +

  14. If mimeTypeRecord is failure, then set mimeTypeRecord to + text/plain;charset=US-ASCII. + +

  15. Return a new data: URL struct whose + MIME type is mimeTypeRecord and + body is body. +

+ + +

Background reading

This section and its subsections are informative only. @@ -6175,6 +6241,7 @@ Brad Porter, Bryan Smith, Caitlin Potter, Cameron McCormack, +Chris Rebert, Clement Pellerin, Collin Jackson, Daniel Robertson, @@ -6231,6 +6298,7 @@ Jxck, Keith Yeung, Kenji Baheux, Lachlan Hunt, +Larry Masinter, Liam Brummitt, Louis Ryan, Lucas Gonze, @@ -6276,6 +6344,7 @@ Sharath Udupa, Shivakumar Jagalur Matt, Sigbjørn Finne, Simon Pieters, +Simon Sapin, Srirama Chandra Sekhar Mogali, Steven Salat, Sunava Dutta,