From 99a4bf14478721004d0804bae678a29e20470149 Mon Sep 17 00:00:00 2001 From: Anne van Kesteren Date: Sun, 16 Aug 2015 08:53:21 +0200 Subject: [PATCH] Define path segments and use them in the parser. Second part towards fixing #33. --- url.bs | 84 ++++++++++++++++++++++--------------- url.html | 126 +++++++++++++++++++++++++++++++------------------------ 2 files changed, 122 insertions(+), 88 deletions(-) diff --git a/url.bs b/url.bs index cdacb549..e7d68783 100644 --- a/url.bs +++ b/url.bs @@ -887,10 +887,11 @@ the following

either optionally followed by "?" and a query. @@ -907,11 +908,17 @@ should be registered in the IANA URI [sic] Schemes registry. must be a relative URL, followed by "#" and a fragment. -

A relative URL must be either a -scheme-relative URL, or a path that does not start with a -scheme and ":", optionally followed by a -"?" and a query. - +

A relative URL must be one of +the following + +

+ +

any optionally followed by "?" and a query.

A non-null base URL is necessary when parsing a relative URL. @@ -925,11 +932,32 @@ by ":" and a port, optionally followed by a ASCII digits.

A path-absolute URL must be -a path that starts with "/" and does not start with -"//". +"/" followed by a path-relative URL. + +

A path-relative URL must be +zero or more path segments, separated from each other by "/", and not +start with "/". + +

A path segment must be one +of the following + +

+ +

A +single-dot path segment +must be "." or an ASCII case-insensitive match for +"%2e". -

A path must be zero or more -URL units, excluding "?". +

A +double-dot path segment +must be ".." or an ASCII case-insensitive match for one of +".%2e", "%2e.", and "%2e%2e".

A query must be zero or more URL units. @@ -1632,31 +1660,19 @@ optionally with an encoding

  • If url is special and c is "\", syntax violation. -

  • -

    If buffer, lowercased, matches any row - in the first column of the following table, set buffer to the contents - of the cell in the second column of the matched row: - - -
    "%2e" "." -
    ".%2e" ".." -
    "%2e." -
    "%2e%2e" -
    - -

  • If buffer is "..", pop url's - path, and then if neither c is "/", nor - url is special and c is "\", append the empty - string to url's path. - -

  • Otherwise, if buffer is "." and if neither c - is "/", nor url is special and c is - "\", append the empty string to url's +

  • If buffer is a double-dot path segment, pop + url's path, and then if neither c is + "/", nor url is special and c is + "\", append the empty string to url's path. + +

  • Otherwise, if buffer is a single-dot path segment and if + neither c is "/", nor url is special and + c is "\", append the empty string to url's path.

  • -

    Otherwise, if buffer is not - ".", run these subsubsteps: +

    Otherwise, if buffer is not a single-dot path segment, run + these subsubsteps:

    1. diff --git a/url.html b/url.html index 029274db..e79c5698 100644 --- a/url.html +++ b/url.html @@ -30,7 +30,7 @@

      URL

      Living Standard — Last Updated -

      +
      @@ -1430,12 +1430,13 @@

      a scheme that is an ASCII case-insensitive - match for a special scheme, followed by a scheme-relative URL + match for a special scheme, followed by ":" and a + scheme-relative URL

    2. a scheme that is not an - ASCII case-insensitive match for a special scheme, followed by a - relative URL + ASCII case-insensitive match for a special scheme, followed by + ":" and a relative URL

      @@ -1456,11 +1457,25 @@

      fragment.

      -

      A relative URL must be either a -scheme-relative URL, or a path that does not start with a -scheme and ":", optionally followed by a -"?" and a query. +

      A relative URL must be one of +the following + +

      + + +

      any optionally followed by "?" and a query.

      A non-null base URL is necessary when @@ -1478,12 +1493,43 @@

      A path-absolute URL must be -a path that starts with "/" and does not start with -"//". +"/" followed by a path-relative URL.

      -

      A path must be zero or more -URL units, excluding "?". +

      A path-relative URL must be +zero or more path segments, separated from each other by "/", and not +start with "/". + +

      +

      A path segment must be one +of the following + +

      + + + +

      A +single-dot path segment +must be "." or an ASCII case-insensitive match for +"%2e". + +

      +

      A +double-dot path segment +must be ".." or an ASCII case-insensitive match for one of +".%2e", "%2e.", and "%2e%2e".

      A query must be zero or more @@ -2587,51 +2633,23 @@

      lowercased, matches any row - in the first column of the following table, set buffer to the contents - of the cell in the second column of the matched row: - -

      - - - - - - - -
      "%2e" - "." - -
      ".%2e" - ".." - -
      "%2e." - -
      "%2e%2e" - -
      - - - -
    3. -

      If buffer is "..", pop url’s - path, and then if neither c is "/", nor - url is special and c is "\", append the empty - string to url’s path. +

      If buffer is a double-dot path segment, pop + url’s path, and then if neither c is + "/", nor url is special and c is + "\", append the empty string to url’s path.

    4. -

      Otherwise, if buffer is "." and if neither c - is "/", nor url is special and c is - "\", append the empty string to url’s +

      Otherwise, if buffer is a single-dot path segment and if + neither c is "/", nor url is special and + c is "\", append the empty string to url’s path.

    5. -

      Otherwise, if buffer is not - ".", run these subsubsteps: +

      Otherwise, if buffer is not a single-dot path segment, run + these subsubsteps:

        @@ -4772,6 +4790,7 @@

        domainToASCII(domain), in §6.2
      1. domain to Unicode, in §3.1
      2. domainToUnicode(domain), in §6.2 +
      3. double-dot path segment, in §4.1
      4. EOF code point, in §1.1
      5. equals, in §4.4
      6. file host state, in §4.2 @@ -4851,14 +4870,12 @@

        dfn for url, in §4
      7. attribute for URLUtils, URLUtilsReadOnly, in §6.3 -
      8. path - +
      9. path, in §4
      10. path-absolute URL, in §4.1
      11. pathname, in §6.3
      12. path or authority state, in §4.2 +
      13. path-relative URL, in §4.1 +
      14. path segment, in §4.1
      15. path start state, in §4.2
      16. path state, in §4.2
      17. percent decode, in §2 @@ -4907,6 +4924,7 @@

        set the password, in §4.2
      18. set the username, in §4.2
      19. simple encode set, in §2 +
      20. single-dot path segment, in §4.1
      21. special authority ignore slashes state, in §4.2
      22. special authority slashes state, in §4.2
      23. special relative or authority state, in §4.2