Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uri issues, summarized #998

Closed
5 tasks done
jebrosen opened this issue May 9, 2019 · 8 comments
Closed
5 tasks done

Uri issues, summarized #998

jebrosen opened this issue May 9, 2019 · 8 comments
Labels
feedback wanted User feedback is needed
Milestone

Comments

@jebrosen
Copy link
Collaborator

jebrosen commented May 9, 2019

Background

Rocket 0.4 introduced Typed URIs and a new Uri type. The uri! macro and strict Uri validation is a valuable addition to Rocket's API and is a great example of Rocket's promise of safety and correctness, but there are a few problems with it due to deficiencies and/or bugs in Rocket and in browsers.

This is a summary of the following issues and PRs:

Bugs

  • Uri does not appear to support fragments in any URI from
    • Workaround: Servers should only deal with fragments in URIs they generate, and most of those can be done via format!.
    • Possible solution: add fragment support, but disallow it or treat it as part of the path/query in incoming URIs
  • Location uses Uri and is therefore too strict about which kinds of URIs it accepts (no relative URI, no fragment)
    • Workaround: Manually set the header as a string.
    • Possible solution: Add support for relative URIs and URIs with fragments to Location and/or Uri
  • Uri is strict about characters such as { needing to be percent-encoded
    • Several browsers send { and other characters unencoded, even when they ought to be encoded.
    • There is currently no workaround. If a user enters a { or certain other relatively innocent charactesr in a plain HTML form submitted with the GET method, Rocket may refuse to process the request entirely.
    • Possible solution: Allow { and some other commonly unencoded characters. Clearly document how this adheres or purposefully does not adhere to the relevant standard and why.
  • (unconfirmed) > isn't even legal in the HTTP header, according to hyper 0.10? Worth double-checking what's going on here.
  • Route matching to the request URI is sensitive to percent encoding (e.g. $ does not match %24).
    • There is currently no workaround
    • Possible solution: Follow an algorithm such as the one outlined in RFC 3986 §6 to do normalization when doing route comparison.

Requests

  • uri! only takes a string literal that must parse as an Origin as its mount point part.
    • Workaround: Manually paste URI together via format!.
    • Feature requested: Support any runtime-provided Uri as the mount point
    • Possible solution: Accept any Into<Uri> as a mount point. Maybe with a special case - string literals can be checked via TryInto<Uri> at compile time and raise an error if they are invalid.
  • uri! with Option query parameters works surprisingly (e.g. Can't use uri! with routes that have optional query params #827 (comment))

What next?

Recapitulating what I wrote in #924 (comment):

Rocket states that it adheres to RFC 7230, which relies on RFC 3986 for many of its URI-related definitions. But strictly adhering to RFC 3986 means that some URLs sent by browsers will be rejected as invalid. WHATWG's URL living standard is intended to replace previous URL standards, but it is still a moving target.

I think the first step to take is to decide if WHATWG's URL living standard is an acceptable specification for Rocket to follow at this time. If it is not suitable, careful and documented deviance from the currently followed specification(s) should be made for the few places that browsers do something different from our expectations.

There are approximately three decisions to be made as I see it:

  • What URI/URL standard to follow - RFC 7230 or WHATWG are the only realistic standards I know of at this time
  • When and how to decode and normalize; either as URIs come in or as they are compared to routes, and which encoding tables to use
  • Extending Uri with support for relative URIs and fragments, or loosening the requirements to use Uri in places such as Location

At the end of the day, I don't want to review or merge a bunch of PRs that modify URI handling without a clear roadmap for doing so in a holistic way.


I will follow up with my own opinions on some of these choices, but I would like to get a feel for what other users of and contributors to Rocket expect as well.

@jebrosen jebrosen added the feedback wanted User feedback is needed label May 9, 2019
@bluetech
Copy link

A lot of places get this wrong. I made a comment when Uri was introduced: #443 (comment).

Request Target, e.g. what appears in GET /somepath, is:

     request-target = origin-form
                    / absolute-form
                    / authority-form
                    / asterisk-form

(/ means alternation)

It is not a superset of URI, nor a subset: some Request Targets are not URIs (e.g. *), and some (http) URIs are not Request Targets (e.g. some relative URIs or anything with a fragment).

For Location, the type is:

     Location = URI-reference

and URI-reference is:

URI-reference = URI / relative-ref

      relative-ref  = relative-part [ "?" query ] [ "#" fragment ]

      relative-part = "//" authority path-abempty
                    / path-absolute
                    / path-noscheme
                    / path-empty

Depending on how far you want to go with static safety, the types will multiply.

@sreeise
Copy link

sreeise commented Aug 3, 2019

The OAuth v2.0 implicit grant can return the access token and other key value pairs in the fragment of the URI. The fragment is appended to the redirect URI and the user is redirected after logging in. However, URI fragment parsing is typically done from the user agent or client side.

What is a more valid use case is that a server may need to send a redirect using the location response header field with a URI that includes a fragment. The OAuth spec notes that some servers do not support fragments in the URI and that a workaround could be to return an HTML page with a button linked to the redirect URI. I am not advocating for adding URI fragment support necessarily but I do believe there are some valid use cases like this.

@magnusmanske
Copy link

FWIW, the issues with { and } also appears to cause problems for the pipe character |. I run a production (!) server based on Rocket that many people link to from wikis, where they copy-paste-edit URLs and often modify them with plain-text chars...

@gorup
Copy link

gorup commented Sep 10, 2019

I'm attempting to integrate a web service with AWS Cognito, a hosted authn/authz provider. I redirect my clients to Cognito, the client logs in, then Cognito redirects the client with a JSON Web Token to me with the format

https://www.example.com/#id_token=123456789tokens123456789&expires_in=3600&token_type=Bearer

As you can see, Cognito uses the # fragment identifier to delineate the json web token from the path. It doesn't appear as Rocket supports this from looking at the documentation and from reading above.

I'm posting this comment so there is more data on the various use cases that the current URI implementation does not support. Thanks!


Cognito documentation: https://docs.aws.amazon.com/cognito/latest/developerguide/cognito-user-pools-app-integration.html (scroll near bottom)

UPDATE

It appears that my understanding of fragments was misguided, and I should access the data from the fragment another way, perhaps through javascript.

See hyperium/hyper#1621

@jebrosen
Copy link
Collaborator Author

Here's another issue; moving it here from #827 (comment):

uri! macro with Option is a bit unintutive:

#![feature(proc_macro_hygiene)]

use rocket::{get, routes, uri};

#[get("/?<input>")]
fn hello(input: Option<String>) -> String {
//    let a = uri!(hello: None);
//    let b = uri!(hello: Some("a".to_string()));
    let c = uri!(hello: _);
    let d = uri!(hello: "a");
//    let e = uri!(hello);

    input.unwrap_or_else(|| "Hello, world!".to_string())
}

fn main() {
    rocket::ignite().mount("/", routes![hello]).launch();
}

None of the commented-out uri! invocations work.

This was discussed in IRC (https://mozilla.logbot.info/rocket/20190816) but I don't remember coming to any conclusion.

@jmaargh
Copy link

jmaargh commented May 13, 2020

I hope this is the correct place to add these thoughts. I can open a separate issue if that would be more helpful.

As discussed in #1021 (and probably elsewhere), it's relatively common for server responses to need to include full URIs to its own endpoints. While the current uri! is nice, it's missing some very large parts of this puzzle: scheme, host, and port in particular. I understand that these things are not necessarily simple to provide (e.g. proxies can prevent rocket from being able to infer them accurately at all), but it would be nice for rocket to provide an ergonomic way of solving this problem.

I'm entirely new to rocket, so this initial spitballing is likely to be off-base, so I don't want to suggest a solution which will likely be very wrong. If it is decided that the current set-up is actually ideal and the problem is fundamentally a bit messy, then perhaps providing an example/docs for how to typically do this would be a good substitute.

@SergioBenitez SergioBenitez added this to the 0.6.0 milestone Nov 2, 2020
SergioBenitez added a commit that referenced this issue Feb 26, 2021
Routing:
  * Unicode characters are accepted anywhere in route paths. (#998)
  * Dyanmic query values can (and must) be any `FromForm` type. The `Form` type
    is no longer useable in any query parameter type.

Capped
  * A new `Capped` type is used to indicate when data has been truncated due to
    incoming data limits. It allows checking whether data is complete or
    truncated. `DataStream` methods returns `Capped` types.
  * Several `Capped<T>` types implement `FromData`, `FromForm`.
  * HTTP 413 (Payload Too Large) errors are now returned when the data limit is
    exceeded. (resolves #972)

Hierarchical Limits
  * Data limits are now hierarchical, delimited with `/`. A limit of `a/b/c`
    falls back to `a/b` then `a` when not set.

Temporary Files
  * A new `TempFile` data and form guard allows streaming data directly to a
    file which can then be persisted.
  * A new `temp_dir` config parameter specifies where to store `TempFile`.
  * The limits `file` and `file/$ext`, where `$ext` is the file extension,
    determines the data limit for a `TempFile`.

Forms Revamp
  * All form related types now reside in a new `form` module.
  * Multipart forms are supported. (resolves #106)
  * Collections are supported in body forms and queries. (resolves #205)
  * Nested forms and structures are supported. (resolves #313)
  * Form fields can be ad-hoc validated with `#[field(value = expr)]`.

Core:
  * `&RawStr` no longer implements `FromParam`.
  * `&str` implements `FromParam`, `FromData`, `FromForm`.
  * `FromTransformedData` was removed.
  * `FromData` gained a lifetime for use with request-local data.
  * All dynamic paramters in a query string must typecheck as `FromForm`.
  * `FromFormValue` removed in favor of `FromFormField`.
  * Dyanmic paramters, form values are always percent-decoded.
  * The default error HTML is more compact.
  * `&Config` is a request guard.
  * The `DataStream` interface was entirely revamped.
  * `State` is only exported via `rocket::State`.
  * A `request::local_cache!()` macro was added for storing values in
    request-local cache without consideration for type uniqueness by using a
    locally generated anonymous type.
  * `Request::get_param()` is now `Request::param()`.
  * `Request::get_segments()` is now `Request::segments()`, takes a range.
  * `Request::get_query_value()` is now `Request::query_value()`, can parse any
    `FromForm` including sequences.
  * `std::io::Error` implements `Responder` as `Debug<std::io::Error>`.
  * `(Status, R)` where `R: Responder` implements `Responder` by setting
    overriding the `Status` of `R`.
  * The name of a route is printed first during route matching.
  * `FlashMessage` now only has one lifetime generic.

HTTP:
  * `RawStr` implements `serde::{Serialize, Deserialize}`.
  * `RawStr` implements _many_ more methods, in particular, those related to the
    `Pattern` API.
  * `RawStr::from_str()` is now `RawStr::new()`.
  * `RawStr::url_decode()` and `RawStr::url_decode_lossy()` only allocate as
    necessary, return `Cow`.
  * `(Status, R)` where `R: Responder` is a responder that overwrites the status
    of `R` to `Status`.
  * `Status` implements `Default` with `Status::Ok`.
  * `Status` implements `PartialEq`, `Eq`, `Hash`, `PartialOrd`, `Ord`.
  * Authority and origin part of `Absolute` can be modified with new
    `Absolute::{with,set}_authority()`, `Absolute::{with,set}_origin()` methods.
  * `Origin::segments()` was removed in favor of methods split into query and
    path parts and into raw and decoded parts.
  * The `Segments` iterator is signficantly smarter. Returns `&str`.
  * `Segments::into_path_buf()` is now `Segments::to_path_buf()`, doesn't
    consume.
  * A new `QuerySegments` is the analogous query segment iterator.
  * Once set, the `expires` field on private cookies is not overwritten.
    (resolves #1506)
  * `Origin::path()` and `Origin::query()` return `&RawStr`, not `&str`.

Codegen:
  * Preserve more spans in `uri!` macro.
  * `FromFormValue` derive removed; `FromFormField` added.
  * The `form` `FromForm` and `FromFormField` field attribute is now named
    `field`. `#[form(field = ..)]` is now `#[form(name = ..)]`.

Examples:
  * `form_validation` and `form_kitchen_sink` removed in favor of `forms`
  * `rocket_contrib::Json` implements `FromForm`.
  * The `json!` macro is exported as `rocket_contrib::json::json`.
  * `rocket_contrib::MsgPack` implements `FromForm`.
  * Added clarifying docs to `StaticFiles`.
  * The `hello_world` example uses unicode in paths.

Internal:
  * Codegen uses new `exports` module with the following conventions:
    - Locals starts with `__` and are lowercased.
    - Rocket modules start with `_` are are lowercased.
    - Stdlib types start with `_` are are titlecased.
    - Rocket types are titlecased.
  * A `header` module was added to `http`, contains header types.
  * `SAFETY` is used as doc-string keyword for `unsafe` related comments.
  * The `Uri` parser no longer recognizes Rocket route URIs.
SergioBenitez added a commit that referenced this issue Mar 2, 2021
Routing:
  * All UTF-8 characters are accepted anywhere in route paths. (#998)
  * `path` is now `uri` in `route` attribute: `#[route(GET, path = "..")]`
    becomes `#[route(GET, uri = "..")]`.

Forms Revamp
  * All form related types now reside in a new `form` module.
  * Multipart forms are supported. (resolves #106)
  * Collections are supported in body forms and queries. (resolves #205)
  * Nested forms and structures are supported. (resolves #313)
  * Form fields can be ad-hoc validated with `#[field(value = expr)]`.
  * `FromFormValue` is now `FromFormField`, blanket implements `FromForm`.
  * Form field values are always percent-decoded apriori.

Temporary Files
  * A new `TempFile` data and form guard allows streaming data directly to a
    file which can then be persisted.
  * A new `temp_dir` config parameter specifies where to store `TempFile`.
  * The limits `file` and `file/$ext`, where `$ext` is the file extension,
    determines the data limit for a `TempFile`.

Capped
  * A new `Capped` type is used to indicate when data has been truncated due to
    incoming data limits. It allows checking whether data is complete or
    truncated. `DataStream` methods return `Capped` types.
  * Several `Capped<T>` types implement `FromData`, `FromForm`.
  * HTTP 413 (Payload Too Large) errors are now returned when data limits are
    exceeded. (resolves #972)

Hierarchical Limits
  * Data limits are now hierarchical, delimited with `/`. A limit of `a/b/c`
    falls back to `a/b` then `a`.

Core
  * `&RawStr` no longer implements `FromParam`.
  * `&str` implements `FromParam`, `FromData`, `FromForm`.
  * `FromTransformedData` was removed.
  * `FromData` gained a lifetime for use with request-local data.
  * The default error HTML is more compact.
  * `&Config` is a request guard.
  * The `DataStream` interface was entirely revamped.
  * `State` is only exported via `rocket::State`.
  * A `request::local_cache!()` macro was added for storing values in
    request-local cache without consideration for type uniqueness by using a
    locally generated anonymous type.
  * `Request::get_param()` is now `Request::param()`.
  * `Request::get_segments()` is now `Request::segments()`, takes a range.
  * `Request::get_query_value()` is now `Request::query_value()`, can parse any
    `FromForm` including sequences.
  * `std::io::Error` implements `Responder` as `Debug<std::io::Error>`.
  * `(Status, R)` where `R: Responder` implements `Responder` by overriding the
    `Status` of `R`.
  * The name of a route is printed first during route matching.
  * `FlashMessage` now only has one lifetime generic.

HTTP
  * `RawStr` implements `serde::{Serialize, Deserialize}`.
  * `RawStr` implements _many_ more methods, in particular, those related to the
    `Pattern` API.
  * `RawStr::from_str()` is now `RawStr::new()`.
  * `RawStr::url_decode()` and `RawStr::url_decode_lossy()` only allocate as
    necessary, return `Cow`.
  * `Status` implements `Default` with `Status::Ok`.
  * `Status` implements `PartialEq`, `Eq`, `Hash`, `PartialOrd`, `Ord`.
  * Authority and origin part of `Absolute` can be modified with new
    `Absolute::{with,set}_authority()`, `Absolute::{with,set}_origin()` methods.
  * `Origin::segments()` was removed in favor of methods split into query and
    path parts and into raw and decoded versions.
  * The `Segments` iterator is smarter, returns decoded `&str` items.
  * `Segments::into_path_buf()` is now `Segments::to_path_buf()`.
  * A new `QuerySegments` is the analogous query segment iterator.
  * Once set, `expires` on private cookies is not overwritten. (resolves #1506)
  * `Origin::path()` and `Origin::query()` return `&RawStr`, not `&str`.

Codegen
  * Preserve more spans in `uri!` macro.
  * Preserve spans `FromForm` field types.
  * All dynamic parameters in a query string must typecheck as `FromForm`.
  * `FromFormValue` derive removed; `FromFormField` added.
  * The `form` `FromForm` and `FromFormField` field attribute is now named
    `field`. `#[form(field = ..)]` is now `#[field(name = ..)]`.

Contrib
  * `Json` implements `FromForm`.
  * `MsgPack` implements `FromForm`.
  * The `json!` macro is exported as `rocket_contrib::json::json!`.
  * Added clarifying docs to `StaticFiles`.

Examples
  * `form_validation` and `form_kitchen_sink` removed in favor of `forms`.
  * The `hello_world` example uses unicode in paths.
  * The `json` example only allocates as necessary.

Internal
  * Codegen uses new `exports` module with the following conventions:
    - Locals starts with `__` and are lowercased.
    - Rocket modules start with `_` and are lowercased.
    - `std` types start with `_` and are titlecased.
    - Rocket types are titlecased.
  * A `header` module was added to `http`, contains header types.
  * `SAFETY` is used as doc-string keyword for `unsafe` related comments.
  * The `Uri` parser no longer recognizes Rocket route URIs.
SergioBenitez added a commit that referenced this issue Mar 4, 2021
Routing:
  * All UTF-8 characters are accepted anywhere in route paths. (#998)
  * `path` is now `uri` in `route` attribute: `#[route(GET, path = "..")]`
    becomes `#[route(GET, uri = "..")]`.

Forms Revamp
  * All form related types now reside in a new `form` module.
  * Multipart forms are supported. (resolves #106)
  * Collections are supported in body forms and queries. (resolves #205)
  * Nested forms and structures are supported. (resolves #313)
  * Form fields can be ad-hoc validated with `#[field(value = expr)]`.
  * `FromFormValue` is now `FromFormField`, blanket implements `FromForm`.
  * Form field values are always percent-decoded apriori.

Temporary Files
  * A new `TempFile` data and form guard allows streaming data directly to a
    file which can then be persisted.
  * A new `temp_dir` config parameter specifies where to store `TempFile`.
  * The limits `file` and `file/$ext`, where `$ext` is the file extension,
    determines the data limit for a `TempFile`.

Capped
  * A new `Capped` type is used to indicate when data has been truncated due to
    incoming data limits. It allows checking whether data is complete or
    truncated. `DataStream` methods return `Capped` types.
  * Several `Capped<T>` types implement `FromData`, `FromForm`.
  * HTTP 413 (Payload Too Large) errors are now returned when data limits are
    exceeded. (resolves #972)

Hierarchical Limits
  * Data limits are now hierarchical, delimited with `/`. A limit of `a/b/c`
    falls back to `a/b` then `a`.

Core
  * `&RawStr` no longer implements `FromParam`.
  * `&str` implements `FromParam`, `FromData`, `FromForm`.
  * `FromTransformedData` was removed.
  * `FromData` gained a lifetime for use with request-local data.
  * The default error HTML is more compact.
  * `&Config` is a request guard.
  * The `DataStream` interface was entirely revamped.
  * `State` is only exported via `rocket::State`.
  * A `request::local_cache!()` macro was added for storing values in
    request-local cache without consideration for type uniqueness by using a
    locally generated anonymous type.
  * `Request::get_param()` is now `Request::param()`.
  * `Request::get_segments()` is now `Request::segments()`, takes a range.
  * `Request::get_query_value()` is now `Request::query_value()`, can parse any
    `FromForm` including sequences.
  * `std::io::Error` implements `Responder` as `Debug<std::io::Error>`.
  * `(Status, R)` where `R: Responder` implements `Responder` by overriding the
    `Status` of `R`.
  * The name of a route is printed first during route matching.
  * `FlashMessage` now only has one lifetime generic.

HTTP
  * `RawStr` implements `serde::{Serialize, Deserialize}`.
  * `RawStr` implements _many_ more methods, in particular, those related to the
    `Pattern` API.
  * `RawStr::from_str()` is now `RawStr::new()`.
  * `RawStr::url_decode()` and `RawStr::url_decode_lossy()` only allocate as
    necessary, return `Cow`.
  * `Status` implements `Default` with `Status::Ok`.
  * `Status` implements `PartialEq`, `Eq`, `Hash`, `PartialOrd`, `Ord`.
  * Authority and origin part of `Absolute` can be modified with new
    `Absolute::{with,set}_authority()`, `Absolute::{with,set}_origin()` methods.
  * `Origin::segments()` was removed in favor of methods split into query and
    path parts and into raw and decoded versions.
  * The `Segments` iterator is smarter, returns decoded `&str` items.
  * `Segments::into_path_buf()` is now `Segments::to_path_buf()`.
  * A new `QuerySegments` is the analogous query segment iterator.
  * Once set, `expires` on private cookies is not overwritten. (resolves #1506)
  * `Origin::path()` and `Origin::query()` return `&RawStr`, not `&str`.

Codegen
  * Preserve more spans in `uri!` macro.
  * Preserve spans `FromForm` field types.
  * All dynamic parameters in a query string must typecheck as `FromForm`.
  * `FromFormValue` derive removed; `FromFormField` added.
  * The `form` `FromForm` and `FromFormField` field attribute is now named
    `field`. `#[form(field = ..)]` is now `#[field(name = ..)]`.

Contrib
  * `Json` implements `FromForm`.
  * `MsgPack` implements `FromForm`.
  * The `json!` macro is exported as `rocket_contrib::json::json!`.
  * Added clarifying docs to `StaticFiles`.

Examples
  * `form_validation` and `form_kitchen_sink` removed in favor of `forms`.
  * The `hello_world` example uses unicode in paths.
  * The `json` example only allocates as necessary.

Internal
  * Codegen uses new `exports` module with the following conventions:
    - Locals starts with `__` and are lowercased.
    - Rocket modules start with `_` and are lowercased.
    - `std` types start with `_` and are titlecased.
    - Rocket types are titlecased.
  * A `header` module was added to `http`, contains header types.
  * `SAFETY` is used as doc-string keyword for `unsafe` related comments.
  * The `Uri` parser no longer recognizes Rocket route URIs.
SergioBenitez added a commit that referenced this issue Mar 4, 2021
So. Many. Changes.

This is an insane commit: simultaneously one of the best (because of all
the wonderful improvements!) and one of the worst (because it is just
massive) in the project's history.

Routing:
  * All UTF-8 characters are accepted everywhere in route paths. (#998)
  * `path` is now `uri` in `route` attribute: `#[route(GET, path = "..")]`
    becomes `#[route(GET, uri = "..")]`.

Forms Revamp
  * All form related types now reside in a new `form` module.
  * Multipart forms are supported. (resolves #106)
  * Collections are supported in forms and queries. (resolves #205)
  * Nested structures in forms and queries are supported. (resolves #313)
  * Form fields can be ad-hoc validated with `#[field(validate = expr)]`.
  * `FromFormValue` is now `FromFormField`, blanket implements `FromForm`.
  * Form field values are always percent-decoded apriori.

Temporary Files
  * A new `TempFile` data and form guard allows streaming data directly to a
    file which can then be persisted.
  * A new `temp_dir` config parameter specifies where to store `TempFile`.
  * The limits `file` and `file/$ext`, where `$ext` is the file extension,
    determines the data limit for a `TempFile`.

Capped
  * A new `Capped` type is used to indicate when data has been truncated due to
    incoming data limits. It allows checking whether data is complete or
    truncated.
  * `DataStream` methods return `Capped` types.
  * `DataStream` API has been revamped to account for `Capped` types.
  * Several `Capped<T>` types implement `FromData`, `FromForm`.
  * HTTP 413 (Payload Too Large) errors are now returned when data limits are
    exceeded. (resolves #972)

Hierarchical Limits
  * Data limits are now hierarchical, delimited with `/`. A limit of `a/b/c`
    falls back to `a/b` then `a`.

Core
  * `&RawStr` no longer implements `FromParam`.
  * `&str` implements `FromParam`, `FromData`, `FromForm`.
  * `FromTransformedData` was removed.
  * `FromData` gained a lifetime for use with request-local data.
  * The default error HTML is more compact.
  * `&Config` is a request guard.
  * The `DataStream` interface was entirely revamped.
  * `State` is only exported via `rocket::State`.
  * A `request::local_cache!()` macro was added for storing values in
    request-local cache without consideration for type uniqueness by using a
    locally generated anonymous type.
  * `Request::get_param()` is now `Request::param()`.
  * `Request::get_segments()` is now `Request::segments()`, takes a range.
  * `Request::get_query_value()` is now `Request::query_value()`, can parse any
    `FromForm` including sequences.
  * `std::io::Error` implements `Responder` like `Debug<std::io::Error>`.
  * `(Status, R)` where `R: Responder` implements `Responder` by overriding the
    `Status` of `R`.
  * The name of a route is printed first during route matching.
  * `FlashMessage` now only has one lifetime generic.

HTTP
  * `RawStr` implements `serde::{Serialize, Deserialize}`.
  * `RawStr` implements _many_ more methods, in particular, those related to the
    `Pattern` API.
  * `RawStr::from_str()` is now `RawStr::new()`.
  * `RawStr::url_decode()` and `RawStr::url_decode_lossy()` only allocate as
    necessary, return `Cow`.
  * `Status` implements `Default` with `Status::Ok`.
  * `Status` implements `PartialEq`, `Eq`, `Hash`, `PartialOrd`, `Ord`.
  * Authority and origin part of `Absolute` can be modified with new
    `Absolute::{with,set}_authority()`, `Absolute::{with,set}_origin()` methods.
  * `Origin::segments()` was removed in favor of methods split into query and
    path parts and into raw and decoded versions.
  * The `Segments` iterator is smarter, returns decoded `&str` items.
  * `Segments::into_path_buf()` is now `Segments::to_path_buf()`.
  * A new `QuerySegments` is the analogous query segment iterator.
  * Once set, `expires` on private cookies is not overwritten. (resolves #1506)
  * `Origin::path()` and `Origin::query()` return `&RawStr`, not `&str`.

Codegen
  * Preserve more spans in `uri!` macro.
  * Preserve spans `FromForm` field types.
  * All dynamic parameters in a query string must typecheck as `FromForm`.
  * `FromFormValue` derive removed; `FromFormField` added.
  * The `form` `FromForm` and `FromFormField` field attribute is now named
    `field`. `#[form(field = ..)]` is now `#[field(name = ..)]`.

Contrib
  * `Json` implements `FromForm`.
  * `MsgPack` implements `FromForm`.
  * The `json!` macro is exported as `rocket_contrib::json::json!`.
  * Added clarifying docs to `StaticFiles`.

Examples
  * `form_validation` and `form_kitchen_sink` removed in favor of `forms`.
  * The `hello_world` example uses unicode in paths.
  * The `json` example only allocates as necessary.

Internal
  * Codegen uses new `exports` module with the following conventions:
    - Locals starts with `__` and are lowercased.
    - Rocket modules start with `_` and are lowercased.
    - `std` types start with `_` and are titlecased.
    - Rocket types are titlecased.
  * A `header` module was added to `http`, contains header types.
  * `SAFETY` is used as doc-string keyword for `unsafe` related comments.
  * The `Uri` parser no longer recognizes Rocket route URIs.
@the10thWiz the10thWiz mentioned this issue Apr 28, 2021
@the10thWiz
Copy link
Collaborator

How should Rocket handle certain parts of the Uri being missing? IIRC it's not technically invalid, at least according to spec, but it might cause issues when using the parsed Uri.

Technically, according to RFC3986, a Uri looks sort of like [ scheme ":" ] [ "//" host ] [ origin ].

There is no reason scheme can't be an empty string, and it looks like Rocket's parser allows this. The parse will generate None if the scheme and associated colon is omitted, and a Some("") when the colon is present.

The host futher breaks down into [ userinfo "@" ] domain [ ":" port ]. Rocket's parse also allows all three parts to be empty, although since the port is saved as an integer, when displaying it, it produces a different Uri. Is this acceptable? I think this should be okay for the most part, since they should be sematically the same.

The origin futher breaks down into [ "/" path ]+ [ "?" query ] [ "#" fragment ]. If the path is omitted (including the slash), Rocket treats it as an empty string, which should be fine since the leading slash is included. Rocket sometimes allows the query part to be empty, either treating it as an empty string, or returning a parse error. (Rocket doesn't parse fragments yet).

Rocket considers an empty string to be an invalid Uri, which is sort of an edge case for the RFC. This is probably for the best, since it's highly unlikely to be sent legitamely, and there are multiple ways to interpert it.

Rocket considers an empty query string to be an error if and only if there is no scheme and there is no leading slash. E.g. a?, a/?, and //a? all fail to parse, while /a?, //a?, ://a?, :a?, and :/a? pass the displays_eq test.

tl;dr: It looks like rocket might have some issues, depending on how you interpert the RFC, and Rocket definitly has issues with query strings.

@SergioBenitez
Copy link
Member

SergioBenitez commented May 20, 2021

@the10thWiz Your comment contains a lot of inaccuracies, unfortunately, so I'd like to address them for posterity.

How should Rocket handle certain parts of the Uri being missing? IIRC it's not technically invalid, at least according to spec, but it might cause issues when using the parsed Uri.

Technically, according to RFC3986, a Uri looks sort of like [ scheme ":" ] [ "//" host ] [ origin ].

This is not accurate. There are many forms of URIs specified via 3986, none of which look like this. The closest thing to what you've shown here is a URI-reference in 7230, but even that is different.

There is no reason scheme can't be an empty string, and it looks like Rocket's parser allows this. The parse will generate None if the scheme and associated colon is omitted, and a Some("") when the colon is present.

A scheme in an absolute URI cannot be empty, according to RFC 7230/3896:

scheme        = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )

The scheme is followed by a :. if there is no :, there cannot be a scheme.

The host futher breaks down into [ userinfo "@" ] domain [ ":" port ]. Rocket's parse also allows all three parts to be empty, although since the port is saved as an integer, when displaying it, it produces a different Uri. Is this acceptable? I think this should be okay for the most part, since they should be sematically the same.

This is not a host but an authority:

authority     = [ userinfo "@" ] host [ ":" port ]

All parts can be empty according to the RFC. Rocket does not store the port as an integer but as an Option<u16>. It should reflect the URI accurately.

The origin futher breaks down into [ "/" path ]+ [ "?" query ] [ "#" fragment ]. If the path is omitted (including the slash), Rocket treats it as an empty string, which should be fine since the leading slash is included. Rocket sometimes allows the query part to be empty, either treating it as an empty string, or returning a parse error. (Rocket doesn't parse fragments yet).

This is not correct. An origin-form URI must have a leading /, and a fragment is not allowed. Rocket does not "sometimes" allow the query part to be empty; it can always be empty, no matter the URI form. It does not return an error.

origin-form = absolute-path [ "?" query ]

absolute-path = 1*( "/" segment )

Rocket considers an empty string to be an invalid Uri, which is sort of an edge case for the RFC. This is probably for the best, since it's highly unlikely to be sent legitamely, and there are multiple ways to interpert it.

This was wrong. An empty string is a valid URI.

Rocket considers an empty query string to be an error if and only if there is no scheme and there is no leading slash. E.g. a?, a/?, and //a? all fail to parse, while /a?, //a?, ://a?, :a?, and :/a? pass the displays_eq test.

You are conflating different URI forms for which the behavior should differ.

tl;dr: It looks like rocket might have some issues, depending on how you interpert the RFC, and Rocket definitly has issues with query strings.

There is a single way to interpret the RFC, via its grammar, and Rocket had no issues with query strings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feedback wanted User feedback is needed
Projects
None yet
Development

No branches or pull requests

8 participants