Quoted pairs #68

rustonaut · 2017-11-07T13:04:54Z

I originally only wanted to add a as_str_repr, a to_content method and fix equality comparsion wrt. to quoted-pairs but because Name was used for both names and values and names should be usable in
patterns this was not possible (to be usable in patterns derive(PartialEq) is required so no custom eq implementation). So I also had to split it into Name+Value.

rustonaut · 2017-11-07T13:08:22Z

I also ran the benchmarks and there where basically no changes in performance (the new version was a bit faster but only in a degree which can be cause by noise)

rustonaut · 2017-11-07T13:09:27Z

Wups, travis is still testing a quite a old rust version... I will fix this (I my code, not travis)

rustonaut · 2017-11-07T15:43:19Z

It might make change to raise the minimal rust version from 1.15.0 to the first stable rust version which supports pub(crate) which then would allow you have multiple modules instead of pseudo modules separated with // Name ==============. I originally had Name/Value in sub modules but
hade to throw them back into lib.

If its fine with you I would go through the crate and clean thinks a bit up after bumping
the minimal rust version, like e.g. splitting up the parsing function into multiple ones or
even better use nom for it, rename some of the fields like e.g. slash to slash_idx
, split thinks into multiple modules for readability, etc. but that would be another pull request.

seanmonstar · 2017-11-09T00:18:19Z

Wow, this looks legit. I'll need a moment to review, but wanted to answer your questions: it seems that this is a breaking change, by splitting to a new Value type. In that case, it seems fine to include an increase to the minimum Rust version (preferably not the absolute latest).

For utf-8 vs utf8, would it be a good idea to encode that possibility in the Value type itself, such that UTF_8 will equal either version, without needing 2 constants?

rustonaut · 2017-11-10T12:28:13Z

The rust version I would like to bump it to would be 1.18.0 as it is the first which allows pub(crate) in stable.

Wrt. utf8 I currently consider 3 options:

make Value and enum ( Utf8 / Other { source: &'a str })
move all special handling of utf8 in a method (either has_utf8_charset() on Mime or is_utf8() on Value)
remove all special handling for utf8 from Value but something like Pattern trait + impl<T> PartialEq<T> for Value where T: Pattern and a zero-sized UTF_8 type implementing Pattern (this still needs to be flushed out more)

The benefit of 1. is a faster comparison of utf8 but a very slightly slower comparison for any other param.
(Through not slower then currently, more like the same performance for everything except comparing a utf8 Value to a utf8 Value (not string) which is much faster)

The benefit of 2. is no special handling of utf8 anywhere but in the one method.

The benefit of 3. is that people can create there own Patterns.

rustonaut · 2017-11-10T13:09:09Z

I think it's best to go with 1. charset=utf8 is basically the most used mime parameter, so that should be fine.

I also noticed that I made a mistake by assuming the PartialEq implementations for Mime use Name (now Name/Value) to compare parameters, but they don't. So I still have to fix that.

I also noticed that:

the comparsion of two Mimes depends on the order of the parameters, just that in the spec they are order independent
I have to take another look into where you can place whitespaces (i.e. text/plain ;charset=utf8, charset = utf8 etc.)

Through I would leaf this for another pull request (should be combined into a single braking change).

rustonaut · 2017-11-10T18:15:19Z

I think there is a better way to fix all this, which is to "normalize" while parsing.
We do copy the string anyway when parsing, so we could do so and
normalize it:

strip linear whitespaces
make all case insensitive letters lowercase
replacing any unnecessary quoted pair with a unquoted one,
remove quoted where possible
special handle utf8 to replace any form of utf-8 (UTF8, Utf-8, ...) with utf-8

This would:

this removes any problem with utf-8 and utf8 and case sensitivity of utf8 wrt. converting the mime back to an string / Deref<Target=str>
- through a Value of utf8 compared to a string still needs special handling
it also removes the problem that the mime crate can accept mimes which are valid but unusual
and might be rejected by other tools (e.g. r#"param="a\bc""# can be used but r#"param="abc""#
or better r#"param=abc"# should be preferred)
make some comparisons faster (mainly Value to Value as they can memcmp, but also more values which are not quoted and can use memcmp in a Value to str comparision)

I'm not sure why I didn't do that from the beginning...
I will prototype and benchmark it now, to see if parsing is slower.

seanmonstar · 2017-11-14T22:40:20Z

What you say sounds very reasonable! Does that mean you'd like to edit this PR more, or should I merge and you'd like to explore separately?

rustonaut · 2017-11-17T15:43:15Z

I originally wanted to work a bit more on the PR.
While doing so I reread all relevant RFC (parts) and came to the conclusion that long term parser parameters are needed.

So I will now just fix the comparison of Media Types with string to use Name/Value for comparison and conclude this pull request and then continue to work on a rewriting the parser in a way which accepts parameters.

Why parameters are needed:
see new issue: #73

This required to split Name into Name & Value, as Value needs to custom implement PartialEq but Name needs to derive partial eq, as it should be usable with match statements as pattern.

So now quoted-pairs can be used for what they where meant to escape ". Some additional tests where added to make sure nothing breakes with this change.

(rust stability gurantees do not apply to crates using deny(warnings) as introducing warnings is not seen as a braking change)

Before it had been at rust 1.15.0

…airs

Adds newlines to the end of file where nessesary and removes all `eq_str` function moving their functionality into a `PartialEq<str>` implementation

rustonaut · 2017-11-17T18:38:40Z

I cleaned thinks up a bit, should now be ready to be merged.

Note that I for now removed all special "utf8" (instead of utf-8) handling,
as the best way to making it work completely correct is to normalize while parsing
thinks into the internal buffer (called Source in the crate). Which I moved
to a later pull request due to the thinks in #73.

seanmonstar · 2017-12-01T04:11:50Z

Awesome, thank you! (Sorry, I was off the internet completely for over a week).

rustonaut force-pushed the quoted_pairs branch from b3384e4 to 88e8eff Compare November 10, 2017 13:17

rustonaut force-pushed the quoted_pairs branch from 88e8eff to 43da879 Compare November 17, 2017 17:53

rustonaut added 9 commits November 17, 2017 19:25

chore: implemented support for quoted-pairs

466ffd3

This required to split Name into Name & Value, as Value needs to custom implement PartialEq but Name needs to derive partial eq, as it should be usable with match statements as pattern.

chore: changed parser to accept "\\\"" and "\\\t" and reject "\\"

693ba29

So now quoted-pairs can be used for what they where meant to escape ". Some additional tests where added to make sure nothing breakes with this change.

test: added tests for as_str_repr and to_content

d1b39fe

fix: now can be used again on nightly

ced70c2

(rust stability gurantees do not apply to crates using deny(warnings) as introducing warnings is not seen as a braking change)

doc: minor change in (internal) documentation

6c36c1a

chore(travis): bumped test for minimum rust version to rust 1.18

c75dd32

Before it had been at rust 1.15.0

fix: PartialEq for Mime to Mime and Mime to str now respects quoted-p…

7f8be5c

…airs

refactor: slightly clean up the code

c081116

Adds newlines to the end of file where nessesary and removes all `eq_str` function moving their functionality into a `PartialEq<str>` implementation

test: added tests for the order independence of params

06f9a75

rustonaut force-pushed the quoted_pairs branch from 0d2e902 to 06f9a75 Compare November 17, 2017 18:32

test: added (more correct) tests for case insensitive name comparisons

bbf20d8

seanmonstar merged commit 843636c into hyperium:master Dec 1, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quoted pairs #68

Quoted pairs #68

rustonaut commented Nov 7, 2017

rustonaut commented Nov 7, 2017

rustonaut commented Nov 7, 2017

rustonaut commented Nov 7, 2017 •

edited

seanmonstar commented Nov 9, 2017

rustonaut commented Nov 10, 2017

rustonaut commented Nov 10, 2017 •

edited

rustonaut commented Nov 10, 2017

seanmonstar commented Nov 14, 2017

rustonaut commented Nov 17, 2017

rustonaut commented Nov 17, 2017 •

edited

seanmonstar commented Dec 1, 2017

Quoted pairs #68

Quoted pairs #68

Conversation

rustonaut commented Nov 7, 2017

rustonaut commented Nov 7, 2017

rustonaut commented Nov 7, 2017

rustonaut commented Nov 7, 2017 • edited

seanmonstar commented Nov 9, 2017

rustonaut commented Nov 10, 2017

rustonaut commented Nov 10, 2017 • edited

rustonaut commented Nov 10, 2017

seanmonstar commented Nov 14, 2017

rustonaut commented Nov 17, 2017

rustonaut commented Nov 17, 2017 • edited

seanmonstar commented Dec 1, 2017

rustonaut commented Nov 7, 2017 •

edited

rustonaut commented Nov 10, 2017 •

edited

rustonaut commented Nov 17, 2017 •

edited