Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

percent-encoding: Allow to remove character from AsciiSet #528

Merged
merged 4 commits into from Aug 5, 2019

Conversation

@DoumanAsh
Copy link
Contributor

DoumanAsh commented Jul 24, 2019

Allow to remove character from AsciiSet

I liked trait approach better, but if you're going with something like AsciiSet it would be better to allow to remove bytes, as sometimes it is easier to take NON_ALPHANUMERIC and remove few necessary characters


This change is Reviewable

@DoumanAsh DoumanAsh changed the title oercent-encoding: Allow to remove character from AsciiSet percent-encoding: Allow to remove character from AsciiSet Jul 24, 2019
@DoumanAsh DoumanAsh force-pushed the DoumanAsh:ascii_set_remove branch from 182637a to 0d33781 Jul 24, 2019
@SimonSapin
Copy link
Member

SimonSapin commented Jul 24, 2019

remove looks good to me. What’s the motivation for exposing contains? If you’d like this on crates.io, please also bump the version number to 2.1.0.

@DoumanAsh
Copy link
Contributor Author

DoumanAsh commented Jul 24, 2019

@SimonSapin for contains I have use case when I want to determine if I'm going to use percent_encoding or not at all at runtime.
And previous major version had such ability, so I think it would be convenient to leave it out for user to use too.

Yes, I'll bump version too

UPD: To clarify, my use case is determining whether I should use filename or filename* in Content-Disposition

@SimonSapin
Copy link
Member

SimonSapin commented Jul 24, 2019

Sorry, I still have a hard time seeing how calling contains is useful. Are you writing functions or methods that take a &AsciiSet parameter as input?

What does filename* refer to?

@DoumanAsh
Copy link
Contributor Author

DoumanAsh commented Jul 24, 2019

filename referes to https://tools.ietf.org/html/rfc6266#section-4.3

If you're planning to encode file name in content-disposition header you should use different attribute instead of regular filename and provide encoding information.
I only want to percent encode if file name doesn't contain valid set of symbols for path and/or outside of ASCII.
So originally I always checked ahead if I'm planning to encode file name https://github.com/DoumanAsh/yukikaze/blob/1.0/src/header/content_disposition.rs#L38 like that for 1.0 version

@SimonSapin
Copy link
Member

SimonSapin commented Jul 24, 2019

Simply making contains public and using it in the code above would be incorrect as you’d also need to look for non-ASCII bytes.

It looks like what you’re trying to do is find out whether percent_encode would make any change to a given input string. To do that you could also use std::borrow::Cow::<str>::from(percent_encode(…)). If you get a Cow::Borrowed, the output is the same as the input and none of the bytes are encoded.

@SimonSapin
Copy link
Member

SimonSapin commented Jul 24, 2019

Oh sorry that’s not accurate. That conversion can also return Cow::Borrowed if the input is a single byte that needs to be encoded. (In that case it borrows from a static string.)

Would branching on output == input work for you?

@DoumanAsh
Copy link
Contributor Author

DoumanAsh commented Jul 24, 2019

@SimonSapin Yes, I'd need to check both for non-ASCII and for AsciiSet::contains.

I guess if comparing output with input would be the only choice I'd go for it too, but it would require for me to percent ahead instead of doing it later.
Which may be ok in this case, but I'm not sure about other cases.

Nevertheless, if you feel like you don't want to expose this method, I can remove it from PR.

@DoumanAsh
Copy link
Contributor Author

DoumanAsh commented Aug 1, 2019

@benesch
Copy link
Contributor

benesch commented Aug 1, 2019

This would simplify the AttrCharEncodeSet in seanmonstar/reqwest#583 quite a bit, so I'm greatly in support of this landing as well.

@SimonSapin
Copy link
Member

SimonSapin commented Aug 5, 2019

Looks good, thanks!

@bors-servo r+

@bors-servo
Copy link
Contributor

bors-servo commented Aug 5, 2019

📌 Commit ed7b991 has been approved by SimonSapin

@bors-servo
Copy link
Contributor

bors-servo commented Aug 5, 2019

Testing commit ed7b991 with merge 0e874dd...

bors-servo added a commit that referenced this pull request Aug 5, 2019
percent-encoding: Allow to remove character from AsciiSet

Allow to remove character from AsciiSet

I liked trait approach better, but if you're going with something like AsciiSet it would be better to allow to remove bytes, as sometimes it is easier to take `NON_ALPHANUMERIC` and remove few necessary characters

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/rust-url/528)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

bors-servo commented Aug 5, 2019

☀️ Test successful - checks-travis
Approved by: SimonSapin
Pushing 0e874dd to master...

@bors-servo bors-servo merged commit ed7b991 into servo:master Aug 5, 2019
3 checks passed
3 checks passed
Travis CI - Pull Request Build Passed
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
homu Test successful
Details
@SimonSapin
Copy link
Member

SimonSapin commented Aug 5, 2019

@DoumanAsh DoumanAsh deleted the DoumanAsh:ascii_set_remove branch Aug 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants
You can’t perform that action at this time.