Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reuse url encoding from url crate, don't use separate percent-encoding #11750

Merged
merged 1 commit into from
Feb 21, 2023

Conversation

klensy
Copy link
Contributor

@klensy klensy commented Feb 21, 2023

Reuse encoding from url, don't use separate percent-encoding.

@rustbot
Copy link
Collaborator

rustbot commented Feb 21, 2023

r? @ehuss

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added A-interacts-with-crates.io Area: interaction with registries S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 21, 2023
@klensy
Copy link
Contributor Author

klensy commented Feb 21, 2023

There is similar one use with percent-encoding

pub fn search(&mut self, query: &str, limit: u32) -> Result<(Vec<Crate>, u32)> {
let formatted_query = percent_encode(query.as_bytes(), NON_ALPHANUMERIC);
let body = self.req(
&format!("/crates?q={}&per_page={}", formatted_query, limit),
None,
Auth::Unauthorized,
)?;

and url

pub fn is_url_crates_io(url: &str) -> bool {
Url::parse(url)
.map(|u| u.host_str() == Some("crates.io"))
.unwrap_or(false)
}

but i don't have good (and simple) idea what to do.

@arlosi
Copy link
Contributor

arlosi commented Feb 21, 2023

Thanks! The example in the crates-io is more difficult since it's a relative URL, and the Url crate can't handle that. The other example doesn't use URL encoding at all, so I think it's fine as-is.

@bors r+

@bors
Copy link
Collaborator

bors commented Feb 21, 2023

📌 Commit 37d429c has been approved by arlosi

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 21, 2023
@bors
Copy link
Collaborator

bors commented Feb 21, 2023

⌛ Testing commit 37d429c with merge b6c7fbe...

@bors
Copy link
Collaborator

bors commented Feb 21, 2023

☀️ Test successful - checks-actions
Approved by: arlosi
Pushing b6c7fbe to master...

@bors bors merged commit b6c7fbe into rust-lang:master Feb 21, 2023
@Eh2406
Copy link
Contributor

Eh2406 commented Feb 21, 2023

Are there tests for the encoding of +? I don't want to break existing use cases re: rust-lang/crates.io#4891

@klensy
Copy link
Contributor Author

klensy commented Feb 21, 2023

Are there tests for the encoding of +? I don't want to break existing use cases re: rust-lang/crates.io#4891

No, i don't see tests for that.

cargo search "hello+f" --limit 101
cargo run search "hello+f" --limit 101

both: (go to https://crates.io/search?q=hello%2Bf to see more)

but here the diff (For that usecase, new behavior for is actually more accurate?)

 cargo search "hello f" --limit 101
(go to https://crates.io/search?q=hello%20f to see more)

cargo run search "hello f" --limit 101
(go to https://crates.io/search?q=hello+f to see more)

@arlosi
Copy link
Contributor

arlosi commented Feb 21, 2023

crates.io can handle both encodings of (space) in the query string (%20 and and +). + must always be encoded as %2B

I think there's a copy-paste error in your diff. Searching for "hello f" produces %20 on existing Cargo.

 cargo search "hello f" --limit 101
(go to https://crates.io/search?q=hello%20f to see more)

@Eh2406
Copy link
Contributor

Eh2406 commented Feb 21, 2023

crates.io can handle both encodings of (space) in the query string (%20 and and +).

Yes.

+ must always be encoded as %2B

I think you have it backwards. https://static.crates.io/crates/libgit2-sys/libgit2-sys-0.12.25+1.3.0.crate works but https://static.crates.io/crates/libgit2-sys/libgit2-sys-0.12.25%2B1.3.0.crate dose not. This is a bug in crate.io, or really a bug in s3, but we can not start hitting it.

weihanglo added a commit to weihanglo/rust that referenced this pull request Feb 23, 2023
15 commits in 17b3d0de0897e1c6b8ca347bd39f850bb0a5b9f6..9d5b32f503fc099c4064298465add14d4bce11e6
2023-02-17 19:45:09 +0000 to 2023-02-22 23:04:16 +0000

- refactor(job_queue): docs and move types around (rust-lang/cargo#11758)
- Scrub more of the test environment (rust-lang/cargo#11757)
- Make more reads of environment variables go through the `Config` (rust-lang/cargo#11754)
- Revert "Update curl-sys to use libcurl 7.88.1" (rust-lang/cargo#11755)
- use consistent case (rust-lang/cargo#11748)
- Switch some tests from `build` to `check` (rust-lang/cargo#11725)
- Fix typo in sparse-registry warning message (rust-lang/cargo#11753)
- reuse url encoding from `url` crate, don't use separate `percent-encoding` (rust-lang/cargo#11750)
- Read environment variables through `Config` instead of `std::env::var(_os)` (rust-lang/cargo#11727)
- Update curl-sys to use libcurl 7.88.1 (rust-lang/cargo#11749)
- mdman: update pretty_assertions to reduce deps (rust-lang/cargo#11747)
- Cleanup tests (rust-lang/cargo#11745)
- Enhance help texts of position args (rust-lang/cargo#11740)
- Fix typo (rust-lang/cargo#11741)
- Update comment about cargo-ok (rust-lang/cargo#11724)
bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 23, 2023
Update cargo

15 commits in 17b3d0de0897e1c6b8ca347bd39f850bb0a5b9f6..9d5b32f503fc099c4064298465add14d4bce11e6
2023-02-17 19:45:09 +0000 to 2023-02-22 23:04:16 +0000

- refactor(job_queue): docs and move types around (rust-lang/cargo#11758)
- Scrub more of the test environment (rust-lang/cargo#11757)
- Make more reads of environment variables go through the `Config` (rust-lang/cargo#11754)
- Revert "Update curl-sys to use libcurl 7.88.1" (rust-lang/cargo#11755)
- use consistent case (rust-lang/cargo#11748)
- Switch some tests from `build` to `check` (rust-lang/cargo#11725)
- Fix typo in sparse-registry warning message (rust-lang/cargo#11753)
- reuse url encoding from `url` crate, don't use separate `percent-encoding` (rust-lang/cargo#11750)
- Read environment variables through `Config` instead of `std::env::var(_os)` (rust-lang/cargo#11727)
- Update curl-sys to use libcurl 7.88.1 (rust-lang/cargo#11749)
- mdman: update pretty_assertions to reduce deps (rust-lang/cargo#11747)
- Cleanup tests (rust-lang/cargo#11745)
- Enhance help texts of position args (rust-lang/cargo#11740)
- Fix typo (rust-lang/cargo#11741)
- Update comment about cargo-ok (rust-lang/cargo#11724)
@Eh2406
Copy link
Contributor

Eh2406 commented Mar 2, 2023

I did not notice that this was just for search. Please ignore my rant.

@arlosi
Copy link
Contributor

arlosi commented Mar 2, 2023

This is a bug in crate.io, or really a bug in s3, but we can not start hitting it.

I think it's a bug in Cargo & crates.io. We should be encoding the + in the URL. However, fixing it would be a significant breaking change. The actual files in S3 are storing the + as a .

@ehuss ehuss added this to the 1.69.0 milestone Mar 9, 2023
@Turbo87
Copy link
Member

Turbo87 commented Jun 6, 2023

I think it's a bug in Cargo & crates.io. We should be encoding the + in the URL. However, fixing it would be a significant breaking change.

FWIW I looked into the situation today. I agree that it would be a breaking change for cargo, but I don't agree that it is a bug in cargo.

the myth that + needs to be encoded apparently comes from https://datatracker.ietf.org/doc/html/rfc1866#section-8.2.1, but that is only relevant for the application/x-www-form-urlencoded MIME type and for query parameters and
form submission request payloads. it is essentially irrelevant for URL paths from what I can tell.

I've started a more detailed write-up today, but it's still WIP.

@Turbo87
Copy link
Member

Turbo87 commented Jun 7, 2023

for cross-referencing, I've posted my investigation results at rust-lang/crates.io#4891 (comment) :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-interacts-with-crates.io Area: interaction with registries S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants