Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSoup 1.15.4 and 1.16.1 is decoding my urlencoded string #1952

Closed
michaelarnauts opened this issue May 8, 2023 · 4 comments
Closed

JSoup 1.15.4 and 1.16.1 is decoding my urlencoded string #1952

michaelarnauts opened this issue May 8, 2023 · 4 comments
Assignees
Labels
bug Confirmed bug that we should fix fixed
Milestone

Comments

@michaelarnauts
Copy link

I'm using Jsoup.connect(url) with an url http://xxx/api/test/%2B32123, but Jsoup is making the request as http://xxx/api/test/+32123, causing it to be interpreted as 32123 by the backend.

This looks related to #1914, #1902, #1936 and #1928.

Downgrading to 1.15.3 works.

@ntinnemeier
Copy link

I am experiencing a similar case that probably has the same cause as this issue. We are passing in a URL of the form: http://xxx/api/123+456. The URLBuilder decodes the URL path:
URI uri = new URI(..., decodePart(this.u.getPath()), ...);

which yields: http://xxx/api/123 456

That is, the plus sign is converted to a space.

Afaik, plus signs in a path shouldn't be encoded (see for example https://stackoverflow.com/questions/2678551/when-should-space-be-encoded-to-plus-or-20 and https://www.baeldung.com/java-url-encoding-decoding).

In 1.15.3 the plus sign wasn't decoded to a space and the URL still works fine.
Also, encoding the plus sign as %2B works as the decoding will decode it to a plus sign. (But I don't think that's a proper solution.)

@jhy jhy self-assigned this Sep 8, 2023
@jhy jhy added bug Confirmed bug that we should fix fixed labels Sep 8, 2023
@jhy jhy closed this as completed in 1e69577 Sep 8, 2023
@jhy jhy added this to the 1.16.2 milestone Sep 8, 2023
@jhy
Copy link
Owner

jhy commented Sep 8, 2023

Thanks for the report and apologies for the issue! Have fixed this by reverting to the previous behavior of not encoding supplied paths, other than normalizing to ASCII.

@michaelarnauts
Copy link
Author

Thanks for your fix! We had pinned the version to 1.15.3 until now, and updating to 1.17.1 works just fine.

@jhy
Copy link
Owner

jhy commented Nov 30, 2023

That's great, thanks for confirming.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Confirmed bug that we should fix fixed
Projects
None yet
Development

No branches or pull requests

3 participants