Skip to content

Commit

Permalink
Escape supplemental characters correctly
Browse files Browse the repository at this point in the history
  • Loading branch information
jhy committed Oct 20, 2023
1 parent f0eb6bd commit 6ccd158
Show file tree
Hide file tree
Showing 3 changed files with 10 additions and 0 deletions.
3 changes: 3 additions & 0 deletions CHANGES
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,9 @@ Release 1.16.2 [PENDING]
ASCII.
<https://github.com/jhy/jsoup/issues/1952>

* Bugfix: in Jsoup.connect(url), strings containing supplemental characters (e.g. emoji) were not URL escaped
correctly.

* Bugfix: in Jsoup.connect(url), the ConstrainableInputStream would clear Thread interrupts when reading the body.
This precluded callers from spawning a thread, running a number of requests for a length of time, then joining that
thread after interrupting it.
Expand Down
1 change: 1 addition & 0 deletions src/main/java/org/jsoup/helper/UrlBuilder.java
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,7 @@ private static void appendToAscii(String s, boolean spaceAsPlus, StringBuilder s
} else if (c > 127) { // out of ascii range
sb.append(URLEncoder.encode(new String(Character.toChars(c)), UTF_8.name()));
// ^^ is a bit heavy-handed - if perf critical, we could optimize
if (Character.charCount(c) == 2) i++; // advance past supplemental
} else {
sb.append((char) c);
}
Expand Down
6 changes: 6 additions & 0 deletions src/test/java/org/jsoup/helper/HttpConnectionTest.java
Original file line number Diff line number Diff line change
Expand Up @@ -260,6 +260,12 @@ public void caseInsensitiveHeaders(Locale locale) {
assertEquals("https://test.com/foo%20bar/%5BOne%5D?q=white+space#frag", url2.toExternalForm());
}

@Test public void encodeUrlSupplementary() throws MalformedURLException {
URL url1 = new URL("https://example.com/tools/test💩.html"); // = "/tools/test\uD83D\uDCA9.html"
URL url2 = new UrlBuilder(url1).build();
assertEquals("https://example.com/tools/test%F0%9F%92%A9.html", url2.toExternalForm());
}

@Test void encodedUrlDoesntDoubleEncode() throws MalformedURLException {
URL url1 = new URL("https://test.com/foo%20bar/%5BOne%5D?q=white+space#frag%20ment");
URL url2 = new UrlBuilder(url1).build();
Expand Down

0 comments on commit 6ccd158

Please sign in to comment.