Skip to content

Commit

Permalink
Update default max body to 2MB. Update default UA.
Browse files Browse the repository at this point in the history
  • Loading branch information
jhy committed Jan 26, 2020
1 parent 9fdda99 commit 86d69ea
Show file tree
Hide file tree
Showing 3 changed files with 10 additions and 4 deletions.
4 changes: 4 additions & 0 deletions CHANGES
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,10 @@ jsoup changelog
* Improvement: ensure HTTP keepalives work when fetching content via body() and bodyAsBytes().
<https://github.com/jhy/jsoup/issues/1232>

* Improvement: set the default max body size in Jsoup.Connection to 2MB (up from 1MB) so fewer people get trimmed
content if they have not set it, but still in sensible bounds. Also updated the default user-agent to improve
default compatibility.

* Bugfix: on pages fetch by Jsoup.Connection, a "Mark Invalid" exception might be incorrectly thrown, or the page may
miss some data. This occurred on larger pages when the file transfer was chunked, an an invalid HTML entity happened
to cross a chunk boundary.
Expand Down
6 changes: 4 additions & 2 deletions src/main/java/org/jsoup/Connection.java
Original file line number Diff line number Diff line change
Expand Up @@ -98,8 +98,10 @@ public final boolean hasBody() {

/**
* Set the maximum bytes to read from the (uncompressed) connection into the body, before the connection is closed,
* and the input truncated. The default maximum is 1MB. A max size of zero is treated as an infinite amount (bounded
* only by your patience and the memory available on your machine).
* and the input truncated (i.e. the body content will be trimmed). <b>The default maximum is 2MB</b>. A max size of
* <code>0</code> is treated as an infinite amount (bounded only by your patience and the memory available on your
* machine).
*
* @param bytes number of bytes to read from the input before truncating
* @return this Connection, for chaining
*/
Expand Down
4 changes: 2 additions & 2 deletions src/main/java/org/jsoup/helper/HttpConnection.java
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ public class HttpConnection implements Connection {
* vs in jsoup, which would otherwise default to {@code Java}. So by default, use a desktop UA.
*/
public static final String DEFAULT_UA =
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.143 Safari/537.36";
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36";
private static final String USER_AGENT = "User-Agent";
public static final String CONTENT_TYPE = "Content-Type";
public static final String MULTIPART_FORM_DATA = "multipart/form-data";
Expand Down Expand Up @@ -551,7 +551,7 @@ public static class Request extends HttpConnection.Base<Connection.Request> impl

Request() {
timeoutMilliseconds = 30000; // 30 seconds
maxBodySizeBytes = 1024 * 1024; // 1MB
maxBodySizeBytes = 1024 * 1024 * 2; // 2MB
followRedirects = true;
data = new ArrayList<>();
method = Method.GET;
Expand Down

0 comments on commit 86d69ea

Please sign in to comment.