Skip to content

Commit

Permalink
Fix for PrematureEOF
Browse files Browse the repository at this point in the history
transfer closed with outstanding read data remaining
  • Loading branch information
Mariusz Smykula committed Apr 26, 2013
1 parent 834d314 commit 49f1647
Show file tree
Hide file tree
Showing 2 changed files with 24 additions and 1 deletion.
2 changes: 1 addition & 1 deletion src/main/java/org/jsoup/helper/DataUtil.java
Expand Up @@ -120,7 +120,7 @@ static ByteBuffer readToByteBuffer(InputStream inStream, int maxSize) throws IOE
int read;
int remaining = maxSize;

while (true) {
while (inStream.available() > 0) {

This comment has been minimized.

Copy link
@mariuszs

mariuszs Apr 27, 2013

Owner

Stop reading, if transfer is closed by server.

read = inStream.read(buffer);
if (read == -1) break;
if (capped) {
Expand Down
23 changes: 23 additions & 0 deletions src/test/java/org/jsoup/integration/PrematureEOFTest.java
@@ -0,0 +1,23 @@
package org.jsoup.integration;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.junit.Test;

import java.io.IOException;
import java.net.URL;

import static org.junit.Assert.assertTrue;

/**
* @author mariuszs@gmail.com
*/
public class PrematureEOFTest {

@Test
public void fetchURl() throws IOException {
String url = "http://www.dotnetnuke.com/Resources/Blogs/rssid/99.aspx";
Document doc = Jsoup.connect(url).get();
assertTrue(doc.body().html().contains("channel"));
}
}

1 comment on commit 49f1647

@rdkrajunus
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks for your work on this fix. I think I've found a case where this fix causes the library to parse less well than it does before the fix.

Without this commit, the following code prints list elements from 2007-2013 (I'm not sure sure jsoup doesn't retrieve the others):

Document doc = Jsoup.connect("http://www.informatik.uni-trier.de/~ley/pers/hd/h/Han:Jiawei.html").get();
Elements allYears = doc.select("li.year");
System.out.println(allYears);

With this commit, that same code prints no list elements, which seems like a step backward.

Do you know why that is the case, and is there a way to fix it?

Thanks, again, mariuszs,
Rich

Please sign in to comment.