Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elements.select() returns not all matching elements #664

Closed
luksch opened this issue Dec 6, 2015 · 4 comments
Closed

Elements.select() returns not all matching elements #664

luksch opened this issue Dec 6, 2015 · 4 comments

Comments

@luksch
Copy link

luksch commented Dec 6, 2015

Suppose the following example:

String htmlpart = ""
        + "<div class=\"a b\">1</div>"
        + "<div class=\"a b\">2</div>"
        + "<div class=\"a b\">3</div>"
        + "<div class=\"a b\">1</div>";
Document doc = Jsoup.parse(htmlpart);

Elements elsA = doc.select(".a");
Elements elsAB_1 = elsA.select(".b");
Elements elsAB_2 = doc.select(".a.b");

System.out.println("elsAB_1:\n"+elsAB_1);
System.out.println("elsAB_2:\n"+elsAB_2);

I would expect the output of both println statements to be identical, but they are not. The first case does not contain the last div, i.e. the one div that looks exactly as the first one.

The second println output looks correct. The question is why is the last div lost in the first example? Is this expected behavior or a bug?

The origin of this issue is a stackoverflow question: http://stackoverflow.com/questions/34117769/how-to-use-jsoup-get-repeating-number

@luksch
Copy link
Author

luksch commented Jan 14, 2016

Here is another example of this:

String htmlStr = ""
        +"<table class=\"xt\">"
        +"  <tbody>"
        +"   <tr><th class=\"bg-warning\" colspan=\"3\">-</th></tr>"
        +"   <tr><th class=\"bg-warning\">2014-08-29</th><td>0</td><td>0.00</td><td>0.00</td></tr>"
        +"  </tbody>"
        +"</table>";
Document doc = Jsoup.parse(htmlStr);
Element xt = doc.select("table.xt").first();
int thCols = xt.select("tr").eq(1).select("th").size();
int tdCols = xt.select("tr").eq(1).select("td").size();
int tdCols2 = xt.select("tr:eq(1)>td").size();

System.out.println(thCols); // prints 1
System.out.println(tdCols); // prints 2 - whereas I am expecting 3
System.out.println(tdCols2); // prints 3 as expected

Again, basis of this is a stackoverflow question: http://stackoverflow.com/questions/34795208/jsoup-am-not-being-able-to-correctly-parse-two-consecutive-table-cells-with-ide/

@akuma
Copy link

akuma commented Jan 21, 2016

I got this bug also.

@luksch
Copy link
Author

luksch commented Jan 25, 2016

It seems to be related to #614

@jhy
Copy link
Owner

jhy commented Apr 18, 2016

Fixed in 1.9.1 with identity equals

@jhy jhy closed this as completed Apr 18, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants