Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for XPath 1.0 #80

Closed
wants to merge 399 commits into from
Closed

Added support for XPath 1.0 #80

wants to merge 399 commits into from

Conversation

btd
Copy link

@btd btd commented Mar 18, 2011

From test you can see usage.

jhy added 30 commits February 6, 2010 00:40
jhy and others added 22 commits February 3, 2011 21:45
…. Overriding implementations in Element still return Element.
…, and to change many at once with Elements.tagName(String).
… queries (e.g. "meta[http-equiv], meta[content]") were being parsed incorrectly as OR only queries (e.g. former as "meta, [http-equiv], meta[content]")

Fixed issue where a content-tye specified in a meta tag may not be reliably detected, due to the above issue.
@riczhao
Copy link

riczhao commented Oct 23, 2013

No one working on xpath support?

@guitarmind
Copy link

The performance of CSS locator (selector) is significantly faster than Xpath.
It would also be easy to rewrite Xpath into CSS locator scripts. :)

For your reference:
http://sauceio.com/index.php/2011/05/why-css-locators-are-the-way-to-go-vs-xpath/

In my test with HTMLUnit (using XPath) and Jsoup (using CSS locator), the result also shows the same!

[Testing Log]

Parsing E:/profiling.html using Xpath ...
Loading doc by HTMLUnit ...
Time spent: 8122.613623 milliseconds.

Searching doc by HTMLUnit ...
XPath: //div[@Class="alonesort"]/div[@Class="mc"]/dl[@Class="fore"]/dd/em/span/a
Matched Element Count: 1260
Time spent: 35.006174 milliseconds.

Parsing E:/profiling.html using CSS Selector ...
Loading doc by Jsoup ...
Time spent: 151.277634 milliseconds.

Searching doc by Jsoup ...
CSS Locator: div.alonesort > div.mc > dl.fore > dd > em > span > a
Matched Element Count: 1260
Time spent: 13.975146 milliseconds.

I think Jsoup is already good enough by supporting CSS locator!
The rule is also shorter and simpler in contrast to Xpath.

@riczhao
Copy link

riczhao commented Oct 24, 2013

Thanks. My requirement is simple. Getting xpath or css selector in browser,
past it in code, get the content. Now using chrome, I can get xpath in dev
tools window easily.
When I try to translate the xpath to css selector, one problem is I don't
know how to handle tag[1], which mean the first tag with name 'tag'. :eq()
means the nth child.

@guitarmind
Copy link

Hi riczhao,

The equivalent expression of "tag[1]" in CSS Selector is "tag:nth-child(1)".
You can check the following links:
http://jsoup.org/apidocs/org/jsoup/select/Selector.html
http://stackoverflow.com/questions/16914980/parsing-htmlnot-well-formed-with-jsoup

@slorber
Copy link

slorber commented Jun 11, 2014

Hello,

Please document that JSoup does not currently support XPath, as it is not really clear while it should be at first glance.

@jhy jhy closed this Aug 2, 2015
@ik-j
Copy link

ik-j commented Mar 22, 2017

I can get cssSelector using jSoup but am not able to get xpath:idrelative, xpath:attributes, xpath:location.
Though I am able to get xpath fixed with the code below:

StringBuilder absPath=new StringBuilder();
Elements parents = e.parents();

                for (int j = parents.size()-1; j >= 0; j--) {
                    Element element = parents.get(j);
                    absPath.append("/");
                    absPath.append(element.tagName());
                    absPath.append("[");
                    absPath.append(element.siblingIndex());
                    absPath.append("]");
                }

Help would be appreciated :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet