Use CSS Selector, XPath 1.0 or RegExp select data from HTML, inspired by scrapy.
npm i -P xselector
const selector = reuqire('xselector');
let sel = selector.load(html);
sel.xpath('//div').css('img').attr('src');
// need to use relative path
sel.css('body').xpath('./div//img/@src').values();
sel.regexp(/<title>([^<]+)<\/title>/);
arguments:
html
string:options
object: options ofxmldom
.
return: Selector
extends SelectorList
.
- SelectorList#
css(selector)
: SelectorList - SelectorList#
xpath(path)
: SelectorList
- SelectorList#
attr(name)
: string - SelectorList#
text()
: string - SelectorList#
html()
: string - SelectorList#
value()
: string- If selected value is Element, return html;
- If selected value is Text|Attr, return nodeValue;
- If selected value is string|number|boolean, return itself.
- SelectorList#
regexp(re [, searchText])
: stringre
string|RegExp: a pattern to match a part of string, ifre
has match groups, return first match group.searchText
boolean: Iftrue
, its context is Selector#text()
; Iffalse
, its context is Selector#html()
. Default isfalse
.
- SelectorList#
attrs(name)
: string[] - SelectorList#
texts()
: string[] - SelectorList#
htmls()
: string[] - SelectorList#
values()
: string[] - SelectorList#
regexps(re [, searchText])
: string[]
Submit the issues if you find any bug or have any suggestion.
Or fork the repo and submit pull requests.
Author:plylrnsdy
Github:xselector