We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
<div id="one"> <div class="two"> <a href="http://querylist.cc">QueryList官网</a> <img src="http://querylist.com/1.jpg" alt="这是图片"> <img src="http://querylist.com/2.jpg" alt="这是图片2"> </div> <span>其它的<b>一些</b>文本</span> </div>
$rules = array( //采集id为one这个元素里面的纯文本内容 'text' => array('#one','text'), //采集class为two下面的超链接的链接 'link' => array('.two>a','href'), //采集class为two下面的第二张图片的链接 'img' => array('.two>img:eq(1)','src'), //采集span标签中的HTML内容 'other' => array('span','html') );
Array ( [0] => Array ( [text] => QueryList官网 其它的一些文本 [link] => http://querylist.cc [img] => http://querylist.com/2.jpg [other] => 其它的<b>一些</b>文本 ) )
<div class="xx"> <img data-src="/path/to/1.jpg" alt=""> </div> <div class="xx"> <img data-src="/path/to/2.jpg" alt=""> </div> <div class="xx"> <img data-src="/path/to/3.jpg" alt=""> </div>
array( 'image' => array('.xx>img','data-src') )
采集结果:分别被放在了Array[0],Array[1],Array[2]中?
Array ( [0] => Array ( [image] => /path/to/1.jpg ) [1] => Array ( [image] => /path/to/2.jpg ) [2] => Array ( [image] => /path/to/3.jpg ) )
The text was updated successfully, but these errors were encountered:
案例一 中符合采集规则的数据只有一条,所以采集结果只有一条,案例二符合采集规则的数据有多条,所以采集结果有多条;案例一中Array[0]中虽然有多条数据,但他们是一个整体,合起来是一条数据。
Sorry, something went wrong.
@jae-jae 很高兴能收到你的回信,感谢你对我提问的回答。只是读完回答后,我还是有点不明白。在案例一中,我写了三条采集规则,分别为'text','link'和'img',并且这三条采集规则都能采集到对应数据,为何你的回答中说,符合采集规则的数据只有一条?以下我将附上采集规则,期待你的下一次回复。
No branches or pull requests
案例一:
html:
rules:
采集结果:全部被放在了 Array[0]中
案例二:
html:
rules:
采集结果:分别被放在了Array[0],Array[1],Array[2]中?
The text was updated successfully, but these errors were encountered: