We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nodejs写网页内容抓取代码,很简单!!!只要3个包就可以解决这个问题。 Nodejs自带的fs与http或者https,还有一个cheerio 安装cheerio:yarn add cheerio 实现代码:
fs
http
https
cheerio
yarn add cheerio
const http = require('http'); const fs = require('fs'); const cheerio = require('cheerio'); const URL = `http://www.qdfuns.com/`; let id = 0; let cls = 'h2.media-heading a:last-child'; function startRequest(ID) { id++; let url = URL + `notes/id/all:all:all:82dec52c2ad6cd55dc2a84ee5cfdc713/page/${id}.html`; return http.get(url, res => { res.setEncoding('utf8'); let rawData = ''; let Arr = []; res.on('data', chunk => { rawData += chunk; }); res.on('end', () => { try { let $ = cheerio.load(rawData); $(cls).each((i, data) => { Arr.push({ title: data.children[0].data, url : data.children[0].parent.attribs.href }) // console.log(data.children[0].data); // console.log(data.children[0].parent.attribs.href); }) fs.writeFile(__dirname + '/data/article.json', JSON.stringify({ status: 0, data: Arr }), (err) => { if (err) throw err; console.log('文章列表写入完成'); }); } catch (e) { console.error(e.message); } }); if (id < ID) { return startRequest(ID); } }).on('error', e => { console.error(`错误: ${e.message}`); }); } startRequest(0)
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Nodejs写网页内容抓取代码,很简单!!!只要3个包就可以解决这个问题。
Nodejs自带的
fs
与http
或者https
,还有一个cheerio
安装
cheerio
:yarn add cheerio
实现代码:
The text was updated successfully, but these errors were encountered: