Skip to content

acefei/ace-crawler

Repository files navigation

网站爬虫的实践积累

Usage

详情见prototypes/executors目录内各脚本的 doc

Feature

[x] 集成scrapy-redis
[x] 改造scrapy-redis dupefilter,使用bloomfilter
[x] 增加自定义extensions去自动关闭scrapy-redis spider
[x] Redis数据迁移
[x] 通用网页正文抽取
[x] scrapyd部署

Inspiration

反击爬虫,前端工程师的脑洞可以有多大?

About

Distributed General Crawler

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published