Skip to content
python爬虫 全球网址URL滚动提取
Branch: master
Clone or download
Latest commit 238dcba Jan 19, 2014
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
src activating Jan 19, 2014
.gitignore ignore Mar 26, 2013
README.md

README.md

spider

python 爬虫

版本1 功能简述: 以hao123为入口页面,滚动爬取外链,收集网址,并记录网址上的内链和外链数目,记录title等信息

windows7 32位上测试,目前每24个小时,可收集数据为10万左右

You can’t perform that action at this time.