Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
a crawler that should be fast/strong/tricky
Python
Branch: master

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
scheduler
spiders
test
.gitignore
LICENSE
README.md
__init__.py
application.py
config.py
crawler.py
database.py
downloader.py
exception.py
logmanager.py
monitor.py
options.py
proxy.py
requirements.txt
threadPool.py
util.py
webPage.py

README.md

爬虫

一个貌似很健壮的爬虫

TODO

  • rewrite it with gevent
  • add proxy support
Something went wrong with that request. Please try again.