=============================== crawlit

Python web crawler with limitations.

Installation

New directory will be created and all html files will be dumped.

Single threaded
Auto recovery of crawler
Obeys Robots rule
Crawls links from same domain
Downloads only html files
Uses requests stream option so headers are fetched and body is fetched when needed

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
crawlit		crawlit
docs		docs
tests		tests
.gitignore		.gitignore
.travis.yml		.travis.yml
AUTHORS.rst		AUTHORS.rst
CONTRIBUTING.rst		CONTRIBUTING.rst
HISTORY.rst		HISTORY.rst
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py
tox.ini		tox.ini