Skip to content

Web spiders are usually disliked by websites, but useful for recursive API/page downloads for offline analysis.

License

Notifications You must be signed in to change notification settings

khilnani/spidey.py

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

spidey.py

Web spiders are usually disliked by websites, but useful for recursive API/page downloads for offline analysis.

Installation

Pypi Location: https://pypi.python.org/pypi/spidey.py

  • Using Pypi - pip install spidey

Usage

Run spidey for Detailed help.

  • spidey --dir NEW_DIR --filter DOMAIN --url URL [--base BASE_URL]
  • spidey --dir NEW_DIR --filter DOMAIN --url URL --max MAX_DOWNLOADS
  • Example - spidey --dir test --filter 'www.google.com' --url 'https://www.google.com/' --max 20

More Examples

spidey \
	-d test \
	-f 'www.google.com' \
	-u 'https://www.google.com/' \
    -b 'https://www.google.com/' \
	-hh '{"Accept" : "application/json"}' \
	-n 2 \
    -m 10 \
    -s 5
spidey \
	--dir test \
	--filter 'www.google.com' \
	--url 'https://www.google.com/'' \ \
    --base 'https://www.google.com/
	--headers '{"Accept" : "application/json"}' \
	--depth 2 \
    --max 10 \
    --sleep 5

About

Web spiders are usually disliked by websites, but useful for recursive API/page downloads for offline analysis.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published