Skip to content
master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
Scrapes the pages and resources on a domain, starting from the provided URL.
Local directory structure will mimic the URL paths as closely as possible.
Inspects the HTML pages for src and href attributes.

Usage: usage = scrape.py OPTIONS domain url

Options:
  -h, --help  show the help message and exit
  --out  output directory, if not provided, will use working directory

Examples:

Scrape the google.com domain, starting at http://google.com/:
  python ./scrape.py google.com http://google.com/  

Scrape the github.com domain, store in the provided directory:
  python ./scrape.py --out ./github github.com http://github.com/

About

Python web scraper

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages