Skip to content

serenity-valley/PyCrawler

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 

Repository files navigation

usage: PyCrawler.py [-h] [--dbname DBNAME] [--followextern] [--verbose]
                    [--striphtml] [--downloadstatic]
                    starturl crawldepth


positional arguments:
  starturl          The root URL to start crawling from.
  crawldepth        Number of levels to crawl down to before quitting. Default
                    is 10.

optional arguments:
  -h, --help        show this help message and exit
  --dbname DBNAME   The db file to be created for storing crawl data.
  --followextern    Follow external links(disabled by default).
  --verbose         Be verbose while crawling.
  --striphtml       Strip HTML tags from crawled content.
  --downloadstatic  Download static content.

About

A simple python web crawler

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%