Skip to content

vedarthk/crawler

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Crawler

license python

A basic implementation of a web crawler written in python.

Usage

  • Do pip install -r requirements.txt
  • Run the crawler:
$ python crawler.py --help
usage: crawler [-h] [-d DEPTH] [-v] url

A basic implementation of a web crawler written in python.

positional arguments:
  url                   the url to crawl

optional arguments:
  -h, --help            show this help message and exit
  -d DEPTH, --depth DEPTH
                        set the max_depth for crawling
  -v, --verbose         Toggle verbose on (default is off)

For more information see http://github.com/shashankgroovy/crawler

Example

$ python crawler.py http://shashank.im -d 10
Fetching urls...
Found: 27 links

https://play.google.com/store/apps/details?id=com.jrummy.root.browserfree
http://youtu.be/ulFeUCAI5xM
http://shashank.im/bucketlist
http://www.utorrent.com/
http://shashank.im
http://shashank.im/articles/2014/05/08/gsoc-selection/
...

License

MIT License

Copyright (c) 2016 Shashank Srivastav

About

Python crawler

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%