Skip to content
/ crawler Public

Crawler for a specific web site and show directory tree

Notifications You must be signed in to change notification settings

berkai/crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

crawler

Crawler for a specific web site and shows the site map with directory tree from terminal and a html file. Also save urls in a csv file.

First clone project:

git clone https://github.com/berkayberkman/crawler.git

Create and open a virtual environment:

python3 -m venv scraper

source scraper/bin/activate

Install the dependicies:

pip install -r requirements.txt

sudo apt-get install tree

Go to the script directory:

cd script

Finally run the script via:

python multithread_url_scraper.py

Crawling takes less than a minute. It will depends with your internet connection.

Screenshot from 2020-04-18 05-04-06

About

Crawler for a specific web site and show directory tree

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published