Skip to content

Latest commit

 

History

History
31 lines (16 loc) · 782 Bytes

README.md

File metadata and controls

31 lines (16 loc) · 782 Bytes

crawler

Crawler for a specific web site and shows the site map with directory tree from terminal and a html file. Also save urls in a csv file.

First clone project:

git clone https://github.com/berkayberkman/crawler.git

Create and open a virtual environment:

python3 -m venv scraper

source scraper/bin/activate

Install the dependicies:

pip install -r requirements.txt

sudo apt-get install tree

Go to the script directory:

cd script

Finally run the script via:

python multithread_url_scraper.py

Crawling takes less than a minute. It will depends with your internet connection.

Screenshot from 2020-04-18 05-04-06