crawler

Crawler for a specific web site and shows the site map with directory tree from terminal and a html file. Also save urls in a csv file.

First clone project:

git clone https://github.com/berkayberkman/crawler.git

Create and open a virtual environment:

python3 -m venv scraper

source scraper/bin/activate

Install the dependicies:

pip install -r requirements.txt

sudo apt-get install tree

Go to the script directory:

cd script

Finally run the script via:

python multithread_url_scraper.py

Crawling takes less than a minute. It will depends with your internet connection.

Provide feedback