It is a web Crawler and Scraper to extract data from doctolaria site.
The informations collected are:
- Name
- Image_link
- Specializations
- Experiences
- City
- State
- Address
- Address_telephone
After activate your Python Virtual Environment (venv) run the below command to install the dependencies:
pip install -r requirements.txt
- chromedriver.exe - Web driver used by Selenium to call Chrome. This executable is for Windows x64. If you are not confident to use this .exe file, OR have another Operation System, you can download the correct version at Selenium Chrome webdriver
python DoctoraliaWebCrawler.py
- The pagination are limited to 100 pages and locked to 20 doctors per page
- A doctor can have multiple addresses. In this project we are only extracting the First Address and the Telephones for this address