Uses Selenium and Chrome driver to open webpages in an optionally headless (invisible) browser and access their content
Currently scrapable: Amazon India, Flipkart, BigBasket
Required on your system:
- Python (added to PATH): Install from https://www.python.org/downloads/ and add to PATH variable
- Chrome: Install from https://www.google.com/intl/en_us/chrome/
- Chrome Driver: Download (same version as Chrome!) from https://chromedriver.chromium.org/downloads (versions <= 114) or https://googlechromelabs.github.io/chrome-for-testing (versions >= 115)
- Python modules (selenium and tqdm): Run command
pip install -r path/to/WebScraper/requirements.txt
(see also: Updating)
Windows users can quickly run by clicking on RUN.bat
Otherwise, run commands
cd path/to/WebScraper
and python main.py
If cloned with Git, Windows users can use UPDATE.bat to pull the latest version while preserving consts.txt
Otherwise, run commands
cd path/to/WebScraper
and python updater.py
This will ask you if you would like to update Python packages as well (note that this is a time-taking process) Selenium handshake failure errors can mostly be ignored