This script uses selenium to iterate through the image search results of a specific term and downloads the specified number of images sequentially.
-
If python is in your PATH, run the file in a terminal or command line using:
python googleImageScraper.py
-
Alternatively, you can run the file using the terminal built into VSCode, pyCharm, or any other IDE of your choice.
The script attempts to allocate sufficient time to download an image. Upon failure to do so on or after 10 attempts, the image will be skipped.
These types of errors can arise due to a slow internet connection or when the image file is simply too big to download in time. The script can be modified accordingly by changing the sys.time() calls.
Every 25 iterations, selenium runs into the "Related Searches" element. This element is removed before scraping because clicking on it would result in the desired image page being lost, as the browser navigates to a separate page.