Skip to content
Selenium Image Crawler
Branch: master
Clone or download
Latest commit 46ce6ab Dec 4, 2017


Selenium Image Crawler

Reference source code :


  • python>=3.3
  • elasticsearch
  • Pillow>=2.0
  • requests
  • imagehash
  • selenium

Selenium Driver : PhantomJS (headless browser) To Install PhantomJS follow


  • Google Image Searh and Yandex Image Search included
  • BaseCrawler supplied for other search engines or websites
  • GoogleCrawler and YandexCrawler extended from BaseCrawler
  • BaseProcessor supplied for processing of each search item
  • LogProcessor, DownloadProcessor and ElasticSearchProcessor extended from BaseProcessor
  • DownloadProcessor, ElasticSearchProcessor : Pool class is used from multiprocessing library for parallelizing download
  • example_*.py files are included for simple usage


  • More drivers will be developed : Bing Image Search
  • Result images and metadata will be stored in databases : MongoDB, Cassandra, PostgreSQL
You can’t perform that action at this time.