This is a simple parser for Yandex Images. It allows searching by text query or image.
When searching, you can specify parameters such as:
- Size
- Orientation
- Number of images
- Type (photo, clipart, etc.)
- Color (colorful, b/w, red, orange, etc.)
- Format (jpg, png, gif)
- Site
Delays between requests are automatically randomized in a range of +-15%.
Since Selenium is used for searching, there is no limit of 30 or 300 images in this parser.
It requires installation of the Mozilla Firefox browser!
- Clone the repository:
$ git clone https://github.com/glebtk/yandex_images_parser.git- Before using, you need to install the project requirements:
$ pip install -r requirements.txt-
Ensure that all requirements are successfully installed.
-
Ensure that Mozilla Firefox is installed.
-
To test the functionality, you can run example.py.
Let's start by creating an instance of the parser class:
from yandex_images_parser import Parser
parser = Parser()- Let's say we want to find one cat image. Let's do it!
# Call the "query_search" function - search by query:
# the "query" parameter contains the text query
# the "limit" parameter defines the desired number of images
one_cat = parser.query_search(query="cat", limit=1)
# Since the query_search function returns a list, we will extract the zero-th element:
one_cat_url = one_cat[0]Done! Cat is here:
- Let's find 10 similar cat images using the image_search function:
# Call the "image_search" function - search by image:
# pass the link to the found image through the "url" parameter
# set limit to 10
similar_cats = parser.image_search(url=one_cat_url, limit=10)The search result is a list of url to similar cats:
- In addition to the limit parameter, you can use parameters such as:
- delay - the delay time between requests (in seconds)
- size - the size of the images
- orientation - the orientation of the images
- image_type - the type of the images (photo, illustration, etc.)
- color - color
- image_format - the format of the images (jpg, png, gif)
- site - the site where the images are located
For example, if you need to find 128 paintings of famous painters in png format, use this code:
paintings = parser.query_search(query="paintings of famous painters",
limit=128,
image_format=parser.format.png)And this code finds 30 b/w face images, with a vertical orientation, medium size, and jpg format.
faces = parser.query_search(query="face",
limit=30,
size=parser.size.medium,
color=parser.color.gray,
image_type=parser.image_type.face,
image_format=parser.format.jpg,
orientation=parser.orientation.vertical)Sometimes, during a complex search, the results may
contain duplicate images (with the same URL).
To remove such URLs in advance, there is a special
function called remove_duplicates() in utils.py.
Import it from utils:
from utils import remove_duplicatesRemove duplicate URLs from the paintings list:
paintings = remove_duplicates(paintings)Import the save_images() function from utils:
from utils import save_imagesWe will pass to the function a list of urls and the path by which we want to save the images:
save_images(urls=paintings, dir_path="./images/paintings")Done!
If you have any suggestions or feedback, feel free to contact me by email or via telegram!



