Kawaii Search (Images similarity)

The blog post describing how it works is here.

This is a demo of applying VGG16 and kNN to build similar image search.


You can use any big dataset with images. I used pictures from this group about fashion: vk/tokyofashion.

-- photos.csv # is a csv file with pictures' info and url.
-- images     # is a directory with pictures

You can use src/ to get all pictures info and urls from a specified group in the vk. Use to set vk group id.

You can use src/ to download images listed in the data/photos.csv.


I use pretrained VGG16 from the keras, but without last layers, only global max pooling. So i get 512 features per image, that i used in the kNN with the cosine metric to calculate similarity.

But you have to generate all the 512-sized vectors for each image, so run src/ to do it. On the GPU Tesla K80 in the gcloud for 50_000 images it takes 20 minutes. The result will be saved in the submission/images_vec.npz with submissions/images_order.csv.


Look into test.ipynb file for example of using this model.


Modify and copy app.service to the /etc/systemd/system. Run systemctl daemon-reload and systemctl enable app.service.


  • create a single main file to do all the steps
  • write a blog post about this
  • build a web service with
  • add feature to find similar photos by an image URL
  • create web service for telegram chat
  • download more images