An application to find similar pictures from a vk.com group with the VGG16 and kNN
Jupyter Notebook Other
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
data add baseline Dec 7, 2017
etc/systemd/system add systemd Dec 7, 2017
src
submission add baseline Dec 7, 2017
templates typo fix Dec 20, 2017
.gitignore rm secrets Dec 7, 2017
.python-version add phash example Apr 29, 2018
LICENSE Initial commit Dec 7, 2017
Makefile up makefile Dec 20, 2017
README.md update readme: add blog post link Feb 1, 2018
phash.ipynb add phash example Apr 29, 2018
requirements.txt add server Dec 7, 2017
showcase.ipynb add showcase with new trained vgg19 Dec 7, 2017
test.ipynb add predictor Dec 7, 2017

README.md

Kawaii Search (Images similarity)

The blog post describing how it works is here.

This is a demo of applying VGG16 and kNN to build similar image search.

vk.com/tokyofashion vk.com/tokyofashion

Dataset

You can use any big dataset with images. I used pictures from this group about fashion: vk/tokyofashion.

./data
-- photos.csv # is a csv file with pictures' info and url.
-- images     # is a directory with pictures

You can use src/get_images.py to get all pictures info and urls from a specified group in the vk. Use config.py to set vk group id.

You can use src/download_images.py to download images listed in the data/photos.csv.

Training

I use pretrained VGG16 from the keras, but without last layers, only global max pooling. So i get 512 features per image, that i used in the kNN with the cosine metric to calculate similarity.

But you have to generate all the 512-sized vectors for each image, so run src/vectorize_image.py to do it. On the GPU Tesla K80 in the gcloud for 50_000 images it takes 20 minutes. The result will be saved in the submission/images_vec.npz with submissions/images_order.csv.

Evaluating

Look into test.ipynb file for example of using this model.

Deploy

Modify and copy app.service to the /etc/systemd/system. Run systemctl daemon-reload and systemctl enable app.service.

TODO

  • create a single main file to do all the steps
  • write a blog post about this
  • build a web service with
  • add feature to find similar photos by an image URL
  • create web service for telegram chat
  • download more images