Using the shape context algorithm to classify shoe shapes.
Let's say that you work at a fashion aggregator and you receive hundreds of different shoes every week from different retailers. These retailers send you one identifier and a few images per shoe-model, with different points of view (front-view, side-view, sole, etc). Sometimes they send you a "default" identifier, indicating which of the images for a given shoe should be displayed, normally the side view. However, this is not always the case.
You want your website to look perfect, so ideally, you want to always show the same point of view for all shoe images. Therefore, the question is: How do we find the side view as shoe-images come through the feed?
The proposed solution
Here I show a version of part of the solution that I implemented back in the days when I faced this problem. The solution here presented uses the Shape Context algorithm to compute the shape descriptors and MiniBatchKMeans to cluster the shapes.
For a series of reasons I cannot share the dataset here. There are a series of plots and diagrams that I hope are sufficient to illustrate the process.
The steps are:
1-. Use the Shape Context Algorithm to parameterise the shape of the shoes
2-. Use k-means (MiniBatch) to cluster the shapes
You will see that after clustering the shapes there is still some processing needed to isolate the side view. If I have the time I will upload an implementation of such processing.
Additional use and further improvements
As it is straightforward to understand, the shape arrays can also be used to build a content-based recommendation algorithm, recommending shoes of similar shape. One could complement this with color histograms or using additional features, such as haralick textural features, or GIST descriptors.
Of course, ultimately, if you have the budget to pay for someone to classify the shoe images, you can turn your problem into a supervised one and use Deep Learning. A series of convolutional layers will capture shapes, color, patterns, etc.
I have run this example with 5000 shoe images (
jpg) that are in my disk, in a directory: data/shoe_images/
In the "real world" the image feed normally comes in a form of json files with urls pointing towards the images.
The images are of 150 width and varying height (relative to the width). If you have a similar dataset, you could run the code by simply
python cluster_shoe_shapes.py --n_clusters k
k is the number of clusters to use.
Perhaps, the most usefull part of this repo is at the directory demo, where details of the process are provided. There I recommend to have a look to the notebooks in the following order:
I will emphasize again, I cannot share the dataset. But I am sure once you go through the notebooks you will easily be able to "play" with your own images.