Columbia Image and Face Search tool for MEMEX
Clone or download
Failed to load latest commit information.

Columbia University Image and Face Search Tool

Author: Svebor Karaman

This repository implements the image and face search tools developed by the DVMM lab of Columbia University for the MEMEX project by Dr. Svebor Karaman, Dr. Tao Chen and Prof. Shih-Fu Chang.


This project can be used to build a searchable index of images that can scale to millions of images. It provides a RESTful API for querying the index to find similar images in less than a second.

The images index is built by extracting features from the images. Two feature extraction models are included:

  • A full image recognition model is based on the DeepSentibank feature representation that was trained targeting the Adjective-Noun Pairs (ANP) of the Visual Sentiment Ontology.
  • A face detection and recognition model, that are the publicly available models from the DLib library, see the blog post DLib face recognition for more information about the models.

However, the package cufacesearch has been written in a modular way and using another image feature extraction model, face detection or recognition model should be fairly easy.

NB: For now, the python package is still named cufacesearch even if it contains both image and face search capability. The package will be renamed soon.



This repository relies on docker and docker-compose for an easy setup, you will need to have those installed. Install docker-compose on your system following the guidelines at:

You could install all the dependencies packages and run the tools outside of docker, but this is considered an advanced setting that is not documented yet.

Setup the environment

The folder setup contains detailed description on how to setup the tool, with examples building the index for publicly available datasets. Check the in that folder to get you started.

Perform searches

You can check the file in www folder for details about the API usage. You can also open your browser at http://localhost/[endpoint]/view_similar_byURL?data=[an_image_URL] to visualize some results.


Apache License Version 2.0, see LICENSE.


Please feel free to contact me with any questions you may have. Also, please post any issue you encounter or request features on github.