A simple yet accurate sketch recognizer.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
examples
models
samples
LICENSE
README.md
demo.gif
demo.py

README.md

sketch-recognizer

A simple yet accurate sketch recognizer. Created by Jean-Baptiste Alayrac at INRIA.

alt text

License

This code is released under the MIT License (refer to the LICENSE file for details).

Contents

Requirements

This code has been tested under Ubuntu 16.04, equipped with a Nvidia GTX1080 (in GPU mode), and CUDA 8.0. The code may work on previous versions of Ubuntu, but without warranty. Similarly, earlier or later versions of CUDA should work as well. Other requirements:

  • python2.7: install with anaconda is always a safe and simple option
  • caffe: follow the installation guide (with python support)

Run the demo

Once everything is properly installed:

  1. Clone this repo and go to the generated folder
git clone https://github.com/jalayrac/sketch-recognizer.git
cd sketch-recognizer
  1. Download the pretrained model:
wget http://www.di.ens.fr/~alayrac/sketch-recognizer/finetune_googledraw_iter_360000.caffemodel -P ./models/  
  1. Setup caffe and CPU/GPU.

Edit the caffe_root in demo.py to reflect your installation setup (of the form /path/to/caffe). Select if you prefer gpu or cpu mode (cpu by default).

If you have installed caffe with GPU support, don't forget to also add cuda to your LD_LIBRARY_PATH (with the path corresponding to your installation):

export LD_LIBRARY_PATH=/usr/cuda-8.0/lib64/:$LD_LIBRARY_PATH
  1. Run the demo.
python demo.py samples/*.png

If everything is setup correctly, you should see the following predictions (after some init messages from caffe):

sample_1.png: zebra (0.969), tiger (0.03), horse (0.00), cow (0.00), panda (0.00) (served in 0.122 s)
sample_2.png: sailboat (1.000), canoe (0.00), knife (0.00), chandelier (0.00), submarine (0.00) (served in 0.046 s)
sample_3.png: banana (0.790), boomerang (0.17), moon (0.01), snake (0.01), trombone (0.00) (served in 0.041 s)
sample_4.png: wine-bottle (0.992), wineglass (0.01), socks (0.00), lightbulb (0.00), vase (0.00) (served in 0.038 s)
sample_5.png: eiffel-tower (0.997), skyscraper (0.00), tent (0.00), chandelier (0.00), sword (0.00) (served in 0.041 s)

NB: the timings have been obtained in GPU mode with a GTX1080. If you run in CPU mode, the inference on my machine was closer to 1s per image.

  1. Run on your own images.

To run on your own images, simply creates a free form drawing (with tools such as this one), save it in png on your computer, and simply type (with the correct path to your drawing):

  python demo.py /path/to/my/drawing.png

Projects using this code

  • Palais de la découverte: This code has been originally developped for a permanent exhibition in the museum Palais de la découverte in Paris. More specific code used for that project is provided here.

  • Small web server: see here for an interactive web server that recognize uploaded drawings.

If you happen to use it for your project, please let me know!

Under the hood

If you wonder how this model has been obtained, here are some details [TODO].

Credits

This work wouldn't have been possible without the following great projects:

I would also like to thank Francis Bach, Laurent Viennot, Vincent Blech (from Palais de la Découverte) and Fleur De Papier.