Conjoined Triple Deep Network for Learning Local Image Descriptors
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
train
.gitignore
efficiency.png
eval.lua
pnnet-liberty-nn-ascii.t7
pnnet-liberty.t7
readme.org
stats-liberty.t7
triplets.png
true_positives.png

readme.org

NEW!!

Check the TFeat descriptor from BMVC 2016.

Results should be much better than PN-Net introduced below


PN-Net: Conjoined Triple Deep Network for Learning Local Image Descriptors

Arxiv Page

Intro

Code for the arxiv paper PN-Net: Conjoined Triple Deep Network for Learning Local Image Descriptors.

The network extracts feature descriptors from grayscale local patches of size 32x32.

Architecture

nn.Sequential {
  [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> output]
  (1): cudnn.SpatialConvolution(1 -> 32, 7x7)
  (2): cudnn.Tanh
  (3): cudnn.SpatialMaxPooling(2,2,2,2)
  (4): cudnn.SpatialConvolution(32 -> 64, 6x6)
  (5): cudnn.Tanh
  (6): nn.View
  (7): nn.Linear(4096 -> 128)
  (8): cudnn.Tanh
}

Implementation details

For optimization details refer to the arxiv publication. Training code is now also available.

Getting the datasets in torch format

Get the Phototour datasets in .t7 format from UBC-Phototour-Patches-Torch and extract liberty yosemite and notredame

How to run the evaluation code

Run th eval.lua

The script will print a series of evaluation results for the patch pairs from the notredame 100k dataset e.g.

1 4.3915 
0 10.6367 
1 5.6122 
0 10.5520 
0 10.4561 
0 10.1167 
0 9.8624 
1 3.2972 
1 2.1507 
1 2.9709 
1 3.4437 
1 7.6362 

The first column contains the patch pair label (0 negative, 1 positive), the second row contains the L2 distance between the two patches of the pair, based on the features extracted from the last layer of our CNN.

From this output, one can compute the ROC curve (e.g. cppROC).

How to train

In the train folder, run th run.lua.

When training with 1.2M triplets on a GTX TITAN X, each epoch takes approximately 2mins

Examples of the training triplets used ./triplets.png

Positive matching examples

Examples of positive nearest neighbour patch matching using the pnnet descriptor in the Oxford matching dataset.

./true_positives.png

Efficiency

Efficiency comparison with MatchNet and deepcompare, both from CVPR 2015. For more results refer to the arxiv paper.

./efficiency.png