New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement SIFT feature detector and descriptor #5472
Merged
Merged
Changes from all commits
Commits
Show all changes
128 commits
Select commit
Hold shift + click to select a range
df9d170
initial commit
e6aa3b7
added SIFT Feature Detector
4b59631
Merge remote-tracking branch 'origin/feature/sift' into feature/sift
2d44490
added comments and docs
8697203
used extract_and_extract
84306ad
fixed small bug that occurred through renaming variables
eb78b77
update todo and made function private
31db91c
fixed small bug that occurred due to renaming method
be41f2b
fixed small bug that occurred due to renaming method
b8bf381
changed np.array to astype
0ad2b8a
Merge branch 'main' of https://github.com/scikit-image/scikit-image i…
df16db1
made Keypoint Class "private"
faberno 86f212e
stylistic changes, added c_max as parameter
faberno 0b8cd14
removed val from keypoint class
faberno c820a1a
removed Keypoint class and added stylistic improvements
faberno ff452fb
stylistic changes
faberno 3f8b966
modified documentation
faberno 9e46fa9
changed the way keypoints are saved, from list of octaves to all keyp…
faberno 08edfe9
fixed a bug that occurred due to last changes
faberno b27dee5
refactoring the structure
faberno af0fd63
added parts of documentation
faberno bc8d43c
corrected example output
faberno e64f89f
Minor typos
AlexanderZeilmann e16e13a
Merge branch 'main' of https://github.com/faberno/scikit-image into f…
faberno 70c93a4
added original paper as reference
faberno 846d7c4
added tests
faberno e515b0a
added Runtime Error if no features are found
faberno b384401
added an example for sift
faberno 115a3e1
added sift to __init__.py
faberno 4b1bc00
added detect and extract to the tests
faberno 5e3b726
changed a few comments
faberno 50c6636
changed a comment
faberno 0927970
some pep08 reformatting
faberno 157b366
Merge branch 'main' of https://github.com/faberno/scikit-image into f…
faberno ba61a18
formatted to line length of 79
faberno 4ece48c
np.int to np.int64
faberno 298159f
more line formatting
faberno 85e8e37
even more line formatting
faberno 87bd115
extended title overline
faberno b0d371d
removed dtype tests
faberno e1e2b0e
set dtypes of multiple attributes
faberno 4c98920
changed DOI links
faberno 3f754ff
less keypoints are plotted
faberno b16a2b7
corrected test name
faberno 18d663a
refactor _find_localize_evaluate for simplicity and performance
grlee77 f9b344e
move edge response calculation into a separate function
grlee77 5a81c61
compute edge response without np.linalg.eig as in the IPOL publication
grlee77 756b139
Only compute non-redundant Hessian entries
grlee77 98b901d
STYLE: replace use of np.hstack and np.vstack
grlee77 5a63274
use math module for scalar math
grlee77 0f733af
consistent scipy.ndimage import style
grlee77 47b03e1
use sparse meshgrid call in _compute_descriptor
grlee77 5df6c2c
move hist and bins initialization outside of the loop in _compute_des…
grlee77 b632864
minor simplification in _compute_descriptor
grlee77 7516a0d
BUG: contrast_threshold should not be divided by n_scales
grlee77 efceb63
use sparse meshgrid in _compute_orientation
grlee77 7ceab89
minor efficiency tweaks to _compute_orientations
grlee77 80f1fdd
STYLE: rename gradientSpace and gradientspace to gradient_space
grlee77 45f544b
STYLE: _compute_orientation
grlee77 82235f4
STYLE: Min -> p_min, Max -> p_max
grlee77 072e5d2
STYLE: _compute_descriptor
grlee77 6e39feb
STYLE: use r, c notation for consistency with scikit-image convention
grlee77 c0bd6ad
renaming for clarity
grlee77 1184889
split normalization and quantization
grlee77 721e04e
pep8 fixes
grlee77 fc24793
BUG: fix range of deltas in SIFT
grlee77 230b625
_create_scalespace: move sigma computations outside of the loop
grlee77 7a8423b
Reorder axes for octaves variable within _create_scalespace
grlee77 88557fe
minor style update
grlee77 2326ab8
Cythonize histogram bottleneck in _compute_descriptor
grlee77 07238d4
Create a fast _local_max Cython function to replace peak_local_max calls
grlee77 6ba4821
move orientation histogram bin and value computation to Cython
grlee77 4c5370e
increase offset_max from 0.5 to 0.6 as recommended in the IPOL paper
grlee77 286aabe
DOC: fix SIFT doctests and update references/notes in the docstring
grlee77 e09b19f
add descriptive docstrings to the Cython functions
grlee77 2387dcc
Make sure _compute_descriptor uses float32 computations for float32 i…
grlee77 7a296a9
Remove potentially annoying max octaves warning
grlee77 3d32be9
changed DOI links
faberno 528eda5
less keypoints are plotted
faberno a093715
corrected test name
faberno 303057f
Update example
grlee77 68bacfe
Merge remote-tracking branch 'faberno/feature/sift' into sift-review-…
grlee77 2087ea0
fix demo
grlee77 dd15765
expanded edges of hist to avoid wrong values in convolution +
faberno a0b2c0c
removed useless if in orientation
faberno 57f4538
implemented oversample_bilin in cython and used it to replace rescale
faberno b29759c
Merge pull request #1 from grlee77/sift-review-updates
faberno 9cea25a
Merge remote-tracking branch 'origin/feature/sift' into feature/sift
faberno c64cfd3
changed way we give delta to oversample_bilin
faberno 64f697f
removed unused variables
faberno 3ac9e2f
shorted too long lines
faberno e67e600
removed redundant check for empty keypoints
faberno 8f72f9d
corrected parameter name in docs of _update_histogram
faberno dbf2d47
could remove keypoints_valid in orientation due to prior removal of t…
faberno df2db70
formatting
faberno 33586cd
readded option to steer maximum number of octaves by self.n_octaves +…
faberno 9ddfa13
fixed dtype problem in _oversample_bilin
faberno b432746
changed test and example output
faberno aad72a7
pep08
faberno f7c8b9d
sort imports by isort style
faberno 053865e
used parentheses instead of backslash for line continuation
faberno 79d1f4d
removed parentheses and sort alphabetically in import
faberno a6011b2
make image C-contiguous before oversampling
faberno fbf48ab
removed range in loops
faberno 3bd3b80
simplified calculation
faberno 92a470a
removed unnecessary 0 in slicing
faberno 167775c
set dtype at creation of deltas
faberno 6c71aad
use skimage.transform.rescale instead of oversample_bilin + change te…
faberno 78bafff
replaced beginnings of detect, extract and detect_and_extract with _p…
faberno ee486d0
replaced 1/self.delta_min by self.upsampling
faberno 26c9d17
avoid copy of histograms by using reshape instead of flatten
faberno 3b2f409
made self.deltas a property and added new member float_dtype
faberno 7372e09
loop over self.deltas to call it less
faberno c535b29
removed unnecessary numpy import in plot_sift.py
faberno 6bfa34b
adjust output of example
faberno 6522d31
removed _oversample_bilin
faberno 89d948d
improve if
faberno f4f0812
fixed indentation and added comment about patent
faberno 2f80b83
improved neighbor looping in _local_max
faberno a4f2ba2
dtype fix double->np_floats in _ori_distances
grlee77 8514777
pep8
grlee77 bf78a40
Merge remote-tracking branch 'upstream/main' into feature/sift
grlee77 c48234b
Merge remote-tracking branch 'origin/feature/sift' into feature/sift
faberno f903adf
Merge branch 'main' of https://github.com/scikit-image/scikit-image i…
faberno 9a182f4
added description of sift
faberno a8e6948
indentation
faberno f48d528
replaced link of ORB paper
faberno 92b03e8
added underscore for references
faberno File filter
Filter by extension
Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,88 @@ | ||
""" | ||
============================================== | ||
SIFT feature detector and descriptor extractor | ||
============================================== | ||
|
||
This example demonstrates the SIFT feature detection and its description | ||
algorithm. | ||
|
||
The scale-invariant feature transform (SIFT) [1]_ was published in 1999 and is | ||
still one of the most popular feature detectors available, as its promises to | ||
be "invariant to image scaling, translation, and rotation, and partially | ||
in-variant to illumination changes and affine or 3D projection" [2]_. Its | ||
biggest drawback is its runtime, that's said to be "at two orders of | ||
magnitude" [3]_ slower than ORB, which makes it unsuitable for real-time | ||
applications. | ||
|
||
References | ||
---------- | ||
.. [1] https://en.wikipedia.org/wiki/Scale-invariant_feature_transform | ||
|
||
.. [2] D.G. Lowe. "Object recognition from local scale-invariant | ||
features", Proceedings of the Seventh IEEE International | ||
Conference on Computer Vision, 1999, vol.2, pp. 1150-1157. | ||
:DOI:`10.1109/ICCV.1999.790410` | ||
|
||
.. [3] Ethan Rublee, Vincent Rabaud, Kurt Konolige and Gary Bradski | ||
"ORB: An efficient alternative to SIFT and SURF" | ||
http://www.gwylab.com/download/ORB_2012.pdf | ||
""" | ||
import matplotlib.pyplot as plt | ||
|
||
from skimage import data | ||
from skimage import transform | ||
from skimage.color import rgb2gray | ||
from skimage.feature import match_descriptors, plot_matches, SIFT | ||
|
||
img1 = rgb2gray(data.astronaut()) | ||
img2 = transform.rotate(img1, 180) | ||
tform = transform.AffineTransform(scale=(1.3, 1.1), rotation=0.5, | ||
translation=(0, -200)) | ||
img3 = transform.warp(img1, tform) | ||
|
||
descriptor_extractor = SIFT() | ||
|
||
descriptor_extractor.detect_and_extract(img1) | ||
keypoints1 = descriptor_extractor.keypoints | ||
descriptors1 = descriptor_extractor.descriptors | ||
|
||
descriptor_extractor.detect_and_extract(img2) | ||
keypoints2 = descriptor_extractor.keypoints | ||
descriptors2 = descriptor_extractor.descriptors | ||
|
||
descriptor_extractor.detect_and_extract(img3) | ||
keypoints3 = descriptor_extractor.keypoints | ||
descriptors3 = descriptor_extractor.descriptors | ||
|
||
matches12 = match_descriptors(descriptors1, descriptors2, max_ratio=0.6, | ||
cross_check=True) | ||
matches13 = match_descriptors(descriptors1, descriptors3, max_ratio=0.6, | ||
cross_check=True) | ||
fig, ax = plt.subplots(nrows=2, ncols=2, figsize=(11, 8)) | ||
|
||
plt.gray() | ||
|
||
plot_matches(ax[0, 0], img1, img2, keypoints1, keypoints2, matches12) | ||
ax[0, 0].axis('off') | ||
ax[0, 0].set_title("Original Image vs. Flipped Image\n" | ||
"(all keypoints and matches)") | ||
|
||
plot_matches(ax[1, 0], img1, img3, keypoints1, keypoints3, matches13) | ||
ax[1, 0].axis('off') | ||
ax[1, 0].set_title("Original Image vs. Transformed Image\n" | ||
"(all keypoints and matches)") | ||
|
||
plot_matches(ax[0, 1], img1, img2, keypoints1, keypoints2, matches12[::15], | ||
only_matches=True) | ||
ax[0, 1].axis('off') | ||
ax[0, 1].set_title("Original Image vs. Flipped Image\n" | ||
"(subset of matches for visibility)") | ||
|
||
plot_matches(ax[1, 1], img1, img3, keypoints1, keypoints3, matches13[::15], | ||
only_matches=True) | ||
ax[1, 1].axis('off') | ||
ax[1, 1].set_title("Original Image vs. Transformed Image\n" | ||
"(subset of matches for visibility)") | ||
|
||
plt.tight_layout() | ||
plt.show() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,163 @@ | ||
#cython: cdivision=True | ||
#cython: boundscheck=False | ||
#cython: nonecheck=False | ||
#cython: wraparound=False | ||
|
||
import numpy as np | ||
cimport numpy as cnp | ||
from libc.math cimport M_PI, floor | ||
|
||
from .._shared.fused_numerics cimport np_floats | ||
|
||
|
||
cpdef _ori_distances(np_floats[::1] ori_bins, | ||
np_floats[::1] theta): | ||
"""Compute angular minima and their indices. | ||
|
||
Parameters | ||
---------- | ||
ori_bins : ndarray | ||
The orientation histogram bins. | ||
theta : ndarray | ||
The orientation computed for each pixel in the patch. | ||
|
||
Returns | ||
------- | ||
near_t : ndarray | ||
The histogram bin index corresponding to each pixel in the patch. | ||
near_t_val : ndarray | ||
The distance between the orientation of each pixel and the nearest bin. | ||
""" | ||
cdef: | ||
Py_ssize_t n_theta = theta.size | ||
Py_ssize_t n_ori = ori_bins.size | ||
rfezzani marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Py_ssize_t i, j, idx_min | ||
np_floats th, dist, dist_min | ||
np_floats two_pi = 2 * M_PI | ||
Py_ssize_t[::1] near_t = np.empty((n_theta, ), dtype=np.intp) | ||
|
||
if np_floats == cnp.float32_t: | ||
dtype = np.float32 | ||
else: | ||
dtype = np.float64 | ||
cdef np_floats[::1] near_t_vals = np.empty((n_theta, ), dtype=dtype) | ||
|
||
for i in range(n_theta): | ||
dist_min = 100. | ||
th = theta[i] | ||
for j in range(n_ori): | ||
dist = ori_bins[j] - th | ||
rfezzani marked this conversation as resolved.
Show resolved
Hide resolved
|
||
if dist > two_pi: | ||
dist -= two_pi | ||
elif dist < 0: | ||
dist += two_pi | ||
if dist < dist_min: | ||
idx_min = j | ||
dist_min = dist | ||
near_t[i] = idx_min | ||
near_t_vals[i] = dist_min | ||
return np.asarray(near_t), np.asarray(near_t_vals) | ||
|
||
|
||
cpdef _update_histogram(np_floats[:, :, ::1] histograms, | ||
const Py_ssize_t[::1] near_t, | ||
np_floats[::1] near_t_val, # const | ||
np_floats[::1] magnitude, # const | ||
np_floats[:, ::1] dist_r, # const | ||
np_floats[:, ::1] dist_c, # const | ||
np_floats rc_bin_spacing): | ||
"""Compute an array of orientation histograms (Eq. 28 of Otero et. al.) | ||
|
||
Parameters | ||
---------- | ||
histograms : (n_hist, n_hist, n_ori) ndarray | ||
An array of zeros that will contain the histogram output. `n_ori` is | ||
the number of orientation bins and `n_hist` is the number of spatial | ||
bins along each axis. | ||
near_t : (n_patch,) ndarray | ||
The orientation histogram bins obtained from `_ori_distances`. | ||
near_t_val : (n_patch,) ndarray | ||
The orientation histogram values obtained from `_ori_distances`. | ||
magnitude : (n_patch,) ndarray | ||
The magnitude weights based on spatial distance. | ||
dist_r : (n_hist, n_patch) ndarray | ||
Row distances between each point in the patch and the nearest row | ||
histogram bin. Shape (n_hist, n_patch). | ||
dist_c : (n_hist, n_patch) ndarray | ||
Column distances between each point in the patch and the nearest column | ||
histogram bin. Shape (n_hist, n_patch). | ||
rc_bin_spacing : float | ||
The spacing between spatial histogram bins along a single axis. | ||
|
||
""" | ||
cdef: | ||
Py_ssize_t i, p, r, c, k_index, k_index2 | ||
Py_ssize_t n_patch = len(magnitude) | ||
rfezzani marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Py_ssize_t n_hist = histograms.shape[0] | ||
Py_ssize_t n_ori = histograms.shape[2] | ||
np_floats t_val, val_norm1, w0, w1, w2, inv_spacing | ||
rfezzani marked this conversation as resolved.
Show resolved
Hide resolved
|
||
inv_spacing = 1 / rc_bin_spacing | ||
val_norm1 = n_ori / (2 * M_PI) | ||
for p in range(n_patch): | ||
rfezzani marked this conversation as resolved.
Show resolved
Hide resolved
|
||
for r in range(n_hist): | ||
if dist_r[r, p] > rc_bin_spacing: | ||
continue | ||
for c in range(n_hist): | ||
if dist_c[c, p] > rc_bin_spacing: | ||
continue | ||
w0 = ((1 - inv_spacing * dist_r[r, p]) | ||
* (1 - inv_spacing * dist_c[c, p]) | ||
* magnitude[p]) | ||
rfezzani marked this conversation as resolved.
Show resolved
Hide resolved
|
||
t_val = near_t_val[p] | ||
rfezzani marked this conversation as resolved.
Show resolved
Hide resolved
|
||
w1 = w0 * val_norm1 * t_val | ||
w2 = w0 - w1 | ||
k_index = near_t[p] | ||
rfezzani marked this conversation as resolved.
Show resolved
Hide resolved
|
||
k_index2 = k_index + 1 | ||
if k_index2 == n_ori: | ||
k_index2 = 0 | ||
histograms[r, c, k_index] += w1 | ||
histograms[r, c, k_index2] += w2 | ||
|
||
|
||
cpdef _local_max(np_floats[:, :, ::1] octave, double thresh): | ||
cdef: | ||
Py_ssize_t n_r = octave.shape[0] | ||
Py_ssize_t n_c = octave.shape[1] | ||
Py_ssize_t n_s = octave.shape[2] | ||
Py_ssize_t r, c, s | ||
np_floats* center_ptr | ||
Py_ssize_t neighbor_offsets[26] | ||
np_floats center_val, val | ||
int n = 0 | ||
|
||
for r in range(-1, 2): | ||
for c in range(-1, 2): | ||
for s in range(-1, 2): | ||
if (r != 0 or c != 0 or s != 0): | ||
neighbor_offsets[n] = (r * n_c + c) * n_s + s | ||
n += 1 | ||
rfezzani marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
maxima_coords = [] | ||
for r in range(1, n_r - 1): | ||
for c in range(1, n_c - 1): | ||
for s in range(1, n_s - 1): | ||
center_ptr = &octave[r, c, s] | ||
center_val = center_ptr[0] | ||
if abs(center_val) < thresh: | ||
continue | ||
is_local_min = True | ||
is_local_max = False | ||
for offset in neighbor_offsets: | ||
val = center_ptr[offset] | ||
if val <= center_val: | ||
is_local_max = True | ||
is_local_min = False | ||
for offset in neighbor_offsets: | ||
val = center_ptr[offset] | ||
if val >= center_val: | ||
is_local_max = False | ||
break | ||
break | ||
if is_local_min or is_local_max: | ||
maxima_coords.append((r, c, s)) | ||
return np.asarray(maxima_coords, dtype=np.intp) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to have some context here about what SIFT is, how it is used, and what other feature detectors exist in skimage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay, I was quite busy :D How about:
This example demonstrates the SIFT feature detection and its description
algorithm.
Even though it was already published in 1999, SIFT is still one of the most
popular feature detectors available, as its promises to be "invariant to image
scaling, translation, and rotation, and partially in-variant to illumination
changes and affine or 3D projection" [1]. Its biggest drawback is its runtime,
that's said to be "at two orders of magnitude" [2] slower than ORB, which makes
it unsuitable for real-time applications.
.. [1] D.G. Lowe. "Object recognition from local scale-invariant
features", Proceedings of the Seventh IEEE International
Conference on Computer Vision, 1999, vol.2, pp. 1150-1157.
:DOI:
10.1109/ICCV.1999.790410
.. [2] Ethan Rublee, Vincent Rabaud, Kurt Konolige and Gary Bradski
"ORB: An efficient alternative to SIFT and SURF"
http://www.vision.cs.chubu.ac.jp/CV-R/pdf/Rublee_iccv2011.pdf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks pretty good to me, but I would modify slightly to spell out SIFT early on:
"The scale-invariant feature transform (SIFT) was published in 1999 and is still one of the most popular..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps a reference to the wikipedia article could also help for those wanting a bit more detail in a less technical format than the conference publications:
https://en.wikipedia.org/wiki/Scale-invariant_feature_transform
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that the link to the paper "ORB: An efficient alternative to SIFT and SURF" does not work anymore. I hope it's alright, that I changed it. Should we also change it in orb.py?