Added fast scanning algorithm #5

annimesh2809 · 2017-06-26T17:42:07Z

An implementation of the fast scanning algorithm as mentioned in this paper. An extension of this algorithm has also been implemented for working with higher dimensional images. A few features like removal of small segments and use of adaptive threshold will be progressively added. The performance of this algorithm for 2-d images is fantastic. After addition of the above-mentioned features, I will add the docstring and tests.

Here are a demo:

Original:

After segmentation and coloring the larger segments (except background):

Benchmark results:

julia> summary(img)
"270×400 Array{RGB{N0f8},2}"

julia> @benchmark fast_scanning(img, 0.21)
BenchmarkTools.Trial: 
  memory estimate:  1.94 MiB
  allocs estimate:  5636
  --------------
  minimum time:     20.537 ms (0.00% GC)
  median time:      21.222 ms (0.00% GC)
  mean time:        21.887 ms (0.60% GC)
  maximum time:     40.295 ms (0.00% GC)
  --------------
  samples:          229
  evals/sample:     1

codecov · 2017-06-26T18:50:19Z

Codecov Report

Merging #5 into master will not change coverage.
The diff coverage is 100%.

@@          Coverage Diff          @@
##           master     #5   +/-   ##
=====================================
  Coverage     100%   100%           
=====================================
  Files           3      5    +2     
  Lines         210    281   +71     
=====================================
+ Hits          210    281   +71

Impacted Files	Coverage Δ
src/core.jl	`100% <ø> (ø)`	⬆️
src/region_growing.jl	`100% <100%> (ø)`	⬆️
src/ImageSegmentation.jl	`100% <100%> (ø)`
src/fast_scanning.jl	`100% <100%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0d8720a...22e05e5. Read the comment docs.

timholy

Looks good. The main thing I wonder about is code de-duplication---are there prospects for unifying the two implementations? Obviously there are a number of details (e.g., the use of two integers to left_label and top_label in one case and a Vector{CartesianIndex{N}} in the other), but I wonder if one could define some small utility functions that would allow you to combine the implementations?

This is not crucial, but the advantages of combining them include:

more certainty that bugfixes will be applicable to all cases
the potential for flexibility in allowing the usage of corners even in 2d

timholy · 2017-06-26T19:13:12Z

src/fast_scanning.jl

+# for 2-D images
+function fast_scanning{CT<:Colorant}(img::AbstractArray{CT,2}, threshold::Real, diff_fn::Function = default_diff_fn)
+
+    # Bfs coloring function


What does "bfs" stand for?

By "bfs" I meant Breadth First Search

timholy · 2017-06-26T19:13:34Z

src/fast_scanning.jl

@@ -0,0 +1,235 @@
+
+# for 2-D images
+function fast_scanning{CT<:Colorant}(img::AbstractArray{CT,2}, threshold::Real, diff_fn::Function = default_diff_fn)


Can we handle numbers too? CT<:Union{Real,Colorant} or something.

Yes, that can be done!

timholy · 2017-06-26T19:19:38Z

src/fast_scanning.jl

+    end
+
+    # Remove labels that have been recolored due to the bfs calls
+    @noinline sort_fn!(removed_labels) = sort!(removed_labels)


I guess this is needed because of 15276?

Yes, without it things lit up red in ProfileView.

timholy · 2017-06-26T19:23:51Z

src/fast_scanning.jl

+    _diagmN = diagm([1 for i in 1:N])
+    half_region = ntuple(i-> CartesianIndex{N}(ntuple(j->_diagmN[j,i], Val{N})), Val{N})
+    neg_neighbourhood{N}(x::CartesianIndex{N}) = ntuple(i-> x-half_region[i], Val{N})
+    pos_neighbourhood{N}(x::CartesianIndex{N}) = ntuple(i-> x+half_region[i], Val{N})


So I guess one difference between the two implementations is that "corners" are used here but not above. I wonder if one could unify the two algorithms by having a AbstractStencilIterator, with different concrete implementations, and allowing that as an argument to the function?

annimesh2809 · 2017-06-30T11:59:34Z

After some more changes, the performance has increased with a mean time of about 40 ms for a 512x512 color image.
However, when I tried to unify the two implementations, the performance decreased for 2-d images. Multiple data structures that are required for N-d images are not necessary for the 2-d case.
Another concern is regarding the API of the functions. The methods for fast_scanning have two arguments diff_fn and threshold. These arguments could be replaced with a single argument belongs_to::Function which would return true if the pixel could belong to the region or else it would return false. Internally it could work as belongs_to(region, point) = diff_fn(region, point) < threshold. This would also allow removing the sqrt from default_diff_fn which is slowing the implementation. So it would look like:

default_diff_fn{CT1<:Colorant, CT2<:Colorant}(c1::CT1,c2::CT2) = sum(abs2,(c1)-Images.accum(CT2)(c2))
belongs_to(region, point) = default_diff_fn(region, point) < threshold^2

timholy · 2017-06-30T21:32:28Z

If unifying them is too painful, then by all means keep them separate. I am trying to head out for a holiday, but I'll see if I can review changes over the weekend. Sorry for the delay, I have been very busy with julia 1.0 changes lately...

timholy · 2017-07-02T12:38:56Z

Seems that if a couple of comments above are addressed (spell out bfs and generalize to use Real as well as Colorant) then this is good to go.

annimesh2809 · 2017-07-02T13:06:30Z

Yes, currently I am working on replacing the bfs! function with a union-find data structure to increase performance and also trying to merge the two implementations. I will try to complete this implementation with the tests and docstring by tomorrow.

annimesh2809 · 2017-07-03T13:17:04Z

After removing the bfs_color! function, it was easy to merge the two implementations without any performance loss. Now it uses a Union-find data structure to guarantee a linear time complexity on the number of pixels.

annimesh2809 · 2017-07-03T22:35:02Z

The paper demonstrates that adaptive thresholding produces better segmentation with grayscale images. This can be implemented as another method for CT<:Gray using dispatch. Should we add this method?
A utility function that removes the segments with pixel count < threshold might be useful. I tried searching for it but could not find what I wanted (my googling skills seem to be poor). @timholy and @tejus-gupta, have you guys worked on or know about any such algorithm? The function would assign the pixels of the smaller segments to its nearest valid segment (with pixel count > threshold).

tejus-gupta · 2017-07-04T16:21:39Z

The author's implementation of the "Efficient Graph-Based Image Segmentation" paper post processes the image to remove small segments. They construct a Region Adjacency Graph with each of the segmented region as vertex and edges between neighboring regions measuring dissimilarity between these regions. They iterate through edges (sorted in increasing order of weight) and merge an edge's two vertices if either of the vertex regions has size < minimum region size. Since edge weights measure similarity between vertex regions, small regions will merge with their most similar neighbor (and this merged region will recursively merge with it's most similar region if it's size < minimum region size).

Due to the way felzenszwalb's algorithm and normalized graph-cut algorithm are setup, their is no overhead for constructing the RAG and managing edge links during merge step. This step may slow down your algorithm. Choosing the similarity metric for the RAG that always works is also difficult.

annimesh2809 · 2017-07-04T18:46:20Z

The utility function is not specifically meant for this algorithm. It could be present in core.jl and have a prototype like:

function segfilter{T<:AbstractArray,U<:Union{Colorant, Real}}(seg::SegmentedImage{T,U}, threshold::Int)::SegmentedImage{T,U}

Provided a reasonable threshold, the fast scanning algorithm produces large enough segments.

This function is just in case the user wants to set a lower limit on the size of segments, he can use any of the segmentation methods depending on his requirements and then filter it using this method.

timholy · 2017-07-04T19:24:33Z

💯 for having succeeded in merging the two implementations with no loss in performance. You're leveraging what I think is Julia's main asset, the ability to write incredibly generic code without sacrificing speed. I think it's what sets Julia apart from anything I've ever used before, and you're becoming a master at exploiting this.

I don't see anything that should hold this up from merging, unless you want to tackle the items in #5 (comment). Would the adaptive threshold produce a different value for each pixel? If so, could you once again stick with a single implementation, allowing an AbstractArray input for thresh and then defining something like this?

getscalar(A::AbstractArray, i) = A[i]
getscalar(a::Real, i) = a

As far as merging small components goes, one possibility would be to have a function that constructs a graph from a SegmentedImage keeping track of how many times each component is adjacent to each other component. This graph will be smaller than any pixelwise graph (since it only has as many nodes as there are labels, rather than pixels) and might allow quite a few different types of post-processing.

annimesh2809 · 2017-07-05T08:53:02Z

I like the idea of allowing an AbstractArray input for thresh and using getscalar. I will add adaptive thresholding by today, then we can merge.
I will add the function to merge smaller components in a separate PR.

annimesh2809 · 2017-07-06T21:37:09Z

Support for adaptive thresholding has been added. An array can be passed as threshold which partitions the original image into blocks and each block is assigned a common threshold. For grayscale images, the adpative_thres function returns the N-dimensional threshold array, given the number of partitions in each dimension. The adaptive_thres function calculates the thresholds based on the frequency(sharpness) and variance of each block. The constants have been adjusted to work with common grayscale images, however, they are dependent on the domain of the image. This does not constrain us as the user can provide custom thresholds in the form of arrays.

timholy

Looks good! Merge at will.

timholy · 2017-07-07T18:59:47Z

src/fast_scanning.jl

@@ -0,0 +1,136 @@
+sharpness{CT<:Images.NumberLike,N}(img::AbstractArray{CT,N}) = var(imfilter(img, Kernel.Laplacian(ntuple(i->true, Val{N}))))
+
+function adaptive_thres{CT<:Images.NumberLike,N}(img::AbstractArray{CT,N}, block::NTuple{N,Int})


Do you have a citation for this algorithm? Is it something that might have more widespread use?

Do you have a citation for this algorithm?

The algorithm used in adaptive_thres has been demonstrated in this paper. They have used a linear combination of the variance and frequency as a measure of the threshold of each block. The coefficients have been adjusted to work with common grayscale images.
The sharpness (frequency) of an image is calculated using the variance of Laplacian as demonstrated here (see LAP4).

Is it something that might have more widespread use?

The sharpness function might be useful. However, there are multiple more methods for measuring the focus of an image (about 30 methods have been compared in the mentioned paper above). The adaptive_thres function uses constants that have been hard-coded to work nicely only with a small class of images and only when using abs(x1-x2) as the difference measure. I don't think it has much widespread use.

timholy · 2017-07-07T19:00:53Z

src/fast_scanning.jl

+"""
+    seg_img = fast_scanning(img, threshold, [diff_fn])
+
+Segments the N-D image using a fast scanning algorithm and returns a


Come to think of it, a citation for most of our algorithms might be a good idea. I've not done that systematically in Images.jl, unfortunately.

I will add that

timholy · 2017-07-07T19:05:02Z

src/fast_scanning.jl

+
+```
+"""
+fast_scanning{CT<:Images.NumberLike,N}(img::AbstractArray{CT,N}, block::NTuple{N,Int} = ntuple(i->4,Val{N})) = fast_scanning(img, adaptive_thres(img, block))


Attach the docstring to the "main" method not the one that dispatches to the adaptive threshold?

Also, wrap line after the =?

timholy · 2017-07-07T19:08:13Z

src/fast_scanning.jl

+
+function fast_scanning{CT<:Union{Colorant,Real},N}(img::AbstractArray{CT,N}, threshold::Union{AbstractArray,Real}, diff_fn::Function = default_diff_fn)
+
+    if typeof(threshold) <: AbstractArray


Can also use isa(threshold, AbstractArray)

or even threshold isa AbstractArray

threshold isa AbstractArray fails on Julia v0.5 but isa(threshold, AbstractArray) works. Do we need to support Julia v0.5?

Either is fine by me. If you want to drop 0.5 just make sure you change the REQUIRE and CI scripts appropriately.

timholy · 2017-07-08T22:17:47Z

🎆

tkelman · 2017-09-01T13:53:48Z

src/ImageSegmentation.jl

+using Images, DataStructures, StaticArrays, ImageFiltering
+
+# For efficient hashing of CartesianIndex
+if !isdefined(Base.IteratorsMD, :cartindexhash_seed)


~~is this needed if your minimum julia version is 0.6?~~ nevermind, was added post-0.6 in JuliaLang/julia#22657 - Compat might be a better place for this

Should I remove it from here and send a PR to Compat?

@timholy thoughts? Would this break anything in any way to start getting different hash values for these? If it's otherwise a performance-only improvement, would it make sense for this to be available in a centralized, approximately-always-loaded place instead of only being changed when ImageSegmentation is loaded?

It's just a performance tweak (though one with substantial consequences). I'd support moving it to Compat, though I don't think type piracy that leads only to performance changes is all that much to worry about. (Yes, better in Compat, but no crime having it only here.)

timholy reviewed Jun 26, 2017

View reviewed changes

annimesh2809 added 3 commits July 3, 2017 02:26

Added fast scanning algorithm

c313df9

Replace bfs! function with a union find ds

766c74e

Merged the two implementations without performance degradation

31ab423

annimesh2809 force-pushed the Fast_scanning branch from 9635237 to 31ab423 Compare July 3, 2017 13:10

annimesh2809 added 2 commits July 3, 2017 19:41

Added docstring and some changes

db042c5

Added tests and minor fixes

09bf9d4

Added adaptive thresholding

f53f4f7

Added more tests

5d3ea98

annimesh2809 requested a review from timholy July 7, 2017 10:34

timholy approved these changes Jul 7, 2017

View reviewed changes

annimesh2809 added 2 commits July 8, 2017 02:10

Minor fixes

f2c8270

Withdraw support for julia v0.5

22e05e5

annimesh2809 merged commit d7dff8e into JuliaImages:master Jul 8, 2017

tkelman reviewed Sep 1, 2017

View reviewed changes

		@@ -0,0 +1,235 @@

		# for 2-D images
		function fast_scanning{CT<:Colorant}(img::AbstractArray{CT,2}, threshold::Real, diff_fn::Function = default_diff_fn)

		@@ -0,0 +1,136 @@
		sharpness{CT<:Images.NumberLike,N}(img::AbstractArray{CT,N}) = var(imfilter(img, Kernel.Laplacian(ntuple(i->true, Val{N}))))

		function adaptive_thres{CT<:Images.NumberLike,N}(img::AbstractArray{CT,N}, block::NTuple{N,Int})


		function fast_scanning{CT<:Union{Colorant,Real},N}(img::AbstractArray{CT,N}, threshold::Union{AbstractArray,Real}, diff_fn::Function = default_diff_fn)

		if typeof(threshold) <: AbstractArray

Added fast scanning algorithm #5

Added fast scanning algorithm #5

Conversation

annimesh2809 commented Jun 26, 2017

codecov bot commented Jun 26, 2017 • edited Loading

Codecov Report

timholy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

annimesh2809 commented Jun 30, 2017 • edited Loading

timholy commented Jun 30, 2017

timholy commented Jul 2, 2017

annimesh2809 commented Jul 2, 2017

annimesh2809 commented Jul 3, 2017

annimesh2809 commented Jul 3, 2017

tejus-gupta commented Jul 4, 2017 • edited Loading

annimesh2809 commented Jul 4, 2017

timholy commented Jul 4, 2017 • edited Loading

annimesh2809 commented Jul 5, 2017

annimesh2809 commented Jul 6, 2017

timholy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

timholy commented Jul 8, 2017

tkelman Sep 1, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Jun 26, 2017 •

edited

Loading

annimesh2809 commented Jun 30, 2017 •

edited

Loading

tejus-gupta commented Jul 4, 2017 •

edited

Loading

timholy commented Jul 4, 2017 •

edited

Loading

tkelman Sep 1, 2017 •

edited

Loading