New superpixel algorithm (F-DBSCAN) #3093

scloke · 2021-10-31T03:34:00Z

Implementation of a new superpixel algorithm, "Accelerated superpixel image segmentation with a parallelized DBSCAN algorithm".

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or other license that is incompatible with OpenCV
The PR is proposed to proper branch
There is reference to original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

force_builders=Linux x64,Win64,Mac,Android armeabi-v7a,Docs,iOS,Win32,ARMv7,ARMv8,Linux x64 Debug

Implementation of a new superpixel algorithm, "Accelerated superpixel image segmentation with a parallelized DBSCAN algorithm".

added newline at end of file

bug fixes

trailing whitespace removal

bug fixes

editing changes

minor edits

bug fixes

modules/ximgproc/include/opencv2/ximgproc/scansegment.hpp

inserted @addtogroup block

bug fixes

sturkmen72 · 2021-11-07T09:56:38Z

@scloke first of all thank you for the contribution
as a common OpenCV user, after compilation, I tested using the following python code
and want to share my experience.

code :

img = cv.imread('d:/test/hajandrade.jpg')
ss = cv.ximgproc.createScanSegment(img.shape[1],img.shape[0],500,1,True)
tm = cv.TickMeter()
tm.start()
ss.iterate(img)
res = ss.getLabelContourMask(True)
tm.stop()
print(tm.getTimeMilli())
res = cv.cvtColor(res,cv.COLOR_GRAY2BGR)
res = cv.add(res,img)
cv.imshow("Output", res);
cv.waitKey()
cv.destroyAllWindows()

is it possible to create one instance of ScanSegment and use it for different-sized images?
giving createScanSegment different values to threads produces different results.

threads = 1

threads = 4

is this pattern intentional?

scloke · 2021-11-07T10:55:23Z

Hi,

Thanks for the comments. Appreciate it+++ Am a bit of a newbie at open source, so I will try and explain the background of this contribution in a bit of detail.

For this algorithm, it was developed to try and speed-up superpixel segmentation. The underlying principle is straightforward. What it does is to parallelise two important processes, which are: 1) the actual segmentation process which is DBSCAN based, and 2) the merging of small segments.

Normally DBSCAN is considered too slow for real-time image work, and quite hard to parallelise since many small segments are created and large segments formed by different processes will overlap. The innovation here is to limit the cluster size and convert the colour difference function to simple integer-based arithmetic. This can be processed very fast and segments are limited in size and hence don't overlap.

Similarly, merging of small segments is quite slow as I am using an adaptation of OpenCV's watershed algorithm. I managed to parallelise it efficiently by dividing it in the same way as the segmentation, and using a window with a surrounding margin of 1 pixel.

This explains a couple of things which you have noticed. 1) The inverted-T pattern flows from the way that the DBSCAN clustering works. If you have a large homogenous textureless image, this pattern is repeated throughout. 2) The use of 4 threads shows a clear horizontal and vertical division, and this comes from the merging algorithm. 3) the output is deterministic when the number of threads is fixed (this is in the comments of the .hpp file), but different threads counts will give different segmentation results. 4) using the initialisation function will only allocate the buffers. Each run of the iterate function will segment a new image using the allocated buffers. No problems.

This algorithm has an O(n2) complexity (quadratic), so it is better for smaller images. When tested on the Berkeley Segmentation Dataset (smaller images), it is about six times faster than SEEDS or any of the other OpenCV algorithms (the new algorithm is F-DBSCAN on the chart).

The segmentation accuracy is about the same as the OpenCV routines.

When tested on this random 2 megapixel image on my hard drive with 200 superpixels, it gave 199 segments. In comparison, SEEDS gave 144 segments using the default settings. Runtime was about 32.9 s for 1000 iterations for F-DBSCAN and 58.3 s for 1000 iterations for SEEDS on a 10-core i9 Windows machine.

Original

F-DBSCAN

SEEDS

The original algorithm was written in ISO C++ 17.0, so I had to adapt it to C++ 11.0 for OpenCV and use the inbuilt parallelisation routines. The speed is different from the original, but the segmentation accuracy should be the same, so I intend to submit this first and if it is approved, then I will use the finalised code to build a short comparison sample program to test performance against the other OpenCV algorithms. I will also write a proper guide on usage with examples.

SC

alalek

Thank you for contribution!

Please take a look on comments below.

modules/ximgproc/include/opencv2/ximgproc/scansegment.hpp

modules/ximgproc/src/scansegment.cpp

indents removed

extra indents removed

license agreement updated

reference moved to ximgproc.bib

c++ def removed

changed threads param

tab indents replaced with 4 spaces

bug fixes

removed trailing whitespace

replace malloc with autobuffer

updated header guard

bug fix

bug fixes

fixed process threads to the number of slices

alalek

Thank you for updates!

alalek · 2021-11-19T03:41:03Z

modules/ximgproc/src/scansegment.cpp

+dr = std::abs((ptr1)[2] - (ptr2)[2]); \
+diff = ws_max(db, dg);                \
+diff = ws_max(diff, dr);              \
+assert(0 <= diff && diff <= 255);  \


Using of C-style assert() is not allowed.

use CV_Assert() / CV_DbgAssert() instead

or prefer to use code with std::clamp() logic instead of checks (clamp is C++17 feature and not available yet by default, so simulate it through std::min/max)

other cases should be fixed too

substituted with CV_Assert()

alalek · 2021-11-19T03:44:23Z

modules/ximgproc/src/scansegment.cpp

+else                                \
+q[idx].first = node;            \
+q[idx].last = node;                 \


{} brackets? or add indentation to clarify that this code is not broken

indentation added

alalek · 2021-11-19T03:46:30Z

modules/ximgproc/src/scansegment.cpp

+//! @addtogroup ximgproc_superpixel
+//! @{
+
+class ScanSegmentImpl : public ScanSegment


consider using CV_FINAL (final)

alalek · 2021-11-19T03:48:35Z

modules/ximgproc/src/scansegment.cpp

+ScanSegmentImpl::ScanSegmentImpl(int image_width, int image_height, int num_superpixels, int slices, bool merge_small)
+{
+    // set the number of process threads
+    processthreads = std::thread::hardware_concurrency();


Please use cv::getNumThreads() instead, drop #include <thread>

alalek · 2021-11-19T04:02:34Z

modules/ximgproc/src/scansegment.cpp

+
+    // start at the center of the rect, then run through the remainder
+    labBuffer = reinterpret_cast<cv::Vec3b*>(src.data);
+    cv::parallel_for_(cv::Range(0, (int)indexNeighbourVec.size()), PP1(reinterpret_cast<ScanSegmentImpl*>(this)));


BTW, using of C++11 lambdas with parallel_for looks like:

parallel_for_(Range(0, (int)indexNeighbourVec.size()), [&](const Range& range) { for (int i = range.start; i < range.end; i++) { OP1(i); } });

OP4 case:

// copy back to labels mat parallel_for_(Range(0, (int)indexProcessVec.size()), [&](const Range& range) { for (int i = range.start; i < range.end; i++) { OP4(indexProcessVec[i]); } });

Thanks+++ Appreciate the help. I compiled on my system on C++ 14 and no problems. All replaced

alalek · 2021-11-19T04:09:13Z

modules/ximgproc/src/scansegment.cpp

+    cv::AutoBuffer<cv::Rect> _seedRects;    // autobuffer of seed rectangles
+    cv::AutoBuffer<cv::Rect> _seedRectsExt; // autobuffer of extended seed rectangles
+    cv::AutoBuffer<cv::Rect> _offsetRects;  // autobuffer of offset rectangles
+    cv::AutoBuffer<cv::Point> _neighbourLoc;// autobuffer of neighbour locations
+    cv::Rect* seedRects;					// array of seed rectangles
+    cv::Rect* seedRectsExt;					// array of extended seed rectangles
+    cv::Rect* offsetRects;					// array of offset rectangles
+    cv::Point* neighbourLoc;				// neighbour locations


In general it is dangerous to store dedicated RAW pointers (RAW pointers unable to control lifetime of allocated buffer).
Also it doesn't make sense as .data() method is fast as RAW pointer.
Moreover operator [](size_t i) would check index for valid range in debug builds.

The .data() method is new to me, so I used the CLAHE implementation code in OpenCV as a template to follow. The code that was used there converted the .data() to a RAW pointer:
cv::AutoBuffer _tileHist(histSize);
int* tileHist = _tileHist.data();

There is a significant speed difference when converting to .data() without dedicated RAW pointers. I ran with and without dedicated RAW pointers over a thousand cycles, twice, and the speed difference was 31.1s (with RAW) vs 37.3s (without RAW). I managed to improve the speed a bit by converting the neighbourLoc to a pre-initialised buffer, but otherwise I left it unchanged.

I think that the use of RAW pointers in this case should be safe enough since they are sourced from the AutoBuffers which are initialised as class variables that are allocated and deallocated based on the lifetime of the class. The evidence of this is:

running several thousand cycles in both debug and release builds showed no instability / memory leaks / buffer overruns.

the CLAHE module in OpenCV uses the same method and there has be no report of instability

I really have several questions for such micro-benchmarks... No idea that they measure.

https://github.com/opencv/opencv/blob/ac4b592b4e550a0ced1977e9aa19e8059a796e3c/modules/core/include/opencv2/core/utility.hpp#L127-L142

From the code you quoted, .data() should give a straight reference to the aligned pointer of the data in the AutoBuffer. The operator should also function similarly. Hence, there should not be any speed difference as you said.

Previously, I ran the entire code to process the 2 MP image I described earlier 1000 times. Now I have written a microbenchmark to test the AutoBuffer specifically.

// test 10000 autobuffer with raw vs without raw void testIterate(int iterate, int buffersize, int* rawtime, int* norawtime) { cv::AutoBuffer<int> _testBuffer = cv::AutoBuffer<int>(buffersize); int* testBuffer = _testBuffer.data(); std::fill(testBuffer, testBuffer + buffersize, 0); auto tstart1 = std::chrono::high_resolution_clock::now(); for (int i = 0; i < iterate; i++) { for (int j = 0; j < buffersize; j++) { testBuffer[j] = 0; testBuffer[j] = testBuffer[j] + 1; } } auto tend1 = std::chrono::high_resolution_clock::now(); *rawtime = (int)std::chrono::duration_cast<std::chrono::microseconds>(tend1 - tstart1).count(); cv::AutoBuffer<int> testBuffer2 = cv::AutoBuffer<int>(buffersize); std::fill(testBuffer2.data(), testBuffer2.data() + buffersize, 0); auto tstart2 = std::chrono::high_resolution_clock::now(); for (int i = 0; i < iterate; i++) { for (int j = 0; j < buffersize; j++) { testBuffer2.data()[j] = 0; testBuffer2.data()[j] = testBuffer2.data()[j] + 1; } } auto tend2 = std::chrono::high_resolution_clock::now(); *norawtime = (int)std::chrono::duration_cast<std::chrono::microseconds>(tend2 - tstart2).count(); }

These were the results in Debug and Release mode respectively, with the numbers in microseconds, iterating 50000 times, with a buffer size of 50000.

DEBUG

RELEASE

There is a large difference between the use of RAWs and without, so much so that I am wondering if this is a compiler optimisation effect we are looking at.

If this is so, then the use of RAW pointers may be easier for the compiler to optimise, hence the speed difference we are seeing.

What do you think?

I tried again using XCode on iOS, and here are the new values

DEBUG

RELEASE

This time the release numbers are looking more like expected. Once again, compiler differences? I am open to suggestions and have kept both versions on my system. If you think that it's better to go completely without RAWs, then I will upload the new version. Do let me know.

Release difference is about 0.1% - measurement accuracy. I would say we don't loose performance here.
Debug difference is expected - .data() method and other functions are not inlined by default, more checks are involved to validate code assumptions (e.g, CV_DbgAssert). Usually debug builds are not tracked for performance.

It is better to replace RAW pointers from code safety/security perspective.

I have replaced the RAW pointers in the code except for labBuffer since this is taken from a Mat.data rather than an Autobuffer. This should be safe since the lifetime of the pointer is short (only used in OP1 and read just before invocation), and is read-only for operator values.

bug fixes

C++ 11 lambdas used instead of cv::ParallelLoopBody

changed neighbours location buffer to array

remove whitespace

RAW pointers removed

bug fixes

alalek

Pushed update with fixed coding style, added smoke test.

Please take a look on the comments below.

alalek · 2021-11-28T03:19:26Z

modules/ximgproc/src/scansegment.cpp

+void ScanSegmentImpl::OP2(std::pair<int, int> const& p)
+{
+    std::pair<int, int>& q = const_cast<std::pair<int, int>&>(p);
+    for (int i = q.first; i < q.second; i++) {


Why do we need this alias?

std::pair<int, int>& q = const_cast<std::pair<int, int>&>(p);

The same note is about OP4

My apologies. This was some old code that got carried over. Have updated it.

alalek · 2021-11-28T03:39:52Z

modules/ximgproc/include/opencv2/ximgproc/scansegment.hpp

+@param image_width Image width.
+@param image_height Image height.


BTW, prefer to use Size image_size instead of 2 dedicated values

I was thinking of this also, but in the end I went with following the pattern in createSuperpixelSEEDS which used two dedicated values. If you prefer cv::Size, let me know and I will change.

alalek · 2021-11-28T03:41:27Z

modules/ximgproc/include/opencv2/ximgproc/scansegment.hpp

+existing superpixel segmentation methods. When tested on the Berkeley Segmentation Dataset, the average processing speed is 175 frames/s
+with a Boundary Recall of 0.797 and an Achievable Segmentation Accuracy of 0.944. The computational complexity is quadratic O(n2) and
+more suited to smaller images, but can still process a 2MP colour image faster than the SEEDS algorithm in OpenCV. The output is deterministic
+when the number of processing threads is fixed, and requires the source image to be in Lab colour format.


Please add @cite loke2021accelerated in this documentation section.

added citation

bug fixes

alalek

Great job! Thank you for contribution 👍

scloke · 2021-11-29T20:44:08Z

My pleasure 😊 SC

scloke added 14 commits October 31, 2021 16:29

New superpixel algorithm (F-DBSCAN)

68ece0e

Implementation of a new superpixel algorithm, "Accelerated superpixel image segmentation with a parallelized DBSCAN algorithm".

Update scansegment.hpp

79f8436

added newline at end of file

Update scansegment.cpp

3e6aaae

added newline at end of file

Update scansegment.cpp

5d8616e

bug fixes

Update scansegment.cpp

d74bde5

bug fixes

Update scansegment.hpp

9dfe150

bug fixes

Update scansegment.cpp

3e4f25d

bug fixes

Update scansegment.hpp

5a2e153

trailing whitespace removal

Update scansegment.cpp

7130d25

bug fixes

Update scansegment.cpp

ee702c6

bug fixes

Update scansegment.cpp

c4155b0

editing changes

Update scansegment.hpp

ec78937

editing changes

Update scansegment.hpp

4de0e31

minor edits

Update scansegment.cpp

c6a918c

bug fixes

sturkmen72 reviewed Nov 3, 2021

View reviewed changes

modules/ximgproc/include/opencv2/ximgproc/scansegment.hpp Outdated Show resolved Hide resolved

sturkmen72 reviewed Nov 3, 2021

View reviewed changes

modules/ximgproc/include/opencv2/ximgproc/scansegment.hpp Outdated Show resolved Hide resolved

scloke added 3 commits November 3, 2021 15:02

Update scansegment.cpp

67d9143

inserted @addtogroup block

Update scansegment.cpp

0fbdc2f

bug fixes

Update scansegment.hpp

74c6cf5

bug fixes

alalek reviewed Nov 9, 2021

View reviewed changes

scloke added 8 commits November 10, 2021 15:33

Update scansegment.hpp

650f9c4

indents removed

Update scansegment.cpp

e0a4048

extra indents removed

Update scansegment.cpp

a968096

license agreement updated

Update scansegment.hpp

d98fa2c

license agreement updated

Update ximgproc.bib

a2bdd04

Update scansegment.hpp

0b3bfe4

reference moved to ximgproc.bib

Update scansegment.cpp

a2d5fe8

reference moved to ximgproc.bib

Update scansegment.hpp

5e950ee

c++ def removed

scloke added 10 commits November 10, 2021 15:59

Update scansegment.hpp

d092203

changed threads param

Update scansegment.cpp

86521fc

changed threads param

Update scansegment.cpp

bcbfdc3

tab indents replaced with 4 spaces

Update scansegment.cpp

c64cbf6

bug fixes

Update scansegment.hpp

56112ad

removed trailing whitespace

Update scansegment.cpp

a048792

replace malloc with autobuffer

Update scansegment.hpp

f106c54

updated header guard

Update scansegment.cpp

6635ef2

bug fix

Update scansegment.cpp

e9413ad

bug fixes

Update scansegment.cpp

8e86fc0

fixed process threads to the number of slices

alalek reviewed Nov 19, 2021

View reviewed changes

scloke and others added 7 commits November 22, 2021 12:18

Update scansegment.cpp

ab87fa1

bug fixes

Update scansegment.cpp

1b586b3

C++ 11 lambdas used instead of cv::ParallelLoopBody

Update scansegment.cpp

a766bb4

changed neighbours location buffer to array

Update scansegment.cpp

67fa9e5

remove whitespace

Update scansegment.cpp

2234e6a

RAW pointers removed

Update scansegment.cpp

3f398a5

bug fixes

ximgproc(ScanSegment): coding style, add smoke test

8b132b5

alalek reviewed Nov 28, 2021

View reviewed changes

scloke added 2 commits November 29, 2021 16:45

Update scansegment.hpp

30e7b5b

added citation

Update scansegment.cpp

acc879b

bug fixes

alalek approved these changes Nov 29, 2021

View reviewed changes

alalek merged commit a5cc475 into opencv:4.x Nov 29, 2021

alalek mentioned this pull request Dec 30, 2021

(5.x) Merge 4.x #3142

Merged

alalek mentioned this pull request Feb 22, 2022

(5.x) Merge 4.x #3179

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New superpixel algorithm (F-DBSCAN) #3093

New superpixel algorithm (F-DBSCAN) #3093

scloke commented Oct 31, 2021 •

edited

Loading

sturkmen72 commented Nov 7, 2021 •

edited

Loading

scloke commented Nov 7, 2021 •

edited

Loading

alalek left a comment

alalek left a comment

alalek Nov 19, 2021

scloke Nov 21, 2021

alalek Nov 19, 2021

scloke Nov 21, 2021

alalek Nov 19, 2021

scloke Nov 21, 2021

alalek Nov 19, 2021

scloke Nov 21, 2021

alalek Nov 19, 2021

scloke Nov 21, 2021 •

edited

Loading

alalek Nov 19, 2021

scloke Nov 22, 2021

alalek Nov 22, 2021

scloke Nov 22, 2021 •

edited

Loading

scloke Nov 24, 2021

alalek Nov 24, 2021

scloke Nov 24, 2021

alalek left a comment

alalek Nov 28, 2021

scloke Nov 29, 2021

alalek Nov 28, 2021

scloke Nov 29, 2021

alalek Nov 28, 2021

scloke Nov 29, 2021

alalek left a comment

scloke commented Nov 29, 2021 via email •

edited

Loading

		@param image_width Image width.
		@param image_height Image height.

New superpixel algorithm (F-DBSCAN) #3093

New superpixel algorithm (F-DBSCAN) #3093

Conversation

scloke commented Oct 31, 2021 • edited Loading

Pull Request Readiness Checklist

sturkmen72 commented Nov 7, 2021 • edited Loading

scloke commented Nov 7, 2021 • edited Loading

alalek left a comment

Choose a reason for hiding this comment

alalek left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

scloke Nov 21, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

scloke Nov 22, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alalek left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alalek left a comment

Choose a reason for hiding this comment

scloke commented Nov 29, 2021 via email • edited Loading

scloke commented Oct 31, 2021 •

edited

Loading

sturkmen72 commented Nov 7, 2021 •

edited

Loading

scloke commented Nov 7, 2021 •

edited

Loading

scloke Nov 21, 2021 •

edited

Loading

scloke Nov 22, 2021 •

edited

Loading

scloke commented Nov 29, 2021 via email •

edited

Loading