-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New superpixel algorithm (F-DBSCAN) #3093
Conversation
Implementation of a new superpixel algorithm, "Accelerated superpixel image segmentation with a parallelized DBSCAN algorithm".
added newline at end of file
added newline at end of file
bug fixes
bug fixes
bug fixes
bug fixes
trailing whitespace removal
bug fixes
bug fixes
editing changes
editing changes
minor edits
bug fixes
inserted @addtogroup block
bug fixes
bug fixes
@scloke first of all thank you for the contribution code :
threads = 1 threads = 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for contribution!
Please take a look on comments below.
indents removed
extra indents removed
license agreement updated
license agreement updated
reference moved to ximgproc.bib
reference moved to ximgproc.bib
c++ def removed
changed threads param
changed threads param
tab indents replaced with 4 spaces
bug fixes
removed trailing whitespace
replace malloc with autobuffer
updated header guard
bug fix
bug fixes
fixed process threads to the number of slices
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for updates!
modules/ximgproc/src/scansegment.cpp
Outdated
dr = std::abs((ptr1)[2] - (ptr2)[2]); \ | ||
diff = ws_max(db, dg); \ | ||
diff = ws_max(diff, dr); \ | ||
assert(0 <= diff && diff <= 255); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using of C-style assert()
is not allowed.
- use
CV_Assert()
/CV_DbgAssert()
instead - or prefer to use code with std::clamp() logic instead of checks (clamp is C++17 feature and not available yet by default, so simulate it through std::min/max)
other cases should be fixed too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
substituted with CV_Assert()
modules/ximgproc/src/scansegment.cpp
Outdated
else \ | ||
q[idx].first = node; \ | ||
q[idx].last = node; \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{}
brackets? or add indentation to clarify that this code is not broken
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indentation added
modules/ximgproc/src/scansegment.cpp
Outdated
//! @addtogroup ximgproc_superpixel | ||
//! @{ | ||
|
||
class ScanSegmentImpl : public ScanSegment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider using CV_FINAL
(final
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
modules/ximgproc/src/scansegment.cpp
Outdated
ScanSegmentImpl::ScanSegmentImpl(int image_width, int image_height, int num_superpixels, int slices, bool merge_small) | ||
{ | ||
// set the number of process threads | ||
processthreads = std::thread::hardware_concurrency(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use cv::getNumThreads()
instead, drop #include <thread>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replaced
modules/ximgproc/src/scansegment.cpp
Outdated
|
||
// start at the center of the rect, then run through the remainder | ||
labBuffer = reinterpret_cast<cv::Vec3b*>(src.data); | ||
cv::parallel_for_(cv::Range(0, (int)indexNeighbourVec.size()), PP1(reinterpret_cast<ScanSegmentImpl*>(this))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, using of C++11 lambdas with parallel_for looks like:
parallel_for_(Range(0, (int)indexNeighbourVec.size()), [&](const Range& range) {
for (int i = range.start; i < range.end; i++) {
OP1(i);
}
});
OP4 case:
// copy back to labels mat
parallel_for_(Range(0, (int)indexProcessVec.size()), [&](const Range& range) {
for (int i = range.start; i < range.end; i++) {
OP4(indexProcessVec[i]);
}
});
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks+++ Appreciate the help. I compiled on my system on C++ 14 and no problems. All replaced
modules/ximgproc/src/scansegment.cpp
Outdated
cv::AutoBuffer<cv::Rect> _seedRects; // autobuffer of seed rectangles | ||
cv::AutoBuffer<cv::Rect> _seedRectsExt; // autobuffer of extended seed rectangles | ||
cv::AutoBuffer<cv::Rect> _offsetRects; // autobuffer of offset rectangles | ||
cv::AutoBuffer<cv::Point> _neighbourLoc;// autobuffer of neighbour locations | ||
cv::Rect* seedRects; // array of seed rectangles | ||
cv::Rect* seedRectsExt; // array of extended seed rectangles | ||
cv::Rect* offsetRects; // array of offset rectangles | ||
cv::Point* neighbourLoc; // neighbour locations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general it is dangerous to store dedicated RAW pointers (RAW pointers unable to control lifetime of allocated buffer).
Also it doesn't make sense as .data()
method is fast as RAW pointer.
Moreover operator [](size_t i)
would check index for valid range in debug builds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The .data() method is new to me, so I used the CLAHE implementation code in OpenCV as a template to follow. The code that was used there converted the .data() to a RAW pointer:
cv::AutoBuffer _tileHist(histSize);
int* tileHist = _tileHist.data();
There is a significant speed difference when converting to .data() without dedicated RAW pointers. I ran with and without dedicated RAW pointers over a thousand cycles, twice, and the speed difference was 31.1s (with RAW) vs 37.3s (without RAW). I managed to improve the speed a bit by converting the neighbourLoc to a pre-initialised buffer, but otherwise I left it unchanged.
I think that the use of RAW pointers in this case should be safe enough since they are sourced from the AutoBuffers which are initialised as class variables that are allocated and deallocated based on the lifetime of the class. The evidence of this is:
- running several thousand cycles in both debug and release builds showed no instability / memory leaks / buffer overruns.
- the CLAHE module in OpenCV uses the same method and there has be no report of instability
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really have several questions for such micro-benchmarks... No idea that they measure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the code you quoted, .data() should give a straight reference to the aligned pointer of the data in the AutoBuffer. The operator should also function similarly. Hence, there should not be any speed difference as you said.
Previously, I ran the entire code to process the 2 MP image I described earlier 1000 times. Now I have written a microbenchmark to test the AutoBuffer specifically.
// test 10000 autobuffer with raw vs without raw
void testIterate(int iterate, int buffersize, int* rawtime, int* norawtime)
{
cv::AutoBuffer<int> _testBuffer = cv::AutoBuffer<int>(buffersize);
int* testBuffer = _testBuffer.data();
std::fill(testBuffer, testBuffer + buffersize, 0);
auto tstart1 = std::chrono::high_resolution_clock::now();
for (int i = 0; i < iterate; i++) {
for (int j = 0; j < buffersize; j++) {
testBuffer[j] = 0;
testBuffer[j] = testBuffer[j] + 1;
}
}
auto tend1 = std::chrono::high_resolution_clock::now();
*rawtime = (int)std::chrono::duration_cast<std::chrono::microseconds>(tend1 - tstart1).count();
cv::AutoBuffer<int> testBuffer2 = cv::AutoBuffer<int>(buffersize);
std::fill(testBuffer2.data(), testBuffer2.data() + buffersize, 0);
auto tstart2 = std::chrono::high_resolution_clock::now();
for (int i = 0; i < iterate; i++) {
for (int j = 0; j < buffersize; j++) {
testBuffer2.data()[j] = 0;
testBuffer2.data()[j] = testBuffer2.data()[j] + 1;
}
}
auto tend2 = std::chrono::high_resolution_clock::now();
*norawtime = (int)std::chrono::duration_cast<std::chrono::microseconds>(tend2 - tstart2).count();
}
These were the results in Debug and Release mode respectively, with the numbers in microseconds, iterating 50000 times, with a buffer size of 50000.
There is a large difference between the use of RAWs and without, so much so that I am wondering if this is a compiler optimisation effect we are looking at.
If this is so, then the use of RAW pointers may be easier for the compiler to optimise, hence the speed difference we are seeing.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried again using XCode on iOS, and here are the new values
This time the release numbers are looking more like expected. Once again, compiler differences? I am open to suggestions and have kept both versions on my system. If you think that it's better to go completely without RAWs, then I will upload the new version. Do let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Release difference is about 0.1% - measurement accuracy. I would say we don't loose performance here.
Debug difference is expected - .data()
method and other functions are not inlined by default, more checks are involved to validate code assumptions (e.g, CV_DbgAssert). Usually debug builds are not tracked for performance.
It is better to replace RAW pointers from code safety/security perspective.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have replaced the RAW pointers in the code except for labBuffer since this is taken from a Mat.data rather than an Autobuffer. This should be safe since the lifetime of the pointer is short (only used in OP1 and read just before invocation), and is read-only for operator values.
bug fixes
C++ 11 lambdas used instead of cv::ParallelLoopBody
changed neighbours location buffer to array
remove whitespace
RAW pointers removed
bug fixes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushed update with fixed coding style, added smoke test.
Please take a look on the comments below.
modules/ximgproc/src/scansegment.cpp
Outdated
void ScanSegmentImpl::OP2(std::pair<int, int> const& p) | ||
{ | ||
std::pair<int, int>& q = const_cast<std::pair<int, int>&>(p); | ||
for (int i = q.first; i < q.second; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this alias?
std::pair<int, int>& q = const_cast<std::pair<int, int>&>(p);
The same note is about OP4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My apologies. This was some old code that got carried over. Have updated it.
@param image_width Image width. | ||
@param image_height Image height. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, prefer to use Size image_size
instead of 2 dedicated values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking of this also, but in the end I went with following the pattern in createSuperpixelSEEDS which used two dedicated values. If you prefer cv::Size, let me know and I will change.
existing superpixel segmentation methods. When tested on the Berkeley Segmentation Dataset, the average processing speed is 175 frames/s | ||
with a Boundary Recall of 0.797 and an Achievable Segmentation Accuracy of 0.944. The computational complexity is quadratic O(n2) and | ||
more suited to smaller images, but can still process a 2MP colour image faster than the SEEDS algorithm in OpenCV. The output is deterministic | ||
when the number of processing threads is fixed, and requires the source image to be in Lab colour format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add @cite loke2021accelerated
in this documentation section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added.
added citation
bug fixes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job! Thank you for contribution 👍
My pleasure 😊
SC
|
Implementation of a new superpixel algorithm, "Accelerated superpixel image segmentation with a parallelized DBSCAN algorithm".
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.