Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

connectedcomponents submission #135

Merged
merged 17 commits into from
Jan 10, 2013
Merged

connectedcomponents submission #135

merged 17 commits into from
Jan 10, 2013

Conversation

nevion
Copy link

@nevion nevion commented Nov 5, 2012

Hi,

This is a git pull request for the following issue (that has went on way too long!). I've fixed up the Dustin's complaints... let's get this code out there!

http://code.opencv.org/issues/1236

There are COUNTLESS threads on how to do connectedcomponents almost monthly submissions to the ML. Stackoverflow alone must have 20+ threads on this. Most end up using cvblobslib which is often broken and too slow to boot. There are some other implementations too and it's possible to do it opencv with findContours but it's really a bad situation atm. People actually are rolling their own implementations still (naively).

Anyway this presents an easy to use single function call which gives the labeled image and optionally the statistics one often wants with connected components (bounding box, centroid, area) and does so with better worst case and average case performance than all the other implementations.

I hope I'll have better luck at submission than through the issue tracker.

Jason Newton added 3 commits November 5, 2012 08:10
@ghost ghost assigned vpisarev Nov 7, 2012
@vpisarev
Copy link
Contributor

ok, let's finally put it in. However, I would like you to modify interface significantly and partly modify implementation. Since the current trend in OpenCV is to introduce very stable API and do not change it for a long time, I want the API to be future-proof, and not only at the source code level, but also at binary level.

Why we did not integrate the patch earlier in the first place? Because connected components is quite generic thing. They can be extracted from binary (bi-level) images, they can be extracted from grayscale or even multi-channel images if we define the predicate (close(p, q)) for the neighbor pixels. On the output side, representation of connected components can be very different too, they can be represented as contours, as sets of horizontal/vertical line segments etc. The statistics that is computed in your implementation may be sufficient for some users, but insufficient for others, e.g. J. Matas et al in "Real-time scene text localization and recognition" consider different, in my opinion very useful characteristics like the perimeter, number of holes, number of zero-crossings, number of inflections etc.

So, we can not just "occupy" quite generic name "connected components" for something rather specific and not covering this area.

But I admit that something is better than nothing, and designing an ideal connected component extraction function may take a very long time.

So, let's put the current code in, but make the API more generic and more wrapper-friendly:

vvvvvvvvvvvvvvvvvvvvvvvv

CV_EXPORTS_W int connectedComponents(InputArray image, OutputArray labels, int connectivity = 8, int flags=0);

enum { CC_STAT_LEFT=0, CC_STAT_TOP=1, CC_STAT_WIDTH=2, CC_STAT_HEIGHT=3, CC_STAT_CX=4, CC_STAT_CY=5, CC_STAT_AREA=6, CC_STAT_INTEGRAL_X=7, CC_STAT_INTEGRAL_Y=8, ... };

CV_EXPORTS_W int connectedComponentsWithStats(InputArray image, OutputArray labels, OutputArray stats, int connectivity = 8, int flags=0);

^^^^^^^^^^^^^^^^^^^^^^^^^^^^

First of all, the use of uint64 should be eliminated since it's a weird type and its use here is not justified for any reasonable image size (2 billions of pixels is quite a big number). Then, input image should be made the first parameter, by our common convention and should be made InputArray, which was introduced in 2.4.0. "const Mat&" is implicitly converted to InputArray. The output label image should be made OutputArray. "Mat&" is implicitly converted to OutputArray. The output vector of cc statistics should be also wrapped into OutputArray. The structure ConnectedComponentStats should be removed as too specific and not wrapper-friendly. For Python you provided wrappers, but what about Java? Also, for example, quite a few people want Matlab & Octave interface for OpenCV, and each extra structure means that we will have to provide such a conversion function. Instead, I suggest to pack all the statistics into array of integers. The centroid can be represented in fixed-point format.

Mat labels, stats;
connectedComponents(my_image, labels, stats);
Point2f center_i(stats.at(i,CC_STAT_CX)/256.f, stats.at(i,CC_STAT_CY)/256.f);

I would also add the label of the component in the "default" statistics, because if we remove (filter off) some of the connected components, e.g. too small ones, the ordering will be lost.

flags parameter will regulate what exactly needs to be computed. 0 means your statistics, e.g. CC_STAT_MATAS_MODE could mean adding some more stuff etc.


Now, on the implementation. It's a lot of code in OpenCV already, and we try to do our best to keep functionality/footprint ratio reasonable (when possible, we try to increase it).

The proposed patch implements connected components extraction for binary image, which is 8-bit single-channel image (CV_8UC1) in 99% cases. If not, it can be converted to this format with a single line of code:

connectedComponents((my_32f_image > eps), labels, stats);

So, it's enough to support CV_8UC1 input. Similarly, it's enough to constrain the output image of labels to 32-bit format (CV_32S). We support only this format in our labeled distance transform and watershed transforms functions and did not hear a single complain. So, let's bring the code size down and make the function non-template 8u->32s code. I'm sure we will find a better use of the space (especially on mobile devices).


One final point. For the code that can be tested (i.e. except for camera capturing, GUI stuff) we now demand some tests, at least smoke tests. The sample you prepared is great, but we can not run it nightly in batch mode. So we will need some tests for these connected component extraction functions. You can look at https://github.com/Itseez/opencv/blob/master/modules/imgproc/test/test_watershed.cpp for a relevant test. It just reads some image, runs the function and compares the output with the previously stored one.

What do you think? If you are fine with the proposed changes and willing to do that and update the pull request, I will merge it in.

@nevion
Copy link
Author

nevion commented Nov 23, 2012

This is indeed alot of changes... see my responses inline.

On Thu, Nov 22, 2012 at 10:52 AM, Vadim Pisarevsky <notifications@github.com

wrote:

ok, let's finally put it in. However, I would like you to make a few
significantly modify interface and partly modify implementation. Since the
current trend in OpenCV is to introduce very stable API and do not change
it for a long time, I want the API to be future-proof, and not only at the
source code level, but also at binary level.

Why we did not integrate the patch earlier in the first place? Because
connected components is quite generic thing. They can be extracted from
binary (bi-level) images, they can be extracted from grayscale or even
multi-channel images if we define the predicate (close(p, q)) for the
neighbor pixels. On the output side, representation of connected components
can be very different too, they can be represented as contours, as sets of
horizontal/vertical line segments etc. The statistics that is computed in
your implementation may be sufficient for some users, but insufficient for
others, e.g. J. Matas et al in "Real-time scene text localization and
recognition" consider different, in my opinion very useful characteristics
like the perimeter, number of holes, number of zero-crossings, number of
inflections etc.

Managing a fast implementation of all of the statistics is a hard thing to
do. With templates, as I do in the code, you can redifine the functor used
for statistics computation and pay for what you use. But this is OpenCV
and it is not common to make so much of the implementation available in
headers. Instead I just added the by-far most common statistics... sure
there will be people who want more.... unfortunately given the
difficulty/slownness of providing that, they will just have to do a 3rd
pass over the image, which is all the statistics computation saves you from
doing. Or make the template implementation available.

So, we can not just "occupy" quite generic name "connected components" for
something rather specific and not covering this area.

But I admit that something is better than nothing, and designing an ideal
connected component extraction function may take a very long time.

I consider the dense matrix representation (ie an image) to be the fastest
and most useful representation of the result. I also believe that the
required input is binary in nature (but type independent) and other than
optimization does not make sense to define for multi-channel images.
However I think it is possible to add multi-channel support at the expense
of complicating the implementation further.

So, let's put the current code in, but make the API more generic and more
wrapper-friendly:

vvvvvvvvvvvvvvvvvvvvvvvv

CV_EXPORTS_W int connectedComponents(InputArray image, OutputArray labels,
int connectivity = 8, int flags=0);

enum { CC_STAT_LEFT=0, CC_STAT_TOP=1, CC_STAT_WIDTH=2, CC_STAT_HEIGHT=3,
CC_STAT_CX=4, CC_STAT_CY=5, CC_STAT_AREA=6, CC_STAT_INTEGRAL_X=7,
CC_STAT_INTEGRAL_Y=8, ... };

CV_EXPORTS_W int connectedComponentsWithStats(InputArray image,
OutputArray labels, OutputArray stats, int connectivity = 8, int flags=0);

^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I wasn't aware of this, it's fine, though at this point, annoying change.
Kudos for having a good/easy way to support all those other
languages/environments.

First of all, the use of uint64 should be eliminated since it's a weird

type and its use here is not justified for any reasonable image size (2
billions of pixels is quite a big number)

You may scoff but I've gotten uncomfortably close to the limits of uint32
for labels. I've also gotten surprises from Xlib only supporting images of
2^15 or 16th in dimension before. Leave it and be future proof for all
those nasa or panoramic types, hm? 64GB of ram is only a few hundred
dollars these days.

Then, input image should be made the first parameter, by our common
convention and should be made InputArray, which was introduced in 2.4.0.
"const Mat&" is implicitly converted to InputArray. The output label image
should be made OutputArray. "Mat&" is implicitly converted to OutputArray.
The output vector of cc statistics should be also wrapped into OutputArray.
The structure ConnectedComponentStats should be removed as too specific and
not wrapper-friendly. For Python you provided wrappers, but what about
Java? Also, for example, quite a few people want Matlab & Octave interface
for OpenCV, and each extra structure means that we will have to provide
such a conversion function. Instead, I suggest to pack all the statistics
into array of integers. The centroid can be represent ed in fixed-point
format.

The centroid computations are cheap since the number of divisions is the
number of components. If you're going to convert them to floats at some
point might as well just do it once?

Mat labels, stats;

connectedComponents(my_image, labels, stats);
Point2f center_i(stats.at(i,CC_STAT_CX)/256.f, stats.at
(i,CC_STAT_CY)/256.f);

I would also add the label of the component in the "default" statistics,
because if we remove (filter off) some of the connected components, e.g.
too small ones, the ordering will be lost.

Not needed. Their position in the array tells you the label. If you
reduce the array by filtering, you should keep track of it yourself with a
tuple/pair object or extra column/vector. This is something I encounter
all over the place in unrelated context and code and that's the way I've
dealt with it. I don't believe that approach will slow one down one by any
meaningful amount of time.

flags parameter will regulate what exactly needs to be computed. 0 means

your statistics, e.g. CC_STAT_MATAS_MODE could mean adding some more stuff
etc.

Predefined groups of statistics are the only way to really manage the
complexity of implementation though it still leaves custom statistics out
in the cold for that 3rd pass. Having thought about matas's statistics I'm
not sure if they could or should be implemented in the same way though...
What I currently compute is done in the relabeling step (2nd pass) and the
final result is not actually finished yet (the pixel after the current one
used in statistics computation is not yet labeled with the final label).
So some of these statistics may best run in a 3rd pass if it depends on a
neighborhood or is otherwise not causal. Either way this is looking to be
a relatively large add-on effort. These things are probably best put into
an auxiliary function using a 3rd pass where as the statistics I have
computed can be computed in the 2nd pass efficiently.


Now, on the implementation. It's a lot of code in OpenCV already, and we
try to do our best to keep functionality/footprint ratio reasonable (when
possible, we try to increase it).

The proposed patch implements connected components extraction for binary
image, which is 8-bit single-channel image (CV_8UC1) in 99% cases. If not,
it can be converted to this format with a single line of code:

connectedComponents((my_32f_image > eps), labels, stats);

So, it's enough to support CV_8UC1 input. Similarly, it's enough to
constrain the output image of labels to 32-bit format (CV_32S). We support
only this format in our labeled distance transform and watershed transforms
functions and did not hear a single complain. So, let's bring the code size
down and make the function non-template 8u->32s code. I'm sure we will find
a better use of the space (especially on mobile devices).

Input is bi-level, that is established. However output depends on the size
of the images or characteristics of the input and proper output choice
matters for speed. For some images uint8 is enough, others uint16, and
others more uint32... and finally, only because it is the next largest
size, uint64 for those working on extremely large images (signed types make
no sense here).

Further in-place threshholding a float without new image being constructed
and set is still faster... that's what this algorithm/implementation is all
about, getting it as fast as possible and giving the user enough power and
flexibility to shoot themselves in the foot (output-types) but also shave
time off and get real-time interactive output and reduce memory
requirements / cache thrashing.

So I think it is unfortunate and true that there is some unneeded code wrt
an individuals program, but it's not worth the sacrifice. Too bad strip
--strip-unneeded doesn't work here, right?


One final point. For the code that can be tested (i.e. except for camera
capturing, GUI stuff) we now demand some tests, at least smoke tests. The
sample you prepared is great, but we can not run it nightly in batch mode.
So we will need some tests for this connected component extraction
functions. You can look at
https://github.com/Itseez/opencv/blob/master/modules/imgproc/test/test_watershed.cppfor a similar function. It just reads some image, runs the function and
compares the output with the previously stored one.

I'll do that.

What do you think? If you are fine with the proposed changes and willing
to do that and update the pull request, I will merge it in.

I'm extremely happy someone finally talked with me and your email was
thorough. I wish it came earlier and maybe in smaller chucks.... I don't
have much time for FOSS contributions these days. .. I've been doing
extensive in the field (air) testing of vision systems every month or so
for the last 5 months so it's been a crunch time and then some leaving me
with little time for anything else.

I should be able to get most of these done hopefully before the weekend is
over but no promises... I was already working on occupational projects.
The things we have dis-agreed on will need re-evaluation though.

Thank you and best regards,
Jason

Reply to this email directly or view it on GitHubhttps://github.com//pull/135#issuecomment-10642632.

@nevion
Copy link
Author

nevion commented Nov 23, 2012

Actually, I'll concede on the return type of uint64. I don't like using a
signed type for the return type given a large enough image though. One can
recover the actual number by casting again back to unsigned but this should
be an unsigned type as it is the count of something. Any final thoughts
on this? I don't suppose opencv will ever add in a uint32...

After seeing the 360k gcc outputs for this binary after stripping I took
the size reduction a little more seriously. I cut the input variations
down to just uint8 which reduced it to 90k, a bit more on par with the
other functions in imgproc. I don't like doing this that much but at least
it's only a few lines to add in support for these types... at the expense
of a private build.

On Thu, Nov 22, 2012 at 6:40 PM, Jason Newton nevion@gmail.com wrote:

This is indeed alot of changes... see my responses inline.

On Thu, Nov 22, 2012 at 10:52 AM, Vadim Pisarevsky <
notifications@github.com> wrote:

ok, let's finally put it in. However, I would like you to make a few
significantly modify interface and partly modify implementation. Since the
current trend in OpenCV is to introduce very stable API and do not change
it for a long time, I want the API to be future-proof, and not only at the
source code level, but also at binary level.

Why we did not integrate the patch earlier in the first place? Because
connected components is quite generic thing. They can be extracted from
binary (bi-level) images, they can be extracted from grayscale or even
multi-channel images if we define the predicate (close(p, q)) for the
neighbor pixels. On the output side, representation of connected components
can be very different too, they can be represented as contours, as sets of
horizontal/vertical line segments etc. The statistics that is computed in
your implementation may be sufficient for some users, but insufficient for
others, e.g. J. Matas et al in "Real-time scene text localization and
recognition" consider different, in my opinion very useful characteristics
like the perimeter, number of holes, number of zero-crossings, number of
inflections etc.

Managing a fast implementation of all of the statistics is a hard thing to
do. With templates, as I do in the code, you can redifine the functor used
for statistics computation and pay for what you use. But this is OpenCV
and it is not common to make so much of the implementation available in
headers. Instead I just added the by-far most common statistics... sure
there will be people who want more.... unfortunately given the
difficulty/slownness of providing that, they will just have to do a 3rd
pass over the image, which is all the statistics computation saves you from
doing. Or make the template implementation available.

So, we can not just "occupy" quite generic name "connected components"
for something rather specific and not covering this area.

But I admit that something is better than nothing, and designing an ideal
connected component extraction function may take a very long time.

I consider the dense matrix representation (ie an image) to be the fastest
and most useful representation of the result. I also believe that the
required input is binary in nature (but type independent) and other than
optimization does not make sense to define for multi-channel images.
However I think it is possible to add multi-channel support at the expense
of complicating the implementation further.

So, let's put the current code in, but make the API more generic and more
wrapper-friendly:

vvvvvvvvvvvvvvvvvvvvvvvv

CV_EXPORTS_W int connectedComponents(InputArray image, OutputArray
labels, int connectivity = 8, int flags=0);

enum { CC_STAT_LEFT=0, CC_STAT_TOP=1, CC_STAT_WIDTH=2, CC_STAT_HEIGHT=3,
CC_STAT_CX=4, CC_STAT_CY=5, CC_STAT_AREA=6, CC_STAT_INTEGRAL_X=7,
CC_STAT_INTEGRAL_Y=8, ... };

CV_EXPORTS_W int connectedComponentsWithStats(InputArray image,
OutputArray labels, OutputArray stats, int connectivity = 8, int flags=0);

^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I wasn't aware of this, it's fine, though at this point, annoying change.
Kudos for having a good/easy way to support all those other
languages/environments.

First of all, the use of uint64 should be eliminated since it's a weird

type and its use here is not justified for any reasonable image size (2
billions of pixels is quite a big number)

You may scoff but I've gotten uncomfortably close to the limits of uint32
for labels. I've also gotten surprises from Xlib only supporting images of
2^15 or 16th in dimension before. Leave it and be future proof for all
those nasa or panoramic types, hm? 64GB of ram is only a few hundred
dollars these days.

Then, input image should be made the first parameter, by our common
convention and should be made InputArray, which was introduced in 2.4.0.
"const Mat&" is implicitly converted to InputArray. The output label image
should be made OutputArray. "Mat&" is implicitly converted to OutputArray.
The output vector of cc statistics should be also wrapped into OutputArray.
The structure ConnectedComponentStats should be removed as too specific and
not wrapper-friendly. For Python you provided wrappers, but what about
Java? Also, for example, quite a few people want Matlab & Octave interface
for OpenCV, and each extra structure means that we will have to provide
such a conversion function. Instead, I suggest to pack all the statistics
into array of integers. The centroid can be represent ed in fixed-point
format.

The centroid computations are cheap since the number of divisions is the
number of components. If you're going to convert them to floats at some
point might as well just do it once?

Mat labels, stats;

connectedComponents(my_image, labels, stats);
Point2f center_i(stats.at(i,CC_STAT_CX)/256.f, stats.at
(i,CC_STAT_CY)/256.f);

I would also add the label of the component in the "default" statistics,
because if we remove (filter off) some of the connected components, e.g.
too small ones, the ordering will be lost.

Not needed. Their position in the array tells you the label. If you
reduce the array by filtering, you should keep track of it yourself with a
tuple/pair object or extra column/vector. This is something I encounter
all over the place in unrelated context and code and that's the way I've
dealt with it. I don't believe that approach will slow one down one by any
meaningful amount of time.

flags parameter will regulate what exactly needs to be computed. 0 means

your statistics, e.g. CC_STAT_MATAS_MODE could mean adding some more stuff
etc.

Predefined groups of statistics are the only way to really manage the
complexity of implementation though it still leaves custom statistics out
in the cold for that 3rd pass. Having thought about matas's statistics I'm
not sure if they could or should be implemented in the same way though...
What I currently compute is done in the relabeling step (2nd pass) and the
final result is not actually finished yet (the pixel after the current one
used in statistics computation is not yet labeled with the final label).
So some of these statistics may best run in a 3rd pass if it depends on a
neighborhood or is otherwise not causal. Either way this is looking to be
a relatively large add-on effort. These things are probably best put into
an auxiliary function using a 3rd pass where as the statistics I have
computed can be computed in the 2nd pass efficiently.


Now, on the implementation. It's a lot of code in OpenCV already, and we
try to do our best to keep functionality/footprint ratio reasonable (when
possible, we try to increase it).

The proposed patch implements connected components extraction for binary
image, which is 8-bit single-channel image (CV_8UC1) in 99% cases. If not,
it can be converted to this format with a single line of code:

connectedComponents((my_32f_image > eps), labels, stats);

So, it's enough to support CV_8UC1 input. Similarly, it's enough to
constrain the output image of labels to 32-bit format (CV_32S). We support
only this format in our labeled distance transform and watershed transforms
functions and did not hear a single complain. So, let's bring the code size
down and make the function non-template 8u->32s code. I'm sure we will find
a better use of the space (especially on mobile devices).

Input is bi-level, that is established. However output depends on the
size of the images or characteristics of the input and proper output choice
matters for speed. For some images uint8 is enough, others uint16, and
others more uint32... and finally, only because it is the next largest
size, uint64 for those working on extremely large images (signed types make
no sense here).

Further in-place threshholding a float without new image being constructed
and set is still faster... that's what this algorithm/implementation is all
about, getting it as fast as possible and giving the user enough power and
flexibility to shoot themselves in the foot (output-types) but also shave
time off and get real-time interactive output and reduce memory
requirements / cache thrashing.

So I think it is unfortunate and true that there is some unneeded code wrt
an individuals program, but it's not worth the sacrifice. Too bad strip
--strip-unneeded doesn't work here, right?


One final point. For the code that can be tested (i.e. except for camera
capturing, GUI stuff) we now demand some tests, at least smoke tests. The
sample you prepared is great, but we can not run it nightly in batch mode.
So we will need some tests for this connected component extraction
functions. You can look at
https://github.com/Itseez/opencv/blob/master/modules/imgproc/test/test_watershed.cppfor a similar function. It just reads some image, runs the function and
compares the output with the previously stored one.

I'll do that.

What do you think? If you are fine with the proposed changes and
willing to do that and update the pull request, I will merge it in.

I'm extremely happy someone finally talked with me and your email was
thorough. I wish it came earlier and maybe in smaller chucks.... I don't
have much time for FOSS contributions these days. .. I've been doing
extensive in the field (air) testing of vision systems every month or so
for the last 5 months so it's been a crunch time and then some leaving me
with little time for anything else.

I should be able to get most of these done hopefully before the weekend is
over but no promises... I was already working on occupational projects.
The things we have dis-agreed on will need re-evaluation though.

Thank you and best regards,
Jason

Reply to this email directly or view it on GitHubhttps://github.com//pull/135#issuecomment-10642632.

…but this breaks python bindings;

remove non-8bit input type support, not worth the binary size
@vpisarev
Copy link
Contributor

Jason,
no need to implement Matas etc. props, we just need to leave space for it in the API. Actually, I now think of going even further to make it an "Algorithm":

class CComp : public Algorithm
{
public:
...
int label(InputArray img, OutputArray labels) const;
int labelAndGetStat(InputArray img, OutputArray labels, OutputArray stats) const;
};

and everything else can be made properties, so it's super-extendable:

Ptr ccomp = Algorithm::create("connected-component-extractor" /* or use some better name? */ );
// change parameters if the default values are not good enough
ccomp->setInt("connectivity", 4);
ccomp->setString("mode", "Matas");
ccomp->setInt("outputFormat", CV_64F);
// run the algorithm
ccomp->label(img, labels, stats);

I understand that:

  1. you may not have enough time to modify the code and those InputArray/OutputArray/Algorithm concepts is something extra that you do not spend time on.
  2. whatever changes we propose, the end result should work for you, otherwise it would be very silly situation - contribute something, spend time on it and get unusable result.

so I suggest to do it together. If you make a branch in your repository and give me access to it, I can help to adjust the code. Or I can submit a pull request to your repository at github. Is it fine?

On the various data type support. uint32 is actually needed very rarely, as we learnt from our experience and users' feedback. If int32 is not enough (we prefer to call it int, since on any known for us 32-bit or 64-bit OS sizeof(int)==4), we found double to be the best alternative. It's supported in hardware anywhere nowadays, from most low-power ARMs to the modern Intel chips with AVX instructions (that can process 4 double's at once). double can represent any integer <=2*53 exactly, it takes the same space as [u]int64, it's usually faster to process than 64-bit integers on 32-bit CPUs and about as fast on 64-bit CPUs (actually, it's faster there too if we take into account SSE2/AVX that have a very limited support for 64-bit integers). So, uint32, int64/uint64 are useless types in my opinion. (BTW, if "int" as return type is not enough for connectedComponent functions, I would suggest to use "double"). And if we have some spare resources to add another datatype to OpenCV's Mat, I would immediately choose float16 (half).

@vpisarev
Copy link
Contributor

oh, I forgot to add that another reason to make it an Algorithm is to add [later, if not now] some other useful properties, e.g.
ccomp->setInt("minArea", 10);
if a connected component is too small, the algorithm could wipe it out (or just do to store the statistics for it, but for that we need the "label" component in each "stat" row). Filtering out tiny connected components is very useful feature, sometimes it's the only purpose of using connected component function (like in our stereo correspondence algorithms)

-Change input/output order from (out Labeled, in Image) -> (in Image, Out Labeled) and convert
to Input/OutputArrays in the process.

-Adopt OutputArray for statistics export so that the algorithm is "wrapper friendly" and not requiring a new struct in
language bindings at the expense of using doubles for everything and slowing statistics computation down..
@nevion
Copy link
Author

nevion commented Nov 28, 2012

Hi Vadim,

I've made alot of the changes we've discussed... I'll go through the
checklist again later but the last thing before the introduction of the
algorithm functor is the autobuild friendly test.

On the stats thing, I made it an OutputArray of Ints (using the types as
unsigned int internally). This doesn't really fit in with the centroid
though even though it is easily computable from these ints. I don't think
fixed point makes it any easier to deal with and that it throws away
precision is just more mud in the face.

So 1) the fixed point way is:
Point2f center_i(stats.at http://stats.at(l,CC_STAT_CX)/256.f,
stats.at http://stats.at(l,CC_STAT_CY)/256.f);
2) The compute as needed way:
Point2f center_i(stats.at http://stats.at(l,CC_STAT_INTEGRAL_X)/
float(stats.at http://stats.at(l,CC_STAT_AREA)),
stats.athttp://stats.at
(l,CC_STAT_INTEGRAL_Y)//float(stats.at http://stats.at
(l,CC_STAT_AREA));
OR
Point2f center(Vec2f(stats.at http://stats.at(l,CC_STAT_INTEGRAL_X)),
stats.at http://stats.at(l,CC_STAT_INTEGRAL_Y))/stats.athttp://stats.at
(l,CC_STAT_AREA));

  1. The doubles (slower) approach:
    Point2f center_i(stats.at http://stats.at(l,CC_STAT_CX),
    stats.at http://stats.at(l,CC_STAT_CY));

I think the compute as needed way (2) is the best approach but would be
often repeated and is a fairly long line. It also assumes users will
recognize the integrals and areas to get centroids. Any suggestions on
dealing with this so it is more friendly to users? Perhaps a
mention/example in documentation is the only way to go?

I'm also a little worried about the integrals as well. Imagine a trivial
all-1 bi-level of 8192x8192, this would have an integral on x and y
equating to (2^13)_(2^13-1)/2_2^13 or 2^38. Well exceeding the int's
capabilities. The only solutions here are:
1a) Don't export the integral and use a private workspace for it (with
uint64 or double), export results as fixed point integers
1b) Same as 1 but export it as an additional OutputArray, as double for
centroids
2) Use a struct as originally done
3) use all doubles

Please let me know your thoughts on the best solution. I'm thinking if we
want to guard against these perfectly possible issues, either internal
integrals of uint64 or structs is the way to go. The former still would
require either fixed point or an additional matrix for centroids.

I suppose I wouldn't mind seeing the following signature, as a user:

CV_EXPORTS_W int connectedComponentsWithStats(InputArray image, OutputArray
labels, OutputArray stats, OutputArray centroids, int connectivity = 8);


On the topic of the functor... I'm a little worried about bloat. I'm not
saying such an object wouldn't be useful to quickly chain a vision stack
together but it seems like it could get to be pretty large rather than the
simple stand-alone utility I originally envisioned.

Perhaps both should be provided... and maybe not just for this function,
but for all of these constructs of image processing. I would say they are
beyond my scope of use though, personally. Its a little reminiscent of ITK
(although they force their iterator concepts upon everyone and are a
heavily templated library).

On Fri, Nov 23, 2012 at 12:44 AM, Vadim Pisarevsky <notifications@github.com

wrote:

oh, I forgot to add that another reason to make it an Algorithm is to add
[later, if not now] some other useful properties, e.g.
ccomp->setInt("minArea", 10);
if a connected component is too small, the algorithm could wipe it out (or
just do to store the statistics for it, but for that we need the "label"
component in each "stat" row). Filtering out tiny connected components is
very useful feature, sometimes it's the only purpose of using connected
component function (like in our stereo correspondence algorithms)


Reply to this email directly or view it on GitHubhttps://github.com//pull/135#issuecomment-10653087.

@vpisarev
Copy link
Contributor

Hi Jason!
good progress, it's converging.
Some comments:

  1. First of all, the output arrays are not created in your code, instead, you use the existing type to dispatch to the right function. This does not conform to the current OpenCV guidelines. The output arrays can empty or have wrong size or type, and the function should work correctly nevertheless. In fact, it should ignore the current size and type of the output image. If the output image type can be different, it should be explicitly passed to the function. Also, do we really need 8u output type for labels? It's 1/3 of the code size and seem quite impractical. Still, if you want to support 8u, 16u and 32u/s, you need to add another output parameter:

int connectedComponents(InputArray _img, OutputArray _labels, int connectivity=8, int ltype=CV_32S)
{
Mat img = _img.getMat();
_labels.create(img.size(), CV_MAT_TYPE(ltype));
Mat labels = _labels.getMat();
// ... call internal functions.
}

  1. the related note - the internal functions should probably take [const] Mat& instead of InputArrays, it's more efficient and safe.

  2. even though you already refactored your code, I would suggest to declare the ConnectedCompStat structure (or how you call it) to the .cpp and make the template functions return vector. The transformation vector<C...Stat> => OutputArray stats can be moved to a separate function, which can be non-template (it does not depend on the type of labels image, right?). Also, this transformation will help to keep the code more elegant and easier to handle various issues with accuracy etc. - they are postponed till the very end.

  3. on CC_STAT_CX, CC_STAT_CY - I did not quite understand what you mean, the formatting in your comment is broken.

  4. on CC_STAT_INTEGRAL_* - I did not quite understand it too, but for a different reason. I thought, the characteristics depends on the area and shape of the connected component, not on its position, and therefore 32-bit value is enough. Can you provide some information on what is CC_STAT_INTEGRAL_X/Y exactly?

@nevion
Copy link
Author

nevion commented Nov 29, 2012

Vadim,

Inlined comments.

On Wed, Nov 28, 2012 at 11:36 AM, Vadim Pisarevsky <notifications@github.com

wrote:

Hi Jason!
good progress, it's converging.
Some comments:

  1. First of all, the output arrays are not created in your code, instead,
    you use the existing type to dispatch to the right function. This does not
    conform to the current OpenCV guidelines. The output arrays can empty or
    have wrong size or type, and the function should work correctly
    nevertheless.

Fair enough. The reason it is like that is to allow reuse without realloc
of the output array. Does the allocator for output arrays optimize the case
of an existing sufficiently sized buffer? I also liked that it internally
passed the label type in. But guidelines are guidelines so this will
change to then.

In fact, it should ignore the current size and type of the output image.
If the output image type can be different, it should be explicitly passed
to the function. Also, do we really need 8u output type for labels? It's
1/3 of the code size and seem quite impractical. Still, if you want to
support 8u, 16u and 32u/s, you need to add another output parameter:

Yea that 8u is a funny one... it's too low for my use but I've seen/heard
people use it on QR/barcode readers and in situations where it is not safe
(label overflow) but knowingly do so because it is low probability and they
want the speed. I won't fight for it so I'll leave the decision up to you.

int connectedComponents(InputArray _img, OutputArray _labels, int
connectivity=8, int ltype=CV_32S)
{
Mat img = _img.getMat();
_labels.create(img.size(), CV_MAT_TYPE(ltype));
Mat labels = _labels.getMat();
// ... call internal functions.
}

  1. the related note - the internal functions should probably take [const]
    Mat& instead of InputArrays, it's more efficient and safe.

Ok.

  1. even though you already refactored your code, I would suggest to
    declare the ConnectedCompStat structure (or how you call it) to the .cpp
    and make the template functions return vector. The transformation vector =>
    OutputArray stats can be moved to a separate function, which can be
    non-template (it does not depend on the type of labels image, right?).
    Also, this transformation will help to keep the code more elegant and
    easier to handle various issues with accuracy etc. - they are postponed
    till the very end.

Ok. This also means I won't output the integral stats, they are normally
useless anyway and have range issues with everything less than
double/int64. I may also only do this with the integrals if the stats
output array stays int.

  1. on CC_STAT_CX, CC_STAT_CY - I did not quite understand what you mean,
    the formatting in your comment is broken.

Hm, maybe because I compose on gmail? Boils down to centroids not being
OutputArray friendly with the other parameters unless everything is doubles
or centroids are a separate OutputArray. I think a separate array is
fastest computation time (next to fixed point) and easiest to use (no need
to stick a strange divisor everywhere when using). Otherwise fixed point
and one output array with the divisor though I can see the newbies and
academics' frustrated faces already.

  1. on CC_STAT_INTEGRAL_* - I did not quite understand it too, but for a
    different reason. I thought, the characteristics depends on the area and
    shape of the connected component, not on its position, and therefore 32-bit
    value is enough. Can you provide some information on what is
    CC_STAT_INTEGRAL_X/Y exactly?

The integral is computed for centroid computation. It is the sum of
positions a component is present in and it grows cubically for a square
image where n is the square's dimension. It overflows easily as a result.
Divide this 2-vector by the area and you get centroid (average position).
If the image is effectively one giant connected component (or close to it)
you can see that this would give you an unexpected result easily.

Reply to this email directly or view it on GitHubhttps://github.com//pull/135#issuecomment-10817621.

@vpisarev
Copy link
Contributor

vpisarev commented Dec 1, 2012

ok, so if the ConnectedComp structure is used for intermediate representation (where integrals can be encoded as double's or 64-bit integers) then the output array of statistics can be without them, because they do not much sense for the users.

On 8-bit output label array - let's throw it away and keep just CV_16U and CV_32S options.

CX and CY representation - well, need to think. Personally I do not think that they should be stored with very high accuracy - any noise pixel added to or removed from the component may affect them. I would say, 1/256 should be enough, but I admit that I do not know all the possible use cases. Splitting the output array into 2 - I would say, this is the least convenient option.

Hey, what about yet another option - encode each of CX and CY as a pair of integers?
stat[i][CC_STAT_CX] would store the rounded-to-nearest x-coordinate of the centroid and stat[i][CC_STAT_FX] would store the signed fractional part, multiplied by (2**31)? We can add a constant:

#define CV_CC_STAT_CENTROID_SCALE 4.656612873077393e-10 /* 1/(2**31) */

so that the actual centroid can be computed as stat[i][CC_STAT_CX]+stat[i][CC_STAT_FX]*CV_CC_STAT_CENTROID_SCALE.

It will give significantly better accuracy than single-precision floating-point number. We can add inline function

inline Point2f getCCompCentroid(const int* stat) { return Point2f(stat[CC_STAT_CX]+..., stat[CC_STAT_CY]+...); }

which can be used as

vector<Vec<int, 9> > ccomp;
Mat img, labels;
getConnectedComponentsAndStats(img, labels, ccomp, 8);
for( size_t i = 0; i < ccomp.size(); i++ )
{
Point2f c = getCCompCentroid(&ccomp[i][0]);
}

those who only need very rough position of the centroid can just use Point(stat[CC_STAT_CX], stat[CC_STAT_CY]).

What do you think?

@nevion
Copy link
Author

nevion commented Dec 2, 2012

On Sat, Dec 1, 2012 at 9:16 AM, Vadim Pisarevsky
notifications@github.comwrote:

ok, so if the ConnectedComp structure is used for intermediate
representation (where integrals can be encoded as double's or 64-bit
integers) then the output array of statistics can be without them, because
they do not much sense for the users.

On 8-bit output label array - let's throw it away and keep just CV_16U and
CV_32S options.

Ok.

CX and CY representation - well, need to think. Personally I do not think
that they should be stored with very high accuracy - any noise pixel added
to or removed from the component may affect them. I would say, 1/256 should
be enough, but I admit that I do not know all the possible use cases.
Splitting the output array into 2 - I would say, this is the least
convenient option.

Hey, what about yet another option - encode each of CX and CY as a pair of
integers?
stat[i][CC_STAT_CX] would store the rounded-to-nearest x-coordinate of the
centroid and stat[i][CC_STAT_FX] would store the signed fractional part,
multiplied by (2**31)? We can add a constant:

#define CV_CC_STAT_CENTROID_SCALE 4.656612873077393e-10 /* 1/(2**31) */

so that the actual centroid can be computed as
stat[i][CC_STAT_CX]+stat[i][CC_STAT_FX]*CV_CC_STAT_CENTROID_SCALE.

It will give significantly better accuracy than single-precision
floating-point number. We can add inline function

inline Point2f getCCompCentroid(const int* stat) { return
Point2f(stat[CC_STAT_CX]+..., stat[CC_STAT_CY]+...); }

which can be used as

vector > ccomp;
Mat img, labels;
getConnectedComponentsAndStats(img, labels, ccomp, 8);
for( size_t i = 0; i < ccomp.size(); i++ )
{
Point2f c = getCCompCentroid(&ccomp[i][0]);
}

those who only need very rough position of the centroid can just use
Point(stat[CC_STAT_CX], stat[CC_STAT_CY]).

What do you think?

I think fixed point is more ugly than a separate array and floating point
(double precision). The macro wouldn't be usable in other languages too,
unless they're exported. The function would be. One more alternative is
to split the double into lower and higher halfs and stick the bits in
integer memory. I've heard that some architectures don't like floats using
arbitrary memory or something to that affect. In other languages again
though this could be a large annoyance at interpreting the memory correctly
however your function solution still works and hides whatever complexity
internally.

Having said that, I think the stand alone function is just as ugly as
another output array though.. if not uglier. Please mull it over a little
more.

-Jason


Reply to this email directly or view it on GitHubhttps://github.com//pull/135#issuecomment-10919393.

}
Mat labelImage(img.size(), CV_32S);
int nLabels = connectedComponents(bw, labelImage, 8);
Vec3b colors[nLabels];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an error with MSVC9 (VS2008) since nLabels is not a constant; use new and delete instead.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't const int fix it?

On Mon, Dec 3, 2012 at 10:49 AM, joshdoe notifications@github.com wrote:

In samples/cpp/connected_components.cpp:

- Mat dst = Mat::zeros(img.size(), CV_8UC3);

  • if( !contours.empty() && !hierarchy.empty() )
  • {
  •    // iterate through all the top-level contours,
    
  •    // draw each connected component with its own random color
    
  •    int idx = 0;
    
  •    for( ; idx >= 0; idx = hierarchy[idx][0] )
    
  •    {
    
  •        Scalar color( (rand()&255), (rand()&255), (rand()&255) );
    
  •        drawContours( dst, contours, idx, color, CV_FILLED, 8, hierarchy );
    
  •    }
    
  • Mat labelImage(img.size(), CV_32S);
  • int nLabels = connectedComponents(bw, labelImage, 8);
  • Vec3b colors[nLabels];

This is an error with MSVC9 (VS2008) since nLabels is not a constant; use
new and delete instead.


Reply to this email directly or view it on GitHubhttps://github.com//pull/135/files#r2296401.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"const int" will not help. Using run-time, not a compile-time value to specify the array size is GCC extension to C/C++ standards. Use

vector colors(nLabels);

instead

@vpisarev
Copy link
Contributor

vpisarev commented Dec 4, 2012

ok, let's output the connected component centroid in a separate output array of 2-channel floating-point type (CV_32FC2 or CV_64FC2 - at your choice).

@nevion
Copy link
Author

nevion commented Dec 10, 2012

Vadim,

I believe I've got everything you mentioned taken care of, including the test case which I have also forked from your branch and updated with my input/expected output for regression testing.

Let me know if there's anything missing still.

-Jason

@vpisarev
Copy link
Contributor

looks good now.
One thing to fix, though, is compile errors on Windows: http://pullrequest.opencv.org (I have to apologize for scrambled error messages; we are aware of this problem and working on it) and then we can probably integrate it and then do incremental improvements

typedef unsigned short uint16_t;
typedef int int32_t;
typedef unsigned int uint32_t;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To fix the Windows errors, at least with VC10, add typedef unsigned __int64 uint64_t;. However shouldn't this really be in core? Look at core\types_c.h, as uint64 is already defined here. Consider using uint64 etc. in your code, moving the necessary bits of this set of typedefs there.

@alekcac
Copy link
Contributor

alekcac commented Dec 12, 2012

Vadim, any changes?

@nevion
Copy link
Author

nevion commented Dec 12, 2012

I'll have the change(s) incorporate soon, I'm very busy with work at the
moment... even for this small change.

On Wed, Dec 12, 2012 at 1:52 AM, Alexander Shishkov <
notifications@github.com> wrote:

Vadim, any changes?


Reply to this email directly or view it on GitHubhttps://github.com//pull/135#issuecomment-11282732.

@nevion
Copy link
Author

nevion commented Dec 13, 2012

Hm, regarding these types I've added them to types_c.h... common programming lore states that the type of char is not guaranteed to be 8 bits though... and neither is int to be 32 (I've used one of these - a dsp platform with 16 bit ints and still using gcc 2.95 or some such). I'm not sure what to propose for older code but C++ 0x11/c99 guarantees stdint which provides integer types with guaranteed bit-depths.

With that said I made the change reusing existing typedefs so int8 - uint32 are now also defined rather than just int64 and updated connectedcomponents afterwords.

@vpisarev
Copy link
Contributor

Hi Jason,
such a change in type_c.h can potentially cause conflicts. I'm afraid, we can not accept such a change, sorry. If you need such definitions in .cpp, please, put the declarations there.

Otherwise, the patch looks fine. With one little note - please, take a look at http://pullrequest.opencv.org, your code produces some warnings for android; the code can only be merged if it builds without warnings. You can fix it or I can submit pull request to your repository to fix that, as you wish.

@nevion
Copy link
Author

nevion commented Dec 14, 2012

Vadim - yea I thought it was pretty controversial but somebody (joshdoe?) suggested to add such typedefs there. If I may ask where is the conflict you are concerned about and is this a reasonable judgment? I have received no compilation errors in regards to it and it just replicates the naming convention of u/int64 which is a sane and useful thing IMO. Or is this why they were originally missing? I can move them back but I do consider this a small step backwards..

I have been monitoring the builds since you originally provided me the URL, I figured there was such no merge unless ... conditions were in play. RE the android failing, wasn't able to decern why and I don't have an android environment setup... if it's not to much trouble could you submit a patch to fix the error there?

@vpisarev
Copy link
Contributor

Jason,
thanks for understanding and for the prompt reaction. I will try to fix the warnings.

@vpisarev
Copy link
Contributor

Jason, here is the patch that fixes build problems with VS and probably Android; let's see if it works; For some reason, I could not submit a pull request to your repository, so you need to apply the patch manually:

vpisarev@1eae455

basically, you can merge https://github.com/vpisarev/opencv.git, branch cc, to your master.

@nevion
Copy link
Author

nevion commented Dec 17, 2012

Vadim - I added your patch. Still build errors on android... something the compiler is having trouble with on Input/OutputArrays. On mac the regression test also fails with invalid test data... this only happens when the input matrix (loaded from a png file) is empty... ... perhaps the dataset repo isn't up to date on the mac test environment? This is just a warning however.

Btw can you take references of Input/OutputArrays? I noted you changed the CCStatsOp fields to be pointers but thought it was funny you didn't use references instead...

@vpisarev
Copy link
Contributor

Hi Jason,

  1. regarding the tests - can you, please, send me the two files you use for the regression tests? I will put them to opencv_extra. Without the test data the tests will always pass (unless the test will crash somewhere inside the function).
  2. on the android build - I'm still trying to figure out why it fails.
  3. on Input/OutputArray. using references as members of C++ classes is bad, because it issues the correct warning that the assignment operator and copy constructor can not be generated. Instead of suppressing the warning or ignoring it I always choose to fix it properly.

@vpisarev
Copy link
Contributor

ok; I fixed the android build, please check the cc branch. Basically, you can just copy the implementation of the external cv::connectedComponentsWithStats functions to your .cpp file.

The test data is still needed; with failing tests we can not accept the code

@nevion
Copy link
Author

nevion commented Dec 17, 2012

Vadim - the test data is here (pushed a week ago): https://github.com/nevion/opencv_extra

I pushed your patch too.

@vpisarev
Copy link
Contributor

ok, I added test data to our copy of opencv_extra; hopefully, now the tests will pass

@nevion
Copy link
Author

nevion commented Dec 18, 2012

Vadim - not sure if it's been rebuilt since you added the test data. Can you force a rebuild or should I make a dummy commit?

@vpisarev
Copy link
Contributor

I can and I did. Looks like, this is the problem of our buildbot. Still, we have to wait a bit until it's solved.

@vpisarev
Copy link
Contributor

Hi Jason!

2 more things:

  1. can you, please, update the documentation to reflect the API? E.g. the structure description should be removed.

  2. we will integrate your pull request early next week, because this week we prepare 2.4.3.x release, and so we want to minimize the possibility of conflicts during after-release 2.4 -> master merges. is that fine with you?

@nevion
Copy link
Author

nevion commented Dec 19, 2012

I'll get on 1 sometime today, re 2, we're tantalizingly close and the core code is stable and fairly well tested... I'd hope we could get it in in the next release which would allow such things in, whenever that is. I also hope we're not missing one though that seems like a bug fix release. Btw, come January I'm going to be on a long set of work related field tests so it works best for me to finish up before that... or after though I'd have to fight the distraction.

@vpisarev
Copy link
Contributor

Jason,
don't worry, as long as it's integrated (and it's only the documentation that remains to be fixed), we will put it into the nearest OpenCV release.

Putting it to 2.4.3.x is too late; it's very long release process, going through multiple QA steps, it's already on the way. So the earliest release when we can put it in is 2.4.4, which is scheduled for 2013 Feb.

Regards,
Vadim

@kirill-korniakov
Copy link

Vadim, actually the earliest possible release is 2.5, since this is a new functionality and it is targeted for "master" branch. But I don't think that it will cause any issues, since after merge to master this will be a part of OpenCV!

@vpisarev
Copy link
Contributor

oh, that's right; the pull request should actually be retargeted for 2.4 branch in order to be put into 2.4.4

@kirill-korniakov
Copy link

But I thought that 2.4 should accept only bug-fixes and optimizations =) Is it a "must have" for 2.4.4?

@vpisarev
Copy link
Contributor

there are "should" and "could". 2.4.4 can accept anything that does not break 2.4 API :) So, we can potentially add this functionality, no probs

@nevion
Copy link
Author

nevion commented Dec 19, 2012

Ok I've pushed an update to the docs. Re 2.4, should I rebase off 2.4 and submit a pull request to that branch?

@vpisarev
Copy link
Contributor

Hi Jason,

Let's put it to master to keep things going. We can still then to move it to 2.4

@nevion
Copy link
Author

nevion commented Dec 21, 2012

Looks like the android build bot is freaking out again - other than that I see no warnings relevant to connected components, just histogram equalization and a cascade classifier, not sure what to interpret from the docs, but the topic of interest isn't in it's logs.

@vpisarev
Copy link
Contributor

👍

@vpisarev
Copy link
Contributor

vpisarev commented Jan 9, 2013

:shipit:

vpisarev added a commit that referenced this pull request Jan 10, 2013
@opencv-pushbot opencv-pushbot merged commit 4cb25e9 into opencv:master Jan 10, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants