-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestion: changes to label_components function #226
Conversation
I certainly support the spirit of this change, but there's an important detail. I was deliberately being a bit lazy here: I confined it to Arrays in part because I made use of linear indexing, and for anything that doesn't have efficient linear indexing (e.g., |
I see, I didn't think of How large is actually the performance penalty when indexing linearly into a SubArray? Should I do some benchmarks? I could also try to change the function to use linear indexing. What I actually wanted to achieve, is some behavior like Matlabs bwconncomp, which returns a cell array of vectors with indices for each component. I think this change would get me there, but I would also be glad to hear other ideas how to do this more efficiently. |
Pretty big in 2d, worse in 3d. Example: julia> A = rand(1000,1000);
julia> B = sub(A, 2:999, 2:999);
julia> function mysum(A)
s = 0.0
for i = 1:length(A)
s += A[i]
end
s
end
mysum (generic function with 1 method)
julia> mysum(A)
499876.9882984452
julia> mysum(B)
497909.8790759961
julia> @time mysum(A)
elapsed time: 0.004373402 seconds (13896 bytes allocated)
499876.9882984452
julia> @time mysum(B)
elapsed time: 0.028761602 seconds (96 bytes allocated)
497909.8790759961
julia> function mysum2(A::AbstractMatrix)
s = 0.0
for j = 1:size(A,2), i = 1:size(A,1)
s += A[i,j]
end
s
end
mysum2 (generic function with 1 method)
julia> mysum2(A)
499876.9882984452
julia> mysum2(B)
497909.8790759961
julia> @time mysum2(A)
elapsed time: 0.003726706 seconds (96 bytes allocated)
499876.9882984452
julia> @time mysum2(B)
elapsed time: 0.008171102 seconds (96 bytes allocated)
497909.8790759961 The penalty increases by another factor of 2 for 3d. But now I understand your intention much better---for what you want to achieve, this works just fine, so let's not worry about the |
Suggestion: changes to label_components function
If you want something more like type ComponentList <: AbstractVector{Vector{Int}}
components::Vector{Int}
end
function setindex!(A::ComponentList, v::Int, i::Int)
if v > length(A.components)
sizehint(A.components, v)
for i = length(A.components)+1:v
push!(A.components, Int[])
end
end
push!(A.components[v], i)
A
end |
Maybe we should add this to Images? |
This would be great to have in Images. I just can't see yet how to write an efficient |
You're saying you want something better than There seem to be three options:
|
I think I am going for option 3, using a Dict-based design during the construction and at the end build the container per label struct. Using option 2 significantly throws down performance when the array is accessed in line 43
while the performance penalty for the Dict is still acceptable compared to the memory gain which here is more important to me. I will make another PR once I have tested this and wrapped it into a function. Thank you very much for your help. |
Oh, right; I had forgotten this needs to access the previously-written values. Yes, your approach seems reasonable. |
There are basically 2 changes in this PR.
label_components
is extended to work withAlbl::AbstractArray{Int}
.Albl
is initialized (zeroed) by the caller and not within the function anymore.In my typical use cases most of the data is background, so a lot of memory is wasted by allocating the whole output array. I suggest these changes to be able to use other array-like containers like a sparse matrix, a memory-mapped array etc, to write the labels to. A small example would be in this gist https://gist.github.com/meggart/20503e86e18ad21c9bbd