Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix breakage from the new ReinterpretArray #60

Closed
wants to merge 5 commits into from

Conversation

pablosanjose
Copy link
Contributor

Fix for issue #59 (see details there)

@codecov-io
Copy link

codecov-io commented Dec 6, 2017

Codecov Report

Merging #60 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master      #60   +/-   ##
=======================================
  Coverage   93.89%   93.89%           
=======================================
  Files          14       14           
  Lines         508      508           
=======================================
  Hits          477      477           
  Misses         31       31
Impacted Files Coverage Δ
src/kd_tree.jl 100% <ø> (ø) ⬆️
src/tree_data.jl 100% <ø> (ø) ⬆️
src/inrange.jl 95.83% <ø> (ø) ⬆️
src/hyperspheres.jl 100% <ø> (ø) ⬆️
src/ball_tree.jl 100% <ø> (ø) ⬆️
src/utilities.jl 100% <ø> (ø) ⬆️
src/hyperrectangles.jl 69.23% <ø> (ø) ⬆️
src/knn.jl 96.42% <ø> (ø) ⬆️
src/tree_ops.jl 89.83% <ø> (ø) ⬆️
src/datafreetree.jl 82.14% <ø> (ø) ⬆️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 012a3da...12ec911. Read the comment docs.

Copy link
Owner

@KristofferC KristofferC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. I have thought about the Reinterpret stuff for a while but haven't fixed it yet. I will need to do some benchmarks here to see if it is perhaps worth just doing a copy to a new array instead of reinterpreting.

src/ball_tree.jl Outdated
@@ -4,7 +4,7 @@
# The tree uses the triangle inequality to prune the search space
# when finding the neighbors to a point,
struct BallTree{V <: AbstractVector,N,T,M <: Metric} <: NNTree{V,M}
data::Vector{V}
data::AbstractVector{V}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be bad for performance because this is an abstract field. See (https://docs.julialang.org/en/stable/manual/performance-tips/#Avoid-containers-with-abstract-type-parameters-1),

Perhaps we need to just do like

BallTree{V <: AbstractVector, ...}
    data::V
    .
    .
end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, this was a mistake on my part. I intended to leave data::Vector{V} there, like in BruteTree. In the latter it was necessary to do a convert in the BruteTree constructor, but not with BallTree I think.

@pablosanjose
Copy link
Contributor Author

Ok! But I would have thought that copying to a new array would always be slower, no? Or perhaps there is a substantial overhead with the lazy ReinterpretArray wrapper?

@KristofferC
Copy link
Owner

Yes, indeed, the question is if there is any overhead with the lazy wrapper. Only benchmarking can tell :)

@pablosanjose
Copy link
Contributor Author

pablosanjose commented Dec 8, 2017

I just did some quick tests and there is indeed a rather alarming overhead with this PR.

julia> using NearestNeighbors, BenchmarkTools
julia> srand(1234); data = rand(3, 10^4);

julia v0.6.1 + NearestNeighbors.jl (master)

julia> @btime KDTree($data);
  2.225 ms (50 allocations: 424.77 KiB)

julia v0.7 master + NearestNeighbors.jl (this PR)

julia> @btime KDTree($data);
  16.353 ms (574 allocations: 501.80 KiB)

I also created another branch ("no-reinterpret") on my fork in which all reinterpret calls were swapped to the following reinterpret_or_copy function

if VERSION < v"0.7.0-DEV.2008"
    @inline reinterpret_or_copy(::Type{T}, data, ::Val{dim}) where {T, dim} =
        reinterpret(SVector{dim,T}, data, (length(data) ÷ dim,))
else
    @inline reinterpret_or_copy(::Type{T}, data, ::Val{dim}) where {T, dim} =
        [SVector{dim,T}(ntuple(i -> data[n+i], Val(dim))) for n in 0:dim:(length(data)-1)]
end

Using this I can once more run on julia master, with similar (although a little slower) timings

julia v0.7 master + NearestNeighbors.jl ("no-reinterpret" branch)

julia> @btime KDTree($data);
  2.820 ms (475 allocations: 724.03 KiB)

So you were right! Which makes me wonder, if I this PR follows the expected usage of the 'ReinterpretArray' wrapper, and the overhead is real, how can we get the old reinterpret behaviour back (which is clearly better in this case)? I suspect, by the large number of allocations with this PR, that there is however something fixable here.

Also, should I open a PR with the "no-reinterpret" branch?

@KristofferC
Copy link
Owner

We need a function barrier there. So the dynamic dispatch on dim only happens once.

@pablosanjose
Copy link
Contributor Author

By "there", you mean in the reinterpret_or_copy function, I assume. Decorating it with a @noinline indeed makes the timing a little better.

julia v0.7 master + NearestNeighbors.jl ("no-reinterpret" branch)

julia> @btime KDTree($data);
  2.660 ms (486 allocations: 724.20 KiB)

However, I'd like to understand why the ReinterpretArray is so slow... Copying feels like surrendering :-P

@KristofferC
Copy link
Owner

I think we just gotta resort to copying...

@pablosanjose
Copy link
Contributor Author

That would be too bad! I have posted a question in discourse, to try to understand if the performance problem can be overcome somehow...

@pablosanjose pablosanjose mentioned this pull request Dec 9, 2017
@pablosanjose
Copy link
Contributor Author

I'm closing this in favor of #61. Perhaps in the future, when/if ReinterpretArray is once more as performant as the old reinterpret this simpler (and potentially faster) approach could be revisited, reverting #61.

In such case, it might be good to keep the @compat Array{T}(uninitialized, ...) commit from #61. (That commit should perhaps have been a a separate PR... noobness!)

@pablosanjose pablosanjose deleted the fix-reinterpret branch January 18, 2018 13:30
@KristofferC
Copy link
Owner

Perhaps we can resurrect this. I think new optimizations for reinterpret got merged into master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants