Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MakieCon2023] New (faster) Interpolators #31

Closed
behinger opened this issue Apr 18, 2023 · 11 comments · Fixed by #33
Closed

[MakieCon2023] New (faster) Interpolators #31

behinger opened this issue Apr 18, 2023 · 11 comments · Fixed by #33

Comments

@behinger
Copy link
Collaborator

checkout if we can now use
Interpolators.jl

maybe optimize the use of
ScatteredInterpolators.jl

@behinger
Copy link
Collaborator Author

behinger commented Apr 18, 2023

reimplement https://github.com/scipy/scipy/blob/main/scipy/interpolate/interpnd.pyx cloughtochter2d

(we are in C2)

@behinger
Copy link
Collaborator Author

Regards to interpolation.jl: JuliaMath/Interpolations.jl#118 <-- seems unresolved.

I'm not sure what we possibly could optimize with ScatteredInterpolation, but 7s as in the docs seems quite slow..

@behinger
Copy link
Collaborator Author

If someone wants to have a quickstart, I expose the interpolation algorithms here & run some @time
https://gist.github.com/behinger/d5cf028111e80ab11e5d5cb7f1ef9957

@fatteneder
Copy link

Hi.

Just finished my MakieCon 2023 project to implement scipy's CloughTocherInterpolator in Julia.
Its available here: https://github.com/fatteneder/CloughTocher2DInterpolation.jl
Its not registered yet, but I might do if it turns out to be useful for Topoplots.

Regarding the already available interpolators in Topoplots:
I only just saw that the current "interface" for interpolators is not really ideal with regards to performance.
Usually, interpolators like to be used in two steps:

  1. Construct the interpolator, e.g. precompute a triangulation from the provided reference data, setup caches, etc.
  2. use the interpolator on data.

From what I can see at the moment most of step 1 is also done in step 2.
Perhaps fixing that will make the already existing interpolators fast enough.

@behinger
Copy link
Collaborator Author

behinger commented Apr 22, 2023

Hi - this is really great! Wow. Lot's of lines you covered...!!

Regarding your 1/2 step recommendation. I think the issue is that typically the grid to be evaluated on is not changed, but the underlying data. Thus there is not really anything to be re-used. That would only be the case if we e.g. want to change the resolution of the grid we are evaluating on.

Further, the step 1 is typically extremely fast, factor 1000x compared to the evaluation step. This, of course, depends on the grid-size in the second step (for TopoPlots.jl we currently use a 500x500 grid by default)


I briefly evaluated your implementation, speed wise we are currently ~factor 10 slower than the python implementation (~40ms vs. 380ms)- even after I removed the "we are here" messages (saved me around ~40ms).

Not sure if this will help a lot, but here is a profile view
grafik

finally I noticed that performance is much worse if I provide a matrix to the interpolator, compared to the linearized version

begin
xrange = LinRange(0,1,500)
	yrange = LinRange(0,1,500)
	positions = rand(Point2f,100)
	data = rand(length(positions))
end
begin

	posMat = Float64.(vcat([[p[1],p[2]] for p in positions]...))
	@time s = CloughTocher2DInterpolation.CloughTocher2DInterpolator(posMat, data)
	x = (xrange)' .* ones(length(yrange))
	y = ones(length(xrange))' .* (yrange)

	icoords2 = hcat(x[:],y[:])'

	@time res = s(icoords2)
	
end

I really would need to dig deeper in the code to try to figure out what happens where to optimize. Maybe I will find time this week :)

@fatteneder
Copy link

fatteneder commented Apr 22, 2023

Not sure I understand, but then I did not check how you use the interpolators in the rest of the package.
Just to clarify, I had this usage pattern in mind where the reference data is never changed, so it should only be constructed only once,

ip = CloughTocherInterpolator(ref_xy,ref_z)
...
intrp_xy, intrp_z = zeros(2*N), zeros(N)
for i = 1:1000
   # update intrp_xy
   ip(intrp_z, intrp_xy) # interpolate inplace into intrp_z
end

I think the issue is that typically the grid to be evaluated on is not changed, but the underlying data.

If you mean by that that the reference grid is unchanged but the reference data changes, then we could easily add an update! method that just recomputes the grad field of CloughTocher2DInterpolator, if that is a bottleneck.

Further, the step 1 is typically extremely fast, factor 1000x compared to the evaluation step. This, of course, depends on the grid-size in the second step (for TopoPlots.jl we currently use a 500x500 grid by default)

Ok, I see.


I briefly evaluated your implementation, speed wise we are currently ~factor 10 slower than the python implementation (~40ms vs. 380ms)- even after I removed the "we are here" messages (saved me around ~40ms).

Yeah, the we are here is a left-over debug statement :)
Atm I am trying to polish a bit more and also try to get the rescale keyword argument to work correctly.

Not sure if this will help a lot, but here is a profile view

Thanks for the snapshot, certainly useful, because I did not do any optimizations yet.

inally I noticed that performance is much worse if I provide a matrix to the interpolator, compared to the linearized version

Ok interesting, I need to have a look on that too.

Some more comments:

  1. Atm the interpolation uses only the slow part of the simplices search (_find_simplex_bruteforce), but the scipy version does a clever coarse search before calling into _find_simplex_bruteforce. The reason I haven't implement that (yet) is that for this kind of search one needs the hyperplane equations (normal and offset) that is computed inside qhull but not so easy to extract (one would have to scan the qhull->facets_list and then use an unsafe_load on the pointer, I think). I would also be fine with computing the remaining parts manually (separate from qhull), but I didn't figure out yet how to.
  2. There is also an inplace version for interpolation, that might help with any allocation problems, see the code snippet above.
  3. I think a low hanging fruit for optimization would be to use StaticArrays.jl to keep allocations down.
  4. Since we only call into qhull in the constructor, we have all the freedom to parallelize/multithread the interpolation call, in particular also the _find_simplex_bruteforce search.

Btw: I would appreciate any feedback about the interface. In particular, the ordering of arguments for the inplace interpolation seems to be at odds with that of constructing the interpolator, e.g.

ip = CloughTocherInterpolation(xy,z)
intrp_xy = randn(2,40)
intrp_z = zeros(40)
ip(intrp_z,intrp_xy) # mutate intrp_z

@fatteneder
Copy link

fatteneder commented Apr 23, 2023

Regarding

Atm the interpolation uses only the slow part of the simplices search (_find_simplex_bruteforce), but the scipy version does a clever coarse search before calling into _find_simplex_bruteforce. The reason I haven't implement that (yet) is that for this kind of search one needs the hyperplane equations (normal and offset) that is computed inside qhull but not so easy to extract (one would have to scan the qhull->facets_list and then use an unsafe_load on the pointer, I think). I would also be fine with computing the remaining parts manually (separate from qhull), but I didn't figure out yet how to.

Turns out extracting the equations directly from qhull needs more work, because the data is stored in fields of a C struct facetT, wich in turn can have different precision depending on the build parameters (they use various type aliases like realT, coordT).
I think a better way to grab the data would be to extend the QhullMiniWrapper in Yggdrasil. Although straight forward, I would prefer a more direct method of computation inside Julia, just like I already do for the DelaunayInfo.neighbors.
There is something in the tests of scipy that talks about the hyperplanes, but I could not make sense out of that yet, cf. https://github.com/scipy/scipy/blob/fdea3a89fe0a823c1f0652e78c76f707571b0ea1/scipy/spatial/tests/test_qhull.py#L195-L220

Btw: scipy extracts both neighbors and the equations (or normals + offset) directly from the facetT structs and they can do so, because they also ship their own version of qhull.

@fatteneder
Copy link

Brief update: I managed to compute the hyperplane equations and, thus, also port the smart _find_simplex method from sympy. Comparing with the bruteforce version from before this gives a x10 improvement in my tests.

Regarding this:

finally I noticed that performance is much worse if I provide a matrix to the interpolator, compared to the linearized version

I wanted to give this a shot, but failed in reproducing the problem (but I only tried on current master, perhaps its now already fixed?). Below is my benchmark

using CloughTocher2DInterpolation
using BenchmarkTools

xrange = LinRange(0,1,500)
yrange = LinRange(0,1,500)
positions = rand(2,100)
data = rand(100)

@time s = CloughTocher2DInterpolation.CloughTocher2DInterpolator(positions, data)
x = (xrange)' .* ones(length(yrange))
y = ones(length(xrange))' .* (yrange)

icoords = hcat(x[:],y[:])'
icoords_flat = icoords[:]

@btime $s($icoords)
@btime $s($icoords_flat)

which gives

176.671 ms (6 allocations: 1.91 MiB)
175.796 ms (8 allocations: 1.91 MiB)

@behinger Could you check whether this fixed all your concerns?

@behinger
Copy link
Collaborator Author

behinger commented May 8, 2023

wow! indeed it is now ~10x faster, it takes 35ms on my machine now (tested with @time only) (which is slightly faster than the python version!). Wow this is really great!

Also vector and non-vector are also the same speed now 🎉

@behinger
Copy link
Collaborator Author

behinger commented May 9, 2023

if you register the package, I can include it in TopoPlots.jl already :)

@fatteneder
Copy link

Registration is in progress: JuliaRegistries/General#83213

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants