Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance #1

Open
krober10nd opened this issue Sep 24, 2020 · 14 comments
Open

Performance #1

krober10nd opened this issue Sep 24, 2020 · 14 comments
Labels
enhancement New feature or request

Comments

@krober10nd
Copy link
Collaborator

Hey Darren,

I used line_profiler with a decorator above the kernel _inpoly() to see if there were any hotspots we could improve...Overall great. Below is the output from line_profiler running example.py. Doesn't look like it could be improved much. As you can see, the continue if catch appears to occupy 14% of the total time.

(base) keiths-MacBook-Pro:inpoly-python Keith$ kernprof -l -v example.py
INPOLY2: 2.289767026901245
Wrote profile results to example.py.lprof
Timer unit: 1e-06 s

Total time: 1.26219 s
File: /Users/Keith/junk/inpoly-python/inpoly/inpoly2.py
Function: _inpoly at line 131

Line # Hits Time Per Hit % Time Line Contents

131 @Profile
132 def inpoly(vert, node, edge, ftol, lbar):
133 """
134 INPOLY: the local pycode version of the crossing-number
135 test. Loop over edges; do a binary-search for the first
136 vertex that intersects with the edge y-range; crossing-
137 number comparisons; break when the local y-range is met.
138
139 """
140
141 1 11.0 11.0 0.0 feps = ftol * (lbar ** +2)
142 1 2.0 2.0 0.0 veps = ftol * (lbar ** +1)
143
144 1 27.0 27.0 0.0 stat = np.full(vert.shape[0], False, dtype=np.bool
)
145 1 12.0 12.0 0.0 bnds = np.full(vert.shape[0], False, dtype=np.bool
)
146
147 # ----------------------------------- compute y-range overlap
148 1 47.0 47.0 0.0 XONE = node[edge[:, 0], 0]
149 1 42.0 42.0 0.0 XTWO = node[edge[:, 1], 0]
150 1 86.0 86.0 0.0 YONE = node[edge[:, 0], 1]
151 1 80.0 80.0 0.0 YTWO = node[edge[:, 1], 1]
152
153 1 66.0 66.0 0.0 XMIN = np.minimum(XONE, XTWO)
154 1 83.0 83.0 0.0 XMAX = np.maximum(XONE, XTWO)
155
156 1 62.0 62.0 0.0 XMAX = XMAX + veps
157 1 10.0 10.0 0.0 YMIN = YONE - veps
158 1 46.0 46.0 0.0 YMAX = YTWO + veps
159
160 1 51.0 51.0 0.0 YDEL = YTWO - YONE
161 1 44.0 44.0 0.0 XDEL = XTWO - XONE
162
163 1 750.0 750.0 0.1 ione = np.searchsorted(vert[:, 1], YMIN, "left")
164 1 826.0 826.0 0.1 itwo = np.searchsorted(vert[:, 1], YMAX, "right")
165
166 # ----------------------------------- loop over polygon edges
167 11000 7184.0 0.7 0.6 for epos in range(edge.shape[0]):
168
169 10999 9413.0 0.9 0.7 xone = XONE[epos]
170 10999 8909.0 0.8 0.7 xtwo = XTWO[epos]
171 10999 8982.0 0.8 0.7 yone = YONE[epos]
172 10999 8884.0 0.8 0.7 ytwo = YTWO[epos]
173
174 10999 8760.0 0.8 0.7 xmin = XMIN[epos]
175 10999 8695.0 0.8 0.7 xmax = XMAX[epos]
176
177 10999 8706.0 0.8 0.7 xdel = XDEL[epos]
178 10999 8821.0 0.8 0.7 ydel = YDEL[epos]
179
180 # ------------------------------- calc. edge-intersection
181 264806 186197.0 0.7 14.8 for jpos in range(ione[epos], itwo[epos]):
182
183 253807 182977.0 0.7 14.5 if bnds[jpos]:
184 87497 55301.0 0.6 4.4 continue
185
186 166310 147738.0 0.9 11.7 xpos = vert[jpos, 0]
187 166310 144366.0 0.9 11.4 ypos = vert[jpos, 1]
188
189 166310 117233.0 0.7 9.3 if xpos >= xmin:
190 90503 66346.0 0.7 5.3 if xpos <= xmax:
191 # ------------------- compute crossing number
192 25671 24757.0 1.0 2.0 mul1 = ydel * (xpos - xone)
193 25671 23340.0 0.9 1.8 mul2 = xdel * (ypos - yone)
194
195 25671 71918.0 2.8 5.7 if feps >= np.abs(mul2 - mul1):
196 # ------------------- BNDS -- approx. on edge
197 21998 17140.0 0.8 1.4 bnds[jpos] = True
198 21998 20768.0 0.9 1.6 stat[jpos] = True
199
200 3673 2657.0 0.7 0.2 elif (ypos == yone) and (xpos == xone):
201 # ------------------- BNDS -- match about ONE
202 bnds[jpos] = True
203 stat[jpos] = True
204
205 3673 2516.0 0.7 0.2 elif (ypos == ytwo) and (xpos == xtwo):
206 # ------------------- BNDS -- match about TWO
207 bnds[jpos] = True
208 stat[jpos] = True
209
210 3673 2889.0 0.8 0.2 elif (mul1 < mul2) and (ypos >= yone) and (ypos < ytwo):
211 # ------------------- advance crossing number
212 1819 1829.0 1.0 0.1 stat[jpos] = not stat[jpos]
213
214 75807 57239.0 0.8 4.5 elif (ypos >= yone) and (ypos < ytwo):
215 # ----------------------- advance crossing number
216 66373 56383.0 0.8 4.5 stat[jpos] = not stat[jpos]
217
218 1 1.0 1.0 0.0 return stat, bnds

@dengwirda
Copy link
Owner

I've done a little more testing, and it seems the python version is still around x 10 slower than the MATLAB implementation, so the loops are definitely slowing things down... :-//

The *.py version appears to be (a lot) faster than other python options though, so is probably already useful, but it'd be nice to get close to proper performance here I think.

I'm going to try a quick experiment with cython and see if it's possible to include an optional *.c version of _inpoly that python can auto-build / install via setup.py. Apparently such things are possible!?

@krober10nd
Copy link
Collaborator Author

Ah bummer about the comparison with Matlab.

Yes the cython thing is hard out. I've never tried that but I'd been keen to seen how it works. I've also heard good things.

I have called simple cpp (c with std) using pybind11 like this

https://github.com/krober10nd/simple_cgal

But that time I was using cgal so I was forced to use cpp.

@krober10nd
Copy link
Collaborator Author

@dengwirda
Copy link
Owner

Okay! There's now a cython kernel available in inpoly_.pyx --- this is just the inpoly_ kernel decorated with some typedefs. This can be "compiled" into the CPython-compatible inpoly_.c, which can then be (actually) compiled into inpoly_.so[dylib|dll]. This library can be imported as a normal python module. There's some trickiness in inpoly2.py to fall-back to the "normal" kernel if the compiled one can't be imported.

For me, the cython kernel gives around x 10-15 speedup, which is around what was expected.

In terms of installing, it should (apparently) be possible for this to be used without needing to have cython itself installed --- the cython-generated inpoly_.c should be all that's needed.

I need to learn what else is necessary to set this up for pip install etc, but this is fairly close now I think.

@krober10nd
Copy link
Collaborator Author

For pip install, you just need to create a sdist and wheel and then you use twine to upload it.

https://packaging.python.org/tutorials/packaging-projects/

I’m pretty certain if you clone the repo, you’ll just need to type pip install -e .

@krober10nd
Copy link
Collaborator Author

The cython thing was straightforward and is super cool.

@dengwirda
Copy link
Owner

As you mentioned, multi-threading may be a good thing to consider adding at some stage --- I've seen that for large problems MATLAB may still be a little faster in some cases, but I think this is because it's doing its array operations (espec. sort) on multiple cores. MATLAB's sort seems so fast (compared to np.argsort) that I think it must be doing this in parallel.

Since np.argsort is where the majority of the time is spent in inpoly-python it'd be handy indeed to use the cython openmp wrappers you suggested to try to scale across threads a little. I've re-worked the kernel --- bringing np.argsort inside --- to get set up to try this. In doing this it was also possible to remove some array temporaries, which improved performance a little more anyway.

(Efficient) parallel sorting is not so easy to do, so I'm not sure how well this will work out --- especially via cython. Nonetheless, there's a a few ideas that come to mind --- I'll try to find some time to look at this in another week or two.

In terms of publishing the algorithm --- yes, I'm interested to do this ;) inpoly has always just been part of other projects of mine, so hasn't been published individually. There are other efficient inpolygon methods out there, so it'll be interesting to see whether others find the algorithm interesting or not --- as an "inplace" O(n*log(n))-style method (and especially if the multi-threading works out) it probably is worth submitting a paper on. It surprises me that so many people tend to use O(n^2) approaches.

I suspect "publishability" would be enhanced by combining an algorithm description with actual applications --- perhaps ACM Transactions on Spatial Algorithms and Systems could be something to consider? If you are interested, I'd be happy to work on something like this with you, and include use in the various Earth-system meshing workflows as case studies? Other ideas are welcome here too!

@dengwirda dengwirda reopened this Sep 27, 2020
@dengwirda dengwirda added the enhancement New feature or request label Sep 27, 2020
@krober10nd
Copy link
Collaborator Author

As you mentioned, multi-threading may be a good thing to consider adding at some stage

Yes, it could be very useful for large scale problems. For me I've used this predicate to form reasonably sized signed distance functions of coastal ocean domains to mesh with.

In terms of publishing the algorithm --- yes, I'm interested to do this ;) inpoly has always just been part of other projects of mine, so hasn't been published individually. There are other efficient inpolygon methods out there, so it'll be interesting to see whether others find the algorithm interesting or not --- as an "inplace" O(n*log(n))-style method (and

This paper came up at some point on my search for point-in-poly algorithms: https://www.sciencedirect.com/science/article/pii/S0098300401000371

..especially if the multi-threading works out) it probably is worth submitting a paper on. It surprises me that so many people tend to use O(n^2) approaches.

Yes, just look at matplotlib.path for instance. This is the de facto recommendation on Stack Overflow for point in a polygon queries, which is just mind boggling.

I suspect "publishability" would be enhanced by combining an algorithm description with actual applications --- perhaps ACM Transactions on Spatial Algorithms and Systems could be something to consider? If you are interested, I'd be happy to work on something like this with you, and include use in the various Earth-system meshing workflows as case studies? Other ideas are welcome here too!

I agree that it would be easier to publish with an application in mind such as your suggested journal.
I would be interested in this! I'm still doing my coastal ocean meshing thing on the side. Actually, now attempting to get it in Python (https://github.com/CHLNDDEV/oceanmesh) and perhaps unify it with some other more canonical mesh generation approaches like an advancing front generator. In fact, this is what motivated me to reach out to you about inpoly in the first place because the other approaches were slow as a snail. In the default mesh generation approach, my geometry representation uses SDFs which depend on the inpoly predicate quite a bit. So this could be one application of the application I guess.

I imagine there are probably numerous other geophysical applications one could think of (especially in a package like QGIS or GRASS) such as processing LiDAR data, having a massive PSLG and trying to find intersection with point data, etc. Something to think about but obviously the SDF for mesh generation comes immediately to mind.

@dengwirda
Copy link
Owner

Okay, sounds good --- I'll start to think about what a paper might look like. We also use inpolygon-type queries in many places --- design of mesh spacing functions, analysis of MPAS results, etc, as well as kernels within larger GIS type analysis, etc.

If you're looking for frontal-style meshing algorithms, obviously jigsaw may be of interest --- this is based on my "Frontal-Delaunay" algorithm + CVT iteration, and is what we use for MPAS-O, COMPAS, etc.

@krober10nd
Copy link
Collaborator Author

If you're looking for frontal-style meshing algorithms, obviously jigsaw may be of interest --- this is based on my "Frontal-Delaunay" algorithm + CVT iteration, and is what we use for MPAS-O, COMPAS, etc.

Yep, I had planned on using your jig-saw inside it actually at some point down the line. I want oceanmesh to support the traditional mesh generation paradigm where you propagate a front of elements from a PLSG with a known point spacing and jig-saw would accomplish this nicely in addition to the non-traditional SDF approach.

@dengwirda
Copy link
Owner

👍 but just jigsaw, not jig-saw... 😏

jigsaw

@krober10nd
Copy link
Collaborator Author

that gave a laugh :] yes, I think a jig-saw would be a tool for a different job. haha

@krober10nd
Copy link
Collaborator Author

Hey @dengwirda I got in contact with my friend @HamishB whose one of the GRASS GIS developers about using this algorithm/implementation somehow inside GRASS. He's going to talk to their dev team.

@dengwirda
Copy link
Owner

@krober10nd, @HamishB, sounds good --- let me know if there's interest in this.
I suspect that for a full GIS workflow a "two-level" strategy (where polygons are embedded in an aabb-tree and inpoly is called as the kernel at each leaf) might be the way to go. At least this is how I've done these type of things in the past!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants