Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Floating Point vs Integer #27

Closed
mrgreywater opened this issue Feb 12, 2016 · 11 comments
Closed

Floating Point vs Integer #27

mrgreywater opened this issue Feb 12, 2016 · 11 comments

Comments

@mrgreywater
Copy link
Collaborator

earcut.hpp supports both floating point and integer as vertices type, but isn't using the fastest approach in either case.

When using floating point data, lots of the min/max calculations are done using conditions, which slows them down, as x86/AMD64 both have branchless floating point min/max instructions which won't be used in that case. (see https://github.com/mapbox/earcut.hpp/blob/master/include/earcut.hpp#L286)

When using integer data, the code is casting to double quite frequently which slows things down aswell. Instead of casting to double, when expecting an overflow, it should cast to the next bigger integer type, for example with boost::int_t<size>

@mourner
Copy link
Member

mourner commented Feb 12, 2016

You're right about some of the lines being unoptimized — things like min/max lines were a direct JS ports, certainly they can be made faster with a C++-specific approach.

Regarding integer casts — we'd like to keep the library minimal, avoiding dependencies like boost.

If you have any particular quick fixes in mind, we'll be happy to see more PRs!

@mrgreywater
Copy link
Collaborator Author

It is already depending on boost https://github.com/mapbox/earcut.hpp/blob/master/include/earcut.hpp#L12
Either way, choosing the best type to avoid overflow can be done using some SFINAE, using boost for that is no requirement, I just propose to use it as it's already a dependency.

@mourner
Copy link
Member

mourner commented Feb 12, 2016

Oops, right :) I'm not knowledgeable enough to comment meaningfully here.

@jfirebaugh
Copy link
Contributor

Yes, the manual min/maxing should be replaced with std::min/max, and it would be nice to avoid floating point calculations where it's know that integer calculations suffice.

I'd like to figure out a way to eliminate the boost::pool dependency (#10 (comment)) and have zero dependencies other than the standard library.

@mourner
Copy link
Member

mourner commented Apr 14, 2016

@mrgreywater is this issue still relevant?

@mrgreywater
Copy link
Collaborator Author

Yes, I think it is. Not converting from integer to floating point, and only doing integer arithmetic when working with integer vertices should increase both precision and performance. It's not strictly required though. I may look into it the next week or so, when I got more time.

@mrgreywater
Copy link
Collaborator Author

mrgreywater commented Mar 10, 2018

I added integer math test-wise here:
https://github.com/mrgreywater/earcut.hpp/tree/integer-math

To be honest; performance seems to be the same on a modern cpu (might be a different story on a embedded systems). So since this would only increase code complexity, I'd rather not push it to master, unless there's some other reason to have it (e.g numerical stability issues).

Benchmark before, i5-5200U:

"bench float.exe"
+----------------+--------------------+--------------------+
| Polygon        | earcut             | libtess2           |
+----------------+--------------------+--------------------+
| bad_hole       |       111421 ops/s |        50667 ops/s |
| building       |      1197017 ops/s |       111320 ops/s |
| degenerate     |      2742066 ops/s |       167855 ops/s |
| dude           |        79961 ops/s |        25916 ops/s |
| empty_square   |      2132462 ops/s |       134067 ops/s |
| water          |          815 ops/s |          148 ops/s |
| water2         |          890 ops/s |          917 ops/s |
| water3         |        28188 ops/s |        11434 ops/s |
| water3b        |       279092 ops/s |        74984 ops/s |
| water4         |         3240 ops/s |         1929 ops/s |
| water_huge     |           53 ops/s |           52 ops/s |
| water_huge2    |           26 ops/s |           68 ops/s |
+----------------+--------------------+--------------------+

"bench integer.exe"
+----------------+--------------------+--------------------+
| Polygon        | earcut             | libtess2           |
+----------------+--------------------+--------------------+
| bad_hole       |       121200 ops/s |        50954 ops/s |
| building       |      1284872 ops/s |       112155 ops/s |
| degenerate     |      2857988 ops/s |       164828 ops/s |
| dude           |        80475 ops/s |        26487 ops/s |
| empty_square   |      2338438 ops/s |       135683 ops/s |
| water          |          954 ops/s |          153 ops/s |
| water2         |         1028 ops/s |          858 ops/s |
| water3         |        30108 ops/s |        11431 ops/s |
| water3b        |       320514 ops/s |        74015 ops/s |
| water4         |         3758 ops/s |         1929 ops/s |
| water_huge     |           71 ops/s |           51 ops/s |
| water_huge2    |           33 ops/s |           70 ops/s |
+----------------+--------------------+--------------------+

edit: re-executed bench with high-performance profile

@mrgreywater
Copy link
Collaborator Author

mrgreywater commented Mar 12, 2018

So on my main rig, I actually see some performance benefit across the board with integer-math: Here's the bench of an i5 3570K, release build compiled with gcc-5.1. Still not sure if the increase in complexity is worth it though...

"bench float.exe" @e4b722
+----------------+--------------------+--------------------+
| Polygon        | earcut             | libtess2           |
+----------------+--------------------+--------------------+
| bad_hole       |       158675 ops/s |        75410 ops/s |
| building       |      1634979 ops/s |       162169 ops/s |
| degenerate     |      3771630 ops/s |       237986 ops/s |
| dude           |       108349 ops/s |        40060 ops/s |
| empty_square   |      2951560 ops/s |       201882 ops/s |
| water          |         1189 ops/s |          249 ops/s |
| water2         |         1302 ops/s |         1525 ops/s |
| water3         |        39756 ops/s |        17798 ops/s |
| water3b        |       397352 ops/s |       110745 ops/s |
| water4         |         4984 ops/s |         3198 ops/s |
| water_huge     |           88 ops/s |           87 ops/s |
| water_huge2    |           42 ops/s |          122 ops/s |
+----------------+--------------------+--------------------+

"bench integer.exe" @4759f6
+----------------+--------------------+--------------------+
| Polygon        | earcut             | libtess2           |
+----------------+--------------------+--------------------+
| bad_hole       |       188997 ops/s |        76098 ops/s |
| building       |      1770977 ops/s |       162322 ops/s |
| degenerate     |      4059780 ops/s |       238876 ops/s |
| dude           |       113823 ops/s |        40106 ops/s |
| empty_square   |      3170156 ops/s |       202025 ops/s |
| water          |         1442 ops/s |          249 ops/s |
| water2         |         1530 ops/s |         1544 ops/s |
| water3         |        43668 ops/s |        17928 ops/s |
| water3b        |       474675 ops/s |       110404 ops/s |
| water4         |         5880 ops/s |         3183 ops/s |
| water_huge     |          118 ops/s |           85 ops/s |
| water_huge2    |           53 ops/s |          119 ops/s |
+----------------+--------------------+--------------------+

@mourner @kkaefer what do you wanna do with it?

@mourner
Copy link
Member

mourner commented Mar 15, 2018

I'm not sure, the fraction logic indeed complicates the code considerably... @jfirebaugh have an opinion?

@mourner mourner reopened this Mar 15, 2018
@mrgreywater
Copy link
Collaborator Author

mrgreywater commented Mar 15, 2018

The thing about the fraction logic is that theoretically it could be mostly removed here. Instead one could simply do the (floored) division, and work on the integer grid while finding a hole connection (basically bresenham's line rasterization). With the tanCur/tanMin calculation, it could either stay as is, since it doesn't seem overly complicated, or could be converted into some approximated atan2 angle instead.
While both of these things would work, they would also yield slightly different results than with the floating point math from the original javascript library, and I really like the fact that they're both returning the exact same output right now, especially for unit tests.

Alternatively one could also adapt the javascript library to use integer math depending on the input, but optimization would highly depend on the used jit engine and it's type tagging.

@mrgreywater
Copy link
Collaborator Author

Closing this, as there doesn't seem to be much performance gain to be had. It's especially not worth it given the increased code complexity. There may be a use case for micro-controllers without FPU, but unless somebody requires triangulation on a device like that, this isn't worth it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants