Floating Point vs Integer #27

mrgreywater · 2016-02-12T14:06:10Z

earcut.hpp supports both floating point and integer as vertices type, but isn't using the fastest approach in either case.

When using floating point data, lots of the min/max calculations are done using conditions, which slows them down, as x86/AMD64 both have branchless floating point min/max instructions which won't be used in that case. (see https://github.com/mapbox/earcut.hpp/blob/master/include/earcut.hpp#L286)

When using integer data, the code is casting to double quite frequently which slows things down aswell. Instead of casting to double, when expecting an overflow, it should cast to the next bigger integer type, for example with boost::int_t<size>

The text was updated successfully, but these errors were encountered:

mourner · 2016-02-12T14:11:53Z

You're right about some of the lines being unoptimized — things like min/max lines were a direct JS ports, certainly they can be made faster with a C++-specific approach.

Regarding integer casts — we'd like to keep the library minimal, avoiding dependencies like boost.

If you have any particular quick fixes in mind, we'll be happy to see more PRs!

mrgreywater · 2016-02-12T14:12:43Z

It is already depending on boost https://github.com/mapbox/earcut.hpp/blob/master/include/earcut.hpp#L12
Either way, choosing the best type to avoid overflow can be done using some SFINAE, using boost for that is no requirement, I just propose to use it as it's already a dependency.

mourner · 2016-02-12T14:19:06Z

Oops, right :) I'm not knowledgeable enough to comment meaningfully here.

jfirebaugh · 2016-02-12T18:23:29Z

Yes, the manual min/maxing should be replaced with std::min/max, and it would be nice to avoid floating point calculations where it's know that integer calculations suffice.

I'd like to figure out a way to eliminate the boost::pool dependency (#10 (comment)) and have zero dependencies other than the standard library.

mourner · 2016-04-14T14:40:13Z

@mrgreywater is this issue still relevant?

mrgreywater · 2016-04-14T16:23:36Z

Yes, I think it is. Not converting from integer to floating point, and only doing integer arithmetic when working with integer vertices should increase both precision and performance. It's not strictly required though. I may look into it the next week or so, when I got more time.

mrgreywater · 2018-03-10T17:42:57Z

I added integer math test-wise here:
https://github.com/mrgreywater/earcut.hpp/tree/integer-math

To be honest; performance seems to be the same on a modern cpu (might be a different story on a embedded systems). So since this would only increase code complexity, I'd rather not push it to master, unless there's some other reason to have it (e.g numerical stability issues).

Benchmark before, i5-5200U:

"bench float.exe"
+----------------+--------------------+--------------------+
| Polygon        | earcut             | libtess2           |
+----------------+--------------------+--------------------+
| bad_hole       |       111421 ops/s |        50667 ops/s |
| building       |      1197017 ops/s |       111320 ops/s |
| degenerate     |      2742066 ops/s |       167855 ops/s |
| dude           |        79961 ops/s |        25916 ops/s |
| empty_square   |      2132462 ops/s |       134067 ops/s |
| water          |          815 ops/s |          148 ops/s |
| water2         |          890 ops/s |          917 ops/s |
| water3         |        28188 ops/s |        11434 ops/s |
| water3b        |       279092 ops/s |        74984 ops/s |
| water4         |         3240 ops/s |         1929 ops/s |
| water_huge     |           53 ops/s |           52 ops/s |
| water_huge2    |           26 ops/s |           68 ops/s |
+----------------+--------------------+--------------------+

"bench integer.exe"
+----------------+--------------------+--------------------+
| Polygon        | earcut             | libtess2           |
+----------------+--------------------+--------------------+
| bad_hole       |       121200 ops/s |        50954 ops/s |
| building       |      1284872 ops/s |       112155 ops/s |
| degenerate     |      2857988 ops/s |       164828 ops/s |
| dude           |        80475 ops/s |        26487 ops/s |
| empty_square   |      2338438 ops/s |       135683 ops/s |
| water          |          954 ops/s |          153 ops/s |
| water2         |         1028 ops/s |          858 ops/s |
| water3         |        30108 ops/s |        11431 ops/s |
| water3b        |       320514 ops/s |        74015 ops/s |
| water4         |         3758 ops/s |         1929 ops/s |
| water_huge     |           71 ops/s |           51 ops/s |
| water_huge2    |           33 ops/s |           70 ops/s |
+----------------+--------------------+--------------------+

edit: re-executed bench with high-performance profile

mrgreywater · 2018-03-12T16:37:48Z

So on my main rig, I actually see some performance benefit across the board with integer-math: Here's the bench of an i5 3570K, release build compiled with gcc-5.1. Still not sure if the increase in complexity is worth it though...

"bench float.exe" @e4b722
+----------------+--------------------+--------------------+
| Polygon        | earcut             | libtess2           |
+----------------+--------------------+--------------------+
| bad_hole       |       158675 ops/s |        75410 ops/s |
| building       |      1634979 ops/s |       162169 ops/s |
| degenerate     |      3771630 ops/s |       237986 ops/s |
| dude           |       108349 ops/s |        40060 ops/s |
| empty_square   |      2951560 ops/s |       201882 ops/s |
| water          |         1189 ops/s |          249 ops/s |
| water2         |         1302 ops/s |         1525 ops/s |
| water3         |        39756 ops/s |        17798 ops/s |
| water3b        |       397352 ops/s |       110745 ops/s |
| water4         |         4984 ops/s |         3198 ops/s |
| water_huge     |           88 ops/s |           87 ops/s |
| water_huge2    |           42 ops/s |          122 ops/s |
+----------------+--------------------+--------------------+

"bench integer.exe" @4759f6
+----------------+--------------------+--------------------+
| Polygon        | earcut             | libtess2           |
+----------------+--------------------+--------------------+
| bad_hole       |       188997 ops/s |        76098 ops/s |
| building       |      1770977 ops/s |       162322 ops/s |
| degenerate     |      4059780 ops/s |       238876 ops/s |
| dude           |       113823 ops/s |        40106 ops/s |
| empty_square   |      3170156 ops/s |       202025 ops/s |
| water          |         1442 ops/s |          249 ops/s |
| water2         |         1530 ops/s |         1544 ops/s |
| water3         |        43668 ops/s |        17928 ops/s |
| water3b        |       474675 ops/s |       110404 ops/s |
| water4         |         5880 ops/s |         3183 ops/s |
| water_huge     |          118 ops/s |           85 ops/s |
| water_huge2    |           53 ops/s |          119 ops/s |
+----------------+--------------------+--------------------+

@mourner @kkaefer what do you wanna do with it?

mourner · 2018-03-15T07:03:58Z

I'm not sure, the fraction logic indeed complicates the code considerably... @jfirebaugh have an opinion?

mrgreywater · 2018-03-15T11:35:33Z

The thing about the fraction logic is that theoretically it could be mostly removed here. Instead one could simply do the (floored) division, and work on the integer grid while finding a hole connection (basically bresenham's line rasterization). With the tanCur/tanMin calculation, it could either stay as is, since it doesn't seem overly complicated, or could be converted into some approximated atan2 angle instead.
While both of these things would work, they would also yield slightly different results than with the floating point math from the original javascript library, and I really like the fact that they're both returning the exact same output right now, especially for unit tests.

Alternatively one could also adapt the javascript library to use integer math depending on the input, but optimization would highly depend on the used jit engine and it's type tagging.

mrgreywater · 2021-07-11T19:02:37Z

Closing this, as there doesn't seem to be much performance gain to be had. It's especially not worth it given the increased code complexity. There may be a use case for micro-controllers without FPU, but unless somebody requires triangulation on a device like that, this isn't worth it.

mourner added the question label Feb 12, 2016

mourner added the enhancement label Feb 12, 2016

mrgreywater mentioned this issue Feb 13, 2016

Multiple minor floating point fixes #30

Merged

mrgreywater closed this as completed Mar 10, 2018

mourner reopened this Mar 15, 2018

mrgreywater closed this as completed Jul 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Floating Point vs Integer #27

Floating Point vs Integer #27

mrgreywater commented Feb 12, 2016

mourner commented Feb 12, 2016

mrgreywater commented Feb 12, 2016

mourner commented Feb 12, 2016

jfirebaugh commented Feb 12, 2016

mourner commented Apr 14, 2016

mrgreywater commented Apr 14, 2016

mrgreywater commented Mar 10, 2018 •

edited

Loading

mrgreywater commented Mar 12, 2018 •

edited

Loading

mourner commented Mar 15, 2018

mrgreywater commented Mar 15, 2018 •

edited

Loading

mrgreywater commented Jul 11, 2021

Floating Point vs Integer #27

Floating Point vs Integer #27

Comments

mrgreywater commented Feb 12, 2016

mourner commented Feb 12, 2016

mrgreywater commented Feb 12, 2016

mourner commented Feb 12, 2016

jfirebaugh commented Feb 12, 2016

mourner commented Apr 14, 2016

mrgreywater commented Apr 14, 2016

mrgreywater commented Mar 10, 2018 • edited Loading

mrgreywater commented Mar 12, 2018 • edited Loading

mourner commented Mar 15, 2018

mrgreywater commented Mar 15, 2018 • edited Loading

mrgreywater commented Jul 11, 2021

mrgreywater commented Mar 10, 2018 •

edited

Loading

mrgreywater commented Mar 12, 2018 •

edited

Loading

mrgreywater commented Mar 15, 2018 •

edited

Loading