More precise & up to 2-5x faster STL output #4643

ochafik · 2023-05-19T22:41:41Z

Fixes #4316 (and the precision part of #2651 )

sphere(10, $fn=1000); now renders 5.3x faster in ASCII STL and 2.5x faster in binary STL (tested on M1 mac).

Notes:

Using wicked-fast double-conversion's ToShortestSingle to format floats (as suggested by @thehans in the thread below), which also guarantees that parsing gives back the exact same float value.
- As a result, ASCII STL output now has the same precision as the binary format (reading it is guaranteed to yield the exact same float coordinates)
- Also means we no longer need to parse strings after formatting, for further speedups
Uses an indexed mesh:
- Added a tesselate_faces overload that outputs the intermediate indexed mesh instead of wasting time building a PolySet.
- For ASCII export, caching the results of each vertex's toString
Both overloads of tesselate_faces are also faster because we now skip the indexed faces stage (made possible by making the Reindexer's points vector grow incrementally / available for lookups w/o overhead)
Skipping expensive superfluous tests in STL output (tesselation gives us guarantee of no duplicate vertices in each poly)
Batch vector outputs in binary STL mode for extra speedups
Preallocation of polygon vectors in the PolySet append_poly / append_vertex flow yields up to 10% render speedups on its own and was merged independently in Avoid excessive reallocations in PolySet #4642 (this PR's branch has now been rebased)

t-paul · 2023-05-20T16:51:08Z

src/io/export_stl.cc

-  return v;
+  auto cstr = vertexString.c_str();
+  return {
+    strtof(cstr, (char**)&cstr), 


Does that work in locales having something else than '.' as decimal separator? The man page says it's using locale settings.

Ah! Luckily there's a a setlocale(LC_NUMERIC, "C"); in the parent of all append_stl functions that's meant to force '.' as decimal separator for both ways of parsing in this context (weirdly tho, I tried to force the locale to "fr_FR.UTF-8" there and the output didn't budge - I was expecting commas 🧐)

Keep in mind that setLocale has been shown to be incredibly slow on windows if you enable localization in preferences. #2744

@thehans oh, does that setlocale currently make ASCII STL export slower on Windows? (sounds like something exciting to follow up on, although this PR shouldn't make things worse in that respect)

I don't know how much it is specifically affecting STL output speed.
IIRC the test case there was particularly slow during CSG generation (sample code had tons of polyhedron data, thousands of vertex coordinates that require double to string conversion).

That was the impetus for adding double-conversion library, and so any time we print a number through our Value variant class it now goes through that. But I don't think any of the export code was ever touched as part of those changes.

There has been some discussion more recently about switching that out for libfmt aka {fmt} which is supposed to be even faster than double-conversion.
I spent quite some time a few months ago attempting to swap out the libraries but didn't quite get it to a fully working/presentable state. I might revisit it soon if I find the energy.

You might also try casting double to float and calling ToShortestSingle

edit: I'm surprised it wasn't faster than the standard lib functions, since that's kinda the whole point of it. Are you testing on Release build (i.e. with optimizations)?

Re/ doubleconv it’s weird indeed, maybe I’ve misused it (tried same Params as the Value code, then more naive params), or the mac stdlib is faster than average (might try this benchmark if I have a chance), or… the way I naively stripped trailing zeros negated its performance? (Hadn’t seen ToShortestSingle, will try next)

From your gist trimTrailingZeros doesn't appear to actually get called anywhere.

Anyways, simply stripping zeroes from the end is not even remotely correct behavior, in the event that:

There is no decimal point: "1000" becomes "1" (also "0" would become empty string)

There is an exponent
a) "1.000000e6" would not get trimmed at all
b) "1.000000e10" would trim the exponent "1.000000e1" echo() formating error #2950

The more I think about it, I feel that using ToShortestSingle would make the most sense here, and would allow for simplifying the code much further:

ToShortestSingle creates the shortest exact representation of the input value (a "round trip" guarantee). So no need to worry about trailing zeros (and no directly specifying any precision)

Additionally, all of the code going from double to string and back, was for the purpose of normal calculations matching up after reading in the decimal ASCII converted vertex positions (#1853).
If you downcast double to float and calculate normals from that, then there's no need to perform the round trip (string back to float) ourselves, because no further precision will be lost(post-downcast) by ToShortestSingle.

For the record, a %.9g format would have had the same property of compact format & round trip guarantee (and simpler code).

But double-conversion is indeed much faster (earlier naive attempt was costly in terms of builder usage, not to mention the bogus trailing zeros logic - note: was inlined / not in the unused helper I'd copied over), so I've switched toString to it, PTAL!

tests/data/scad/stl/stl-import-limits.stl

thehans · 2023-05-23T01:24:33Z

src/io/export_stl.cc

-std::string toString(const Vector3d& v)
+/* Define values for double-conversion library. */
+#define DC_BUFFER_SIZE (128)
+#define DC_FLAGS (double_conversion::DoubleToStringConverter::UNIQUE_ZERO | double_conversion::DoubleToStringConverter::EMIT_POSITIVE_EXPONENT_SIGN)


I know this is basically copied verbatim from Value.cc, but I would remove the EMIT_POSITIVE_EXPONENT_SIGN flag here. I was planning to remove that one from the original code too. Also meant to convert all these to constexpr instead of defines.

See this snippet from my unmerged PR for the converted values. I had also tweaked a couple of other values, but I don't know if those are relevant to ToShortestSingle (or ToShortest) without reviewing the double-conversion "docs" (header file comments).

I basically halted work on that PR to start implementing libfmt, but that ended up being much more work than anticipated.

Ah, cool PR! I've dropped EMIT_POSITIVE_EXPONENT_SIGN and will keep the defines consistent-ish for now (unless you'd prefer I constexprify both places in this PR).

Re/ libfmt, once all compilers fully support C++20 (clang 14 doesn't seem there yet), would we get enough of its features through std::format?

thehans · 2023-05-23T23:52:24Z

tests/data/scad/stl/stl-import-limits.stl

+      vertex 0 1.2345679 1.2345679
+    endloop
+  endfacet
+  facet normal 0 0 -1.471716e-7


Still not a unit vector

Oh, have now given isZero a zero tolerance to fix this.

thehans · 2023-05-24T00:07:56Z

tests/regression/stlexport/stl-import-export-expected.stl

Is this supposed to be empty?

Sorry it wasn't 😅 (probably still had a bit of fever)

tests/regression/stlexport/stl-import-limits-expected.stl

thehans · 2023-05-24T00:17:49Z

I guess I'm still a bit confused about what exactly the ASCII STL tests are checking for, since various discrepancies are still passing.
Should we be doing a full textual diff of actual vs expected stl file output, to better ensure correct output?

ochafik · 2023-05-27T18:05:39Z

@thehans I've restructured the test so that stl-import-export-limits.scad both imports and expects the expectations of stl-export-limits.scad (modified the cmake and python test to accept an explicit --expected-file arg), to minimize the changes needed and still benefit from TEST_GENERATE=1

This allows using getArray repeatedly during construction of the reindexer without incurring quadratic costs because of array rebuilding

IndexedFace intermediate stage wasn't needed (possible now that Reindexer's vector is usable during construction)

- use strtof instead of istringstream! - skip tests about polygons w/ duplicate vertices (guaranteed not to happen after tesselation) - cache tostring / fromstring results for each unique vertex - normals are now... normalized (-0 -> 0)

…t-limits

* build PolySet directly from manifold mesh * use small_vector for faces * faster stl export (#4643)

pca006132 · 2023-12-02T08:11:41Z

some optimizations are already implemented, remaining should be done with polyset cleanup.

ochafik mentioned this pull request May 20, 2023

ASCII STL export is slow #4316

Open

t-paul reviewed May 20, 2023

View reviewed changes

ochafik changed the title ~~Up to 2-3x faster STL output~~ [WIP] Up to 2-3x faster STL output May 21, 2023

ochafik changed the title ~~[WIP] Up to 2-3x faster STL output~~ [WIP] More precise & up to 2-5x faster STL output May 22, 2023

ochafik changed the title ~~[WIP] More precise & up to 2-5x faster STL output~~ More precise & up to 2-5x faster STL output May 22, 2023

ochafik marked this pull request as ready for review May 22, 2023 17:13

thehans reviewed May 23, 2023

View reviewed changes

tests/data/scad/stl/stl-import-limits.stl Outdated Show resolved Hide resolved

thehans reviewed May 23, 2023

View reviewed changes

thehans reviewed May 24, 2023

View reviewed changes

tests/regression/stlexport/stl-import-limits-expected.stl Outdated Show resolved Hide resolved

ochafik mentioned this pull request May 25, 2023

Integrate non-CSG speedup initiatives [Do not merge] #4654

Draft

ochafik added 15 commits May 30, 2023 18:18

Reindexer::lookup updates vector on the go.

fb6a94a

This allows using getArray repeatedly during construction of the reindexer without incurring quadratic costs because of array rebuilding

Improve tesselate_faces + add indexed output overload

e8d9440

IndexedFace intermediate stage wasn't needed (possible now that Reindexer's vector is usable during construction)

Use indexed triangle tesselation in PolySet->Manifold conversion

1475e2d

Support IndexedTriangleMesh in ExportMesh

5071a6a

Add a few STL export tests (degeneracies & precision limits)

55b43f7

STL: Format floats w/ %.9g to guarantee parsing of same number

72aa20e

Update STL export expectations w/ proper precision

4839400

STL: use double-conversion's ToShortestSingle to format vectors

d6fdf5c

Speed up Binary STL output

23f1888

Update fastcsg-remesh-cube-2-expected.stl

361a7d4

Restore static assert in export_stl

a57cef1

Update stl-import-limits.stl

1b8a504

Remove EMIT_POSITIVE_EXPONENT_SIGN from stl double-conversion flags

8791f07

Support override of expected file in tests + use for stl-import-expor…

68729aa

…t-limits

ochafik force-pushed the fast-stl3 branch from d3c353b to 68729aa Compare May 30, 2023 17:18

Merge remote-tracking branch 'origin/master' into fast-stl3

238a9b1

pca006132 mentioned this pull request Nov 29, 2023

PolySet needs cleanup #4851

Closed

15 tasks

pca006132 added a commit to pca006132/openscad that referenced this pull request Dec 1, 2023

faster stl export (openscad#4643)

87a24e3

pca006132 mentioned this pull request Dec 1, 2023

polyset cleanup 2 #4867

Merged

kintel pushed a commit that referenced this pull request Dec 1, 2023

polyset cleanup 2 (#4867)

27a7819

* build PolySet directly from manifold mesh * use small_vector for faces * faster stl export (#4643)

pca006132 closed this Dec 2, 2023

Uh oh!

More precise & up to 2-5x faster STL output #4643

More precise & up to 2-5x faster STL output #4643

Uh oh!

Conversation

ochafik commented May 19, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

thehans May 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

thehans May 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ochafik May 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

thehans commented May 24, 2023

Uh oh!

ochafik commented May 27, 2023

Uh oh!

pca006132 commented Dec 2, 2023

Uh oh!

Uh oh!

ochafik commented May 19, 2023 •

edited

Loading

thehans May 21, 2023 •

edited

Loading

thehans May 21, 2023 •

edited

Loading

ochafik May 22, 2023 •

edited

Loading