-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up script evaluation by prehashing & interning identifier strings #4648
base: master
Are you sure you want to change the base?
Conversation
[ Without reviewing the implementation. ] Precalculating the hashes seems entirely plausible. From what you're saying, it sounds like you're thinking of this as caching, and from a quick glance it looks like you aren't touching the parser. You could do this precalculation at parse time. In theory you could resolve lexical-scope (non-$) variable and function references to an offset into a stack, no hashing required. Or, because of context capture, more likely you'd want to resolve into an index into a stack of contexts and an index into that context. $ variables are probably require hashing. Is this worth the complexity? For most OpenSCAD programs, probably not, because the rule of thumb is that execution time is trivial. For programs using complex libraries like BOSL2, yeah, probably worthwhile. |
@jordanbrown0 Thanks for the rapid feedback! I've updated the poor wording, the code indeed precomputes the hashes when building the
Ah of course, thanks! I was a bit scared of magical semantics but as you point out it's probably limited to $ vars.
I'll take a look, not sure how much more complex it would need to be.
Agree for most programs, but we do get repeated complains from people who want faster previews, and I tend to be completely constrained by evaluation time now that I've tasted the sweet BOSL2 fruit (and that I've done all I could on the rendering front, e.g. integrating manifold, fast-csg). The model linked from this PR takes 37sec to preview with the PR's change vs. 48sec before... then only 6 extra seconds to render w/ Manifold 🤦♂️. |
add missing include
src/core/Context.cc
Outdated
{ | ||
if (is_config_variable(name)) { | ||
return session()->lookup_special_function(name, loc); | ||
if (name.resolved_builtin_function) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does feel a little odd to add builtin function caching to the Identifier
class. Would a separate function cache make sense, just for the sake of separation of concerns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, the rest of this PR looks good, modulo test failures, so the quickish way forward could be to split out the function caching into a separate PR to move this one forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kintel sorry it took me so long to get back to this, fixed tests + removed the function caching bit as suggested
return session()->lookup_special_function(name, loc); | ||
} | ||
for (const Context *context = this; context != nullptr; context = context->getParent().get()) { | ||
boost::optional<CallableFunction> result = context->lookup_local_function(name, loc); | ||
if (result) { | ||
return result; | ||
return std::move(result); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't worry about this, but just a note to whomever is reading this that at some point we should probably go through our usage of std::move(). In this case, moving shouldn't be needed due to NRVO copy elision, although the effect is likely negligible.
auto it = interned_names_with_hash.find(name); | ||
if (it == interned_names_with_hash.end()) { | ||
it = interned_names_with_hash.emplace(name, std::make_pair(name, std::hash<std::string>{}(name))).first; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I think all our supported compilers have C++17 now, so you should be able to this, if appropriate:
interned_names_with_hash.try_emplace(name, std::make_pair(name, std::hash<std::string>{}(name)))
bool is_config_variable_ = false; | ||
std::string *interned_name_ptr = NULL; | ||
size_t hash = 0; | ||
Location loc; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another nit: We don't really have a convention for private variables. Old code used no special syntax and prefers prefixing with this->
on access. Newer code seems to have adopted _
suffix as popularized by certain style guides. The _
variant makes code less ambiguous to read, so I'm very slightly tending towards it myself. When modifying existing classes, we should, IMO, try to adhere to one of the two though.
For some reason, one of the tests seem to crash on Windows. Needs looking into. |
This PR substantially speeds up variable bindings & lookups by interning variable names (allows comparing string pointers instead of string contents) and precomputing their hash, using a new
Identifier
class that's built during AST creation and replaces moststd::string
name uses.Result: a model like poly_attached_inlined.scad (heavy scripting w/ BOSL2) gets 10% faster preview (measured w/
-o out.echo
).Benchmark command-line & results
hyperfine --warmup 1 --runs 1 -L build buildMaster,build \ '{build}/OpenSCAD.app/Contents/MacOS/OpenSCAD poly_attached_inlined.scad -o out.echo'
Notes:
Some extra speedup is had by caching the resolution of builtin functions inside theIdentifier
class (caching other kinds of values is more complex, w/ unclear payoffs so far).