Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic tail recursion #3020

Merged
merged 11 commits into from
Sep 6, 2019
Merged

Conversation

jschobben
Copy link

This re-implements tail recursion of functions; now any self-call in tail position gets to enjoy tail recursion, instead of functions that follow a very specific structure. In particular, let/assert/echo and nested ternary operators can be used freely now.

Unlike at parse time as before, now the decision to use tail recursion is made at run-time, and for each function call instead of each function declaration.

Tried to separate it in a few commits. Here's a summary of changes:

  • Remove FunctionTailRecursion and its factory method
  • For a few Expression subclasses, factor out method evaluateStep from evaluate. It returns the following Expression, instead of evaluating it.
  • Use a nested while-loop for function evaluation; the inner loop iteratively follows a single execution path where possible, and falls back to a regular recursive call otherwise (for binary operators etc)
  • Add a few tests

A word on performance; I benchmarked using this silly example:
function speed(n=1000000) = n == 0 ? 42 : speed(n-1);
Initial results weren't too good; it was almost twice as slow as master. So, had to hunt down a few optimizations:

  • custom function to prepare new tail call context, caching some stuff: FunctionCall::prepareTailCallContext
  • Use typeid check + static_cast, instead of dynamic cast
  • Reduce Context creation overhead (not in this PR though; see PR Use shared_ptr for Context::document_path #3019 )

Now the speed is more or less the same as on master.

Copy link
Member

@t-paul t-paul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great. Thanks for splitting up the patch, that helped a lot. I did some additional testing and that looks good too. @kintel can you have an additional look at the context handing. I don't see any issues but I'm still not 100% sure I know about all the details 😄.

@t-paul t-paul requested a review from kintel August 25, 2019 18:44
@kintel
Copy link
Member

kintel commented Sep 6, 2019

LGTM!
It's a bit surprising that the typeid check + static_cast is actually faster than dynamic_pointer_cast, considering this even messes with the reference count.
(at some point we should consider adding performance regressions as I fear code like this can be attempted refactored to look cleaner in the future)

@kintel kintel merged commit 48c794f into openscad:master Sep 6, 2019
@nophead
Copy link
Member

nophead commented Sep 6, 2019 via email

@kintel
Copy link
Member

kintel commented Sep 6, 2019

Yeah, I guess dynamic_pointer_cast would need to mess with the ref count as well : /

@jschobben
Copy link
Author

Thanks for merging this!

The typeid+static cast being faster, is a combination of two things: the (cheap) typeid check prevents doing more than one cast per loop, and dynamic_cast is slightly slower than static_cast. I'd say it's mostly the typeid thing that matters here.
Downside is that typeid is not polymorphic, it only detects an exact match of the type.

Was thinking too about how to add a performance test, since results depend on the HW used. Probably it will just be a matter of "compare PR performance against master" or something.

@t-paul
Copy link
Member

t-paul commented Sep 7, 2019

I recently watched a talk discussing that topic... https://www.youtube.com/watch?v=nOwUzFYt0NQ&feature=youtu.be&t=188 (in context of adding pattern matching to C++)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants