Skip to content

feat: implement tail call optimization #7641

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

HerrCai0907
Copy link
Contributor

@HerrCai0907 HerrCai0907 commented Jun 6, 2025

Fixed: #7647
Now it is just a generic concept about tail call optimization. I am not sure whether I am in the correct way. So I put the draft version and expect to get some feedback.

The basic idea is we can convert

(func ...
   (call $f)
)

to

(func ...
   (return_call $f)
)
(func ...
   (return (call $f))
)

to

(func ...
   (return_call $f)
)
(func ...
   (if
      (condition)
      (call $f)
   )
)

to

(func ...
   (if
      (condition)
      (return_call $f)
   )
)

@HerrCai0907 HerrCai0907 marked this pull request as draft June 6, 2025 08:00
@kripken
Copy link
Member

kripken commented Jun 10, 2025

I believe this is valid, though I also vaguely remember there was a reason it might not always be helpful... @fgmccabe @tlively do you know what the best practices are for automatic conversion of normal calls to tail calls (when the return type permits)?

@tlively
Copy link
Member

tlively commented Jun 10, 2025

The only downside I can think of is that this might lead to confusing stack traces. It would also be good to double check with a microbenchmark that this is actually good for performance; I wouldn't be too surprised if tail calls are slow because the engine needs to do some kind of extra shuffling to make them work.

@fgmccabe
Copy link

Not sure if this is strictly on point: but automatically converting last calls to tail calls may (will) affect semantics of applications.
return_call is useful for so-called tail optimization. But, it's primary purpose is in supporting languages (particularly functional[?] languages like Scheme) that depend on this for their semantics.
Since this is not part of C/C++'s semantics, using return_call for emscriptem represents a risk.

@tlively
Copy link
Member

tlively commented Jun 10, 2025

Can you elaborate on what the observable semantic differences are? Stack exhaustion doesn't count, since the number of stack frames is an implementation limit and therefore allowed to be broadly nondeterministic, even within a single execution.

@fgmccabe
Copy link

Actually, stack exhaustion, as you put it, DOES count (in languages like Scheme).
Other considerations include meta-level features such as dynamic scope, profiling, etc.

@HerrCai0907
Copy link
Contributor Author

I agree there are no semantic related issue, combine call and return should not have semantic changing except call stack.

It would also be good to double check with a microbenchmark that this is actually good for performance

According to https://v8.dev/blog/wasm-tail-call#proposal, It should be have some improvement, but I am not sure whether v8 itself do this optimization for some simple cases. I will do this benchmark later.

@tlively
Copy link
Member

tlively commented Jun 11, 2025

Aha, dynamic scope is the issue. @HerrCai0907, you'll have to make sure that the optimized calls are not inside try or try_table.

@HerrCai0907
Copy link
Contributor Author

you'll have to make sure that the optimized calls are not inside try or try_table.

I think the current implement already consider it.

@HerrCai0907
Copy link
Contributor Author

It would also be good to double check with a microbenchmark that this is actually good for performance

I have run test for quick sort array with 9000 elements in MAC M1 Pro. It can speed up about 0.57% execution time.

@tlively
Copy link
Member

tlively commented Jun 15, 2025

I took a look at the draft implementation, and it doesn't look like it takes try or try_table into account. (return (call ...)) inside a try or try_table will be optimized incorrectly. You could also use Properties::getFallthrough to see if the value reaching a return or the end of a function comes from a call, even if there are other things like blocks or local.tee instructions in the way.

@HerrCai0907
Copy link
Contributor Author

You could also use Properties::getFallthrough to see if the value reaching a return or the end of a function comes from a call, even if there are other things like blocks or local.tee instructions in the way.

I think it is better replacement for checkTailCall in the current PR, right?

@tlively
Copy link
Member

tlively commented Jun 16, 2025

I think it is better replacement for checkTailCall in the current PR, right?

Yes, or at least it could be used inside checkTailCall to simplify it and make it more powerful. You will want to keep the handling of if-else in checkTailCall because Properties::getFallthrough does not handle that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tail-Call optimization
4 participants