feat: implement tail call optimization #7641

HerrCai0907 · 2025-06-06T08:00:47Z

Fixed: #7647
Now it is just a generic concept about tail call optimization. I am not sure whether I am in the correct way. So I put the draft version and expect to get some feedback.

The basic idea is we can convert

(func ...
   (call $f)
)

to

(func ...
   (return_call $f)
)

(func ...
   (return (call $f))
)

to

(func ...
   (return_call $f)
)

(func ...
   (if
      (condition)
      (call $f)
   )
)

to

(func ...
   (if
      (condition)
      (return_call $f)
   )
)

kripken · 2025-06-10T19:29:19Z

I believe this is valid, though I also vaguely remember there was a reason it might not always be helpful... @fgmccabe @tlively do you know what the best practices are for automatic conversion of normal calls to tail calls (when the return type permits)?

tlively · 2025-06-10T20:04:09Z

The only downside I can think of is that this might lead to confusing stack traces. It would also be good to double check with a microbenchmark that this is actually good for performance; I wouldn't be too surprised if tail calls are slow because the engine needs to do some kind of extra shuffling to make them work.

fgmccabe · 2025-06-10T20:10:03Z

Not sure if this is strictly on point: but automatically converting last calls to tail calls may (will) affect semantics of applications.
return_call is useful for so-called tail optimization. But, it's primary purpose is in supporting languages (particularly functional[?] languages like Scheme) that depend on this for their semantics.
Since this is not part of C/C++'s semantics, using return_call for emscriptem represents a risk.

tlively · 2025-06-10T21:20:17Z

Can you elaborate on what the observable semantic differences are? Stack exhaustion doesn't count, since the number of stack frames is an implementation limit and therefore allowed to be broadly nondeterministic, even within a single execution.

fgmccabe · 2025-06-10T23:15:55Z

Actually, stack exhaustion, as you put it, DOES count (in languages like Scheme).
Other considerations include meta-level features such as dynamic scope, profiling, etc.

HerrCai0907 · 2025-06-11T01:19:10Z

I agree there are no semantic related issue, combine call and return should not have semantic changing except call stack.

It would also be good to double check with a microbenchmark that this is actually good for performance

According to https://v8.dev/blog/wasm-tail-call#proposal, It should be have some improvement, but I am not sure whether v8 itself do this optimization for some simple cases. I will do this benchmark later.

tlively · 2025-06-11T15:51:06Z

Aha, dynamic scope is the issue. @HerrCai0907, you'll have to make sure that the optimized calls are not inside try or try_table.

HerrCai0907 · 2025-06-13T08:45:44Z

you'll have to make sure that the optimized calls are not inside try or try_table.

I think the current implement already consider it.

HerrCai0907 · 2025-06-13T08:59:01Z

It would also be good to double check with a microbenchmark that this is actually good for performance

I have run test for quick sort array with 9000 elements in MAC M1 Pro. It can speed up about 0.57% execution time.

tlively · 2025-06-15T01:18:40Z

I took a look at the draft implementation, and it doesn't look like it takes try or try_table into account. (return (call ...)) inside a try or try_table will be optimized incorrectly. You could also use Properties::getFallthrough to see if the value reaching a return or the end of a function comes from a call, even if there are other things like blocks or local.tee instructions in the way.

HerrCai0907 · 2025-06-16T04:22:51Z

You could also use Properties::getFallthrough to see if the value reaching a return or the end of a function comes from a call, even if there are other things like blocks or local.tee instructions in the way.

I think it is better replacement for checkTailCall in the current PR, right?

tlively · 2025-06-16T12:59:53Z

I think it is better replacement for checkTailCall in the current PR, right?

Yes, or at least it could be used inside checkTailCall to simplify it and make it more powerful. You will want to keep the handling of if-else in checkTailCall because Properties::getFallthrough does not handle that.

feat: implement tail call optimization

cd0a99e

HerrCai0907 marked this pull request as draft June 6, 2025 08:00

kripken mentioned this pull request Jun 10, 2025

Tail-Call optimization #7647

Open

handle try catch

e6fcb78

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: implement tail call optimization #7641

feat: implement tail call optimization #7641

Uh oh!

HerrCai0907 commented Jun 6, 2025 •

edited

Loading

Uh oh!

kripken commented Jun 10, 2025

Uh oh!

tlively commented Jun 10, 2025

Uh oh!

fgmccabe commented Jun 10, 2025

Uh oh!

tlively commented Jun 10, 2025

Uh oh!

fgmccabe commented Jun 10, 2025

Uh oh!

HerrCai0907 commented Jun 11, 2025

Uh oh!

tlively commented Jun 11, 2025

Uh oh!

HerrCai0907 commented Jun 13, 2025

Uh oh!

HerrCai0907 commented Jun 13, 2025

Uh oh!

tlively commented Jun 15, 2025

Uh oh!

HerrCai0907 commented Jun 16, 2025

Uh oh!

tlively commented Jun 16, 2025

Uh oh!

Uh oh!

feat: implement tail call optimization #7641

Are you sure you want to change the base?

feat: implement tail call optimization #7641

Uh oh!

Conversation

HerrCai0907 commented Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kripken commented Jun 10, 2025

Uh oh!

tlively commented Jun 10, 2025

Uh oh!

fgmccabe commented Jun 10, 2025

Uh oh!

tlively commented Jun 10, 2025

Uh oh!

fgmccabe commented Jun 10, 2025

Uh oh!

HerrCai0907 commented Jun 11, 2025

Uh oh!

tlively commented Jun 11, 2025

Uh oh!

HerrCai0907 commented Jun 13, 2025

Uh oh!

HerrCai0907 commented Jun 13, 2025

Uh oh!

tlively commented Jun 15, 2025

Uh oh!

HerrCai0907 commented Jun 16, 2025

Uh oh!

tlively commented Jun 16, 2025

Uh oh!

Uh oh!

HerrCai0907 commented Jun 6, 2025 •

edited

Loading