Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

[otp] Duplicate Erlang's non-tail-call-with-no-stack-frame optimization #6

Open
eriksoe opened this Issue · 5 comments

2 participants

Erik Søe Sørensen Kresten Krab Thorup
Erik Søe Sørensen

The optimization in question makes certain non-tail recursions be rewritten so that they run in fixed stack space.
At present, io_lib:write_binary_body/2 sometimes dies because of Erjang lacking this feature.
Another example is lists:flatten applied to a sufficiently deeply nested list.

Erik Søe Sørensen

The rule seems to be as follows:
Several stack frames with the same return address appear as one entry in the stack traces.
This does not, of course, mean that any optimization is actually going on; just that get_stacktrace() doesn't report the raw trace.
However, the problem with Javastack overflows remains an issue.

Kresten Krab Thorup
Owner

Bugger, we need to figure out how to do this. lists:map needs it; and I don't understand how to do this at runtime.

{function, map, 2, 337}.
{label,336}.
  {func_info,{atom,lists},{atom,map},2}.
{label,337}.
  {test,is_nonempty_list,{f,338},[{x,1}]}.
  {allocate,2,2}.
  {get_list,{x,1},{x,2},{y,1}}.
  {move,{x,0},{x,1}}.
  {move,{x,2},{x,0}}.
  {move,{x,1},{y,0}}.
  {call_fun,1}.
  {move,{x,0},{x,2}}.
  {move,{y,1},{x,1}}.
  {move,{y,0},{x,0}}.
  {move,{x,2},{y,1}}.
  {trim,1,1}.
  {call,2,{f,337}}.
  {test_heap,2,1}.
  {put_list,{y,0},{x,0},{x,0}}.
  {deallocate,1}.
  return.
{label,338}.
  {test,is_nil,{f,336},[{x,1}]}.
  {test,is_function2,{f,336},[{x,0},{integer,1}]}.
  {move,nil,{x,0}}.
  return.
Erik Søe Sørensen

I still think the optimization is not really there (except that when the stack trace is generated, identical return addresses are conflated).
The real trick is heap allocation of stack frames - or, nearly equivalent, lazy stack unpacking by kilim.
You get stack overflow errors? Then now you know why I've been rambling about this lazy vs. eager difference :-)

Kresten Krab Thorup
Owner

Yep, exactly. I am seeing stack overflows. I agree that we need to encode this by allocating stack frames on the heap somehow; but I don't think we should involve Kilim in the matter; had Kilim been a proper continuation-style encoder, it would have made sense. A function which is self-recursive-sans-tail-call needs to have a separate self-call stack. Ugly blugly.

Kresten Krab Thorup
Owner

Found out what goes on. It's the {trim,1,1} insn that just "trims" the size of the current activation record. So the call is really normal-recursive, but that BEAM insn just lowers to cost because it makes it possible to reduce the stack size.

Perhaps we can find a way to re-code the special case of a non-tail, but recursive call that is followed by a cons, i.e. ... encode

  • {trim,N,N}
  • call (self)
  • put_list (i.e. a cons)
  • return

Into something tail-recursive with a reverse on the end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.