New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce __STACKTRACE__ #7097

Closed
michalmuskala opened this Issue Dec 8, 2017 · 14 comments

Comments

Projects
None yet
6 participants
@michalmuskala
Member

michalmuskala commented Dec 8, 2017

TL;DR Replace the use of System.get_stacktrace() with a pseudo-variable __STACKTRACE__ similar to __MODULE__ or __CALLER__ available only inside catch and rescue clauses of the try expression.

Reasons

Erlang recently set out to overhaul how stack traces are handled. The :erlang.get_stacktrace/0 call (and thus it's Elixir's equivalent System.get_stacktrace/0) are problematic - they may cause the emulator to hold on to the stack trace for a very long time. This can be harmful, since the stack trace can be very big - in case of FunctionClauseError it includes all the arguments. The emulator has to hold on the stack trace for the last exception in general until the next exception is raised - which can be a long time.
To rectify this potential performance issue, OTP 20 started warning on the use of :erlang.get_stacktrace() outside of the catch clause of a try expression (in Elixir this would be outside of catch and rescue). The plan was to make the call return an empty list in some future Erlang version when called outside of catch.
OTP 21 will introduce an even more restrictive mechanism, to understand it we need to understand the syntax for catch in Erlang:

try
  ...
catch
  Kind:Reason ->
    ...
end

Where Kind can be one of :exit, :error and :throw. The proposed (and merged) enhancement replaces this with:

try
  ...
catch
  Kind:Reason:Stack ->
    ...
end

Where the Stack variable would contain the stack-trace for the just-rescued exception. It is forbidden to pattern match on this variable or use an already-bound variable (which would be equivalent to pinning it in Elixir). It's now planned that some future Erlang release will always return empty lists from :erlang.get_stacktrace() and the only way to acquire the stacktrace will be with this syntax. This means we need a compatible Elixir solution.

Proposal

To solve this, we propose introduction of a new pseudo-variable __STACKTRACE__ that would be available inside catch and rescue and would return the stack trace of the exception being currently handled.
This means that this code:

try do
  ...
rescue
  e ->
    ...
    reraise(e, System.get_stacktrace())
end

Would become:

try do
  ...
rescue
  e ->
    ...
    reraise(e, __STACKTRACE__)
end

The __STACKTRACE__ pseudo-variable would be only available inside catch and resuce parts of try - it should fail to compile if present in any other place, similar to how __CALLER__ is not available outside of macros.

Having this as this kind of pseudo-variable is also compatible with how this works in Erlang - since neither pattern matching nor pinning of the stack variable is allowed, we don't loose any features.

Implementation

Since OTP 21 is not stable yet, and we also want to support multiple Erlang versions, a conditional compilation of this feature is required.

On VMs without new syntax

When compiling on systems without support for the new syntax, the __STACKTRACE__ pseudo-variable should compile into a call to :erlang.get_stacktrace/0.
Additionally, the call to :erlang.get_stacktrace/0 should happen as the very first thing in the block, assigning the result to some internal variable, which could be later substituted for the __STACKTRACE__ expression. This provides a very important advantage compared to the current System.get_stacktrace/0 call by protecting against the "wrong stack trace" bug, which is very easy to introduce accidentally:

try do
  ...
catch
  kind, reason ->
    cleanup()
    report_error(kind, reason, System.get_stacktrace())
end

defp cleanup() do
  {:ok, this_may_throw()}
catch
  throw -> {:error, throw}
end

If the this_may_throw() operation does indeed throw, the System.get_stacktrace/0 call would return the value for that throw, instead of the exception we've just caught, since it always returns the stack trace of the last exception. With the use of the __STACKTRACE__ pseudo-variable, this would compile into something conceptually like:

try do
  ...
catch
  kind, reason ->
    stracktrace = :erlag.get_stacktrace()
    cleanup()
    report_error(kind, reason, stacktrace)
end

Which does not suffer from the issue. This means not only is the proposed feature giving us compatibility with Erlang, it is also a significant improvement in terms of preventing strange and hard to find bugs.

On VMs with new syntax

When compiling on a VM that supports the new syntax, the __STACKTRACE__ call should just use it, the previous example would compile into the following equivalent Erlang code:

try do
  ...
catch
  Kind:Reason:Stacktrace ->
    cleanup()
    report_error(Kind, Reason, Stacktrace)
end

Alternatives

An alternative would be to also support something like:

try do
  ...
catch
  kind, reason, stack ->
    ...
end

There are couple of problems with that approach:

  • similar to Erlang, it introduces this special variable that can't be pattern matched
  • there's no clear way of using this syntax in combination with rescue or catch with just one pattern (which catches only throws).
  • the compilation for VMs that don't support the new syntax is more awkward.

Compatibility

Because of the conditional-compilation, the feature becomes compatible with all the Erlang releases. There is the constraint that code compiled on VM that supports the feature wouldn't work on older VMs, but this is also true today - code compiled on OTP 20 wouldn't work on OTP 18, even though Elixir in itself supports running on both VMs.

Roadmap

The System.get_stacktrace() call should be soft-deprecated in the next release that includes the __STACKTRACE__ feature. Additionally a loud deprecation should happen after 2 releases or as soon as Elixir declares compatibility for an Erlang VM that would always return an empty list for :erlang.get_stacktrace(). This might break the general deprecation policy approach of minimum 2 releases, but this is somewhat out of our hands.

@josevalim josevalim changed the title from [Proposal] Introduce `__STACKTRACE__` to Introduce `__STACKTRACE__` Dec 8, 2017

@whatyouhide whatyouhide changed the title from Introduce `__STACKTRACE__` to Introduce __STACKTRACE__ Dec 12, 2017

@michalmuskala

This comment has been minimized.

Member

michalmuskala commented Dec 18, 2017

So I looked a bit into actually implementing this, and one problem with __STACKTRACE__ is that it can be confusing and a bit inconsistent in nested trys. For example:

__STACKTRACE__ # 0
try do
  __STACKTRACE__ # 1
  throw :foo
catch
  :foo ->
    try do
      __STACKTRACE__ # 2
      throw :bar
    catch
      :bar -> 
        __STACKTRACE__ # 3
    end
    __STACKTRACE__ # 4
end

It's clear that #0 and #1 should just fail to compile. It's also clear that #4 should return the stack for throw :foo. But now #2 should return the stack for throw :foo as well, but then in #3 it should probably return the stack for throw :bar - it suddenly changes from one stack-trace to another. The erlang solution does not have this problem as you need to explicitly bind a variable to the stacktrace.

While it's true that nested trys don't appear often, it is something we should decide how to deal with.

@josevalim

This comment has been minimized.

Member

josevalim commented Dec 18, 2017

@michalmuskala we could make #2 fail, forcing the user to explicitly set a variable before.

@michalmuskala

This comment has been minimized.

Member

michalmuskala commented Dec 18, 2017

Yeah, this also comes down to approach to compiling the thing. I initially tried to make it work similar to __CALLER__ - translating to a variable and after translation mark a flag in elixir_erl record and later read it and prepend the __STACKTRACE__ = :erlang.get_element() call. The problem with that approach is that it won't work, since the variables would conflict.

@josevalim

This comment has been minimized.

Member

josevalim commented Dec 18, 2017

@michalmuskala we can generate fresh variables every time. :)

@josevalim

This comment has been minimized.

Member

josevalim commented Dec 18, 2017

@michallepicki we could even expand our catch/rescue to have three elements, the same as Erlang, but that would be completely internal. This way __STACKTRACE__ belongs exclusively to Elixir land and exclusively to the expansion pass.

@michalmuskala

This comment has been minimized.

Member

michalmuskala commented Dec 18, 2017

You pinged the wrong Michał. 😆

But that would mean we'd need to expand rescue clauses to catch clauses in elixir, wouldn't it?

@michallepicki

This comment has been minimized.

Contributor

michallepicki commented Dec 18, 2017

that's fine, I got used to it already, happens every few months ;)

@josevalim

This comment has been minimized.

Member

josevalim commented Dec 19, 2017

Oops, sorry!

But that would mean we'd need to expand rescue clauses to catch clauses in elixir, wouldn't it?

I don't think we should do that because that is a behaviour that is fairly specific to Erlang. For example, ElixirScript probably wants to keep the high level semantics so rescue is also capable of rescuing JavaScript exceptions or something similar?

Maybe rescue should have two arguments too...

@josevalim josevalim added this to the v1.7.0 milestone Dec 23, 2017

@josevalim josevalim reopened this Feb 3, 2018

@josevalim

This comment has been minimized.

Member

josevalim commented Feb 3, 2018

I am reopening because we need to do more work to prepare for Erlang 21.

@josevalim josevalim closed this May 5, 2018

@dideler

This comment has been minimized.

dideler commented Sep 7, 2018

How should we handle situations where System.stacktrace/0 is called when an exception doesn't occur, that is, there is no rescue/catch?

For example, we have some code where we handle error tuples by collecting information, which includes the stacktrace, and send the data to a monitoring service.

def set_error(transaction, name, message, reason) do
  backtrace = ([reason] ++ System.stacktrace()) |> Enum.map(&inspect/1)
  Appsignal.Transaction.set_error(transaction, name, message, backtrace)
end
@NobbZ

This comment has been minimized.

NobbZ commented Sep 7, 2018

@dideler You should not rely on the ability to get the stacktrace outside of a rescue/catch, it might get removed from the BEAM with any major release.

@OvermindDL1

This comment has been minimized.

OvermindDL1 commented Sep 7, 2018

A workaround is just to throw an exception and then immediately catch it of course, which can be wrapped up in a function.

@dideler

This comment has been minimized.

@josevalim

This comment has been minimized.

Member

josevalim commented Sep 7, 2018

@dideler no problem at all.

Just one last note: System.stacktrace is stateful. It returns the status of the last exception. If you want to get the current stacktrace, then you should do Process.info(self(), :current_stacktrace). I will improve the docs here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment