Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have describeNativeComponentFrame account for truncation of error stack traces in the middle #26999

Closed
wants to merge 1 commit into from

Conversation

KarimP
Copy link
Contributor

@KarimP KarimP commented Jun 23, 2023

Summary

The Hermes VM currently truncates large stack traces (those over 100 lines) in the middle, as opposed to V8 which truncates from the bottom:
https://github.com/facebook/hermes/blob/df07cf713a84a4434c83c08cede38ba438dc6aca/lib/VM/JSError.cpp#L688-L699
It places a ... skipping {n} frames line the middle of the stack trace, e.g.

TypeError: Cannot read property 'children' of undefined
     at SSRSuspendOnServer (js/ssr/SSRSuspendOnServer.react.js:26:80)
     <49 more lines>
     ... skipping 1636 frames
     at renderNodeDestructiveImpl (shared/react/ReactDOMServerStreaming-dev.modern.js:11020:22)
     <49 more lines>

This results in ReactComponentStackFrame.describeNativeComponentFrame incorrectly diffing its control and sample stacks, and attributing the wrong line (usually some React internal method) to the component. This PR fixes that by first parsing out and removing the " ... skipping" line, and everything below it. This mimics the behaviour of the stack traces being truncated at the bottom, which the method correctly accounts for.

To test this out better, I extracted the code that diffs the control and sample stacks to determine the component line into another function that gets called by describeNativeComponentFrame.

How did you test this change?

Created a new jest test to confirm both the existing and new behaviour:

yarn test determineComponentFrameFromStack-test

@KarimP KarimP force-pushed the component-stack-hermes-fix branch from 3d1d374 to 12df3ee Compare June 23, 2023 18:16
@KarimP KarimP changed the title Have describeNativeComponentFrame account for truncation of error stack traces in the middle (on Hermes) Have describeNativeComponentFrame account for truncation of error stack traces in the middle Jun 23, 2023
@KarimP KarimP force-pushed the component-stack-hermes-fix branch from 12df3ee to b820528 Compare June 23, 2023 18:20
@react-sizebot
Copy link

react-sizebot commented Jun 23, 2023

Comparing: 8ec962d...b820528

Critical size changes

Includes critical production bundles, as well as any change greater than 2%:

Name +/- Base Current +/- gzip Base gzip Current gzip
oss-stable/react-dom/cjs/react-dom.production.min.js +0.10% 164.56 kB 164.73 kB +0.09% 51.83 kB 51.87 kB
oss-experimental/react-dom/cjs/react-dom.production.min.js +0.10% 171.97 kB 172.14 kB +0.14% 54.06 kB 54.14 kB
facebook-www/ReactDOM-prod.classic.js +0.15% 571.74 kB 572.61 kB +0.16% 100.83 kB 100.99 kB
facebook-www/ReactDOM-prod.modern.js +0.16% 555.52 kB 556.39 kB +0.15% 98.02 kB 98.16 kB
oss-experimental/react/cjs/react-jsx-dev-runtime.development.js +2.12% 42.93 kB 43.84 kB +3.12% 12.52 kB 12.91 kB
oss-stable-semver/react/cjs/react-jsx-dev-runtime.development.js +2.12% 42.96 kB 43.87 kB +3.10% 12.53 kB 12.92 kB
oss-stable/react/cjs/react-jsx-dev-runtime.development.js +2.12% 42.96 kB 43.87 kB +3.10% 12.53 kB 12.92 kB
facebook-www/JSXDEVRuntime-dev.modern.js +2.11% 44.76 kB 45.70 kB +3.10% 12.70 kB 13.09 kB
facebook-www/JSXDEVRuntime-dev.classic.js +2.11% 44.76 kB 45.70 kB +3.10% 12.70 kB 13.09 kB
oss-experimental/react/cjs/react-jsx-runtime.development.js +2.09% 43.53 kB 44.44 kB +3.06% 12.70 kB 13.09 kB
oss-stable-semver/react/cjs/react-jsx-runtime.development.js +2.09% 43.56 kB 44.47 kB +3.06% 12.70 kB 13.09 kB
oss-stable/react/cjs/react-jsx-runtime.development.js +2.09% 43.56 kB 44.47 kB +3.06% 12.70 kB 13.09 kB

Significant size changes

Includes any change greater than 0.2%:

Expand to show
Name +/- Base Current +/- gzip Base gzip Current gzip
oss-experimental/react/cjs/react-jsx-dev-runtime.development.js +2.12% 42.93 kB 43.84 kB +3.12% 12.52 kB 12.91 kB
oss-stable-semver/react/cjs/react-jsx-dev-runtime.development.js +2.12% 42.96 kB 43.87 kB +3.10% 12.53 kB 12.92 kB
oss-stable/react/cjs/react-jsx-dev-runtime.development.js +2.12% 42.96 kB 43.87 kB +3.10% 12.53 kB 12.92 kB
facebook-www/JSXDEVRuntime-dev.modern.js +2.11% 44.76 kB 45.70 kB +3.10% 12.70 kB 13.09 kB
facebook-www/JSXDEVRuntime-dev.classic.js +2.11% 44.76 kB 45.70 kB +3.10% 12.70 kB 13.09 kB
oss-experimental/react/cjs/react-jsx-runtime.development.js +2.09% 43.53 kB 44.44 kB +3.06% 12.70 kB 13.09 kB
oss-stable-semver/react/cjs/react-jsx-runtime.development.js +2.09% 43.56 kB 44.47 kB +3.06% 12.70 kB 13.09 kB
oss-stable/react/cjs/react-jsx-runtime.development.js +2.09% 43.56 kB 44.47 kB +3.06% 12.70 kB 13.09 kB
oss-stable-semver/react/cjs/react.shared-subset.development.js +1.08% 84.33 kB 85.24 kB +1.63% 23.37 kB 23.75 kB
oss-stable/react/cjs/react.shared-subset.development.js +1.08% 84.36 kB 85.27 kB +1.65% 23.40 kB 23.78 kB
oss-experimental/react/cjs/react.shared-subset.development.js +1.06% 85.70 kB 86.61 kB +1.71% 23.79 kB 24.19 kB
oss-stable-semver/react/cjs/react.development.js +0.92% 99.40 kB 100.31 kB +1.46% 26.83 kB 27.22 kB
oss-stable/react/cjs/react.development.js +0.92% 99.42 kB 100.33 kB +1.47% 26.86 kB 27.25 kB
oss-experimental/react/cjs/react.development.js +0.90% 101.58 kB 102.49 kB +1.46% 27.41 kB 27.81 kB
oss-stable-semver/react/umd/react.development.js +0.77% 122.95 kB 123.90 kB +1.27% 31.71 kB 32.11 kB
oss-stable/react/umd/react.development.js +0.77% 122.97 kB 123.92 kB +1.27% 31.73 kB 32.14 kB
oss-experimental/react/umd/react.development.js +0.76% 125.22 kB 126.17 kB +1.23% 32.29 kB 32.68 kB
facebook-www/React-dev.modern.js +0.74% 126.66 kB 127.61 kB +1.24% 33.45 kB 33.87 kB
facebook-www/React-dev.classic.js +0.74% 127.76 kB 128.70 kB +1.23% 33.70 kB 34.11 kB
oss-stable-semver/react-server/cjs/react-server.development.js +0.62% 146.57 kB 147.48 kB +1.10% 36.53 kB 36.93 kB
oss-stable/react-server/cjs/react-server.development.js +0.62% 146.57 kB 147.48 kB +1.10% 36.53 kB 36.93 kB
oss-experimental/react-server/cjs/react-server.development.js +0.62% 147.75 kB 148.66 kB +1.10% 36.82 kB 37.22 kB
facebook-www/ReactDOMServerStreaming-dev.modern.js +0.26% 367.98 kB 368.92 kB +0.53% 79.92 kB 80.35 kB
facebook-www/ReactART-prod.modern.js +0.26% 338.89 kB 339.76 kB +0.31% 57.62 kB 57.80 kB
oss-stable-semver/react-dom/cjs/react-dom-server.bun.development.js +0.25% 358.70 kB 359.61 kB +0.54% 79.13 kB 79.55 kB
oss-stable/react-dom/cjs/react-dom-server.bun.development.js +0.25% 358.73 kB 359.64 kB +0.54% 79.16 kB 79.58 kB
facebook-www/ReactDOMServer-dev.modern.js +0.25% 373.16 kB 374.10 kB +0.53% 81.21 kB 81.64 kB
oss-stable-semver/react-dom/cjs/react-dom-server.browser.development.js +0.25% 361.46 kB 362.37 kB +0.53% 80.04 kB 80.46 kB
oss-stable/react-dom/cjs/react-dom-server.browser.development.js +0.25% 361.48 kB 362.39 kB +0.53% 80.06 kB 80.49 kB
oss-stable-semver/react-dom/cjs/react-dom-server-legacy.browser.development.js +0.25% 361.72 kB 362.63 kB +0.52% 79.74 kB 80.15 kB
oss-stable/react-dom/cjs/react-dom-server-legacy.browser.development.js +0.25% 361.75 kB 362.66 kB +0.52% 79.77 kB 80.18 kB
oss-stable-semver/react-dom/cjs/react-dom-server.edge.development.js +0.25% 361.87 kB 362.78 kB +0.53% 80.16 kB 80.58 kB
oss-stable/react-dom/cjs/react-dom-server.edge.development.js +0.25% 361.89 kB 362.80 kB +0.53% 80.19 kB 80.61 kB
oss-stable-semver/react-dom/umd/react-dom-server.browser.development.js +0.25% 378.58 kB 379.53 kB +0.51% 80.99 kB 81.41 kB
oss-stable/react-dom/umd/react-dom-server.browser.development.js +0.25% 378.61 kB 379.56 kB +0.51% 81.02 kB 81.44 kB
oss-stable-semver/react-dom/umd/react-dom-server-legacy.browser.development.js +0.25% 378.86 kB 379.82 kB +0.52% 80.68 kB 81.10 kB
oss-stable/react-dom/umd/react-dom-server-legacy.browser.development.js +0.25% 378.89 kB 379.84 kB +0.52% 80.71 kB 81.13 kB
oss-stable-semver/react-dom/cjs/react-dom-server.node.development.js +0.25% 362.95 kB 363.87 kB +0.53% 80.08 kB 80.51 kB
oss-stable/react-dom/cjs/react-dom-server.node.development.js +0.25% 362.98 kB 363.89 kB +0.53% 80.11 kB 80.54 kB
oss-stable-semver/react-dom/cjs/react-dom-server-legacy.node.development.js +0.25% 363.54 kB 364.45 kB +0.51% 80.20 kB 80.61 kB
oss-stable/react-dom/cjs/react-dom-server-legacy.node.development.js +0.25% 363.57 kB 364.48 kB +0.51% 80.23 kB 80.64 kB
facebook-www/ReactDOMServer-dev.classic.js +0.25% 380.58 kB 381.52 kB +0.52% 82.85 kB 83.28 kB
oss-experimental/react-dom/cjs/react-dom-server.bun.development.js +0.25% 368.48 kB 369.39 kB +0.52% 81.34 kB 81.76 kB
facebook-www/ReactART-prod.classic.js +0.25% 349.89 kB 350.75 kB +0.29% 59.53 kB 59.71 kB
oss-experimental/react-dom/cjs/react-dom-static.browser.development.js +0.25% 369.96 kB 370.88 kB +0.51% 81.83 kB 82.25 kB
oss-experimental/react-dom/cjs/react-dom-static.edge.development.js +0.25% 370.37 kB 371.29 kB +0.51% 81.94 kB 82.36 kB
oss-experimental/react-dom/cjs/react-dom-server.browser.development.js +0.25% 370.66 kB 371.57 kB +0.51% 82.01 kB 82.43 kB
oss-experimental/react-dom/cjs/react-dom-server.edge.development.js +0.25% 371.07 kB 371.98 kB +0.51% 82.12 kB 82.55 kB
oss-experimental/react-dom/cjs/react-dom-server-legacy.browser.development.js +0.25% 371.51 kB 372.42 kB +0.48% 81.94 kB 82.34 kB
oss-experimental/react-dom/umd/react-dom-server.browser.development.js +0.24% 388.20 kB 389.15 kB +0.50% 83.21 kB 83.63 kB
oss-experimental/react-dom/cjs/react-dom-static.node.development.js +0.24% 372.12 kB 373.03 kB +0.51% 82.15 kB 82.57 kB
oss-experimental/react-dom/cjs/react-dom-server.node.development.js +0.24% 372.16 kB 373.07 kB +0.51% 82.06 kB 82.48 kB
oss-experimental/react-dom/umd/react-dom-server-legacy.browser.development.js +0.24% 389.06 kB 390.02 kB +0.50% 83.11 kB 83.53 kB
oss-experimental/react-dom/cjs/react-dom-server-legacy.node.development.js +0.24% 373.32 kB 374.23 kB +0.48% 82.41 kB 82.80 kB

Generated by 🚫 dangerJS against b820528

@KarimP KarimP closed this Jun 23, 2023
@KarimP KarimP deleted the component-stack-hermes-fix branch July 12, 2023 21:47
mofeiZ pushed a commit that referenced this pull request Nov 8, 2023
## Summary

There's a bug with the existing stack comparison algorithm in
`describeNativeComponentFrame` — specifically how it attempts to find a
common root frame between the control and sample stacks. This PR
attempts to fix that bug by injecting a frame that can have a guaranteed
string in it for us to search for in both stacks to find a common root.

## Brief Background/How it works now

Right now `describeNativeComponentFrame` does the following to leverage
native browser/VM stack frames to get details (e.g. script path, row and
col #s) for a single component:
1. Throwing and catching a control error in the function
2. Calling the component which should eventually throw an error (most of
the time), that we'll catch as our sample error.
3. Diffing the stacks in the control and sample errors to find the line
which should represent our component call.

## What's broken

To account for potential stack trace truncation, the stack diffing
algorithm first attempts to find a common "root" frame by inspecting the
earliest frame of the sample stack and searching for an identical frame
in the control stack starting from the bottom. However, there are a
couple of scenarios which I've hit that cause the above approach to not
work correctly.

First, it's possible that for render passes of extremely large component
trees to have a lot of repeating internal react function calls, which
can result in an incorrect common or "root" frame found. Here's a small
example from a stack trace using React Fizz for SSR.
Our control frame can look like this:
```
Error:
    at Fake (...)
    at construct (native)
    at describeNativeComponentFrame (...)
    at describeClassComponentFrame (...)
    at getStackByComponentStackNode (...)
    at getCurrentStackInDEV (...)
    at renderNodeDestructive (...)
    at renderElement (...)
    at renderNodeDestructiveImpl (...) // <-- Actual common root frame with the sample stack
    at renderNodeDestructive (...)
    at renderElement (...)
    at renderNodeDestructiveImpl (...) // <-- Incorrectly chosen common root frame
    at renderNodeDestructive (...)
```

And our sample stack can look like this:
```
Error:
    at set (...)
    at PureComponent (...)
    at call (native)
    at apply (native)
    at ErrorBoundary (...)
    at construct (native)
    at describeNativeComponentFrame (...)
    at describeClassComponentFrame (...)
    at getStackByComponentStackNode (...)
    at getCurrentStackInDEV (...)
    at renderNodeDestructive (...)
    at renderElement (...)
    at renderNodeDestructiveImpl (...) // <-- Root frame that's common in the control stack
```

Here you can see that the earliest trace in the sample stack, the
`renderNodeDestructiveImpl` call, can exactly match with multiple
`renderNodeDestructiveImpl` calls in the control stack (including file
path and line + col #s). Currently the algorithm will chose the
earliest/last frame with the `renderNodeDestructiveImpl` call (which is
the second last frame in our control stack), which is incorrect. The
actual matching frame in the control stack is the latest or first frame
(when traversing from the top) with the `renderNodeDestructiveImpl`
call. This leads to the rest of the stack diffing associating an
incorrect frame (`at getStackByComponentStackNode (...)`) for the
component.

Another issue with this approach is that it assumes all VMs will
truncate stack traces at the *bottom*, [which isn't the case for the
Hermes
VM](https://github.com/facebook/hermes/blob/df07cf713a84a4434c83c08cede38ba438dc6aca/lib/VM/JSError.cpp#L688-L699)
which **truncates stack traces in the middle**, placing a

```
    at renderNodeDestructiveImpl (...)
    ... skipping {n} frames
    at renderNodeDestructive (...)
```

line in the middle of the stack trace for all stacks that contain more
than 100 traces. This causes stack traces for React Native apps using
the Hermes VM to potentially break for large component trees. Although
for this specific case with Hermes, it's possible to account for this by
either manually grepping and removing the `... skipping` line and
everything below it (see draft PR: #26999), or by implementing the
non-standard `prepareStackTrace` API which Hermes also supports to
manually generate a stack trace that truncates from the bottom ([example
implementation](main...KarimP:react:component-stack-hermes-fix)).

## The Fix

I found different ways to go about fixing this. The first was to search
for a common stack frame starting from the top/latest frame. It's a
relatively small change ([see
implementation](main...KarimP:react:component-stack-fix-2)),
although it is less performant by being n^2 (albeit with `n`
realistically being <= 5 here). It's also a bit more buggy for class
components given that different VMs insert a different amount of
additional lines for new/construct calls...

Another fix would be to actually implement a [longest common
substring](https://en.wikipedia.org/wiki/Longest_common_substring)
algorithm, which can also be roughly n^2 time (assuming the longest
common substring between control and sample will be most of the sample
frame).

The fix I ended up going with was have the lines that throw the control
error and the lines that call/instantiate the component be inside a
distinct method under an object property
(`"DescribeNativeComponentFrameRoot"`). All major VMs (Safari's
JavaScriptCore, Firefox's SpiderMonkey, V8, Hermes, and Bun) should
display the object property name their stack trace. I've also set the
`name` and `displayName` properties for method as well to account for
minification, any advanced optimizations (e.g. key crushing), and VM
inconsistencies (both Bun and Safari seem to exclusively use the value
under `displayName` and not `name` in traces for methods defined under
an object's own property...).

We can then find this "common" frame by simply finding the line that has
our special method name (`"DescribeNativeComponentFrameRoot"`), and the
rest of the code to determine the actual component line works as
expected. If by any chance we don't find a frame with our special method
name in either control or sample stack traces, we then revert back to
the existing approach mentioned above by searching for the last line of
the sample frame in the control frame.

## How did you test this change?

1. There are bunch of existing tests that ensure a properly formatted
component trace is logged for certain scenarios, so I ensured the
existing full test suite passed
2. I threw an error in a component that's deep in the component
hierarchy of a large React app (facebook) to ensure there's stack trace
truncation, and ensured the correct component stack trace was logged for
Chrome, Safari, and Firefox, and with and without minification.
3. Ran a large React app (facebook) on the Hermes VM, threw an error in
a component that's deep in the component hierarchy, and ensured that
component frames are generated despite stack traces being truncated in
the middle.
github-actions bot pushed a commit that referenced this pull request Nov 8, 2023
## Summary

There's a bug with the existing stack comparison algorithm in
`describeNativeComponentFrame` — specifically how it attempts to find a
common root frame between the control and sample stacks. This PR
attempts to fix that bug by injecting a frame that can have a guaranteed
string in it for us to search for in both stacks to find a common root.

## Brief Background/How it works now

Right now `describeNativeComponentFrame` does the following to leverage
native browser/VM stack frames to get details (e.g. script path, row and
col #s) for a single component:
1. Throwing and catching a control error in the function
2. Calling the component which should eventually throw an error (most of
the time), that we'll catch as our sample error.
3. Diffing the stacks in the control and sample errors to find the line
which should represent our component call.

## What's broken

To account for potential stack trace truncation, the stack diffing
algorithm first attempts to find a common "root" frame by inspecting the
earliest frame of the sample stack and searching for an identical frame
in the control stack starting from the bottom. However, there are a
couple of scenarios which I've hit that cause the above approach to not
work correctly.

First, it's possible that for render passes of extremely large component
trees to have a lot of repeating internal react function calls, which
can result in an incorrect common or "root" frame found. Here's a small
example from a stack trace using React Fizz for SSR.
Our control frame can look like this:
```
Error:
    at Fake (...)
    at construct (native)
    at describeNativeComponentFrame (...)
    at describeClassComponentFrame (...)
    at getStackByComponentStackNode (...)
    at getCurrentStackInDEV (...)
    at renderNodeDestructive (...)
    at renderElement (...)
    at renderNodeDestructiveImpl (...) // <-- Actual common root frame with the sample stack
    at renderNodeDestructive (...)
    at renderElement (...)
    at renderNodeDestructiveImpl (...) // <-- Incorrectly chosen common root frame
    at renderNodeDestructive (...)
```

And our sample stack can look like this:
```
Error:
    at set (...)
    at PureComponent (...)
    at call (native)
    at apply (native)
    at ErrorBoundary (...)
    at construct (native)
    at describeNativeComponentFrame (...)
    at describeClassComponentFrame (...)
    at getStackByComponentStackNode (...)
    at getCurrentStackInDEV (...)
    at renderNodeDestructive (...)
    at renderElement (...)
    at renderNodeDestructiveImpl (...) // <-- Root frame that's common in the control stack
```

Here you can see that the earliest trace in the sample stack, the
`renderNodeDestructiveImpl` call, can exactly match with multiple
`renderNodeDestructiveImpl` calls in the control stack (including file
path and line + col #s). Currently the algorithm will chose the
earliest/last frame with the `renderNodeDestructiveImpl` call (which is
the second last frame in our control stack), which is incorrect. The
actual matching frame in the control stack is the latest or first frame
(when traversing from the top) with the `renderNodeDestructiveImpl`
call. This leads to the rest of the stack diffing associating an
incorrect frame (`at getStackByComponentStackNode (...)`) for the
component.

Another issue with this approach is that it assumes all VMs will
truncate stack traces at the *bottom*, [which isn't the case for the
Hermes
VM](https://github.com/facebook/hermes/blob/df07cf713a84a4434c83c08cede38ba438dc6aca/lib/VM/JSError.cpp#L688-L699)
which **truncates stack traces in the middle**, placing a

```
    at renderNodeDestructiveImpl (...)
    ... skipping {n} frames
    at renderNodeDestructive (...)
```

line in the middle of the stack trace for all stacks that contain more
than 100 traces. This causes stack traces for React Native apps using
the Hermes VM to potentially break for large component trees. Although
for this specific case with Hermes, it's possible to account for this by
either manually grepping and removing the `... skipping` line and
everything below it (see draft PR: #26999), or by implementing the
non-standard `prepareStackTrace` API which Hermes also supports to
manually generate a stack trace that truncates from the bottom ([example
implementation](main...KarimP:react:component-stack-hermes-fix)).

## The Fix

I found different ways to go about fixing this. The first was to search
for a common stack frame starting from the top/latest frame. It's a
relatively small change ([see
implementation](main...KarimP:react:component-stack-fix-2)),
although it is less performant by being n^2 (albeit with `n`
realistically being <= 5 here). It's also a bit more buggy for class
components given that different VMs insert a different amount of
additional lines for new/construct calls...

Another fix would be to actually implement a [longest common
substring](https://en.wikipedia.org/wiki/Longest_common_substring)
algorithm, which can also be roughly n^2 time (assuming the longest
common substring between control and sample will be most of the sample
frame).

The fix I ended up going with was have the lines that throw the control
error and the lines that call/instantiate the component be inside a
distinct method under an object property
(`"DescribeNativeComponentFrameRoot"`). All major VMs (Safari's
JavaScriptCore, Firefox's SpiderMonkey, V8, Hermes, and Bun) should
display the object property name their stack trace. I've also set the
`name` and `displayName` properties for method as well to account for
minification, any advanced optimizations (e.g. key crushing), and VM
inconsistencies (both Bun and Safari seem to exclusively use the value
under `displayName` and not `name` in traces for methods defined under
an object's own property...).

We can then find this "common" frame by simply finding the line that has
our special method name (`"DescribeNativeComponentFrameRoot"`), and the
rest of the code to determine the actual component line works as
expected. If by any chance we don't find a frame with our special method
name in either control or sample stack traces, we then revert back to
the existing approach mentioned above by searching for the last line of
the sample frame in the control frame.

## How did you test this change?

1. There are bunch of existing tests that ensure a properly formatted
component trace is logged for certain scenarios, so I ensured the
existing full test suite passed
2. I threw an error in a component that's deep in the component
hierarchy of a large React app (facebook) to ensure there's stack trace
truncation, and ensured the correct component stack trace was logged for
Chrome, Safari, and Firefox, and with and without minification.
3. Ran a large React app (facebook) on the Hermes VM, threw an error in
a component that's deep in the component hierarchy, and ensured that
component frames are generated despite stack traces being truncated in
the middle.

DiffTrain build for [88b00de](88b00de)
EdisonVan pushed a commit to EdisonVan/react that referenced this pull request Apr 15, 2024
…ook#27132)

## Summary

There's a bug with the existing stack comparison algorithm in
`describeNativeComponentFrame` — specifically how it attempts to find a
common root frame between the control and sample stacks. This PR
attempts to fix that bug by injecting a frame that can have a guaranteed
string in it for us to search for in both stacks to find a common root.

## Brief Background/How it works now

Right now `describeNativeComponentFrame` does the following to leverage
native browser/VM stack frames to get details (e.g. script path, row and
col #s) for a single component:
1. Throwing and catching a control error in the function
2. Calling the component which should eventually throw an error (most of
the time), that we'll catch as our sample error.
3. Diffing the stacks in the control and sample errors to find the line
which should represent our component call.

## What's broken

To account for potential stack trace truncation, the stack diffing
algorithm first attempts to find a common "root" frame by inspecting the
earliest frame of the sample stack and searching for an identical frame
in the control stack starting from the bottom. However, there are a
couple of scenarios which I've hit that cause the above approach to not
work correctly.

First, it's possible that for render passes of extremely large component
trees to have a lot of repeating internal react function calls, which
can result in an incorrect common or "root" frame found. Here's a small
example from a stack trace using React Fizz for SSR.
Our control frame can look like this:
```
Error:
    at Fake (...)
    at construct (native)
    at describeNativeComponentFrame (...)
    at describeClassComponentFrame (...)
    at getStackByComponentStackNode (...)
    at getCurrentStackInDEV (...)
    at renderNodeDestructive (...)
    at renderElement (...)
    at renderNodeDestructiveImpl (...) // <-- Actual common root frame with the sample stack
    at renderNodeDestructive (...)
    at renderElement (...)
    at renderNodeDestructiveImpl (...) // <-- Incorrectly chosen common root frame
    at renderNodeDestructive (...)
```

And our sample stack can look like this:
```
Error:
    at set (...)
    at PureComponent (...)
    at call (native)
    at apply (native)
    at ErrorBoundary (...)
    at construct (native)
    at describeNativeComponentFrame (...)
    at describeClassComponentFrame (...)
    at getStackByComponentStackNode (...)
    at getCurrentStackInDEV (...)
    at renderNodeDestructive (...)
    at renderElement (...)
    at renderNodeDestructiveImpl (...) // <-- Root frame that's common in the control stack
```

Here you can see that the earliest trace in the sample stack, the
`renderNodeDestructiveImpl` call, can exactly match with multiple
`renderNodeDestructiveImpl` calls in the control stack (including file
path and line + col #s). Currently the algorithm will chose the
earliest/last frame with the `renderNodeDestructiveImpl` call (which is
the second last frame in our control stack), which is incorrect. The
actual matching frame in the control stack is the latest or first frame
(when traversing from the top) with the `renderNodeDestructiveImpl`
call. This leads to the rest of the stack diffing associating an
incorrect frame (`at getStackByComponentStackNode (...)`) for the
component.

Another issue with this approach is that it assumes all VMs will
truncate stack traces at the *bottom*, [which isn't the case for the
Hermes
VM](https://github.com/facebook/hermes/blob/df07cf713a84a4434c83c08cede38ba438dc6aca/lib/VM/JSError.cpp#L688-L699)
which **truncates stack traces in the middle**, placing a

```
    at renderNodeDestructiveImpl (...)
    ... skipping {n} frames
    at renderNodeDestructive (...)
```

line in the middle of the stack trace for all stacks that contain more
than 100 traces. This causes stack traces for React Native apps using
the Hermes VM to potentially break for large component trees. Although
for this specific case with Hermes, it's possible to account for this by
either manually grepping and removing the `... skipping` line and
everything below it (see draft PR: facebook#26999), or by implementing the
non-standard `prepareStackTrace` API which Hermes also supports to
manually generate a stack trace that truncates from the bottom ([example
implementation](facebook/react@main...KarimP:react:component-stack-hermes-fix)).

## The Fix

I found different ways to go about fixing this. The first was to search
for a common stack frame starting from the top/latest frame. It's a
relatively small change ([see
implementation](facebook/react@main...KarimP:react:component-stack-fix-2)),
although it is less performant by being n^2 (albeit with `n`
realistically being <= 5 here). It's also a bit more buggy for class
components given that different VMs insert a different amount of
additional lines for new/construct calls...

Another fix would be to actually implement a [longest common
substring](https://en.wikipedia.org/wiki/Longest_common_substring)
algorithm, which can also be roughly n^2 time (assuming the longest
common substring between control and sample will be most of the sample
frame).

The fix I ended up going with was have the lines that throw the control
error and the lines that call/instantiate the component be inside a
distinct method under an object property
(`"DescribeNativeComponentFrameRoot"`). All major VMs (Safari's
JavaScriptCore, Firefox's SpiderMonkey, V8, Hermes, and Bun) should
display the object property name their stack trace. I've also set the
`name` and `displayName` properties for method as well to account for
minification, any advanced optimizations (e.g. key crushing), and VM
inconsistencies (both Bun and Safari seem to exclusively use the value
under `displayName` and not `name` in traces for methods defined under
an object's own property...).

We can then find this "common" frame by simply finding the line that has
our special method name (`"DescribeNativeComponentFrameRoot"`), and the
rest of the code to determine the actual component line works as
expected. If by any chance we don't find a frame with our special method
name in either control or sample stack traces, we then revert back to
the existing approach mentioned above by searching for the last line of
the sample frame in the control frame.

## How did you test this change?

1. There are bunch of existing tests that ensure a properly formatted
component trace is logged for certain scenarios, so I ensured the
existing full test suite passed
2. I threw an error in a component that's deep in the component
hierarchy of a large React app (facebook) to ensure there's stack trace
truncation, and ensured the correct component stack trace was logged for
Chrome, Safari, and Firefox, and with and without minification.
3. Ran a large React app (facebook) on the Hermes VM, threw an error in
a component that's deep in the component hierarchy, and ensured that
component frames are generated despite stack traces being truncated in
the middle.
bigfootjon pushed a commit that referenced this pull request Apr 18, 2024
## Summary

There's a bug with the existing stack comparison algorithm in
`describeNativeComponentFrame` — specifically how it attempts to find a
common root frame between the control and sample stacks. This PR
attempts to fix that bug by injecting a frame that can have a guaranteed
string in it for us to search for in both stacks to find a common root.

## Brief Background/How it works now

Right now `describeNativeComponentFrame` does the following to leverage
native browser/VM stack frames to get details (e.g. script path, row and
col #s) for a single component:
1. Throwing and catching a control error in the function
2. Calling the component which should eventually throw an error (most of
the time), that we'll catch as our sample error.
3. Diffing the stacks in the control and sample errors to find the line
which should represent our component call.

## What's broken

To account for potential stack trace truncation, the stack diffing
algorithm first attempts to find a common "root" frame by inspecting the
earliest frame of the sample stack and searching for an identical frame
in the control stack starting from the bottom. However, there are a
couple of scenarios which I've hit that cause the above approach to not
work correctly.

First, it's possible that for render passes of extremely large component
trees to have a lot of repeating internal react function calls, which
can result in an incorrect common or "root" frame found. Here's a small
example from a stack trace using React Fizz for SSR.
Our control frame can look like this:
```
Error:
    at Fake (...)
    at construct (native)
    at describeNativeComponentFrame (...)
    at describeClassComponentFrame (...)
    at getStackByComponentStackNode (...)
    at getCurrentStackInDEV (...)
    at renderNodeDestructive (...)
    at renderElement (...)
    at renderNodeDestructiveImpl (...) // <-- Actual common root frame with the sample stack
    at renderNodeDestructive (...)
    at renderElement (...)
    at renderNodeDestructiveImpl (...) // <-- Incorrectly chosen common root frame
    at renderNodeDestructive (...)
```

And our sample stack can look like this:
```
Error:
    at set (...)
    at PureComponent (...)
    at call (native)
    at apply (native)
    at ErrorBoundary (...)
    at construct (native)
    at describeNativeComponentFrame (...)
    at describeClassComponentFrame (...)
    at getStackByComponentStackNode (...)
    at getCurrentStackInDEV (...)
    at renderNodeDestructive (...)
    at renderElement (...)
    at renderNodeDestructiveImpl (...) // <-- Root frame that's common in the control stack
```

Here you can see that the earliest trace in the sample stack, the
`renderNodeDestructiveImpl` call, can exactly match with multiple
`renderNodeDestructiveImpl` calls in the control stack (including file
path and line + col #s). Currently the algorithm will chose the
earliest/last frame with the `renderNodeDestructiveImpl` call (which is
the second last frame in our control stack), which is incorrect. The
actual matching frame in the control stack is the latest or first frame
(when traversing from the top) with the `renderNodeDestructiveImpl`
call. This leads to the rest of the stack diffing associating an
incorrect frame (`at getStackByComponentStackNode (...)`) for the
component.

Another issue with this approach is that it assumes all VMs will
truncate stack traces at the *bottom*, [which isn't the case for the
Hermes
VM](https://github.com/facebook/hermes/blob/df07cf713a84a4434c83c08cede38ba438dc6aca/lib/VM/JSError.cpp#L688-L699)
which **truncates stack traces in the middle**, placing a

```
    at renderNodeDestructiveImpl (...)
    ... skipping {n} frames
    at renderNodeDestructive (...)
```

line in the middle of the stack trace for all stacks that contain more
than 100 traces. This causes stack traces for React Native apps using
the Hermes VM to potentially break for large component trees. Although
for this specific case with Hermes, it's possible to account for this by
either manually grepping and removing the `... skipping` line and
everything below it (see draft PR: #26999), or by implementing the
non-standard `prepareStackTrace` API which Hermes also supports to
manually generate a stack trace that truncates from the bottom ([example
implementation](main...KarimP:react:component-stack-hermes-fix)).

## The Fix

I found different ways to go about fixing this. The first was to search
for a common stack frame starting from the top/latest frame. It's a
relatively small change ([see
implementation](main...KarimP:react:component-stack-fix-2)),
although it is less performant by being n^2 (albeit with `n`
realistically being <= 5 here). It's also a bit more buggy for class
components given that different VMs insert a different amount of
additional lines for new/construct calls...

Another fix would be to actually implement a [longest common
substring](https://en.wikipedia.org/wiki/Longest_common_substring)
algorithm, which can also be roughly n^2 time (assuming the longest
common substring between control and sample will be most of the sample
frame).

The fix I ended up going with was have the lines that throw the control
error and the lines that call/instantiate the component be inside a
distinct method under an object property
(`"DescribeNativeComponentFrameRoot"`). All major VMs (Safari's
JavaScriptCore, Firefox's SpiderMonkey, V8, Hermes, and Bun) should
display the object property name their stack trace. I've also set the
`name` and `displayName` properties for method as well to account for
minification, any advanced optimizations (e.g. key crushing), and VM
inconsistencies (both Bun and Safari seem to exclusively use the value
under `displayName` and not `name` in traces for methods defined under
an object's own property...).

We can then find this "common" frame by simply finding the line that has
our special method name (`"DescribeNativeComponentFrameRoot"`), and the
rest of the code to determine the actual component line works as
expected. If by any chance we don't find a frame with our special method
name in either control or sample stack traces, we then revert back to
the existing approach mentioned above by searching for the last line of
the sample frame in the control frame.

## How did you test this change?

1. There are bunch of existing tests that ensure a properly formatted
component trace is logged for certain scenarios, so I ensured the
existing full test suite passed
2. I threw an error in a component that's deep in the component
hierarchy of a large React app (facebook) to ensure there's stack trace
truncation, and ensured the correct component stack trace was logged for
Chrome, Safari, and Firefox, and with and without minification.
3. Ran a large React app (facebook) on the Hermes VM, threw an error in
a component that's deep in the component hierarchy, and ensured that
component frames are generated despite stack traces being truncated in
the middle.

DiffTrain build for commit 88b00de.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants