Add serializer mode for test262 execution #2290
Conversation
Regarding "all native function values should be intrinsics": There was a discussion on that here #1285 --- maybe that issue got closed prematurely. I don't think this is terribly relevant, might just be triggered by some common test code path. |
That's not good. However, your examples seem to all involve classes, which wouldn't surprise me as that's an undertested feature (it's not ES5), and comes with an open issue mentioning some of the things that need to happen, possibly unrelated to the visiting/serialization bug you found: #1317. |
Hm, yes, classes... At the very least, Prepack should give up when encountering not actually supported class features. Great to see that we now have a way to detect them! |
“Lots” might be a bit of an exaggeration as you note, @NTillmann. More like “lots” relative to 5% of tests that failed. A 95% pass rate is pretty good. |
What's the change in pass rate when you exclude tests that involve classes? |
scripts/test262-runner.js
Outdated
} | ||
|
||
const context = vm.createContext({ | ||
// NOTE: Workaround since Prepack serializes code that expects a global `TypedArray` class which does not exist per |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it exists in JSC? Anyway, sounds like a bug that deserves an issue number and a TODO...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#2292 👍
return "Recover"; | ||
}, | ||
onParse: () => { | ||
// TODO(calebmer): Turn all literals into abstract values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess behind some other option(s) that would allow to different levels of fuzzing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am surprised that it only takes 5 minutes with the serializer in the loop.
95% still. I added a basic check to exclude source files that include
I also opened some issues for problems I thought were actionable and not class related. (Since class serialization is known to have issues.)
None seem super related to React code, but #2293 might be. Haven’t audited all the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@calebmer is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Summary: I extended the `--serializer` command line argument I added in #2290 to now support `--serializer abstract-scalar`. What this mode does is it converts all boolean, string, number, and symbol literals into abstract values. I did not choose to extend this logic to object and array literals just yet since scalars alone showed some interesting results. What I really want here is a review of the results. Full suite execution results are real bad. **18%** pass rate. I dug a bit into why. ``` === RESULTS === Passes: 3356 / 17780 (18%) ES5 passes: 2276 / 12045 (18%) ES6 passes: 1080 / 5735 (18%) Skipped: 13375 Timeouts: 28 ``` I was mostly interested in the runtime failures we see since that means Prepack is serializing invalid code. However, I found ~14k failures in the Prepack stage (more on this in a bit) and ~3k failures in the runtime stage. This means ~80% of tests _fail to compile_ with this abstract transformation applied. Why are these tests failing? I took the first 4 items of the stack traces from errors thrown in the Prepack stage, sorted, and ranked them. [Here’s the result.](https://gist.github.com/calebmer/29e27613325fd99fa04be7ab4a9641c0) The top 5 with thousands of hits are: ``` 7538 of: at AbstractValue.throwIfNotConcrete (/Users/calebmer/prepack/src/values/AbstractValue.js:536:11) at ToImplementation.ToStringPartial (/Users/calebmer/prepack/src/methods/to.js:717:69) at NativeFunctionValue._index.NativeFunctionValue [as callback] (/Users/calebmer/prepack/src/intrinsics/ecma262/String.js:34:37) at NativeFunctionValue.callCallback (/Users/calebmer/prepack/src/values/NativeFunctionValue.js:121:12) 4595 of: at AbstractValue.throwIfNotConcrete (/Users/calebmer/prepack/src/values/AbstractValue.js:536:11) at NativeFunctionValue.func.defineNativeMethod [as callback] (/Users/calebmer/prepack/src/intrinsics/ecma262/Object.js:328:41) at NativeFunctionValue.callCallback (/Users/calebmer/prepack/src/values/NativeFunctionValue.js:121:12) at functionCall (/Users/calebmer/prepack/src/methods/call.js:308:26) 1454 of: at AbstractValue.throwIfNotConcrete (/Users/calebmer/prepack/src/values/AbstractValue.js:536:11) at NativeFunctionValue.func.defineNativeMethod [as callback] (/Users/calebmer/prepack/src/intrinsics/ecma262/Object.js:364:41) at NativeFunctionValue.callCallback (/Users/calebmer/prepack/src/values/NativeFunctionValue.js:121:12) at functionCall (/Users/calebmer/prepack/src/methods/call.js:308:26) 1351 of: at invariant (/Users/calebmer/prepack/src/invariant.js:18:15) at EvalPropertyNamePartial (/Users/calebmer/prepack/src/evaluators/ObjectExpression.js:59:7) at _default (/Users/calebmer/prepack/src/evaluators/ObjectExpression.js:80:21) at LexicalEnvironment.evaluateAbstract (/Users/calebmer/prepack/src/environment.js:1368:20) 1053 of: at AbstractValue.throwIfNotConcrete (/Users/calebmer/prepack/src/values/AbstractValue.js:536:11) at NativeFunctionValue.obj.defineNativeMethod [as callback] (/Users/calebmer/prepack/src/intrinsics/ecma262/ObjectPrototype.js:35:39) at NativeFunctionValue.callCallback (/Users/calebmer/prepack/src/values/NativeFunctionValue.js:121:12) at functionCall (/Users/calebmer/prepack/src/methods/call.js:308:26) ``` This means there may be some low hanging fruit. Here are my questions for you. - Did you expect results like this? - What is our ideal test262 pass rate with this transformation applied? - What happens to React Compiler or other projects when these errors are thrown? (As I understand it, we bail out and don’t optimize the code, but do optimize the code around it.) - Do you think my methodology is flawed? It’s also possible that something in my methodology is wrong, but I didn’t spend much time investigating these failures as I spent investigating the failures I found in #2290. My goal with this test suite is to build an understanding of what “correctness” for the React Compiler against all JavaScript code looks like. (Not just the few bundles we’ve selected to look at.) I don’t think these results suggest that we only safely compile 18% of the language, but it’s a data point. I’ll be looking into fixing a selection of these issues to better understand their nature or if I need to change methodologies. Pull Request resolved: #2297 Differential Revision: D9120572 Pulled By: calebmer fbshipit-source-id: b394f1e8da034c9985366010e3e63fd55fd94168
I want to better understand the bugs we have in abstract evaluation. One of my ideas to do this is to replace all the literals in test262 tests with abstract values and see what our test coverage is. The first step to do this is to serialize the test262 sources after running them in Prepack and checking if their generated JavaScript runs correctly. This PR does that by adding a
--serializer
flag.With my methodology, all test262 harnesses are serialized every time since they are in the global scope. This unfortunately seems to be unavoidable since doing otherwise would change program semantics and break the tests.
Running
time yarn test-test262
takes about 2 minutes and has a 98% pass rate. Runningtime yarn test-test262 --serializer
takes about 5 minutes and has a 95% pass rate. Here’s a diff of the two results.Here is a selection of some of the bugs in the serializer caught by running test262 with the Prepack serializer. I might open issues for these, but they can be a bit pedantic. I might fix some of them depending on how important they are to the React code we want to compile.
Notably we have:
07.md
.)01.md
.)I’ll build off this PR to get coverage for the abstract evaluator next, but I found these results interesting on their own so decided to do this in two PRs.