update test262 +test262.properties file (re)generator #930

p-bakker · 2021-06-16T13:33:47Z

As we now have Template Literal support, I wanted to update the version of the test2626 TestSuite we run against to the latest commit in master.

As I didn't want to manually update test262.properties accordingly, I set out to write some code to generate test262.properties from the results of running the TestSuite at the standard optLevels (-1, 0 and 9).

In order to make things easier, I changed the format a bit:

removed the ! in front of every .js file: if the file is listed in test262.properties, it's already an exclusion, so the ! is unneeded
This step makes the directory/file path a key in the Java Properties file format, instead of a commented line. As a result, it's now possible to add comments after the path and if your editor supports the Java Properties file format, you get nice colors
changed from # to ~ for excluding a directory
support for comments after the folder/.js file name

I've also changed the behavior a bit:

removed the logic that allowed excluding a similarly named file in multiple subfolders, by excluding the parent path
While excluding this logic made it easier to (re)generate the properties file, I think it was also very confusing behavior to begin with
on .js files that only fail in some strict-mode/optLevel combinations, details which variations fail is added as the first comment:
- strict: fails only in strict mode on all optLevels
- non-strict: fails only in non-strict mode on all optLevels
- non-interpreted: always passes in interpreted mode, but fails on all other optLevels
- {unsupported: [...]}: skipped due to depending on unsupported features
- {strict: [1,9], non-strict: [9]}: any other scenario not matching the ones above
if all tests fail for a subfolder (i.e. built-in/Array/prototype/xxxx), the .js files can be omitted and replaced by an * in front of the subfolder
all tests that fail due to depending on unsupported features are included (by default), in order to:
- have a full picture
- easier to comprehend the delta between the old and new .properties file when regenerating it after updating the version of the test262 Test Suite
on (rolled-up) folder entries stats are included as the first part in the comment # of failed tests vs. the total # of tests

Check out the updated ./testsrc/README.md for more details, but the basic gist of regenerating the test262.properties file is ./gradlew test --tests org.mozilla.javascript.tests.Test262SuiteTest --rerun-tasks -DupdateTest262properties

The only downside of regenerating the test262.properties file is that any lines that are just comments are lost. To me, that is a reasonable tradeoff though

If the changes in this PR seem overwhelming (especially comparing test262.properties against master): I did things in steps (1 through 6), see the individual commits

@rbri I hope you don't mind but I took your work in #910 as the basis of my work. I think this PR makes yours obsolete, as it adds all the relevant info directly as comments on the individual test cases in the test262.properties file.

When attempting to update to the latest commit in the main branch of test262 I discovered that since September last year, the test262 harness utilizes rest parameters, so unfortunately wasn't able to update all the way to the latest commit on their main branch.

Some open questions:

when (re)generating the test262.properties file, there are a couple of flags that affect the output (see readme). The default is to do rollup, include stats and include failing tests due to unsupported features. To me this makes the most sense, but the default (thus what we should also commit) could be changed
the stats show the % failed test. Could change that to show the % passed. Dunno which number makes us happier :-)

Next steps (can be a future PR(s)):

add rest parameter support to Rhino, so we can attempt to update the version of test262 we test against to the latest commit on their main branch
move excluded features list out of Test262TestSuite and into test262.properties
move excluded flags list out of Test262TestSuite and into test262.properties
when (re)generating the .properties file, include all test262/test// folders in the newly generated .properties file (so new folders added in the test262 repo will automatically be included)

Hope you guys see this as a good step forwards.

closes #817

rbri · 2021-06-16T15:17:27Z

removed the logic that allowed excluding a similarly named file in multiple subfolders, by excluding the parent path

Great

ll tests that fail due to depending on unsupported features are included (by default), in order to:

Yes you made my day

Hope you guys see this as a good step forwards.

From my point of view this is more than one good step forward - Thanks a lot

gbrail · 2021-06-16T20:19:03Z

This looks very promising. Can we talk for a second about how we'll use it going forward?

Does the logic to regenerate test262.properties rewrite the whole file, or does it preserve what's there currently? In particular, I see that you added a lot of helpful metadata to test262.properties about why a particular test is failing. Does all that get thrown away when the file is regenerated?

If so, I'd suggest that we'll really only use the generator once, to build the files for this PR, and then after that we'll edit the file manually in individual PRs like we do today.

gbrail · 2021-06-16T20:19:28Z

Second question -- can we use a transpiler to make the test262 files run for us even though they use rest parameters?

gbrail

Thanks! I have a few questions about the syntax that I put in with the changes to the properties file.

gbrail · 2021-06-16T20:21:50Z

testsrc/README.md

+    S15.1.3.1_A2.4_T1.js
+    S15.1.3.1_A5.2.js
+```
+Folders can be prefixed with a `~` to mark the folder to be skipped entirely, for example because it contains all tests for a feature not yet supported by Rhino or because running the tests take a long time.


If the tilde marks the whole folder to be skipped, doesn't that imply that all the files in it would also be skipped?

In other words, in the example above, wouldn't the same behavior result if the three individual file names (name.js and so on) were not in the file?

yes, but built-ins/decodeURI is excluded because of performance reasons. The three files were failing at the time when the entire folder was excluded (or so I assumed). And I decided we do not want to loose this info (as it was there in the file before I made changes

Does that mean on regeneration if the directory is marked as skipped, then it retains the files listed below it?

Yup: marking a folder to be skipped using the ~ is a manual action. If you as a dev decide to not remove the failing files of that folder, then when regenerating the file, it'll copy them over

As I mentioned in the description of the PR and the README: the only things lost when regenerating the file are lines that just contain comments (first non-whitespace character on the line being either an ! or a #)

gbrail · 2021-06-16T20:23:04Z

testsrc/README.md

 ```
+If all files in a (sub) folder are expected to fail, instead of listing all files explicitly, the folder can be marked as expected to contain only failing tests by prefixing the folder path with an `*`


A syntax that folks might understand better would be "prototype/flatMap/*". Or even use "**" to indicate multiple levels of directory. Is that a big change -- it might require creating a regex.

Fairly easy to change and I had thought about it, but I chose to go with a prefix instead of a suffix because if you glance over the file, the prefix stands out more than a suffix would.

But if people feel a suffix makes more logical sense, I can change it, np

Not sure how much weight I have, but I vote to leave it as a prefix for the reason mentioned.

I vote for suffix

So I guess @gbrail will be the deciding vote?

My mistake, but I guess the question still holds. Is there a way to skip a subset of a folder? I have no idea if this is really necessary, and it may not be.

Do you even need a symbol to mark a subfolder as failing? The files don't have one, and the subfolder is listed in the same way, correct? Does the symbol in this case only serve to identify it as a folder instead of a file?

Do you even need a symbol to mark a subfolder as failing? The files don't have one, and the subfolder is listed in the same way, correct? Does the symbol in this case only serve to identify it as a folder instead of a file?

True, didn't think of that. So we have the following options to choose from then:

! prefix (current behavior) built-ins/Array *prototype/flatMap 21/21 (100.0%)

! nothing built-ins/Array prototype/flatMap 21/21 (100.0%)

! / suffix built-ins/Array prototype/flatMap/ 21/21 (100.0%)

! /* suffix built-ins/Array prototype/flatMap/* 21/21 (100.0%)

! /** suffix built-ins/Array prototype/flatMap/** 21/21 (100.0%)

My mistake, but I guess the question still holds. Is there a way to skip a subset of a folder? I have no idea if this is really necessary, and it may not be.

ATM there are no subfolders or .js files that are being skipped in test262.properties, only what I call topLevel folders (direct children of test262/test/).
Had a look and concluded that attempting to skip subfolders or individual .js files under those would currently cause issues, both in running the test suite and in regenerating the .properties file.
So for now I'll ahead and make sure skipping folders is only supported for topLevel folders. If the need arises in the future to be able to skip on a lower level, that could be added.

Note that most subfolders that only contain failing tests are related to unsupported features, which, if properly tagged as such in test262 and properly excluded by us (we have a hardcoded list of unsupported features), they will be skipped anyway.

I would just go with the !nothing option. A prefix is not necessary. There is already a de facto "suffix" because you can tell by the absence of .js at the end that it is a folder, plus we don't have trailing slashes for the top level folders.

Will the generator create these folder exceptions instead of listing each file individually, or is this only a manual action?

I'm ok with the 'nothing' if everyone else agrees with that as well

If the generator concludes that all files in a folder (that is not marked to be skipped) fail, it'll just include the folder and not the individual files (unless you not enable 'rollup' through the command line (by default 'rollup' is enabled))

The trailing / is also an option for me, maybe a bit more readable.

Adding a trailing / to the topLevel folder entries for consistency wouldn't be difficult

gbrail · 2021-06-16T20:24:12Z

testsrc/README.md

+built-ins/Array <- include all the tests in the test262/built-ins/Array folder
+    from/calling-from-valid-1-noStrict.js <- but expect that this tests fails
+```
+If `built-ins/Array/from/calling-from-valid-1-noStrict.js` indeed fails, it wont fail the test suite. If it passes, this will be logged while running the test suite


I'm a bit confused about the default behavior. I understand that if "built-ins/Array" is included, then we run all the files in that directory except for the ones called out below. But if "built-ins/Array" is not included in the file, then do we run those tests anyway? The PR comments seemed to indicate that we do.

I guess the PR comments are then misleading: both before my changes and now with my changes, only testfiles from the test262/test/ subfolders listed in the test262.property at the top level (i.e. built-ins/Array or language/expressions/function) will be included.

At the bottom of my PR description I did mention that in a future task we should include ALL test262/test subfolders into the test262.properties file when regenerating it, so we dont miss out on newly added folders in the test262 project

I think that "nothing" is OK. If there's no .js suffix, then we can assume that it's a prefix, and if there is a .js suffix, then we can assume that it's a fail. I don't feel strongly, but if you ask me, that makes the most sense and is the simplest.

p-bakker · 2021-06-16T20:40:58Z

Does the logic to regenerate test262.properties rewrite the whole file, or does it preserve what's there currently? In particular, I see that you added a lot of helpful metadata to test262.properties about why a particular test is failing. Does all that get thrown away when the file is regenerated?

It rewrites the whole file, but looks at the existing file for user/dev provided comments to carry over. All the other stuff I added (x/y (zzz%), strict/non-strict/non-interpreted/{unsupported: [...]}/{strict: [...], non-strict: [...]} is all regenerated based on the actual test results when regenerating the file

If you check out my branch and run the command to update test262.properties, the regenerated file is identical to what it was before. If you fix some code so tests stop failing, their entry would be removed on regeneration. If you make some changes to Rhino code that result in more broken tests, the broken tests would get added as exclusions on regeneration.

Basically: if you regenerate test262.properties and run the tests afterwards, the Test Suite will pass without warnings

If so, I'd suggest that we'll really only use the generator once, to build the files for this PR, and then after that we'll edit the file manually in individual PRs like we do today.

That is not my idea: while you can manually edit the test262.properties file, you're better off generating it (guess it depends on the # of edits). Benefit of regenerating the file everytime is that the stats are up to date and if you update the version of the test262 suite, the delta between the old and regenerated version is just the delta due to the changes in test262

can we use a transpiler to make the test262 files run for us even though they use rest parameters?

While I'm sure we can, I think we're better off adding rest parameter support :-) Semantics aren't that complicated, just need to figure out where to hook it in, see #817 (comment)

Tnx for the questions @gbrail

p-bakker · 2021-06-17T15:44:27Z

Got rest params going :-)

Only tests that fail are on functions using non-simple params (rest/spread/destructuring) not allowed to have 'use strict' in the function body. The current destructuring parameters implementation seems to also fail those tests. Will see what can be done about those.

So, if we can wrap up this PR, I can create a PR to add Rest Param support and update test262 to the last commit on their main branch hopefully (haven't tested if there's another feature we're missing that prevents us from doing so)

p-bakker · 2021-06-18T09:21:29Z

In response to the whole prefix/suffix discussion: while trying to implement the nothing option today, I concluded there are some challenges with the nothing option:

Example:

built-ins/Map 7/145 (4.83%) <- topLevel folder
    Symbol.species 4/4 (100.0%) <- subfolder of the built-ins/Map topLevel folder
    iterable-calls-set.js
    iterator-is-undefined-throws.js
    proto-from-ctor-realm.js {unsupported: [Reflect, cross-realm]}

Both built-ins/Map and Symbol.species refer to folders on disk, but their meaning within test262.properties is different:

built-ins/Map refers to a topLevel folder and means: include all tests in built-ins/Map and all of its subfolders
Symbol.species refers to a subfolder of built-ins/Map and means: all tests in built-ins/Map/Symbol.species/** fail

When programatically reading the test262.properties file, the code needs to determine whether a line containing a folder path is a topLevel folder or a subfolder within a topLevel folder. If not using any prefix or suffix to distinguish the two, we need another means to determine this. I see the following options:

whitespace: could assume a line with a folder path that has no leading whitespace is a topLevel folder, otherwise a subfolder
hardcoded topLevel folder names: test262 (currently) comes with 5 topLevel folders in the /test directory (annexB, built-ins, harness, intl402 and language). Could assume that if a folder path starts with any of these 5 names, it must be a topLevel folder
folder existence testing: when encountering a line with a folder path, could test whether that path exists as a subfolder of the current topLevel folder being processed. If it doesn't exist, assume a topLevel folder

Personally, I prefer the first option (whitespace testing) as:

depending on hardcoded topLevel folder names might cause issues if subfolders with the same name occur
folder existence testing will reach the wrong conclusions if a new version of test262 has removed the subfolder

Thoughts?

If we choose one of these options (instead of using a prefix/suffix) it implies that it will not be possible to roll up all files in a topLevel folder if all tests in a topLevel folder fail (but we can mark the topLevel folder to be skipped using the ~), as a line with a topLevel folder path means 'include all tests within this folder'.

Eventually I want to read all folders from disk when regenerating the test262.properties file and include then in the regenerated file, so we don't 'forget' to include new folders being added to test262. I guess we could decide to not add new topLevel folders to test if all tests in them fail in that case or include them as excluded

tonygermano · 2021-06-19T01:23:05Z

I actually just assumed from the way the file was laid out that whitespace had meaning.

Don't kill me, but is subfolder rollup really needed? What's the problem with listing each file individually, especially if it's going to be auto-generated anyway? It's probably simpler to just do files.

Referencing your example above, what should be the appropriate action if an entire subfolder is excluded and commented and then a code change allows some of the files in the directory to pass? When the properties file is regenerated the directory needs to be replaced with the remaining failing tests, but do they inherit or ignore the directory comment?

All that being said, whatever you decide on at this point I will probably find acceptable.

p-bakker · 2021-06-19T06:51:37Z

Think the roll-up is handy (gives you the likely insight that a certain method/feature isn't implemented). If you just see a list of files, you won't know if it's all or a subset of the files in the folder

A comment on a rolled up directory would be lost if some of the tests in the directory starts passing and the file is regenerated, as the individual remaining failing tests would be listed, instead of the rolled up folder

Will implement the whitespace option next week, unless I hear from anyone who thinks I should not 🙂

Cleaning up existing test262.properties file for larger changes to come, removing wildcard matching from the 262Test runner - Removed wildcard matching: now need to be explicit: 'name.js' wont match any file in any subdirectory that ends with `name.js` anymore - added all files with full path for current "wildcard" entries and removed wildcard entries (xxxx.js, matching **/*xxxx.js) - removed exclusions of test files that depend on unsupported features (are filtered out/skipped anyway when running the tests) - made sure test files within a directory are listed in a certain order (as to be able to better compare the current test262.properties with a generated one later on) - marked folders containing only failing tests as excluded (instead of listing all failing files individually) - removed some comments

- removed unneeded ! in front of every .js file: every .js file is an exclusion regardless and by removing the !, the filePath becomes a key within the Java Properties format - use ~ instead of # for excluding directories - no need to exclude files under an already excluded directory - add generateTest262properties commandline flag to regenerate the used .properties file based on the test results of running the Test262SuiteTest

settings

+ some comments and code improvements

on unsupported features + added a few more features to the unsupported feature list

See tc39/test262@f94fc66 Unfortunately, as of the next commit the sta.js file from the harness is using rest parameters, so crashes all tests

p-bakker · 2021-06-21T19:12:46Z

Made the adjustments to not have a prefix or suffix on rolled up subfolders (and instead rely on whitespace to differentiate between topLevel folders and subfolders).

From my POV this is now ready to merge

p-bakker · 2021-06-23T09:36:48Z

@gbrail understand you've been busy getting the Promise PR ready, but could you give a quick update on where you stand regarding this PR?

Asking because I'd like to get this one out of the way before moving onto some of the other cases I'd like to tackle

gbrail · 2021-06-23T19:50:36Z

I think this is great and we should do it. However right now, at least on my machine, about 2000 tests are failing. I updated my test262 submodule using "git submodule update" to f94fc660cc3c59b1f2f9f122fc4d44b4434b935c. Can you check? I'll also re-run ci and see if anything else happened. If anything it might need you to merge the current master, which I won't push to until this is merged.

p-bakker · 2021-06-23T20:10:55Z

That is strange, don't get why you would get 2000 failing tests.

My branch is rebased on top of the current master and CI passes without failures.

I ran git checkout f94fc66 within the /test262 folder to update test262 to the proper commit

p-bakker · 2021-06-23T21:22:19Z

Are you getting 2000 failing test after checking out my branch and the proper test262 version and then running all tests?

Or 2000 new failing tests when running the tests with -DupdateTest262properties?

Either way I have no idea what could be going on though

gbrail · 2021-06-23T21:42:16Z

Well I don't know, I must have merged it wrong, but I did it again and now everything is good, so thanks! For this and subsequent PRs I'm going to start squashing commits because our commit history is full of these meaningless one-line comments, so watch out for that too.

…

On Wed, Jun 23, 2021 at 2:22 PM Paul Bakker ***@***.***> wrote: Are you getting 2000 failing test after checking out my branch and the proper test262 version and then running all tests? Or 2000 new failing tests when running the tests with -DupdateTest262properties? Either way I have no idea what could be going on though — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#930 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAD7I26BKZFCMSKAMRRN463TUJGBJANCNFSM46ZOYNPA> .

p-bakker mentioned this pull request Jun 16, 2021

Many tests excluded from the 262 test suite passing with some setup #909

Closed

rbri mentioned this pull request Jun 16, 2021

more detailed test output (see #909) #910

Closed

gbrail reviewed Jun 16, 2021

View reviewed changes

p-bakker added 14 commits June 21, 2021 20:15

Step 3: regeneration of test262.properties with default regeneration

ac39000

settings

intermediate step: making spotless happy

609bf9c

intermediate step: fix for bug processing rolled-up folders

dbc1278

+ some comments and code improvements

Step 4: change default to include tests that are skipped due to relying

e704b9e

on unsupported features + added a few more features to the unsupported feature list

intermediate step: code cleanup

1ff6771

Step 5: improved subfolder sorting a bit

7abf13c

Step 6: update test262 to last working version (Sep 15, 2020)

35fd712

See tc39/test262@f94fc66 Unfortunately, as of the next commit the sta.js file from the harness is using rest parameters, so crashes all tests

intermediate step: spotless

6c3acd5

intermediate step: small update to README.md

0750d23

intermediate step: spotless

c079fcf

Adjusting rollup format (no prefix anymore)

01ab074

Cleanup + regenerate of test262.properties after rebase

7395bda

p-bakker force-pushed the test262_prop_file_generator branch from d3f147d to 7395bda Compare June 21, 2021 18:43

p-bakker added 2 commits June 21, 2021 20:53

fix remove dependancy on Java 11 api

9a3bc29

spotless...

f972177

small update to README.md

ed82270

gbrail merged commit 263655b into mozilla:master Jun 23, 2021

p-bakker added this to the Release 1.7.14 milestone Oct 13, 2021

p-bakker added the enhancement label Oct 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update test262 +test262.properties file (re)generator #930

update test262 +test262.properties file (re)generator #930

p-bakker commented Jun 16, 2021 •

edited

rbri commented Jun 16, 2021 •

edited

gbrail commented Jun 16, 2021

gbrail commented Jun 16, 2021

gbrail left a comment

gbrail Jun 16, 2021

p-bakker Jun 16, 2021

tonygermano Jun 16, 2021

p-bakker Jun 16, 2021 •

edited

gbrail Jun 16, 2021

p-bakker Jun 16, 2021 •

edited

tonygermano Jun 16, 2021

rbri Jun 16, 2021

p-bakker Jun 16, 2021

tonygermano Jun 16, 2021

p-bakker Jun 17, 2021 •

edited

tonygermano Jun 17, 2021

p-bakker Jun 17, 2021 •

edited

p-bakker Jun 17, 2021

gbrail Jun 16, 2021

p-bakker Jun 16, 2021 •

edited

gbrail Jun 17, 2021

p-bakker commented Jun 16, 2021 •

edited

p-bakker commented Jun 17, 2021

p-bakker commented Jun 18, 2021 •

edited

tonygermano commented Jun 19, 2021

p-bakker commented Jun 19, 2021 •

edited

p-bakker commented Jun 21, 2021

p-bakker commented Jun 23, 2021

gbrail commented Jun 23, 2021

p-bakker commented Jun 23, 2021

p-bakker commented Jun 23, 2021

gbrail commented Jun 23, 2021 via email

		```
		If all files in a (sub) folder are expected to fail, instead of listing all files explicitly, the folder can be marked as expected to contain only failing tests by prefixing the folder path with an `*`

update test262 +test262.properties file (re)generator #930

update test262 +test262.properties file (re)generator #930

Conversation

p-bakker commented Jun 16, 2021 • edited

rbri commented Jun 16, 2021 • edited

gbrail commented Jun 16, 2021

gbrail commented Jun 16, 2021

gbrail left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

p-bakker Jun 16, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

p-bakker Jun 16, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

p-bakker Jun 17, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

p-bakker Jun 17, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

p-bakker Jun 16, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

p-bakker commented Jun 16, 2021 • edited

p-bakker commented Jun 17, 2021

p-bakker commented Jun 18, 2021 • edited

tonygermano commented Jun 19, 2021

p-bakker commented Jun 19, 2021 • edited

p-bakker commented Jun 21, 2021

p-bakker commented Jun 23, 2021

gbrail commented Jun 23, 2021

p-bakker commented Jun 23, 2021

p-bakker commented Jun 23, 2021

gbrail commented Jun 23, 2021 via email

p-bakker commented Jun 16, 2021 •

edited

rbri commented Jun 16, 2021 •

edited

p-bakker Jun 16, 2021 •

edited

p-bakker Jun 16, 2021 •

edited

p-bakker Jun 17, 2021 •

edited

p-bakker Jun 17, 2021 •

edited

p-bakker Jun 16, 2021 •

edited

p-bakker commented Jun 16, 2021 •

edited

p-bakker commented Jun 18, 2021 •

edited

p-bakker commented Jun 19, 2021 •

edited