Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Acorn-based parsing #1820

Merged
merged 3 commits into from
Apr 11, 2024
Merged

Acorn-based parsing #1820

merged 3 commits into from
Apr 11, 2024

Conversation

adams85
Copy link
Contributor

@adams85 adams85 commented Mar 29, 2024

It was unavoidable, I had to see how my new acorn-based JS parser works with Jint... 😄

It seems that pretty well: it resolves most of the parser-related issues in test262 and brings a tiny perf. improvement in overall (see benchmark results below).

So, I'm curious to know what do you think about an acorn-based Jint? Of course, absolutely no pressure, plus this would mean some potentially annoying breaking changes, so if you'd rather stick with Esprima, I'm totally fine with that. (Even then this work won't have been done completely in vain as this is a great integration test for my parser, which uncovered some minor bugs and showed me where the AST has shortcomings currently.)

However, if you'd like to do the switch, that would make my very proud and gladly do the minor adjustments which are needed to finalize this PR!

@adams85
Copy link
Contributor Author

adams85 commented Mar 29, 2024

BenchmarkDotNet v0.13.12, Windows 10 (10.0.19045.4170/22H2/2022Update)
AMD Ryzen 7 2700X, 1 CPU, 16 logical and 8 physical cores
.NET SDK 8.0.202
[Host] : .NET 8.0.3 (8.0.324.11423), X64 RyuJIT AVX2
DefaultJob : .NET 8.0.3 (8.0.324.11423), X64 RyuJIT AVX2

Before (Esprima)

Method FileName Mean StdDev Rank Allocated
Jint array-stress 7,515.224 μs 42.4682 μs 1 7017.84 KB
Jint_ParsedScript array-stress 8,157.611 μs 115.4234 μs 2 6996.66 KB
Jint dromaeo-3d-cube 28,390.069 μs 42.0761 μs 4 6241.11 KB
Jint_ParsedScript dromaeo-3d-cube 27,196.405 μs 34.6054 μs 3 5983.8 KB
Jint dromaeo-core-eval 4,727.442 μs 13.2962 μs 3 355.79 KB
Jint_ParsedScript dromaeo-core-eval 4,475.208 μs 13.5812 μs 2 338.75 KB
Jint dromaeo-object-array 53,183.217 μs 784.0281 μs 1 100410.77 KB
Jint_ParsedScript dromaeo-object-array 56,760.258 μs 930.0847 μs 2 100370.95 KB
Jint droma(...)egexp [21] 229,861.280 μs 2,785.7601 μs 2 163402.02 KB
Jint_ParsedScript droma(...)egexp [21] 222,161.400 μs 2,183.8414 μs 1 169953.97 KB
Jint droma(...)tring [21] 404,473.279 μs 3,376.5817 μs 1 1321472.77 KB
Jint_ParsedScript droma(...)tring [21] 398,929.113 μs 5,418.5646 μs 1 1321261.68 KB
Jint droma(...)ase64 [21] 60,223.052 μs 63.6666 μs 3 6124.7 KB
Jint_ParsedScript droma(...)ase64 [21] 59,242.062 μs 120.8232 μs 2 6039.83 KB
Jint evaluation 36.788 μs 0.6763 μs 2 40.85 KB
Jint_ParsedScript evaluation 15.226 μs 0.1624 μs 1 32.28 KB
Jint linq-js 2,483.586 μs 6.9467 μs 3 1273.64 KB
Jint_ParsedScript linq-js 141.609 μs 1.8270 μs 1 224.88 KB
Jint minimal 5.802 μs 0.2169 μs 3 14.6 KB
Jint_ParsedScript minimal 3.164 μs 0.0050 μs 1 13.22 KB
Jint stopwatch 389,799.457 μs 2,555.6312 μs 4 53042.25 KB
Jint_ParsedScript stopwatch 385,872.907 μs 2,871.8493 μs 4 53017.58 KB

** After (Acornima) **

Method FileName Mean StdDev Rank Allocated
Jint array-stress 7,493.907 μs 55.6531 μs 1 7016.32 KB
Jint_ParsedScript array-stress 8,521.961 μs 111.3640 μs 3 6994.32 KB
Jint dromaeo-3d-cube 28,166.099 μs 27.7357 μs 4 6234.68 KB
Jint_ParsedScript dromaeo-3d-cube 26,540.061 μs 88.2389 μs 3 5981.46 KB
Jint dromaeo-core-eval 4,642.193 μs 22.5442 μs 3 357.3 KB
Jint_ParsedScript dromaeo-core-eval 4,494.251 μs 7.5824 μs 2 339.72 KB
Jint dromaeo-object-array 52,620.869 μs 1,219.8760 μs 1 100411 KB
Jint_ParsedScript dromaeo-object-array 53,944.009 μs 1,143.7667 μs 2 100368.6 KB
Jint droma(...)egexp [21] 233,301.670 μs 3,384.7220 μs 2 164630.45 KB
Jint_ParsedScript droma(...)egexp [21] 225,059.003 μs 2,321.0998 μs 1 165143.63 KB
Jint droma(...)tring [21] 403,190.871 μs 5,494.1274 μs 1 1321527.96 KB
Jint_ParsedScript droma(...)tring [21] 397,381.593 μs 5,916.8257 μs 1 1321348.63 KB
Jint droma(...)ase64 [21] 59,188.533 μs 142.2596 μs 2 6119.13 KB
Jint_ParsedScript droma(...)ase64 [21] 59,294.962 μs 126.4828 μs 2 6036.86 KB
Jint evaluation 32.682 μs 0.0295 μs 2 38.71 KB
Jint_ParsedScript evaluation 14.002 μs 0.0705 μs 1 29.95 KB
Jint linq-js 2,405.574 μs 2.3482 μs 3 1327.49 KB
Jint_ParsedScript linq-js 135.846 μs 2.2389 μs 1 222.54 KB
Jint minimal 4.376 μs 0.0120 μs 2 13.45 KB
Jint_ParsedScript minimal 3.029 μs 0.0046 μs 1 12.09 KB
Jint stopwatch 382,592.787 μs 3,267.6964 μs 4 53041.01 KB
Jint_ParsedScript stopwatch 377,164.633 μs 1,199.8201 μs 4 53015.24 KB

@adams85 adams85 marked this pull request as draft March 29, 2024 12:02
@adams85 adams85 force-pushed the acornima-poc branch 4 times, most recently from 799eb8a to 1639545 Compare March 29, 2024 22:14
Copy link
Owner

@sebastienros sebastienros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes are surprisingly small, this is great!

@@ -4,9 +4,9 @@
<CentralPackageTransitivePinningEnabled>false</CentralPackageTransitivePinningEnabled>
</PropertyGroup>
<ItemGroup>
<PackageVersion Include="Acornima" Version="0.9.1" />
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahah! nobody will install Jint now because "it's not a major version, blablabla" ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've got a point there, probably we can't expect them to read the release notes. 😄 Luckily, though, this can be easily solved. I plan to reach the v1.0 milestone in the upcoming weeks.

@@ -1,4 +1,4 @@
using Esprima;
using Acornima;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would have been easier to just rename Acornina to Esprima ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely, just the name wouldn't really reflect the content then... 😄

@@ -22,12 +28,13 @@ public partial class Engine
public static Script PrepareScript(string script, string? source = null, bool strict = false)
{
var astAnalyzer = new AstAnalyzer(new ScriptPreparationOptions());
var options = ParserOptions.Default with
var options = s_defaultParserOptions with
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this name so ugly? Or do you fancy this style?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it came from Jint's .editorconfig:

# Static fields are camelCase and start with s_

From what I see when looking around the codebase, this rule is not really followed. :) So, I'll change it to be consistent with the rest.

@ejsmith
Copy link
Contributor

ejsmith commented Mar 30, 2024

Would this give Jint better error messages with line and column information?

@adams85
Copy link
Contributor Author

adams85 commented Mar 31, 2024

The current parser, Esprima already provides this information (at least for syntax errors). So, if it's not reported, then it's because of Jint...

As for the new parser, I paid careful attention to error reporting. It pretty closely replicates V8's error reporting behavior, plus it detects a lot of syntax errors which Esprima is currently unable to, so it would be nice if Jint were propagate those errors properly. I may look into this in a future PR - if this one gets accepted, of course. :)

@lahma
Copy link
Collaborator

lahma commented Mar 31, 2024

Awesome work @adams85 as always! We are very interested in this option and would offer better long time support as acorn-based solution is better maintained w.r.t. features. Another question indeed is when we could make the switch, probably would require v4 and v3 only took seven years to "complete" 😉

@lahma
Copy link
Collaborator

lahma commented Mar 31, 2024

Would this give Jint better error messages with line and column information?

Jint has extensive test cases for error reporting on line/column level. There are still a lot of type errors thrown with lacking exception messages, but we take pull requests from people who need them to be improved.

@lahma

This comment was marked as resolved.

@adams85
Copy link
Contributor Author

adams85 commented Mar 31, 2024

Awesome work @adams85 as always! We are very interested in this option and would offer better long time support as acorn-based solution is better maintained w.r.t. features.

❤️

Another question indeed is when we could make the switch, probably would require v4 and v3 only took seven years to "complete" 😉

Even seven years sounds better than never... 😄 It would be nice to merge this PR, so V4 features can be created based on this, but it's nearly irrelevant to me when it gets released. So, absolutely no hurry. I'd just like to avoid constantly rebasing this one...

So it would definitely be nice to find a clever solution to reduce friction with other open PRs. Two options come to my mind:

  1. We create a v4 branch and merge this one into that and develop v4 features on top of that. Maybe retarget some open PRs which we want to release in v4.
  2. Do some implicit using / using alias magic to reduce friction and merge this one into master.

Option 1 (or some combination of the two) seems more flexible to me. But this is up to you.

Just a though, what if there was a pre-PR that would introduce global usings for Esprima and it would then make this one smaller and maybe less against v3 development before merge?

AFAICS, this is done deal :D I like the idea very much. Probably we can improve this even further.

What about (temporary) keeping the original names (Nodes, Location, etc.) with a using alias instead of changing them? This way we wouldn't introduce a lot of conflicts to other PRs.

Also, we can add further using aliases:

  • using JavaScriptParser = Acornima.Parser
  • using TokenType = Acornima.TokenKind
  • using BindingPattern = Acornima.Ast.DestructuringPattern
  • using ParserException = Acornima.ParseErrorException

@lahma

This comment was marked as resolved.

@lahma

This comment was marked as resolved.

@adams85 adams85 force-pushed the acornima-poc branch 4 times, most recently from 508bac9 to e265dcc Compare March 31, 2024 23:36
@adams85
Copy link
Contributor Author

adams85 commented Apr 1, 2024

Phew, this was no fun but finally I managed to rebase the PR. It's much much cleaner this way, no doubt about it. So it was worth it in the end :)

However, while doing this, I spotted something pretty problematic.

Acornima's tolerant parsing mode differs from Esprima's, plus the latter has tolerant parsing turned on while the former has it off by default. Jint used the defaults of Esprima, so I enabled tolerant mode for internal parsing (which I don't think is a good default - but that's another story). The main problem here is that obviously I can't do anything about pre-parsed scripts - which is a pretty sneaky BC...

On top of this, things are really starting to fall apart when Jint needs to parse some code dynamically (e.g. some eval code). At that point we don't have the information whether the containing script was parsed in tolerant or non-tolerant mode. So far Jint's just used the default parser settings. Because of the differences in the tolerant mode behavior I could make tests pass only if I disabled tolerant mode. So, it kinda works but meh...

This is, of course, not something Acornima-specific. The current Esprima version has this problem as well. (For example, end user passes a pre-parsed script to Jint which has been parsed in non-tolerant mode, but Jint will parse dynamic parts in tolerant mode...) With Acornima this issue gets even worse, as it has support for e.g. setting the language version compatibility (EcmaVersion) and fine tuning which experimental features are enabled (ExperimentalESFeatures).

So the current situation is pretty messy. It could be improved easily if Jint handled the parsing internally in all circumstances. However, since it allows end user to pass pre-parsed scripts, this can't be solved within Jint.

The only correct solution to this issue I can think of (and doesn't involve big BCs) is to modify the parser to include the ParserOptions used for parsing in the returned AST (or some struct return value). This way, Jint would know what parser options (tolerant mode, language version, etc.) to use to stay consistent when it needs to parse eval code, import a module, etc.

What do you think? It wouldn't be a big deal to add this to Acornima, I just doesn't feel like re-introducing SyntaxElement... :) I'd be more ok with something like ParseResult ParseScript(string input, string? sourceFile = null, bool strict = false), where ParseResult is a readonly struct which includes the root node of the AST and the ParserOptions used for parsing. Then, Engine.Evaluate would expect this ParseResult type instead of Script.

@adams85
Copy link
Contributor Author

adams85 commented Apr 1, 2024

After familiarizing myself a bit more with Jint's public API, I may see another way that doesn't involve changes to the parser:

The root cause of the problem is that end users are allowed to pass pre-parsed code to Jint via Evaluate(Script)/Execute(Script)/AddModule(Module), which means that Jint can't make any assumptions on what parser settings were used to produce those ASTs.

However, from what I understood, the way of using pre-parsed scripts with Jint is to call PrepareScript/PrepareModule, then to feed the return value to the methods mentioned above.

So, if my understanding is correct, we could get away with much smaller changes:

+ public readonly record struct PrepareResult<T>(T Program, ParserOptions ParserOptions) where T : Program;

- public static Script PrepareScript(string script, string? source = null, bool strict = false)
+ public static PrepareResult<Script> PrepareScript(string script, string? source = null, bool strict = false, bool tolerant = true)

- public JsValue Evaluate(Script script)
+ public JsValue Evaluate(PrepareResult<Script> script)

etc.

This way Jint could retain full control over what parser settings are used for parsing the code to execute and it can store the settings for later use when dynamic code, code of referenced modules, etc. need to be parsed.

Can this work in your opinion?

@lahma
Copy link
Collaborator

lahma commented Apr 1, 2024

This was exactly my initial thought that we could enforce input to be something that PrepareScript produces 👍🏻

Traveling today so mostly short replies and tinkering as background process 🙂

@adams85
Copy link
Contributor Author

adams85 commented Apr 1, 2024

Okay, that's enough for now, I can start working with this.

Just one more question: how about changing tolerant mode to off by default? I can only speak for myself, but I'd be surprised if I fed some script to Jint which is invalid per spec and get some results instead of SyntaxError. I mean I think relaxing the rules should be a deliberate choice, not the default. Or am I missing here something that justifies the opposite?

Of course, this would be another potentially impactful BC, but we'll have a bunch apart from this anyway...

@lahma
Copy link
Collaborator

lahma commented Apr 1, 2024

Probably would be OK to make rules stricter. Maybe return outside of functions is something needed for REPL and environments where people want script like if they were inside function context.

@adams85
Copy link
Contributor Author

adams85 commented Apr 1, 2024

Maybe return outside of functions is something needed for REPL and environments where people want script like if they were inside function context.

No problem with that as it's controlled by another switch than tolerant mode.

@adams85
Copy link
Contributor Author

adams85 commented Apr 3, 2024

Had to dive much deeper into Jint's internals than I had hoped, but I think finally I was able to clean up the issues around parser options and dynamic code parsing.

But there are still a few questions I'm unsure about:

  1. It seems I ran into a bug: I wrote a test where a shadow realm executed a nested eval and Jint crashed with NullReferenceException at this line. I could make it work with this change but I'm not sure at all that it's correct. Could you confirm it's ok?
  2. There are settings in ParserOptions which should not be tweaked by the end user, just by Jint, internally. Without a massive BC, I could come up with this solution. It works, however I'm not sure whether to silently ignore the nonsensical settings (like it's done now) or to throw an exception instead. What do you think?

One more thing worth mentioning: I included this workaround in the regexp parser, so it's not needed here any more.

@lahma
Copy link
Collaborator

lahma commented Apr 3, 2024

Had to dive much deeper into Jint's internals than I had hoped, but I think finally I was able to clean up the issues around parser options and dynamic code parsing.

It's always good to get a new fresh pair of eyes to look at years of hacks and horror 🙂

  1. It seems I ran into a bug: I wrote a test where a shadow realm executed a nested eval and Jint crashed with NullReferenceException at this line. I could make it work with this change but I'm not sure at all that it's correct. Could you confirm it's ok?

Well you're plain evil. This is the thing with JS, there's so many ways to get things in unusable state, great work on spotting this! This whole evaluation context going missing has bitten so many times already that I'm already questioning to whole idea of having it. I think the fix is fine and the regular band-aid that is being used elsewhere too (other places don't have try-finally IIRC).

  1. There are settings in ParserOptions which should not be tweaked by the end user, just by Jint, internally. Without a massive BC, I could come up with this solution. It works, however I'm not sure whether to silently ignore the nonsensical settings (like it's done now) or to throw an exception instead. What do you think?

Just to avoid answering the actual question... Should Jint actually take JintParserOptions which would have only couple sensical properties which would then be transferred to the implementation dependent richer configuration? By default I think Jint should have always all the bells and whistles enabled for parsing and trying to cope with the AST. Both Esprima and Acornima settings objects might be too rich for end user. It could be very small subset of switches which could be expanded when/if there's a need. We could already break this with 3.1 or something and narrow the configuration. I don't see a mention about semantic versioning in the readme 😉

One more thing worth mentioning: I included this workaround in the regexp parser, so it's not needed here any more.

Ah so basically a flag for Jint whether is can even try the compilation. I see that Acornima limits to < 7, but I think it's safe to use again with > 8 too. Jin does the aggressive compilation even if there's no JS regex flag for compiled like you probably know.

@adams85
Copy link
Contributor Author

adams85 commented Apr 3, 2024

Well you're plain evil. This is the thing with JS, there's so many ways to get things in unusable state, great work on spotting this!

Yep, JS is the language of surprises, so minor glitches like this are indeed hard to spot. They align so smoothly with the JS philosophy :D

I think the fix is fine and the regular band-aid that is being used elsewhere too (other places don't have try-finally IIRC).

The reason why I asked is that in other cases I've only seen this pattern:

var ownsContext = _engine._activeEvaluationContext is null;
_engine. _activeEvaluationContext ??= new EvaluationContext(_engine);
try
{
    /* ... */
}
finally
{
    if (ownsContext)
    {
        _engine._activeEvaluationContext = null!;
    }
}

This is a bit different from what I did because it kind of "joins" the evaluation context if it's present already. So, don't we actually need this also in the case in question? (Unfortunately, I can't really tell because of my limited understanding of the concept.)

Should Jint actually take JintParserOptions which would have only couple sensical properties which would then be transferred to the implementation dependent richer configuration?

Yeah, this is the other solution that I referred to as a "massive BC". At the same time this is the clean solution, so I'd prefer it too if BCs are not a concern. However, I think we should introduce this change in a subsequent PR because this one is starting to lose focus... I'll be happy to do the change after merging this.

I see that Acornima limits to < 7, but I think it's safe to use again with > 8 too. Jin does the aggressive compilation even if there's no JS regex flag for compiled like you probably know.

Indeed, I missed that the .NET runtime folks fixed this here. So we can assume that the fix will be included in .NET 9. I'll update the regexp parser accordingly.

@adams85
Copy link
Contributor Author

adams85 commented Apr 9, 2024

As far as I'm concerned, the PR is complete. Also, v1.0.0 of the parser has been released. So what do you think about merging this into some v4 branch so further development can be done in that? (Probably other ongoing PRs which are to be released in v4 should be rebased on that branch too.)

For example, I plan to address some minor todos and issues and I'll also want to look into error reporting because syntax errors reported by the parser are not wrapped in SyntaxError JS objects in some cases (e.g. here). However, I think it would be better to put those changes into seperate PRs.

@adams85 adams85 marked this pull request as ready for review April 9, 2024 23:21
@adams85 adams85 changed the title PoC for Acorn-based parsing Acorn-based parsing Apr 9, 2024
@lahma
Copy link
Collaborator

lahma commented Apr 10, 2024

My attack would probably be as follows

  • rename EsprimaExtensions to AstExtensions (rarely used by consumer code), this branch should rebase cleanly just by ignoring it and overwriting with PR's changes - parity with upcoming changes, we already are breaking things
  • bump version to 3.1
  • release 3.1
  • branch main to 3.x, maintain fixes if needed there
  • merge this to main and make main a Duke Nukem forever project again
  • profit

@adams85
Copy link
Contributor Author

adams85 commented Apr 10, 2024

That can certainly work as well! :)

rename EsprimaExtensions to AstExtensions (rarely used by consumer code), this branch should rebase cleanly just by ignoring it and overwriting with PR's changes - parity with upcoming changes, we already are breaking things

I'll do this soon, then leave the rest up to you. ;)

@adams85
Copy link
Contributor Author

adams85 commented Apr 10, 2024

Latest benchmark result with main branch and this PR updated to Acornima v1.0.0:

Before (Esprima)

Method FileName Mean StdDev Rank Allocated
Jint array-stress 7,575.246 μs 14.4506 μs 1 7016.28 KB
Jint_ParsedScript array-stress 8,020.207 μs 118.9083 μs 2 6982.47 KB
Jint dromaeo-3d-cube 28,279.776 μs 57.0615 μs 1 6240.53 KB
Jint_ParsedScript dromaeo-3d-cube 28,020.723 μs 46.3408 μs 1 5885.77 KB
Jint dromaeo-core-eval 4,584.428 μs 11.0669 μs 1 355.94 KB
Jint_ParsedScript dromaeo-core-eval 4,769.521 μs 10.6490 μs 2 333.21 KB
Jint dromaeo-object-array 50,683.045 μs 413.7656 μs 1 100409.92 KB
Jint_ParsedScript dromaeo-object-array 52,291.036 μs 316.6022 μs 2 100356.51 KB
Jint droma(...)egexp [21] 225,931.847 μs 2,904.9614 μs 2 165651.12 KB
Jint_ParsedScript droma(...)egexp [21] 174,394.089 μs 2,339.3612 μs 1 167267.43 KB
Jint droma(...)tring [21] 399,448.584 μs 8,581.2445 μs 1 1321367.52 KB
Jint_ParsedScript droma(...)tring [21] 398,350.529 μs 4,534.8665 μs 1 1321354.86 KB
Jint droma(...)ase64 [21] 59,212.169 μs 308.0397 μs 1 6123.97 KB
Jint_ParsedScript droma(...)ase64 [21] 60,402.459 μs 177.9859 μs 2 6012.02 KB
Jint evaluation 35.746 μs 0.0466 μs 2 39.39 KB
Jint_ParsedScript evaluation 12.476 μs 0.4245 μs 1 28.01 KB
Jint linq-js 2,561.289 μs 16.2517 μs 2 1272.12 KB
Jint_ParsedScript linq-js 126.997 μs 1.3896 μs 1 192.3 KB
Jint minimal 5.361 μs 0.1053 μs 2 14.66 KB
Jint_ParsedScript minimal 2.475 μs 0.0021 μs 1 12.59 KB
Jint stopwatch 378,419.050 μs 1,603.6123 μs 1 53040.73 KB
Jint_ParsedScript stopwatch 390,796.180 μs 1,204.0319 μs 2 53005.7 KB

** After (Acornima) **

Method FileName Mean StdDev Rank Allocated
Jint array-stress 7,377.878 μs 38.4145 μs 1 7015.58 KB
Jint_ParsedScript array-stress 8,453.743 μs 123.7822 μs 2 6981.33 KB
Jint dromaeo-3d-cube 28,350.037 μs 24.6356 μs 2 6233.65 KB
Jint_ParsedScript dromaeo-3d-cube 27,428.668 μs 47.7358 μs 1 5884.62 KB
Jint dromaeo-core-eval 4,686.884 μs 6.6578 μs 2 355.73 KB
Jint_ParsedScript dromaeo-core-eval 4,544.569 μs 12.6899 μs 1 332.57 KB
Jint dromaeo-object-array 50,250.130 μs 391.0685 μs 1 100409.84 KB
Jint_ParsedScript dromaeo-object-array 62,619.985 μs 321.5383 μs 2 100355.15 KB
Jint droma(...)egexp [21] 227,611.223 μs 3,594.9438 μs 2 168624.43 KB
Jint_ParsedScript droma(...)egexp [21] 177,574.193 μs 2,830.8412 μs 1 165976.58 KB
Jint droma(...)tring [21] 400,274.943 μs 9,084.8877 μs 1 1321360.98 KB
Jint_ParsedScript droma(...)tring [21] 404,169.764 μs 6,067.2517 μs 1 1321322.06 KB
Jint droma(...)ase64 [21] 57,840.141 μs 335.3168 μs 1 6119.98 KB
Jint_ParsedScript droma(...)ase64 [21] 59,283.333 μs 217.2524 μs 2 6010.57 KB
Jint evaluation 32.130 μs 0.0499 μs 2 38.45 KB
Jint_ParsedScript evaluation 10.904 μs 0.0861 μs 1 26.86 KB
Jint linq-js 2,352.298 μs 4.5143 μs 2 1307.63 KB
Jint_ParsedScript linq-js 121.317 μs 1.7267 μs 1 191.15 KB
Jint minimal 4.339 μs 0.0079 μs 2 13.49 KB
Jint_ParsedScript minimal 2.328 μs 0.0037 μs 1 11.45 KB
Jint stopwatch 381,197.513 μs 2,802.2481 μs 1 53040.95 KB
Jint_ParsedScript stopwatch 394,633.258 μs 1,135.1463 μs 2 53004.55 KB

@adams85
Copy link
Contributor Author

adams85 commented Apr 10, 2024

I made a few other runs and the regression in the case of Jint_ParsedScript + droma(...)ase64 [21] looks like just the inaccuracy of the benchmarks:

Method FileName Mean StdDev Rank Allocated
Jint dromaeo-object-array 49.78 ms 0.179 ms 1 98.06 MB
Jint_ParsedScript dromaeo-object-array 51.20 ms 0.357 ms 2 98 MB
Method FileName Mean StdDev Rank Allocated
Jint dromaeo-object-array 50.25 ms 0.204 ms 1 98.06 MB
Jint_ParsedScript dromaeo-object-array 51.72 ms 0.392 ms 2 98 MB

Probably there's some GC activity behind the differences...

@adams85
Copy link
Contributor Author

adams85 commented Apr 10, 2024

Compared to where we started from, the perf. characteristics virtually haven't changed. Looks like the new parser may bring a tiny perf. improvement in cases where parsing the input takes up a relatively larger part of the execution.

@lahma
Copy link
Collaborator

lahma commented Apr 11, 2024

Compared to where we started from, the perf. characteristics virtually haven't changed. Looks like the new parser may bring a tiny perf. improvement in cases where parsing the input takes up a relatively larger part of the execution.

This is quite expected. While Acornima is faster, the majority of time in Jint's use case is always spent in actual script execution/interpretation (performance PRs welcome! 😉). The major improvement is to have a clear(er) path to new ECMAScript features with robust parser infrastructure.

Version 3.1 is being out for a moment and some have already auto-updated like docfx without problems. I've tested against other code bases and only small tweaks needed for consumers like RavenDB. We've broken advanced APIs mostly so people don't notice..

Thanks for championing this!

@lahma lahma merged commit 6d866a2 into sebastienros:main Apr 11, 2024
3 checks passed
@sebastienros
Copy link
Owner

😢

@lahma
Copy link
Collaborator

lahma commented Apr 11, 2024

😢

First tears of joy given!

@sebastienros
Copy link
Owner

I have memories of the summer I spent migrating the Esprima codebase to dotnet.
Now the question is when to flag the repository as not maintained, because I assume nobody would, and suggest people to also move to Acornima, which will also help driving support.

@lahma
Copy link
Collaborator

lahma commented Apr 11, 2024

I have memories of the summer I spent migrating the Esprima codebase to dotnet.

I also have memories of lost summer vacations building/improving these things. "Unfortunately" it seems that Acorn won the battle maintenance-wise and is a much safer bet. Maybe we can be happy about the things we learned along the way, whether it's spec, performance or other things 🙂

Now the question is when to flag the repository as not maintained, because I assume nobody would, and suggest people to also move to Acornima, which will also help driving support.

It's an option. Esprima is a good solution which can handle many things - it's been great for Jint, it's just harder to get new features and compliance with newer features without an upstream to follow.

@adams85
Copy link
Contributor Author

adams85 commented Apr 11, 2024

No reason to be sad, we can now waste many more summer vacations on keeping Acornima up to date :D Every Esprima maintainers are invited. @lahma is already a collaborator ;) If anyone else wants to join, just drop me a message!

Besides that, a big part of Esprima will live on in Acornima. Many of the awesome performance tweaks have been applied to translated Acorn implementation. A lot of add-on features, source gen, tests also come from Esprima. So best to think about Acornima as the love child of two great parents :D

@lahma
Copy link
Collaborator

lahma commented Apr 11, 2024

Nicely summed up, thank you!

As this has been merged, main is again "API unstable", even though I don't see much required changed there (I can also be blind at this point looking at the APIs too long). Breaking changes mean It's just that the new version will require 4.0 stamp due to dependency change.

@adams85
Copy link
Contributor Author

adams85 commented Apr 11, 2024

FWIW, I put together a list of the most impactful breaking changes: https://github.com/adams85/acornima?tab=readme-ov-file#migration-from-esprimanet

Most surprises are to be expected when users are feeding ASTs which were built manually, not by the parser. Not sure how typical a use case this is, just saying that some extra attention is required in such cases.

Apart from that, parsing has become stricter because it's become more compliant with the spec, plus because tolerant mode has been changed to off by default. (Acornima's tolerant mode is different and not as permissive as Esprima's, so it wouldn't have made much sense to keep that default. (Users who relied on tolerant mode would mostly be broken anyway. Top level returns in scripts are still allowed. That behavior hasn't been changed.)

Finally, there are some changes to be aware of in error reporting too. For one, the format of syntax errors has changed significantly. (There are some issues around converting parser syntax errors to Jint SyntaxError objects. Which might be not a new issue, but I plan to look into it because it's not very nice this way...)

@lahma
Copy link
Collaborator

lahma commented Apr 11, 2024

Because adding notes to a closed PR is definitely a best practice, some things:

  • docfx: ParserException missing -> 25>PreprocessorWithResourcePool.cs(5,7): Error CS0246 : The type or namespace name 'Esprima' could not be found (are you missing a using directive or an assembly reference?)
  • EventStore: Assert.IsInstanceOf<Esprima.ParserException>(ex);
  • JavaScriptEngineSwitcher: probably most about changed namespaces/names
  • RavenDB, some manual creation of AST, requires changes to instantiated types

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants