Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch from Terser to escodegen #12410

Open
RReverser opened this issue Oct 2, 2020 · 20 comments
Open

Switch from Terser to escodegen #12410

RReverser opened this issue Oct 2, 2020 · 20 comments

Comments

@RReverser
Copy link
Collaborator

Raising an issue for tracking: such switch should simplify and speed up code generation, and allow us to avoid custom patches we have in Terser.

Background (#11303 (comment)):

@kripken: We don't use terser for minification. In tools/acorn_optimizer.js it's used to convert an acorn AST to JS (so it's in support of the passes in that file, which are almost all not minification-related). Maybe there's a better way or simpler tool for that? I remember I couldn't find one at the time.

@RReverser: Ah, that's what I thought after looking at the code. Yeah, there is estools/escodegen which, among other ESTree tools in that org, was created specifically as a simple AST-to-JS generator. It should be both simpler and much faster than going via all Terser's conversions.

@RReverser
Copy link
Collaborator Author

Semi-relevant: I've noticed we also have uglify-js in third_party, which is an older version of terser. I don't suppose we need both of these either?

uglify-js seems to be used only in tools/js-optimizer.js, which, in turn, doesn't seem to be used anywhere aside from one test? Am I missing external usages or can it and uglify-js be both removed? cc @sbc100 @kripken

@kripken
Copy link
Member

kripken commented Oct 2, 2020

Would be good to experiment with!

When I implemented the initial acorn integration I wrote this:

For outputting I experimented with astring, which is small and nice, and escodegen, which looks very robust, but neither could output compact-enough JS to not regress our JS code sizes.

Perhaps things have changed since then, worth checking.

@kripken
Copy link
Member

kripken commented Oct 2, 2020

Semi-relevant: I've noticed we also have uglify-js in third_party

tools/js-optimizer.js has a bunch of optimization passes that still use uglify. Hopefully they can get rewritten for acorn eventually.

@RReverser
Copy link
Collaborator Author

For outputting I experimented with astring, which is small and nice, and escodegen, which looks very robust, but neither could output compact-enough JS to not regress our JS code sizes.

Perhaps things have changed since then, worth checking.

escodegen has options for formatting - you can either output pretty-printed code, or a compact one.

Of course, it will never match Terser's minified output, but IIUC from your comment the original issue, that's a non-goal anyway, since we use Closure Compiler for proper minification.

@RReverser
Copy link
Collaborator Author

RReverser commented Oct 2, 2020

Semi-relevant: I've noticed we also have uglify-js in third_party

tools/js-optimizer.js has a bunch of optimization passes that still use uglify. Hopefully they can get rewritten for acorn eventually.

Can it be switched to Terser at least? They should be API-compatible (Terser is strict successor), and, for now, it would at least get rid of one old-ish dependency. (UPD: looking at versions, it seems to be UglifyJS@1.x, which actually is even older, and maybe not as compatible as I thought...)

@sbc100
Copy link
Collaborator

sbc100 commented Oct 2, 2020

I'm a big fan of any kind of cleanups / simplifications that can be done in this area!

@sbc100
Copy link
Collaborator

sbc100 commented Oct 2, 2020

Specifically if we can delete another checked in module from third_party I would be very happy.

@RReverser
Copy link
Collaborator Author

escodegen has options for formatting - you can either output pretty-printed code, or a compact one.

I actually just noticed that we never had them documented on the wiki page apparently. That's odd, but just added for visibility: https://github.com/estools/escodegen/wiki/API#optionformat

@RReverser
Copy link
Collaborator Author

tools/js-optimizer.js has a bunch of optimization passes that still use uglify. Hopefully they can get rewritten for acorn eventually.

Okay, but where is it called from? I can't find a single place that would use it aside from one test...

@kripken
Copy link
Member

kripken commented Oct 2, 2020

Of course, it will never match Terser's minified output, but IIUC from your comment the original issue, that's a non-goal anyway,

Well, there is more here... See the full context in the link there, I just quoted one sentence, which was maybe not clear enough, sorry. To summarize the context too, we do expect closure to be used when code size is absolutely critical. But we don't run closure by default, due to speed and also that it may break custom code a user adds that isn't closure-safe. And the breakage issue means that some users can't use closure at all. So while we don't work to squeeze every last byte out of non-closure builds, we do want them to be tiny. So a significant regression would not be good, I think. That's why I ended up adding terser - it was easy, and it made a big difference at the time.

(UPD: looking at versions, it seems to be UglifyJS@1.x, which actually is even older, and maybe not as compatible as I thought...)

Correct, that code uses Uglify1, which is a different API. Those passes would be needed to be rewritten to use a modern AST. But after removing fastcomp we need a lot fewer of them...

@RReverser
Copy link
Collaborator Author

RReverser commented Oct 2, 2020

And the breakage issue means that some users can't use closure at all. So while we don't work to squeeze every last byte out of non-closure builds, we do want them to be tiny. So a significant regression would not be good, I think. That's why I ended up adding terser - it was easy, and it made a big difference at the time.

Makes sense. I guess worth experimenting with FORMAT_MINIFY quoted above and comparing final JS+gzip numbers on some real codebase to see how much we'd lose by leaving Terser.

As for the Closure breaking the code - it should be happening only under advanced optimisations, right? In a "simple" mode it does pretty much the same things as Terser does, so perhaps we could just switch between them.

@kripken
Copy link
Member

kripken commented Oct 2, 2020

Okay, but where is it called from?

Hmm... tools/js_optimizer.py calls tools/js-optimizer.js to run passes. And the python should be called from emcc.py. But it looks like we removed it... which was 99% correct, but I think we removed too much @sbc100 ! We should be calling it to run safeHeap, for example. That means we've lost SAFE_HEAP coverage on custom user JS code 😭

I'll look into fixing that!

@RReverser
Copy link
Collaborator Author

Heh, glad we found it I guess 😅

@sbc100
Copy link
Collaborator

sbc100 commented Oct 2, 2020

Okay, but where is it called from?

Hmm... tools/js_optimizer.py calls tools/js-optimizer.js to run passes. And the python should be called from emcc.py. But it looks like we removed it... which was 99% correct, but I think we removed too much @sbc100 ! We should be calling it to run safeHeap, for example. That means we've lost SAFE_HEAP coverage on custom user JS code

I'll look into fixing that!

But how was it possible we removed too much? Do we lack test coverage?

@kripken
Copy link
Member

kripken commented Oct 2, 2020

But how was it possible we removed too much? Do we lack test coverage?

I'm still looking, but I'm pretty sure the wasm backend path never applied safe heap on user code, and yes, we lacked a test so this was missed...

@kripken
Copy link
Member

kripken commented Oct 2, 2020

Oh, and the other place we use js_optimizer.py, which exists and is correct, is for wasm2js, in building.py (def js_optimizer etc.).

kripken added a commit that referenced this issue Oct 5, 2020
This removes some TODOs from wasm2js where we thought about
running more of the old passes. I think that's not a good idea - they
will not run as our wasm2js output gets less and less like asm.js
anyhow. We'll need to write new acorn passes there, if we want to.

This removes safeHeap which we should be running (see #12410 (comment)),
but as it would be the only thing we use the js optimizer for from emcc.py,
and that isn't a recent regression, remove that too and I'll rewrite it for
acorn.

See #11860
@jonassmedegaard
Copy link

jonassmedegaard commented Dec 6, 2020

Terser itself switched away from ESCodegen to astring - perhaps relevant to cinsider astring here as well?

@sbc100
Copy link
Collaborator

sbc100 commented Jan 26, 2021

Does anyone feel like working on this?

@stale
Copy link

stale bot commented Apr 16, 2022

This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 30 days. Feel free to re-open at any time if this issue is still relevant.

@stale stale bot added the wontfix label Apr 16, 2022
@RReverser
Copy link
Collaborator Author

RReverser commented Oct 22, 2023

Terser itself switched away from ESCodegen to astring - perhaps relevant to cinsider astring here as well?

Note that they switched it only in test utils, not as an implementation.

astring could be an interesting alternative but can't be used until davidbonnet/astring#203 is resolved as Emscripten relies on at least whitespace minification.

@stale stale bot removed the wontfix label Oct 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants