Skip to content

Conversation

@juj
Copy link
Collaborator

@juj juj commented Dec 8, 2019

Automatically run npm install in Emscripten root directory after installing an Emscripten tool.

@kripken
Copy link
Member

kripken commented Dec 8, 2019

Is this for all normal emsdk usage (install latest, activate latest) or just for people building from source?

@juj
Copy link
Collaborator Author

juj commented Dec 8, 2019

This is for all targets. Whenever emscripten is pulled down, the post-build install builds optimizer and then does this npm install.

@kripken
Copy link
Member

kripken commented Dec 9, 2019

Ah, that worries me, then. But I see this discussion is spread over many issues and PRs - where is the best place to get an overview of your total plan here, and to discuss it more in detail?

Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general this is the direction am hoping that we can move in.

We do need to support the new "fat-packages" produced the emscripten-releases (e.g. sdk, latest, upstream, etc) build as well though. In that case emscripten is in a sub-directory of the tool since its all just once big tool (I'm actually kind of sad we lost granularity thee but that's a separate issue).

@juj
Copy link
Collaborator Author

juj commented Dec 10, 2019

But I see this discussion is spread over many issues and PRs - where is the best place to get an overview of your total plan here, and to discuss it more in detail?

To summarize:

  • I want to move to include non-Java version of closure to tree so that we don't need to depend on Java to get closure ability at Unity.
  • in Is there a way to create minified bundle/amalgamation of JS closure compiler? google/closure-compiler#3515 I asked for an easy to check in version of Closure as a single file bundle, but that does not exist - npm is the only mechanism they support.
  • first I authored a PR Closure js emscripten#9962 that would npm install Closure on demand, but that is awkward, as downloading files at the time of building a project generates a silly "after first install you still need to install when you build" scheme. When SDKs like Unity ship Emscripten, we need to ensure that after we bundle Emscripten up, there will be no modifications to it at runtime when user builds Unity projects. So installation needs to be up front, preferably done with a clear 'install' command instead of needing to install+build dummy projects.
  • then I authored PRs Migrate Closure to be shipped via emsdk or manually installed via npm. emscripten#9986 and Add Closure compiler to emsdk. #403 that would make closure-compiler-20191111 a emsdk-shipped Tool that one would install via emsdk like one installs python/java/mingw/node. But perhaps simpler to avoid building abstractions and CDN burden for this in emsdk, after all npm can already do the same job for this.
  • then I authored this PR and PR Migrate Closure to be located via npm install emscripten#9989 to migrate Closure to be provided via running npm install. If one emsdk installs Emscripten, a post install step here will do the npm install step for the user, so they will not need any additional steps.
  • in the same fashion, I authored PR html-minifier-terser emscripten#9990 to migrate to using html-minifier-terser via npm instead of updating html-minifier in tree, to fix a problem with html-minifier not supporting minifying JavaScript.

@kripken
Copy link
Member

kripken commented Dec 10, 2019

Thanks for the writeup @juj!

I have some worries about using npm install in this way. Mostly it feels like adding a large dependency with some new risks. Maybe I am not seeing the solutions, though.

  • About closure only supporting npm install - sure, I get that they don't officially support anything else. But the output of npm install is some JS files that are portable, so we can just copy those, can't we, as an alternative to this?
  • If emsdk runs npm install every time a new version is installed, what do we do if an error occurs (network, or some other npm issue)? How clear would it be for users to debug that, and for us to support them when they file problems?
  • How good is npm support on all platforms? In particular, things like windows (all the different ways to run there), arm, etc.
  • Are there other non-JS projects that use npm install internally in this way? (Non-JS, because a JS workflow would have a developer that understands npm well presumably, and run it directly, and understand errors etc.)

@sbc100
Copy link
Collaborator

sbc100 commented Dec 10, 2019

I'll try to respond to your points in order:

  1. Zipping zip the "node_modules" folder is shipping it is not, IIUC, a recommend way to get node modules installed onto a target machine. That is exactly want npm install for. Checking-in or zipping-up a node_module folder is exactly what we are trying to move away from with this change.

  2. Yes, npm is subject to network errors, just like emsdk itself. But its error will most likely be better that ours since they have many more developers and users. This is a valid point that it does increase the complexity of the emsdk install phase, but this is a cost I think we should bare. npm is very well tested on thousands of machines every day.

  3. IIUC its very well supported.

  4. This is a valid concern. We are a little strange in this way I think, but to be fair we are not pushing the requirement onto our users that they understand npm, we are running it on their behalf during emsdk install. I can imagine there are many projects that at written in a different primary language that still use npm to pull in a JS tool. Imagine a website witten in python that wants to use a tool from npm.. I imagine that must happen, although I don't know of any of the top of my head.

@juj
Copy link
Collaborator Author

juj commented Dec 11, 2019

I think the big practical point here is not about if closure uses native vs java vs js, and it is not about whether we distribute via npm or emsdk, or if we are unconventional users of npm, but it is the fact that bundling these kind of tools in the Emscripten tree is extremely costly. This is because git clone clones the whole history, and these kinds of tools are large, so if closure and other tools like this are in the tree (html-minifier, uglify), whenever we update the version, git clone will still need to pull down copies of all versions we used to ever include. That is horrible and does not scale to the future.

(Currently git clone .git subdirectory of Emscripten is 296MB, whereas working tree is 138MB, so we are at a history:working tree ratio of about 2:1 at the moment. Each tool version accumulates that directly)

The only in-tree solution to that would be to use git-lfs, but these are not large files, but many files, so git-lfs would only work if we zipped up closure in the tree and used git-lfs. But that is super clumsy as well, and whenever you checked out a different emscripten git hash, you'd need to run a custom post checkout unzip command.

Hence I agree we should remove these tools from in-tree and move them to be downloaded via other means. At first I though it would be best to use emsdk hosting for that to keep the CDN dependency light, but after refactoring to use npm, I think that is more convenient for this, since their CDN is likely more stable, it is less work for us, not all Emscripten users like to use emsdk, and updating closure/html-minifier versions becomes super-easy (just edit depedencies.json and change the version number). I do think the npm install route is the best approach that we could take here.

@juj
Copy link
Collaborator Author

juj commented Dec 11, 2019

plus we will then get these kind of lovely stop-the-world messages from GitHub :)

Screen Shot 2019-12-11 at 9 45 51 AM

@kripken
Copy link
Member

kripken commented Dec 11, 2019

I'm not strongly opposed to npm. I do think this adds some risk in maintenance for emscripten developers, as we'll need to figure out npm issues for people that use the emsdk. But if people are in favor of this then let's consider it.

Talking with @sbc100 and @jgravelle-google offline, I think a good next step here might be to post to the mailing list. We should get wide feedback before such a change. In particular we should be sure to consider the docker image use cases etc. cc @trzecieu

One thing @jgravelle-google mentioned is that we should make sure we pin versions of all the things we use, and their dependencies as well. If we use X and it uses Y of version >= N, then we don't want npm to fetch newer versions at some point in time. (I don't know enough about npm to know if that's the default or not.)

@sbc100
Copy link
Collaborator

sbc100 commented Dec 11, 2019

@juj how would you feel about delaying this change until we have disussed it on the mailing list and perhaps running a experiment?

In the mean time maybe you should land the JS clusure compiler change by checking in direclyt into node_modules like we do for ws? Objviously its not a long term solution but maybe it can be short term one.

@juj
Copy link
Collaborator Author

juj commented Dec 11, 2019

we should make sure we pin versions of all the things we use

Yeah, considered the same. The PRs emscripten-core/emscripten#9989 and emscripten-core/emscripten#9990 are both authored in that form. It also changes websockets to be pinned in version, at this line: https://github.com/emscripten-core/emscripten/pull/9989/files#diff-b9cfc7f2cdf78a7f4b91a753d10865a2R4

The way pinning works is that the version number in package.json is specified without a ~ or a ^ prefix (or other prefixes or - range). The resulting package-lock.json file from npm install in the repo then pins down versions of the dependencies I believe.

@juj
Copy link
Collaborator Author

juj commented Dec 11, 2019

In the mean time maybe you should land the JS clusure compiler change by checking in direclyt into node_modules like we do for ws? Objviously its not a long term solution but maybe it can be short term one.

I'd rather not add a temp closure compiler copy in the repo, as I mentioned before, it is quite large at ~400 files and 57 MB (uncompressed), and it would bloat up the git repo needlessly. Testing out git commiting closure in the repository grows the .git subdirectory from 296MB to 323MB (+9.12% overall repository size). I'd rather not add a short-term temp solution commit that increased repo size that much.

@juj how would you feel about delaying this change until we have disussed it on the mailing list and perhaps running a experiment?

Sure, go ahead.

@trzecieu
Copy link
Collaborator

trzecieu commented Dec 11, 2019

In particular we should be sure to consider the docker image use cases etc. cc @trzecieu

This should not be a problem, as npm will be executed on image build phase. This might add some extra weight to the image itself ans node_package likes to store lots of treasures.

Once you have a test branch I'm happy to test solution and provide some data.

@juj
Copy link
Collaborator Author

juj commented Dec 11, 2019

grows the .git subdirectory from 296MB to 323MB (+9.12% overall repository size)

Or hmm actually, the 296MB included my own fork and a couple of other people's forks. A clean git clone of emscripten-core/emscripten.git is 213MB, so 213MB -> 240MB = +12.7% size increase.

@sbc100
Copy link
Collaborator

sbc100 commented Dec 11, 2019

I'd rather not add a temp closure compiler copy in the repo, as I mentioned before, it is quite large at ~400 files and 57 MB (uncompressed), and it would bloat up the git repo needlessly. Testing out git commiting closure in the repository grows the .git subdirectory from 296MB to 323MB (+9.12% overall repository size). I'd rather not add a short-term temp solution commit that increased repo size that much.

I don't understand, I just did npm install closure-compiler and I got:

du -sh node_modules/
9.9M	node_modules/

All of that looks like the jar file really:

du -sh node_modules/google-closure-compiler/*
6.1M	node_modules/google-closure-compiler/compiler.jar
3.7M	node_modules/google-closure-compiler/contrib
12K	node_modules/google-closure-compiler/COPYING
4.0K	node_modules/google-closure-compiler/package.json
16K	node_modules/google-closure-compiler/README.md

I must be missing something?

@juj
Copy link
Collaborator Author

juj commented Dec 12, 2019

Yeah, the target is npm install google-closure-compiler (https://www.npmjs.com/package/google-closure-compiler). closure-compiler is something old (https://www.npmjs.com/package/closure-compiler)

@sbc100
Copy link
Collaborator

sbc100 commented Dec 12, 2019

Bit looks like its only 56Mb:

$ du -sm node_modules/
56	node_modules/
$ du -ms node_modules/*
1	node_modules/ansi-styles
1	node_modules/chalk
1	node_modules/clone
1	node_modules/cloneable-readable
1	node_modules/clone-buffer
1	node_modules/clone-stats
1	node_modules/color-convert
1	node_modules/color-name
1	node_modules/core-util-is
1	node_modules/escape-string-regexp
7	node_modules/google-closure-compiler
11	node_modules/google-closure-compiler-java
4	node_modules/google-closure-compiler-js
34	node_modules/google-closure-compiler-linux
1	node_modules/has-flag
1	node_modules/inherits
1	node_modules/isarray
1	node_modules/minimist
1	node_modules/process-nextick-args
1	node_modules/readable-stream
1	node_modules/remove-trailing-separator
1	node_modules/replace-ext
1	node_modules/safe-buffer
1	node_modules/source-map
1	node_modules/string_decoder
1	node_modules/supports-color
1	node_modules/util-deprecate
1	node_modules/vinyl
1	node_modules/vinyl-sourcemaps-apply

And if you remove the native linux binary its only 23:

$ rm -rf node_modules/google-closure-compiler-linux/
$ du -sm node_modules/
23	node_modules/

So maybe it would be OK with checkin in the interim? Or you can just wait too if you prefer.

@juj juj force-pushed the npm_install_in_emscripten branch from 54fc96f to c9c4495 Compare December 17, 2019 18:01
Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want run with --production to avoid install any dev dependencies (currently just ws) on all users machines.. but its not huge deal.

The major blocker for this PR now is that I think it doesn't work for the "bundled" sdk such as "emsdk install latest". Here would probably need a separate function that does a similiar thing but in the "emscripten" subdirectory of a bundles SDK.

@juj juj force-pushed the npm_install_in_emscripten branch from dd0d012 to 64213b1 Compare December 20, 2019 10:11
@juj
Copy link
Collaborator Author

juj commented Dec 20, 2019

Addressed review, tests pass now.

Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I think we should land this and see what kind bug reports we get back.

Right now are landing this without actually depending on it yet. I propose that we don't actually start depending on the result of the npm install until we've got some feedback/results.

@kripken are you ok with this plan?

node_path = os.path.join(node_tool.installation_path(), 'bin')
npm = os.path.join(node_path, 'npm' + ('.cmd' if WINDOWS else ''))
env = os.environ.copy()
env["PATH"] = node_path + os.pathsep + env["PATH"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just out of interest, why is this needed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Npm needs Node.js in PATH on Linux (on macOS and Windows it worked without).

@kripken
Copy link
Member

kripken commented Dec 20, 2019

Not sure I understand the plan - if we don't depend on it, how will we get bug reports? Maybe I don't understand which users you mean here - just us devs?

Do you think the mailing list discussion is concluded? It didn't seem done last I read it. But I see no harm in landing something that we don't depend on and is optional (but again, if it's optional I don't see where bug reports will come from).

@sbc100
Copy link
Collaborator

sbc100 commented Dec 20, 2019

Well emsdk install will still fail if npm install fails. So we should know very quickly if there are any issues with doing this.

What I mean by not depend on it, is that we can easily back it out, since we are not going to hold off on landing any emscripten-side change that depend on it (for some initial trial period).

If we get reports of failure we can simpy disable this again.

@kripken
Copy link
Member

kripken commented Dec 21, 2019

I see, thanks, lgtm then.

For landing this, maybe let's wait until after the holidays, to avoid breakage when we are not available? That will also give more time for the mailing list discussion.

@sbc100
Copy link
Collaborator

sbc100 commented Jan 8, 2020

lgtm

@juj
Copy link
Collaborator Author

juj commented Jan 9, 2020

Ping @kripken on this, it has now been quite a while since the conversation on mailing list. What do you think?

@kripken
Copy link
Member

kripken commented Jan 9, 2020

Yeah, I think we've waited enough for that discussion, so if you and @sbc100 have the time to land this and handle any fallout, lgtm.

@juj juj merged commit d30ba16 into master Jan 10, 2020
@delete-merged-branch delete-merged-branch bot deleted the npm_install_in_emscripten branch January 10, 2020 20:11
vargaz pushed a commit to vargaz/emsdk that referenced this pull request Nov 22, 2023
…net/node dotnet/cpython (emscripten-core#404)

* Update dependencies from https://github.com/dotnet/emscripten build 20230807.1

Microsoft.NETCore.Runtime.Wasm.Emscripten.Transport
 From Version 8.0.0-preview.4.23381.1 -> To Version 8.0.0-preview.4.23407.1

* Update dependencies from https://github.com/dotnet/binaryen build 20230807.1

runtime.linux-arm64.Microsoft.NETCore.Runtime.Wasm.Binaryen.Transport , runtime.linux-x64.Microsoft.NETCore.Runtime.Wasm.Binaryen.Transport , runtime.osx-arm64.Microsoft.NETCore.Runtime.Wasm.Binaryen.Transport , runtime.osx-x64.Microsoft.NETCore.Runtime.Wasm.Binaryen.Transport , runtime.win-arm64.Microsoft.NETCore.Runtime.Wasm.Binaryen.Transport , runtime.win-x64.Microsoft.NETCore.Runtime.Wasm.Binaryen.Transport
 From Version 8.0.0-preview.4.23381.1 -> To Version 8.0.0-preview.4.23407.1

* Update dependencies from https://github.com/dotnet/node build 20230807.1

runtime.linux-arm64.Microsoft.NETCore.Runtime.Wasm.Node.Transport , runtime.linux-x64.Microsoft.NETCore.Runtime.Wasm.Node.Transport , runtime.osx-arm64.Microsoft.NETCore.Runtime.Wasm.Node.Transport , runtime.osx-x64.Microsoft.NETCore.Runtime.Wasm.Node.Transport , runtime.win-arm64.Microsoft.NETCore.Runtime.Wasm.Node.Transport , runtime.win-x64.Microsoft.NETCore.Runtime.Wasm.Node.Transport
 From Version 8.0.0-preview.4.23381.1 -> To Version 8.0.0-preview.4.23407.1

* Update dependencies from https://github.com/dotnet/cpython build 20230809.1

runtime.osx-arm64.Microsoft.NETCore.Runtime.Wasm.Python.Transport , runtime.osx-x64.Microsoft.NETCore.Runtime.Wasm.Python.Transport , runtime.win-arm64.Microsoft.NETCore.Runtime.Wasm.Python.Transport , runtime.win-x64.Microsoft.NETCore.Runtime.Wasm.Python.Transport
 From Version 8.0.0-preview.4.23374.1 -> To Version 8.0.0-preview.4.23409.1

---------

Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants