refactor: mmap asar files #24470

nornagon · 2020-07-08T22:56:14Z

Description of Change

In preparation for being able to load asar files from a dll/dylib, which are implicitly memory mapped.

Checklist

PR description included and stakeholders cc'd
npm test passes
PR title follows semantic commit guidelines
This is NOT A BREAKING CHANGE. Breaking changes may not be merged to master until 11-x-y is branched.

Release Notes

Notes: none

deepak1556 · 2020-07-08T23:36:53Z

First of all, interesting exploration 👍

QQ: wouldn't this increase the memory footprint of the main process considering the entire asar archive is being kept in memory ?

nornagon · 2020-07-09T18:08:26Z

QQ: wouldn't this increase the memory footprint of the main process considering the entire asar archive is being kept in memory ?

This isn't how mmap works. The OS can page out chunks of the mmap'd file.

nornagon · 2020-07-09T18:13:58Z

Some relevant commentary from Sublime: https://www.sublimetext.com/blog/articles/use-mmap-with-care

deepak1556 · 2020-07-09T18:31:15Z

thanks for the context, follow-up question why does base::MemoryMappedFile provide to load a specific region of file into mmap https://source.chromium.org/chromium/chromium/src/+/master:base/files/memory_mapped_file.h;l=102 and there are some uses of that in the codebase, what is the advantage over loading the whole file ? would it be useful for asar archive which can be separated to regions.

nornagon · 2020-07-09T18:35:49Z

Not 100% sure, but I think the main reason to map a subregion is to limit usage of virtual address space, which is important mainly on 32-bit systems. It could also be a convenience method to avoid having to pass around an offset everywhere along with the mmap'd file.

MarshallOfSound · 2020-07-09T18:48:00Z

@nornagon Have you performance tested this at all? Can we see if there's an increase / decrease in perf of reading a large amount of small files sequentially. Or reading large files. etc.

deepak1556 · 2020-07-09T20:55:39Z

Thinking a bit more, can you run this experiment behind a base::FeatureList flag, seems like a good candidate.

nornagon · 2020-07-09T20:58:26Z

Thinking a bit more, can you run this experiment behind a base::FeatureList flag, seems like a good candidate.

Hm, I'm not sure I understand why this would qualify for that. There's no change in functionality.

deepak1556 · 2020-07-09T21:11:26Z

In preparation for being able to load asar files from a dll/dylib

wouldn't this require anymore implementation changes, having both code paths then would help in comparison as well as a way to opt in.

There's no change in functionality.

Hmm but the underlying implementation has changed, this is similar to blink enabling its new rendering engine LayoutNG which had no user facing changes but was behind feature flag to experiment any regressions in perf or other stuff.

nornagon · 2020-07-09T21:34:30Z

In preparation for being able to load asar files from a dll/dylib

wouldn't this require anymore implementation changes, having both code paths then would help in comparison as well as a way to opt in.

You'll be able to opt in to loading from a dylib by putting your app in a dylib and enabling the relevant fuse. That part will always remain optional, and no feature flag will be necessary.

There's no change in functionality.

Hmm but the underlying implementation has changed, this is similar to blink enabling its new rendering engine LayoutNG which had no user facing changes but was behind feature flag to experiment any regressions in perf or other stuff.

LayoutNG was a major rewrite of a massive and extraordinarily complex system. This is not. I think the risk is substantially and meaningfully lower here.

I don't think we should ask that all refactors in Electron come with a feature flag and a deprecation cycle. If there are concrete reasons to suspect that this may cause issues, then I'd be happy to test & address them prior to merging. Otherwise, let's treat this as any other refactor: roll it out, watch for reports of breakage, and fix as necessary.

The only thing I'm cautious about is potential issues when loading from a network drive that goes away (as mentioned in the sublime post). Previously, that would have resulted in an I/O failure (with unknown effects on the app; it would be up to the app to handle it). After this change, I think it will result in SIGBUS.

There might be other issues when using mmap as opposed to fs.read(), but we won't find them until we enable this by default anyway, so putting it behind a flag will just delay the gathering of such information and force us to maintain both implementations in parallel for longer.

I think flags are best suited for large, relatively isolated functionality which app developers have a reason to want to opt into (e.g. it enables a feature for them, so they'll switch on the flag in order to use it). The value of this refactor by itself isn't obvious, so I don't see any reason a developer would choose to enable it.

shell/common/api/electron_api_asar.cc

lib/common/asar.js

shell/common/api/electron_api_asar.cc

zcbenz · 2020-07-23T01:48:52Z

In preparation for being able to load asar files from a dll/dylib, which are implicitly memory mapped.

I'm not sure what does loading asar files from dll means, can you explain a bit more about what it is and when it would be used?

Mmap is very useful for editors when editing very large files, where app has to constantly read parts of files. But in Electron apps most files in asar archive are only read for once, I'm not sure if there would be benefits on performance. So I'm a bit hesitate with a refactoring that brings little benefits and increases stability risks. It would be great if you can explain more about the new features it would enable, and probably do some benchmarks on performance.

nornagon · 2020-07-23T18:34:49Z

I'm not sure what does loading asar files from dll means, can you explain a bit more about what it is and when it would be used?

The idea is to allow protecting script sources with codesigning. Currently, .asar files are not considered "code" by codesigning verification systems in macOS and Windows:

macOS's Gatekeeper system validates non-code resources at first launch, but not thereafter. However, enabling the hardened runtime on a binary can prevent it from loading any unsigned dylibs, regardless of whether it's "first launch" or not.
- The level of protection that this would add is debatable. If an attacker can modify files on disk, they can just as easily remove the code signature and hardened runtime flags from the main executable as edit the .asar. In current versions of macOS, unsigned binaries run without any user prompt. There are, however, some minor restrictions that apply to unsigned apps: for instance, unsigned apps cannot access the keychain.
- I've heard that on modern macOS, Gatekeeper does "periodic" checks in addition to first-launch checks. I'm not sure what triggers this.
- There have been some signs from Apple that in future versions of macOS, unsigned code will not run by default.
On Windows, AppLocker can restrict which publishers' signatures are respected for all loaded DLLs. It has no system for requiring that non-DLL files are signed, though. Putting resources in a DLL on Windows would allow AppLocker to apply its restrictions to those resources.
I don't know enough about codesigning on Linux to comment as to whether this approach would be relevant on that platform. It seems there are several systems and ongoing development. e.g. IMA appraisal.

The goal here is to allow Electron apps to take advantage of the resource-integrity systems provided by the underlying operating system. Native apps have access to this by default, but given that the code of Electron apps is loaded dynamically from a non-"code" file, these systems do not effectively apply to Electron apps currently.

My plan for supporting this is roughly:

Introduce the ability to load ASAR files from a new kind of bundle file, .asar.dll (or .asar.dylib on macOS). These would be loaded with LoadLibrary / dlopen, and would expose a two well-known symbols: a pointer to a byte sequence, and the size of that byte sequence.
Add app.asar.dll to the front of the "path search" code when searching for app code.
Add a fuse which turns off the path search code and sets it to only ever load from app.asar.dll.

The ultimate goal is to provide apps with tools that allow them to do just as well as the OS will permit them to in attempting to prevent unvalidated code from running.

zcbenz · 2020-07-24T00:41:00Z

Thank for the clarification, code signing asar files would be very exciting and I think it is a fair use case. 👍

nornagon · 2020-07-28T23:44:27Z

@zcbenz please take another look.

nornagon · 2020-08-04T18:47:59Z

all 4 (!) ci failures are unrelated; merging.

release-clerk · 2020-08-04T18:48:08Z

No Release Notes

This reverts commit 01a2e23.

This reverts commit 01a2e23. Co-authored-by: Jeremy Rose <nornagon@nornagon.net>

This reverts commit 01a2e23.

gpetrov · 2021-05-18T19:39:32Z

@nornagon packaging asar files in dlls/gylibs and code signing them will be really an amazing feature! Can't wait till it is available!

I presume that the asar.dll files will be loaded by Electron only when the code signature checks out and refused otherwise.

Loading from a memory mapped files, should really give also a welcome speed boost as well, specially when dealing with tons of small node_modules files.

Also another welcome addition will be also to be able to encrypt the asar files, so next to code signing also code protection is available. When packing an asar archive we can already specify a stream transform, now we just need to at the loading as well.

gpetrov · 2021-05-19T08:03:11Z

A small but powerful addition will be also the just un-gzip asar dll files runtime on load. This will save storage as well and make asar files much smaller.

Also if the asar memory can be directly smartly mapped with the node_modules cache, so run time memory is saved. You don't want to load a 100MB asar file with node_modules, that will take then twice the memory because of the asar mmap file self and all the require modules cache... but that might be quite a technical challenge.

nornagon · 2021-05-19T16:42:15Z

I presume that the asar.dll files will be loaded by Electron only when the code signature checks out and refused otherwise.

This is up to the OS. On Windows, code signatures are not checked by default, but there are some 3rd-party tools like AppLocker which can be employed to harden the system.

Loading from a memory mapped files, should really give also a welcome speed boost as well, specially when dealing with tons of small node_modules files.

There is no appreciable speed difference between reading from an asar and reading from a memory-mapped file, as far as I can tell.

encrypt the asar files

We have no plans to add encryption of any sort.

un-gzip asar dll files runtime on load

I don't believe this is useful to implement in Electron. Enable NTFS compression if you want this.

directly smartly mapped with the node_modules cache

I'm not sure what you mean by this.

gpetrov · 2021-05-19T17:56:28Z

This is up to the OS. On Windows, code signatures are not checked by default, but there are some 3rd-party tools like AppLocker which can be employed to harden the system.

Well we can't really rely on the OS as we don't have control of it. Specially on windows indeed there is no code signing validation per default or it can be easily switched off.

So loading and validating the app and its asar file is really a responsibility of the app self. It has being asked before many times: #24260 and encryption just for that as well #24681 #2570

maybe something like the content verification with own public key, like in NWJS:
https://nwjs.readthedocs.io/en/latest/For%20Users/Advanced/Content%20Verification/

but then combined with Electron fuses @MarshallOfSound or other more easier way to embed the public key in the electron binary without the need to recompile it and also have a way from the asar content to verify back that the main binary is also not tempered and bot use the same code signing key.

un-gzip asar dll files runtime on load
I don't believe this is useful to implement in Electron. Enable NTFS compression if you want this.

Again we don't have control of the user system, so can just let them enable NTFS compression. Just like loading files from a website the browser accepts gzip and unzip on the fly - this should be also possible when loading asar files. NWJS also supports having zip file as a content package to load. But unfortunately it unzips it fully first and is not that smart as asar to do that on the fly.

directly smartly mapped with the node_modules cache > I'm not sure what you mean by this.

well the node modules code resides in the asar file, so when the asar file is loaded, it will get in memory (well maybe not completely depending on the mmap implementation) but then when a node module is required, the require builds a global cache in memory of all required modules, so those don't have to be loaded again. So you can actually the code of the same module twice - once in the asar memory file and once in the require cache https://nodejs.org/api/modules.html#modules_caching

fabiospampinato · 2021-06-08T11:22:53Z

un-gzip asar dll files runtime on load

I don't believe this is useful to implement in Electron. Enable NTFS compression if you want this.

@nornagon Could you expand on why you think this wouldn't be useful? One complaint I hear fairly often from people is that Electron apps are just big, if there's something that can deliver double-digits improvements in this area I don't think Electron should just dismiss the opportunity like that.

Also I don't think it's reasonable to say to people "Yeah we could improve on this, but just switch to a filesystem that supports compressed partitions", that's obviously not going to work, people don't even know what a filesystem is.

I have opened an issue about this here. IMHO working with compressed asar archives may speed things up too, since the bottleneck for almost all machine is going to be reading from disk rather than decompressing.

Other than the potential difficulties of switching to compressed asar archives it seems to me like the potential pros clearly outweigh the potential cons 🤷‍♂️

refactor: mmap asar files

259469e

nornagon requested a review from MarshallOfSound July 8, 2020 22:56

electron-cation bot added the new-pr 🌱 PR opened in the last 24 hours label Jul 8, 2020

remove unused method

10ad3d2

deepak1556 approved these changes Jul 9, 2020

View reviewed changes

electron-cation bot removed the new-pr 🌱 PR opened in the last 24 hours label Jul 9, 2020

zcbenz requested changes Jul 23, 2020

View reviewed changes

shell/common/api/electron_api_asar.cc Outdated Show resolved Hide resolved

shell/common/api/electron_api_asar.cc Outdated Show resolved Hide resolved

lib/common/asar.js Outdated Show resolved Hide resolved

shell/common/api/electron_api_asar.cc Outdated Show resolved Hide resolved

nornagon added 5 commits July 28, 2020 11:46

Merge remote-tracking branch 'origin/master' into asar-mmap

55bb765

review comments

d7b3bea

fix types

13da433

whoops

a7f9e10

whoops2

01382ef

nornagon requested a review from zcbenz July 28, 2020 23:44

nornagon added 3 commits August 3, 2020 14:57

Merge remote-tracking branch 'origin/master' into asar-mmap

8a51826

allocate backing store on main thread

7d85e9a

fix build

ecbeaba

zcbenz approved these changes Aug 4, 2020

View reviewed changes

nornagon merged commit 01a2e23 into master Aug 4, 2020

nornagon deleted the asar-mmap branch August 4, 2020 18:48

zarubond mentioned this pull request Mar 10, 2021

Unable to read files from asar after 10+ hours of runtime #27982

Closed

3 tasks

nornagon added a commit that referenced this pull request Mar 11, 2021

fix: revert "refactor: mmap asar files (#24470)"

7e6a939

This reverts commit 01a2e23.

nomagick mentioned this pull request Mar 15, 2021

ASAR fd not closeable on Windows #28195

Closed

3 tasks

nornagon added a commit that referenced this pull request Mar 15, 2021

fix: revert "refactor: mmap asar files (#24470)" (#28137)

d27ad0d

This reverts commit 01a2e23.

trop bot pushed a commit that referenced this pull request Mar 15, 2021

fix: revert "refactor: mmap asar files (#24470)"

167ec59

This reverts commit 01a2e23.

trop bot pushed a commit that referenced this pull request Mar 15, 2021

fix: revert "refactor: mmap asar files (#24470)"

28329e0

This reverts commit 01a2e23.

trop bot pushed a commit that referenced this pull request Mar 15, 2021

fix: revert "refactor: mmap asar files (#24470)"

247b003

This reverts commit 01a2e23.

zcbenz pushed a commit that referenced this pull request Mar 16, 2021

fix: revert "refactor: mmap asar files (#24470)" (#28201)

fb75073

This reverts commit 01a2e23. Co-authored-by: Jeremy Rose <nornagon@nornagon.net>

zcbenz pushed a commit that referenced this pull request Mar 16, 2021

fix: revert "refactor: mmap asar files (#24470)" (#28202)

44eb30f

This reverts commit 01a2e23. Co-authored-by: Jeremy Rose <nornagon@nornagon.net>

zcbenz pushed a commit that referenced this pull request Mar 16, 2021

fix: revert "refactor: mmap asar files (#24470)" (#28203)

010c6fd

This reverts commit 01a2e23. Co-authored-by: Jeremy Rose <nornagon@nornagon.net>

mlaurencin pushed a commit that referenced this pull request Mar 18, 2021

fix: revert "refactor: mmap asar files (#24470)" (#28137)

45e34e5

This reverts commit 01a2e23.

nornagon mentioned this pull request Mar 31, 2021

feat: add experimental cookie encryption support #27524

Merged

4 tasks

gpetrov mentioned this pull request Jun 8, 2021

[Feature Request]: Compressed asar archives #29556

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: mmap asar files #24470

refactor: mmap asar files #24470

nornagon commented Jul 8, 2020

deepak1556 commented Jul 8, 2020 •

edited

nornagon commented Jul 9, 2020

nornagon commented Jul 9, 2020

deepak1556 commented Jul 9, 2020

nornagon commented Jul 9, 2020

MarshallOfSound commented Jul 9, 2020

deepak1556 commented Jul 9, 2020

nornagon commented Jul 9, 2020

deepak1556 commented Jul 9, 2020

nornagon commented Jul 9, 2020

zcbenz commented Jul 23, 2020

nornagon commented Jul 23, 2020

zcbenz commented Jul 24, 2020

nornagon commented Jul 28, 2020

nornagon commented Aug 4, 2020

release-clerk bot commented Aug 4, 2020

gpetrov commented May 18, 2021

gpetrov commented May 19, 2021

nornagon commented May 19, 2021

gpetrov commented May 19, 2021

fabiospampinato commented Jun 8, 2021

refactor: mmap asar files #24470

refactor: mmap asar files #24470

Conversation

nornagon commented Jul 8, 2020

Description of Change

Checklist

Release Notes

deepak1556 commented Jul 8, 2020 • edited

nornagon commented Jul 9, 2020

nornagon commented Jul 9, 2020

deepak1556 commented Jul 9, 2020

nornagon commented Jul 9, 2020

MarshallOfSound commented Jul 9, 2020

deepak1556 commented Jul 9, 2020

nornagon commented Jul 9, 2020

deepak1556 commented Jul 9, 2020

nornagon commented Jul 9, 2020

zcbenz commented Jul 23, 2020

nornagon commented Jul 23, 2020

zcbenz commented Jul 24, 2020

nornagon commented Jul 28, 2020

nornagon commented Aug 4, 2020

release-clerk bot commented Aug 4, 2020

gpetrov commented May 18, 2021

gpetrov commented May 19, 2021

nornagon commented May 19, 2021

gpetrov commented May 19, 2021

fabiospampinato commented Jun 8, 2021

deepak1556 commented Jul 8, 2020 •

edited