New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: mmap asar files #24470
refactor: mmap asar files #24470
Conversation
First of all, interesting exploration 👍 QQ: wouldn't this increase the memory footprint of the main process considering the entire asar archive is being kept in memory ? |
This isn't how mmap works. The OS can page out chunks of the mmap'd file. |
Some relevant commentary from Sublime: https://www.sublimetext.com/blog/articles/use-mmap-with-care |
thanks for the context, follow-up question why does |
Not 100% sure, but I think the main reason to map a subregion is to limit usage of virtual address space, which is important mainly on 32-bit systems. It could also be a convenience method to avoid having to pass around an offset everywhere along with the mmap'd file. |
@nornagon Have you performance tested this at all? Can we see if there's an increase / decrease in perf of reading a large amount of small files sequentially. Or reading large files. etc. |
Thinking a bit more, can you run this experiment behind a |
Hm, I'm not sure I understand why this would qualify for that. There's no change in functionality. |
wouldn't this require anymore implementation changes, having both code paths then would help in comparison as well as a way to opt in.
Hmm but the underlying implementation has changed, this is similar to blink enabling its new rendering engine |
You'll be able to opt in to loading from a dylib by putting your app in a dylib and enabling the relevant fuse. That part will always remain optional, and no feature flag will be necessary.
LayoutNG was a major rewrite of a massive and extraordinarily complex system. This is not. I think the risk is substantially and meaningfully lower here. I don't think we should ask that all refactors in Electron come with a feature flag and a deprecation cycle. If there are concrete reasons to suspect that this may cause issues, then I'd be happy to test & address them prior to merging. Otherwise, let's treat this as any other refactor: roll it out, watch for reports of breakage, and fix as necessary. The only thing I'm cautious about is potential issues when loading from a network drive that goes away (as mentioned in the sublime post). Previously, that would have resulted in an I/O failure (with unknown effects on the app; it would be up to the app to handle it). After this change, I think it will result in SIGBUS. There might be other issues when using mmap as opposed to fs.read(), but we won't find them until we enable this by default anyway, so putting it behind a flag will just delay the gathering of such information and force us to maintain both implementations in parallel for longer. I think flags are best suited for large, relatively isolated functionality which app developers have a reason to want to opt into (e.g. it enables a feature for them, so they'll switch on the flag in order to use it). The value of this refactor by itself isn't obvious, so I don't see any reason a developer would choose to enable it. |
I'm not sure what does loading asar files from dll means, can you explain a bit more about what it is and when it would be used? Mmap is very useful for editors when editing very large files, where app has to constantly read parts of files. But in Electron apps most files in asar archive are only read for once, I'm not sure if there would be benefits on performance. So I'm a bit hesitate with a refactoring that brings little benefits and increases stability risks. It would be great if you can explain more about the new features it would enable, and probably do some benchmarks on performance. |
The idea is to allow protecting script sources with codesigning. Currently, .asar files are not considered "code" by codesigning verification systems in macOS and Windows:
The goal here is to allow Electron apps to take advantage of the resource-integrity systems provided by the underlying operating system. Native apps have access to this by default, but given that the code of Electron apps is loaded dynamically from a non-"code" file, these systems do not effectively apply to Electron apps currently. My plan for supporting this is roughly:
The ultimate goal is to provide apps with tools that allow them to do just as well as the OS will permit them to in attempting to prevent unvalidated code from running. |
Thank for the clarification, code signing asar files would be very exciting and I think it is a fair use case. 👍 |
@zcbenz please take another look. |
all 4 (!) ci failures are unrelated; merging. |
No Release Notes |
This reverts commit 01a2e23.
This reverts commit 01a2e23.
This reverts commit 01a2e23.
@nornagon packaging asar files in dlls/gylibs and code signing them will be really an amazing feature! Can't wait till it is available! I presume that the asar.dll files will be loaded by Electron only when the code signature checks out and refused otherwise. Loading from a memory mapped files, should really give also a welcome speed boost as well, specially when dealing with tons of small node_modules files. Also another welcome addition will be also to be able to encrypt the asar files, so next to code signing also code protection is available. When packing an asar archive we can already specify a stream transform, now we just need to at the loading as well. |
A small but powerful addition will be also the just un-gzip asar dll files runtime on load. This will save storage as well and make asar files much smaller. Also if the asar memory can be directly smartly mapped with the node_modules cache, so run time memory is saved. You don't want to load a 100MB asar file with node_modules, that will take then twice the memory because of the asar mmap file self and all the require modules cache... but that might be quite a technical challenge. |
This is up to the OS. On Windows, code signatures are not checked by default, but there are some 3rd-party tools like AppLocker which can be employed to harden the system.
There is no appreciable speed difference between reading from an asar and reading from a memory-mapped file, as far as I can tell.
We have no plans to add encryption of any sort.
I don't believe this is useful to implement in Electron. Enable NTFS compression if you want this.
I'm not sure what you mean by this. |
Well we can't really rely on the OS as we don't have control of it. Specially on windows indeed there is no code signing validation per default or it can be easily switched off. So loading and validating the app and its asar file is really a responsibility of the app self. It has being asked before many times: #24260 and encryption just for that as well #24681 #2570 maybe something like the content verification with own public key, like in NWJS: but then combined with Electron fuses @MarshallOfSound or other more easier way to embed the public key in the electron binary without the need to recompile it and also have a way from the asar content to verify back that the main binary is also not tempered and bot use the same code signing key.
Again we don't have control of the user system, so can just let them enable NTFS compression. Just like loading files from a website the browser accepts gzip and unzip on the fly - this should be also possible when loading asar files. NWJS also supports having zip file as a content package to load. But unfortunately it unzips it fully first and is not that smart as asar to do that on the fly.
well the node modules code resides in the asar file, so when the asar file is loaded, it will get in memory (well maybe not completely depending on the mmap implementation) but then when a node module is required, the require builds a global cache in memory of all required modules, so those don't have to be loaded again. So you can actually the code of the same module twice - once in the asar memory file and once in the require cache https://nodejs.org/api/modules.html#modules_caching |
@nornagon Could you expand on why you think this wouldn't be useful? One complaint I hear fairly often from people is that Electron apps are just big, if there's something that can deliver double-digits improvements in this area I don't think Electron should just dismiss the opportunity like that. Also I don't think it's reasonable to say to people "Yeah we could improve on this, but just switch to a filesystem that supports compressed partitions", that's obviously not going to work, people don't even know what a filesystem is. I have opened an issue about this here. IMHO working with compressed asar archives may speed things up too, since the bottleneck for almost all machine is going to be reading from disk rather than decompressing. Other than the potential difficulties of switching to compressed asar archives it seems to me like the potential pros clearly outweigh the potential cons 🤷♂️ |
Description of Change
In preparation for being able to load asar files from a dll/dylib, which are implicitly memory mapped.
Checklist
npm test
passesRelease Notes
Notes: none