-
Notifications
You must be signed in to change notification settings - Fork 15.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Main Process Can Leak Memory/File Handles/Temp Files When Renderer Requests Files From ASAR #29292
Comments
dsanders11
added a commit
to dsanders11/electron
that referenced
this issue
May 23, 2021
This was referenced May 23, 2021
dsanders11
added a commit
to dsanders11/electron
that referenced
this issue
May 23, 2021
This is really ground breaking bug report! Awesome work @dsanders11 - great research and PR solution! We were just about to dump Asar usage just because of this memory leak and the tons of out of memory crashes coming because of it. But now there is light on the horizon! Hope the PR gets quickly merged by the Electron team! |
dsanders11
added a commit
to dsanders11/electron
that referenced
this issue
May 31, 2021
dsanders11
added a commit
to dsanders11/electron
that referenced
this issue
Jun 2, 2021
zcbenz
pushed a commit
that referenced
this issue
Jun 4, 2021
* fix: change ASAR archive cache to per-process to fix leak (#29292) * chore: address code review comments * chore: tighten up thread-safety * chore: better address code review comments * chore: more code review changes
zcbenz
pushed a commit
that referenced
this issue
Jun 4, 2021
* fix: change ASAR archive cache to per-process to fix leak (#29292) * chore: address code review comments * chore: tighten up thread-safety * chore: better address code review comments * chore: more code review changes Co-authored-by: David Sanders <dsanders11@ucsbalum.com>
zcbenz
pushed a commit
that referenced
this issue
Jun 4, 2021
* fix: change ASAR archive cache to per-process to fix leak (#29292) * chore: address code review comments * chore: tighten up thread-safety * chore: better address code review comments * chore: more code review changes Co-authored-by: David Sanders <dsanders11@ucsbalum.com>
dsanders11
added a commit
to dsanders11/electron
that referenced
this issue
Jun 5, 2021
…9293) * fix: change ASAR archive cache to per-process to fix leak (electron#29292) * chore: address code review comments * chore: tighten up thread-safety * chore: better address code review comments * chore: more code review changes
zcbenz
pushed a commit
that referenced
this issue
Jun 7, 2021
* fix: change ASAR archive cache to per-process to fix leak (#29292) * chore: address code review comments * chore: tighten up thread-safety * chore: better address code review comments * chore: more code review changes
Good job! |
2 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Bug Description
When a renderer sends a request for a file in an ASAR archive (including unpacked files), the main process can leak the C++
asar::Archive
object, leading to leaked memory and a file handle. If any temp files were created for that archive viaasar::Archive::CopyFileOut
, those would also be leaked. This doesn't happen on every request, see "Reproduction" and "Root Cause" for more info.Refs #28962.
Might ref #27982 and explain why the switch to memory mapped files caused issues, as those would also be leaked by this issue. cc @nornagon.
Reproduction
To reproduce, send a handful (to guarantee reproduction, send dozens or even hundreds at once) of
fetch
requests from a renderer for a file in an ASAR. It doesn’t matter if the file is marked as unpacked or not (as long as you request them from a path withapp.asar
, and notapp.asar.unpacked
), either case will reproduce the bug. The amount of memory leaked is related to the header size of the ASAR archive (not the size of the file requested), so the bigger the archive the better, it only leaks ~10x the header size per reproduction. The leak can also be confirmed by looking at open file handles for the main process (lsof -p <pid> | grep 'app.asar$
), there shouldn't be more than a few, and reproducing will cause them to increase.There’s a timing aspect to the leak (which isn’t strictly timing related, more on that in "Root Cause"), so it’s best to wait 30 to 60 seconds between attempts to reproduce, leaving the app completely idle during that time.
The Signal app reliably reproduces the bug on macOS simply by opening the emoji picker in a conversation and flipping through the emoji categories, since it will send requests for dozens of emoji images, which are in the ASAR file for the app. That app has a large ASAR file (~100MB), so the memory leak is substantial and noticeable.
Root Cause
The leak is due to the usage of
base::ThreadLocalPointer
inshell/common/asar/asar_util.cc
to cacheasar::Archive
files per-thread. That class doesn’t take ownership of the pointer, so memory management needs to be handled manually. From the header file comment for the class:While there is an
asar::ClearArchives
function which will delete the pointer stored inbase::ThreadLocalPointer
, it does not align well with the real-world lifetime of the threads which handle URL requests from renderers. It also appearsasar::ClearArchives
is only called in a few locations, where it's executed on a main thread (main process, renderer, web worker), so it wouldn't effect the thread-local storage on worker threads. This means threads get destroyed without there being a call toasar::ClearArchives
, leaking theasar::Archive
instance. Due to the lifespan and nature of worker threads, it's not trivial to add a call toasar::ClearArchives
which will be executed before thread exit.In particular, the pointer in
base::ThreadLocalPointer
gets populated whenasar::AsarURLLoader
is run (viaasar::GetOrCreateAsarArchive
), and that URL loader is executed on the worker thread pool, which creates and destroys threads as needed. This is what creates the “timing” aspect in the reproduction. Rather than waiting an arbitrary amount of time, you can watch the number of threads for the main process and the leak will reproduce again once threads have been destroyed, meaning new threads will be created to handle future requests. In testing, an app left entirely idle will dwindle down to ~28-30 total threads for the main process (lldb
shows 1 or 2 "ThreadPoolForegroundWorker" threads), and when hundreds of requests from the renderer are sent it increases to ~45-48 total threads (lldb
shows ~16 "ThreadPoolForegroundWorker" threads). When left idle for a few dozen seconds, it then dwindles back down to ~28-30 threads again.Is The ASAR Archive Cache Necessary?
Probably? At the moment it's only being used on some subset of requests from renderers since worker threads come and go. App startup is likely a time when a lot of apps load a handful of files at once, which would use multiple worker threads, which will each have a cache miss, so currently a common path isn't really getting much benefit from the caching. However, testing with the Signal app since has a large ASAR did show that with caching disabled, each request took ~40ms longer, so the improvement with caching is not insignificant for larger apps. For real world usage it looks a little mixed though, as there seems to be a bottleneck elsewhere in the main process code when more than a few requests are sent at once, and fixing this issue didn't seem to create any real difference in the startup time of the Signal app.
The text was updated successfully, but these errors were encountered: