-
Notifications
You must be signed in to change notification settings - Fork 15.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: change ASAR archive cache to per-process to fix leak #29293
Conversation
38a08bf
to
241e24e
Compare
Nice analysis of the issue! Putting aside the process wide archive cache solution for a moment, regarding the The call to |
It does seem like one since it's a bad access crash, yea.
I've added that change locally, and it does indeed fix the bad access crash on thread exit. (Maybe submit that fix to Chromium?) Interestingly enough, while using Here's some data on memory usage, first table is the fix in this PR, second table is using
Here's the `base::ThreadLocalOwnedPointer` patch for completeness.diff --git a/shell/common/asar/asar_util.cc b/shell/common/asar/asar_util.cc
index 65ddbf284..f43d9340d 100644
--- a/shell/common/asar/asar_util.cc
+++ b/shell/common/asar/asar_util.cc
@@ -10,6 +10,7 @@
#include "base/files/file_path.h"
#include "base/files/file_util.h"
#include "base/lazy_instance.h"
+#include "base/memory/ptr_util.h"
#include "base/stl_util.h"
#include "base/threading/thread_local.h"
#include "base/threading/thread_restrictions.h"
@@ -21,7 +22,7 @@ namespace {
// The global instance of ArchiveMap, will be destroyed on exit.
typedef std::map<base::FilePath, std::shared_ptr<Archive>> ArchiveMap;
-base::LazyInstance<base::ThreadLocalPointer<ArchiveMap>>::Leaky
+base::LazyInstance<base::ThreadLocalOwnedPointer<ArchiveMap>>::Leaky
g_archive_map_tls = LAZY_INSTANCE_INITIALIZER;
const base::FilePath::CharType kAsarExtension[] = FILE_PATH_LITERAL(".asar");
@@ -41,7 +42,7 @@ bool IsDirectoryCached(const base::FilePath& path) {
std::shared_ptr<Archive> GetOrCreateAsarArchive(const base::FilePath& path) {
if (!g_archive_map_tls.Pointer()->Get())
- g_archive_map_tls.Pointer()->Set(new ArchiveMap);
+ g_archive_map_tls.Pointer()->Set(base::WrapUnique(new ArchiveMap));
ArchiveMap& map = *g_archive_map_tls.Pointer()->Get();
// if we have it, return it
@@ -61,8 +62,9 @@ std::shared_ptr<Archive> GetOrCreateAsarArchive(const base::FilePath& path) {
}
void ClearArchives() {
- if (g_archive_map_tls.Pointer()->Get())
- delete g_archive_map_tls.Pointer()->Get();
+ if (g_archive_map_tls.Pointer()->Get()) {
+ g_archive_map_tls.Pointer()->Set(nullptr);
+ }
}
bool GetAsarArchivePath(const base::FilePath& full_path, |
Yup, it is a good candidate. Please give it a try in upstreaming.
Heap Profiler from chromium should help identify the difference here. You would need this change to start the profiler. I do like the thread scoped archive map for its simple lifetime and lock free code in IO sections. But the process wide map does show its advantage in memory usage. I am inclined to either solutions, will wait for other maintainers to chime in. |
I was suggesting you be the one to ustream it. 😄 I just randomly stuck the call into
Thanks for the info. I'm not inclined to spend any more time on this for now, I've already sunk a good amount of time into debugging the issue and making sure I fully understood it for the bug report write up. Anyone else, feel free, all the info needed and patches are here.
I like the lock-free nature of the thread scope map, which is why I examined quite a few alternatives before settling on this PR's final form, I would have preferred avoiding the explicit locking. However, as mentioned in the original comment, the lifetime of the threads in which the cache would be most useful (thread pool workers) significantly diminishes cache hits. The threads are constantly destroyed, meaning cache misses will happen often and the ASAR will need to be parsed again on a request, so you're not gaining anything from the cache in those situations. As the second table in my previous comment shows (open file handles are a proxy measure for how many caches in threads there are), after 30 or so seconds there's only one cache (I even saw it drop back to 0 at one point in testing) remaining, so a single request might benefit from the cache, but if a handful of requests (fairly common usage) are sent at once, only one would benefit from the cache, the others will all parse the ASAR, using memory and file handles in the meantime. It's also important to consider the code paths which use this cache. The JavaScript land Finally, the distributed nature of the thread scope cache makes it difficult (if not impossible?) to ever provide manual control over the caching. This appears to be a use case some developers want (see #28368, and node-fs/#876), because the open file handles to the ASAR file prevents them from doing certain things. With a centralized cache there's at least hope of providing a way to free those file handles. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it does not affect fs
module, there should be no performance punishment, I'm good with making the cache per-process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
Tested AsarURLLoader
thread ticks before and after this change with batch of requests from relatively large asar file, PR didn't regress either sync or async scenario and timing is better with this change.
Trace code
diff --git a/shell/browser/net/asar/asar_url_loader.cc b/shell/browser/net/asar/asar_url_loader.cc
index 394d0f9db..3aca3e6ae 100644
--- a/shell/browser/net/asar/asar_url_loader.cc
+++ b/shell/browser/net/asar/asar_url_loader.cc
@@ -9,6 +9,7 @@
#include <utility>
#include <vector>
+#include "base/trace_event/typed_macros.h"
#include "base/strings/stringprintf.h"
#include "base/task/post_task.h"
#include "base/task/thread_pool.h"
@@ -90,6 +91,7 @@ class AsarURLLoader : public network::mojom::URLLoader {
mojo::PendingReceiver<network::mojom::URLLoader> loader,
mojo::PendingRemote<network::mojom::URLLoaderClient> client,
scoped_refptr<net::HttpResponseHeaders> extra_response_headers) {
+ TRACE_EVENT_BEGIN("electron", "AsarURLLoader::Start");
auto head = network::mojom::URLResponseHead::New();
head->request_start = base::TimeTicks::Now();
head->response_start = base::TimeTicks::Now();
@@ -264,6 +266,7 @@ class AsarURLLoader : public network::mojom::URLLoader {
}
void MaybeDeleteSelf() {
+ TRACE_EVENT_END("electron");
if (!receiver_.is_bound() && !client_.is_bound())
delete this;
}
Great, thanks @deepak1556 for the extra testing with tracing. I'd like @nornagon to get a chance to comment when he gets a chance, since he was recently in this part of the code base with the ASAR memory-mapped file work. |
241e24e
to
70f15ba
Compare
PR rebased and deleted some |
70f15ba
to
4e24d9b
Compare
ebb22b9
to
d1458d4
Compare
@nornagon, ready for re-review. 👍 |
I have automatically backported this PR to "13-x-y", please check out #29535 |
I have automatically backported this PR to "14-x-y", please check out #29536 |
…9293) * fix: change ASAR archive cache to per-process to fix leak (electron#29292) * chore: address code review comments * chore: tighten up thread-safety * chore: better address code review comments * chore: more code review changes
@dsanders11 has manually backported this PR to "12-x-y", please check out #29548 |
Description of Change
Close #29292.
This PR changes the ASAR archive cache in
shell/common/asar/asar_util.cc
to be process-wide instead of in thread-locals to prevent theasar::Archive
from being leaked on worker thread exit. See issue #29292 for the technical details about the bug.Changing the cache to being process-wide means that thread-safety had to be added to
asar::Archive
andasar::GetOrCreateAsarArchive
. This is (unfortunately?) accomplished with locks, asasar::GetOrCreateAsarArchive
is used in both async (AsarURLLoader
) and sync (NativeImage
) contexts. I know in Chromium locks are frowned upon, but using them here allows the code to continue to be used where it is currently in synchronous code.Other changes include making
asar::IsDirectoryCached
thread-safe, where it wasn't before, and removing all current usages ofasar::ClearArchives
since it was being used in spots that weren't really accomplishing anything (when processes are being torn down anyway).asar::ClearArchives
was left in the code base though for potential future use in manually clearing the cache. Theasar::Archive::header
method was also removed since it is not currently used, and it could be used to modify the parsed header which would makeasar::Archive
not thread-safe.Alternatives Considered
Quite a few alternatives were considered, in an attempt to avoid locking for thread-safety, in no particular order:
Use
base::ThreadLocalOwnedPointer
To Fix LeakThis was initially attempted as the most concise fix for the leak, as it would destruct the cache on thread exit. However, this change led to a crash on thread exit, through the
File::Close
code path. This is due to another thread-local storage in the Chromium code base, which appears to have already been destructed by the time the archive cache is being destructed, but the code inFile::Close
needs to use that other thread-local storage. Ensuring the order of destruction of disconnected thread-local storages seems fragile and bug prone. Per-thread caches are also redundant, and make it very difficult to clear the cache.Eliminate Archive Cache All Together
The bug could be fixed by simply eliminating the archive cache and letting each interaction with an ASAR archive load and parse the header. However, that parsing of the header can be rather slow, in my testing I was seeing ~40ms with an ASAR archive header which was about 850KB. Eliminating the cache would lead to a lot of redundant processing, and more resource thrash since each request would temporarily store the parsed header in memory, and would open a new file handle.
Don't Use Archive Cache for Sync Code Paths, Use A Sequence Instead of Locks
This has the same shortcomings as eliminating the cache, but isolated to sync code (like
NativeImage
). The sync code path is accessed from JavaScript, so it likely currently benefits from the cache consistently since all access will be on the same thread. As such, not using the archive cache would be a performance regression for those code paths.Use
base::SequenceLocalStorageSlot
Instead of Thread-Local Storagebase:: SequenceLocalStorageSlot
is like thread-local storage, but it's per-sequence. That means all worker threads using the same sequence would get the same archive cache, and it would be destructed when the sequence was destructed. This approach isn't viable becauseAsarURLLoader
runs on a new sequence for each request, and running all the loaders on the same sequence could be a bottleneck.cc @nornagon, @zcbenz
Checklist
npm test
passesRelease Notes
Notes: Fixed memory leak when requesting files in ASAR archive from renderer