-
-
Notifications
You must be signed in to change notification settings - Fork 19k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use multiple threads to import resources. #47343
Use multiple threads to import resources. #47343
Conversation
I gave this PR a test rebased on top of #47370, and I'm having some issues when trying to import the current tps-demo:
|
Here I got a stracktrace with the above steps to reproduce. It crashes importing a glTF file:
|
EditorProgress is suspect too. |
Extracted some files from the TPS demo to make a MRP for the deadlock issue: |
2e42f43
to
aea1890
Compare
aea1890
to
c04a379
Compare
Seems to work pretty well now! Testing on TPS demo:
|
c04a379
to
1fecccf
Compare
- For now everything imports multithreaded by default (should work I guess, let's test). - Controllable per importer Early test benchmark. 64 large textures (importing as lossless, _not_ as vram) on a mobile i7, 12 threads: Importing goes down from 46 to 7 seconds. For VRAM I will change the logic to use a compressing thread in a subsequent PR, as well as implementing Betsy.
1fecccf
to
2b730ca
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I downloaded the artifacts windows editor for Godot Engine from the cicd github actions and used tps-demo and nothing hung.
The load times roughly matched Akien's numbers, but I have a faster computer.
I did an integration test and not style review or code review.
Without PR, importing the TPS demo on my system:
With PR, importing the TPS demo on my system:
|
@aaronfranke Not so much of a win in your case, what hardware specs are you running on? |
My specs are an i5-6600k CPU (4c/4t), 48 GB RAM, GTX 1070 GPU, and not many open programs while running this test. |
That's still a pretty big gain if you compare before/after, it's a 69.5% import time reduction (for The base metrics do seem high for @aaronfranke, but comparatively I have an i7-8705G(*) and since this is all CPU and I/O bound, the difference is likely expected. Another factor might be the OS, I'm running Linux and if @aaronfranke is on Windows, the infamously slow Windows I/O might be a big factor. All in all, good results I think :) (*) Full specs.
|
@aaronfranke oh, guess it kinda makes sense then since you are limited by number of cores and this is CPU based compression (you got 3.3x improvement, which is pretty good, considering the time it takes to load/save on disk). With a 8 core 16 threads machine the TPS demo imports in less than 10 seconds. |
Thanks! |
_reimport_file looks that use godot/editor/editor_file_system.h Line 151 in db0816e
godot/editor/editor_file_system.cpp Line 1728 in db0816e
and this broke importing when importing a lot of files #49324 (comment) |
@qarmin I asked reduz about
Also, when playing around with importing assets, here are two other changes that I recommend if you're still seeing hangs or crashes: the first seems to fix some corner case that can cause hangs; the second disables the progress bar, since it leads to a nested main loop which wreaks havoc on multithreaded code. diff --git a/core/templates/thread_work_pool.h b/core/templates/thread_work_pool.h
index 9f7a692cc5..b242648bc8 100644
--- a/core/templates/thread_work_pool.h
+++ b/core/templates/thread_work_pool.h
@@ -105,7 +105,7 @@ public:
}
bool is_done_dispatching() const {
- ERR_FAIL_COND_V(current_work == nullptr, false);
+ ERR_FAIL_COND_V(current_work == nullptr, true);
return index.load(std::memory_order_acquire) >= current_work->max_elements;
}
diff --git a/editor/progress_dialog.cpp b/editor/progress_dialog.cpp
index 0b6a3798b3..1a89ff4749 100644
--- a/editor/progress_dialog.cpp
+++ b/editor/progress_dialog.cpp
@@ -207,7 +207,7 @@ bool ProgressDialog::task_step(const String &p_task, const String &p_state, int
DisplayServer::get_singleton()->process_events();
}
- Main::iteration(); // this will not work on a lot of platforms, so it's only meant for the editor
+ // Main::iteration(); // this will not work on a lot of platforms, so it's only meant for the editor
return cancelled;
} |
Early test benchmark. 64 large textures (importing as lossless, not as vram) on a mobile i7, 12 threads:
Importing goes down from 46 to 7 seconds.
For VRAM I will change the logic to use a compressing thread in a subsequent PR, as well as implementing Betsy.