Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add thirdparty library etcpak for faster imports. #47370

Merged
merged 1 commit into from
Apr 12, 2021

Conversation

fire
Copy link
Member

@fire fire commented Mar 25, 2021

Add etcpak library for faster ETC/ETC2/S3TC imports.

  • etc module was renamed to etcpak and modified to use the new library.
  • PKM importer is removed in the process. Could be re-added if still useful?
  • Old library etc2comp is removed.
  • S3TC compression no longer done via squish (but decompression still is).
  • Slight modifications to etcpak sources for MinGW compatibility,
    to fix LLVM -Wc++11-narrowing errors, and to allow using vendored or
    system libpng.

Squish and etc modules have been modified.

Modifications for mingw.

Some times from TPS sample.

ook ook ook 🍌

Please help measure if this is faster or not.

ETCPAK encode took 74 ms
Encoding format: ETC2_RA_AS_RG
ETCPAK encode took 60 ms
Encoding format: DXT5 RGBA8
ETCPAK encode took 74 ms
Encoding format: ETC2_RGB8
ETCPAK encode took 130 ms
Encoding format: DXT1 RGB8
ETCPAK encode took 7 ms
Encoding format: ETC
ETCPAK encode took 30 ms
Encoding format: FORMAT_DXT5_RA_AS_RG
ETCPAK encode took 9 ms
Encoding format: ETC2_RA_AS_RG
ETCPAK encode took 135 ms
Encoding format: DXT5 RGBA8
ETCPAK encode took 8 ms
Encoding format: ETC2_RGB8

@fire fire requested a review from reduz March 25, 2021 20:15
@fire fire force-pushed the etcpak branch 3 times, most recently from 046e3ed to 9232a58 Compare March 25, 2021 21:16
@Calinou Calinou added this to the 4.0 milestone Mar 25, 2021
@fire fire force-pushed the etcpak branch 8 times, most recently from f3e9d0b to 4dbdedf Compare March 25, 2021 22:04
@fire
Copy link
Member Author

fire commented Mar 26, 2021

From 64 8k texture load test, in a 1 minute period 37 seconds is get_pixel and 13 seconds is texture compression (this pr),

https://github.com/fire/texture-import-godot-project/tree/main/peacock

@fire fire force-pushed the etcpak branch 9 times, most recently from 8d38b14 to 5e89ec1 Compare March 27, 2021 14:44
@fire
Copy link
Member Author

fire commented Mar 27, 2021

Added some flags:

env.Prepend(CCFLAGS=["-march=skylake"])

Edited 2021-03-27:
Removed simd optimizations from osx.

VRAM ETCPAK
res://peacock/japanese_shiny_embroidered_peacock_fabric_basecolor - Copy (2) - Copy - Copy - Copy - Copy.png import took 4.275 seconds

Lossless PNG
res://peacock/japanese_shiny_embroidered_peacock_fabric_basecolor - Copy - Copy (2) - Copy - Copy.png import took 7.441 seconds

Before ETC2 and DXT5

ETC: Begin encoding, format: ETC2_RGB8
ETC: Time encoding: 22420
res://peacock/japanese_shiny_embroidered_peacock_fabric_basecolor - Copy - Copy (2) - Copy.png import took 32.629 seconds

After DXT5 and ETC2

Encoding format: DXT5 RGBA8
ETCPAK encode took 333 ms
Encoding format: ETC2_RGB8
ETCPAK encode took 632 ms
res://peacock/japanese_shiny_embroidered_peacock_fabric_basecolor - Copy - Copy (2) - Copy - Copy.png import took 3.031 seconds
Encoding format: DXT5 RGBA8
ETCPAK encode took 331 ms
Encoding format: ETC2_RGB8
ETCPAK encode took 626 ms
res://peacock/japanese_shiny_embroidered_peacock_fabric_basecolor - Copy - Copy (3) - Copy - Copy.png import took 2.848 seconds
Encoding format: DXT5 RGBA8
ETCPAK encode took 329 ms
Encoding format: ETC2_RGB8
ETCPAK encode took 625 ms
res://peacock/japanese_shiny_embroidered_peacock_fabric_basecolor - Copy - Copy (3) - Copy.png import took 2.815 seconds

@fire fire force-pushed the etcpak branch 4 times, most recently from 78fb320 to 5a7102d Compare March 27, 2021 17:35
@akien-mga
Copy link
Member

akien-mga commented Apr 12, 2021

Quick test of importing all assets of the TPS demo (note: includes some DAEs and glTFs too):

  • Before this PR: 6m30s
  • After this PR: 1m13s

Example:

  • Before:
res://enemies/red_robot/textures/red_robot_albedo.png import took 2.871 seconds
res://enemies/red_robot/textures/red_robot_emission.png import took 1.115 seconds
res://enemies/red_robot/textures/red_robot_normal.png import took 3.756 seconds
res://enemies/red_robot/textures/red_robot_orm.png import took 3.275 seconds
  • After:
res://enemies/red_robot/textures/red_robot_albedo.png import took 0.659 seconds
res://enemies/red_robot/textures/red_robot_emission.png import took 0.489 seconds
res://enemies/red_robot/textures/red_robot_normal.png import took 0.869 seconds
res://enemies/red_robot/textures/red_robot_orm.png import took 0.797 seconds

So there's already a pretty significant gain on this very unscientific test. This can likely be improved further by ensuring that we properly use the available SSE4.1 / AVX2 code paths on compatible CPUs.

- `etc` module was renamed to `etcpak` and modified to use the new library.
- PKM importer is removed in the process, it's obsolete.
- Old library `etc2comp` is removed.
- S3TC compression no longer done via `squish` (but decompression still is).
- Slight modifications to etcpak sources for MinGW compatibility,
  to fix LLVM `-Wc++11-narrowing` errors, and to allow using vendored or
  system libpng.

Co-authored-by: Rémi Verschelde <rverschelde@gmail.com>
@akien-mga
Copy link
Member

Thanks!

@fire fire deleted the etcpak branch April 12, 2021 23:10
@fire
Copy link
Member Author

fire commented Apr 13, 2021

#elif !defined(__APPLE__) && !defined(__MINGW32__ )
    pthread_setname_np( thread.native_handle(), name );
#endif

LLVM + MINGW still requires the __MINGW32__ check.

@akien-mga
Copy link
Member

#elif !defined(__APPLE__) && !defined(__MINGW32__ )
    pthread_setname_np( thread.native_handle(), name );
#endif

LLVM + MINGW still requires the __MINGW32__ check.

That doesn't seem like a correct fix. What's the error exactly?

This API is not compiler dependent, it's likely related to whether your MinGW distro is configured to use pthread or another thread API. Typically MinGW-GCC distros come with winpthreads, there might be a switch or a package you need for mingw-llvm.

@akien-mga
Copy link
Member

akien-mga commented Apr 13, 2021

This is the proper fix:

diff --git a/thirdparty/etcpak/System.cpp b/thirdparty/etcpak/System.cpp
index a09b289cb2..aee8483b36 100644
--- a/thirdparty/etcpak/System.cpp
+++ b/thirdparty/etcpak/System.cpp
@@ -1,5 +1,5 @@
 #include <algorithm>
-#ifdef _WIN32
+#ifdef _MSC_VER
 #  include <windows.h>
 #else
 #  include <pthread.h>

Edit: Well part of it, more s/_WIN32/_MSC_VER/g needed, working on it.

@akien-mga
Copy link
Member

If anyone is interested, I did some work to backport this change (and follow-ups) to the 3.x branch, which could help speed import times significantly:

https://github.com/akien-mga/godot/commits/3.x-etcpak

It's WIP and I don't plan to work further on it as it turns out etcpak doesn't support all formats needed for GLES2 support in the 3.x branch, so a hybrid solution would be needed (keeping the old importers for the formats that etcpak doesn't support).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants