Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Engine tree closed on Linux host_debug_impeller_vulkan #138028

Closed
bdero opened this issue Nov 7, 2023 · 5 comments · Fixed by flutter/engine#47781
Closed

Engine tree closed on Linux host_debug_impeller_vulkan #138028

bdero opened this issue Nov 7, 2023 · 5 comments · Fixed by flutter/engine#47781
Labels
engine flutter/engine repository. See also e: labels. P0 Critical issues such as a build break or regression

Comments

@bdero
Copy link
Member

bdero commented Nov 7, 2023

Starting with flutter/engine#47678, all post submit CI runs for Linux host_debug_impeller_vulkan are failing.

Runs fail consistently somewhere between 8.5m to 10m during seemingly random Vulkan playgrounds. gtest always exits with -6. No additional output indicating a problem.

image

Sample:
https://logs.chromium.org/logs/flutter/buildbucket/cr-buildbucket/8765060191207742689/+/u/test:_Host_Tests_for_host_debug_impeller_vulkan/stdout
https://ci.chromium.org/ui/p/flutter/builders/prod/Linux%20Production%20Engine%20Drone/455462/overview

@bdero bdero added engine flutter/engine repository. See also e: labels. P0 Critical issues such as a build break or regression labels Nov 7, 2023
@jason-simmons
Copy link
Member

Reproduced it with a local engine build and captured a core dump and a stack trace:

#0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0)
    at ./nptl/pthread_kill.c:44
#1  0x00007feb21b7c15f in __pthread_kill_internal (signo=6, threadid=<optimized out>)
    at ./nptl/pthread_kill.c:78
#2  0x00007feb21b2e472 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3  0x00007feb21b184b2 in __GI_abort () at ./stdlib/abort.c:79
#4  0x00007feb1b6dd6d4 in std::_LIBCPP_ABI_NAMESPACE::__throw_system_error (ev=11, 
    what_arg=0x7feb1b097025 "__thread_specific_ptr construction failed")
    at ../../third_party/libcxx/src/system_error.cpp:294
#5  0x00007feb1b6dde08 in std::_LIBCPP_ABI_NAMESPACE::__thread_specific_ptr<std::_LIBCPP_ABI_NAMESPACE::__thread_struct>::__thread_specific_ptr (
    this=0x7feb1be52e58 <std::_LIBCPP_ABI_NAMESPACE::__thread_local_data()::__b>)
    at ../../third_party/libcxx/include/thread:186
#6  0x00007feb1b6dda22 in std::_LIBCPP_ABI_NAMESPACE::__thread_local_data ()
    at ../../third_party/libcxx/src/thread.cpp:123
#7  0x00007feb1b542488 in std::_LIBCPP_ABI_NAMESPACE::__thread_proxy[abi:v15000]<std::_LIBCPP_ABI_NAMESPACE::tuple<std::_LIBCPP_ABI_NAMESPACE::unique_ptr<std::_LIBCPP_ABI_NAMESPACE::__thread_struct, std::_LIBCPP_ABI_NAMESPACE::default_delete<std::_LIBCPP_ABI_NAMESPACE::__thread_struct> >, marl::Thread::Impl::Impl(marl::Thread::Affinity&&, std::_LIBCPP_ABI_NAMESPACE::function<void ()>&&)::{lambda()#1}> >(void*) (__vp=0x1ecb2a)
    at ../../third_party/libcxx/include/thread:293
#8  0x00007feb21b7a3ec in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:444
#9  0x00007feb21bfaa4c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

@bdero
Copy link
Member Author

bdero commented Nov 7, 2023

@jonahwilliams speculatively reverted flutter/engine#47678

@bdero
Copy link
Member Author

bdero commented Nov 7, 2023

The tree is open

@bdero bdero closed this as completed Nov 7, 2023
@jason-simmons
Copy link
Member

This test suite was failing because of a resource leak in libcxx that happens each time the test process loads and unloads SwiftShader.

Each time an Impeller Vulkan playground test case runs, it creates an impeller::ContextVK that owns a Vulkan instance. The ContextVK and its Vulkan instance are deleted when the test case completes.

The SwiftShader library will be loaded on demand when the first Vulkan instance in the process is created. If the test destroys the Vulkan instance and no other instances exist, then the SwiftShader library will be unloaded.

The SwiftShader library contains a static instance of marl::Scheduler. When the test creates a Vulkan device, the Scheduler is lazily initialized. Doing this apparently obtains a thread-local storage key. But it appears that libcxx is not deleting this key when the library is unloaded and the library's statics are cleaned up.
(see https://flutter.googlesource.com/third_party/libcxx/+/bc553e116d25590f97f545486b5739e3aea2cf7d/include/thread#187)

So a TLS key is being leaked each time a test case runs and does an load/unload cycle of SwiftShader. If enough test cases run within the suite, then the process will reach the PTHREAD_KEYS_MAX limit (1024 in this Linux environment) and fail.

jason-simmons added a commit to jason-simmons/flutter_engine that referenced this issue Nov 8, 2023
…vent SwiftShader from being unloaded after a test completes

Libcxx is leaking a thread-local storage key each time SwiftShader is loaded
and unloaded.  If a test's Vulkan instance is the only one in the process,
then SwiftShader will be unloaded after the test ends.  If many Vulkan
playground tests run in a suite, then eventually the leak will cause the
process to exceed its limit of TLS keys and the suite will fail.

The process can ensure that SwiftShader remains loaded by holding another
Vulkan instance that persists across all tests in the suite.

Fixes flutter/flutter#138028
jason-simmons added a commit to jason-simmons/flutter_engine that referenced this issue Nov 8, 2023
…vent SwiftShader from being unloaded after a test completes

Libcxx is leaking a thread-local storage key each time SwiftShader is loaded
and unloaded.  If a test's Vulkan instance is the only one in the process,
then SwiftShader will be unloaded after the test ends.  If many Vulkan
playground tests run in a suite, then eventually the leak will cause the
process to exceed its limit of TLS keys and the suite will fail.

The process can ensure that SwiftShader remains loaded by holding another
Vulkan instance that persists across all tests in the suite.

Fixes flutter/flutter#138028
bdero pushed a commit to flutter/engine that referenced this issue Nov 8, 2023
…vent SwiftShader from being unloaded after a test completes (#47781)

Libcxx is leaking a thread-local storage key each time SwiftShader is
loaded and unloaded. If a test's Vulkan instance is the only one in the
process, then SwiftShader will be unloaded after the test ends. If many
Vulkan playground tests run in a suite, then eventually the leak will
cause the process to exceed its limit of TLS keys and the suite will
fail.

The process can ensure that SwiftShader remains loaded by holding
another Vulkan instance that persists across all tests in the suite.

Fixes flutter/flutter#138028
auto-submit bot pushed a commit to flutter/engine that referenced this issue Nov 8, 2023
Reland of #47432

Also includes:

#47617
#47637

Fixes the performance on iOS by removing blocking on compilation of shaders. From local testing this has identical before/after numbers. Additional, ensures that we don't unecessarily specialize vertex shaders and notes this restriction in the documentation.

Adds support for Specialization constants to Impeller for our usage in the engine. A motivating example has been added in the impeller markdown docs.

Fixes flutter/flutter#136210
Fixes flutter/flutter#119357

Investigating: flutter/flutter#138028
Copy link

This thread has been automatically locked since there has not been any recent activity after it was closed. If you are still experiencing a similar issue, please open a new bug, including the output of flutter doctor -v and a minimal reproduction of the issue.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
engine flutter/engine repository. See also e: labels. P0 Critical issues such as a build break or regression
Projects
None yet
2 participants