-
Notifications
You must be signed in to change notification settings - Fork 625
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix lib-pthread issues #2410
Fix lib-pthread issues #2410
Conversation
Hi Wenyong, recently we observed a situation when test was running 4 threads but thread manager reported up to 5-6 of them because after pthread_join it still was in thread_manager list for some short time (that's how I got it). So we added retries on the test side. Is the second change fixing it? |
Yes, in lib-pthread mode (not wasi-threads mode), I found that when thread A is exiting in pthread_join_wrapper, it may change its node's status to THREAD_EXIT before calling wasm_cluster_exit_thread to actually exit, and if thread B is joining it and detects the change of thread A node's status, thread B will just return successfully. But at this time thread A may hasn't actually exited, this may cause some unexpected behavior, for example, if thread B is main thread, it may think that all other thread have exited, and it may destroy all resources (module instance, module, runtime, etc.) and exit, but the resources may be still needed by thread A. I just let thread B wait some time after joining to fix it. |
@Zzzabiyaka Maybe you can check whether this PR fixes your test issue and don't add the retries? |
I don't think the issue mentioned by @Zzzabiyaka can actually be fixed. Both
The thing is different for lib-pthread, because it's fully implemented in host, therefore we can confidently wait for the native thread to complete. |
they are actually destroyed to avoid unexpected behavior. */ | ||
os_mutex_lock(&exec_env->wait_lock); | ||
os_cond_reltimedwait(&exec_env->wait_cond, &exec_env->wait_lock, 1000); | ||
os_mutex_unlock(&exec_env->wait_lock); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't this still racy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The exec_env->wait_lock
is used singly or together with exec_env->wait_cond
, it won't be occupied by a thread for a long time: when a thread locks exec_env->wait_lock
, it will unlock it after short time operations, or call os_cond_wait/reltimedwait, note that the lock will be unlocked firstly inside os_cond_wait/reltimedwait. My understanding is that here the current thread should be able to acquire the lock.
- Avoid destroying module instance repeatedly in pthread_exit_wrapper and wasm_thread_cluster_exit. - Wait enough time in pthread_join_wrapper for target thread to exit and destroy its resources.
- Avoid destroying module instance repeatedly in pthread_exit_wrapper and wasm_thread_cluster_exit. - Wait enough time in pthread_join_wrapper for target thread to exit and destroy its resources.
wasm_thread_cluster_exit.
destroy its resources.