Please sign in to comment.
arch: POSIX: Fix race with unused threads
Fix a race which seems to have been presenting itself very sporadically on loaded systems. The race seems to have caused tests/kernel/sched/schedule_api to fail at random on native_posix. The case is a bit convoluted: When the kernel calls z_new_thread(), the POSIX arch saves the new thread entry call in that new Zephyr thread stack together with a bit of extra info for the POSIX arch. And spawns a new pthread (posix_thread_starter()) which will eventually (after the Zephyr kernel swapped to it), call that entry function. (Note that in principle a thread spawned by pthreads may be arbitrarily delayed) The POSIX arch does not try to synchronize to that new pthread (because why should it) until the first time the Zephyr kernel tries to swap to that thread. But, the kernel may never try to swap to it. And therefore that thread's posix_thread_starter() may never have got to run before the thread was aborted, and its Zephyr stack reused for something else by the Zephyr app. As posix_thread_starter() was relaying on looking into that thread stack, it may now be looking into another thread stack or anything else. So, this commit fixes it by having posix_thread_starter() get the input it always needs not from the Zephyr stack, but from its own pthread_create() parameter pointing to a structure kept by the POSIX arch. Note that if the thread was aborted before reaching that point posix_thread_starter() will NOT call the Zephyr thread entry function, but just cleanup. With this change all "asynchronous" parts of the POSIX arch should relay only on the POSIX arch own structures. Signed-off-by: Alberto Escolar Piedras <firstname.lastname@example.org>
- Loading branch information...