Skip to content
Permalink
Browse files

arch: POSIX: Fix race with unused threads

Fix a race which seems to have been presenting itself
very sporadically on loaded systems.
The race seems to have caused tests/kernel/sched/schedule_api
to fail at random on native_posix.

The case is a bit convoluted:
When the kernel calls z_new_thread(), the POSIX arch saves
the new thread entry call in that new Zephyr thread stack
together with a bit of extra info for the POSIX arch.
And spawns a new pthread (posix_thread_starter()) which
will eventually (after the Zephyr kernel swapped to it),
call that entry function.
(Note that in principle a thread spawned by pthreads may
be arbitrarily delayed)
The POSIX arch does not try to synchronize to that new
pthread (because why should it) until the first time the
Zephyr kernel tries to swap to that thread.
But, the kernel may never try to swap to it.
And therefore that thread's posix_thread_starter() may never
have got to run before the thread was aborted, and its
Zephyr stack reused for something else by the Zephyr app.

As posix_thread_starter() was relaying on looking into that
thread stack, it may now be looking into another thread stack
or anything else.

So, this commit fixes it by having posix_thread_starter()
get the input it always needs not from the Zephyr stack,
but from its own pthread_create() parameter pointing to a
structure kept by the POSIX arch.

Note that if the thread was aborted before reaching that point
posix_thread_starter() will NOT call the Zephyr thread entry
function, but just cleanup.

With this change all "asynchronous" parts of the POSIX arch
should relay only on the POSIX arch own structures.

Signed-off-by: Alberto Escolar Piedras <alpi@oticon.com>
  • Loading branch information...
aescolar committed Jul 17, 2019
1 parent 1074efd commit fece6907b4852f34f67b2cd9bc58f5d97ad487ac
Showing with 19 additions and 13 deletions.
  1. +19 −13 arch/posix/core/posix_core.c
@@ -72,6 +72,8 @@ struct threads_table_el {
bool running; /* Is this the currently running thread */
pthread_t thread; /* Actual pthread_t as returned by native kernel */
int thead_cnt; /* For debugging: Unique, consecutive, thread number */
/* Pointer to the status kept in the Zephyr thread stack */
posix_thread_status_t *t_status;
};

static struct threads_table_el *threads_table;
@@ -260,11 +262,11 @@ static void posix_cleanup_handler(void *arg)
*/
static void *posix_thread_starter(void *arg)
{
posix_thread_status_t *ptr = (posix_thread_status_t *) arg;
int thread_idx = (intptr_t)arg;

PC_DEBUG("Thread [%i] %i: %s: Starting\n",
threads_table[ptr->thread_idx].thead_cnt,
ptr->thread_idx,
threads_table[thread_idx].thead_cnt,
thread_idx,
__func__);

/*
@@ -286,18 +288,20 @@ static void *posix_thread_starter(void *arg)
pthread_cleanup_push(posix_cleanup_handler, arg);

PC_DEBUG("Thread [%i] %i: %s: After start mutex (hav mut)\n",
threads_table[ptr->thread_idx].thead_cnt,
ptr->thread_idx,
threads_table[thread_idx].thead_cnt,
thread_idx,
__func__);

/*
* The thread would try to execute immediately, so we block it
* until allowed
*/
posix_wait_until_allowed(ptr->thread_idx);
posix_wait_until_allowed(thread_idx);

posix_new_thread_pre_start();

posix_thread_status_t *ptr = threads_table[thread_idx].t_status;

z_thread_entry(ptr->entry_point, ptr->arg1, ptr->arg2, ptr->arg3);

/*
@@ -306,13 +310,13 @@ static void *posix_thread_starter(void *arg)
*/
/* LCOV_EXCL_START */
posix_print_trace(PREFIX"Thread [%i] %i [%lu] ended!?!\n",
threads_table[ptr->thread_idx].thead_cnt,
ptr->thread_idx,
threads_table[thread_idx].thead_cnt,
thread_idx,
pthread_self());


threads_table[ptr->thread_idx].running = false;
threads_table[ptr->thread_idx].state = FAILED;
threads_table[thread_idx].running = false;
threads_table[thread_idx].state = FAILED;

pthread_cleanup_pop(1);

@@ -370,16 +374,18 @@ void posix_new_thread(posix_thread_status_t *ptr)
threads_table[t_slot].state = USED;
threads_table[t_slot].running = false;
threads_table[t_slot].thead_cnt = thread_create_count++;
threads_table[t_slot].t_status = ptr;
ptr->thread_idx = t_slot;

PC_SAFE_CALL(pthread_create(&threads_table[t_slot].thread,
NULL,
posix_thread_starter,
(void *)ptr));
(void *)(intptr_t)t_slot));

PC_DEBUG("created thread [%i] %i [%lu]\n",
PC_DEBUG("%s created thread [%i] %i [%lu]\n",
__func__,
threads_table[t_slot].thead_cnt,
ptr->thread_idx,
t_slot,
threads_table[t_slot].thread);

}

0 comments on commit fece690

Please sign in to comment.
You can’t perform that action at this time.