Skip to content

Conversation

@zzby0
Copy link

@zzby0 zzby0 commented Jan 15, 2026

Note: Please adhere to Contributing Guidelines.

Summary

This PR optimizes the ostest suite execution time and fixes several reliability issues:

  1. Reduce test execution time: Reduced sleep() durations across multiple test cases, decreasing total test time from ~150s to ~60s (60% reduction) while maintaining test effectiveness.

  2. Fix timing-related test failures: Replaced timing-based synchronization (usleep/sleep) with deterministic event-based synchronization (semaphore polling, waitpid) to eliminate race conditions on slower systems or under high load.

  3. Fix SMP-related race conditions:

    • Fixed robust_test failure in SMP environments where parent thread could acquire mutex before child thread
    • Fixed restart_test assertion in semaphore code by restructuring the test to ensure restart happens under controlled conditions

Changes by category:

  • Time optimization: 24 files with reduced sleep/delay times
  • Synchronization improvements: sighand.c, suspend.c
  • Race condition fixes: robust.c, restart.c
  • Timeout adjustments: pthread_rwlock.c

Impact

Users:

  • Faster CI/test execution (60% time reduction)
  • More reliable test results across different hardware speeds and configurations
  • Better test coverage for SMP systems

Build process: No impact - pure test code changes

Compatibility: Fully backward compatible - no API or behavior changes to tested components

Testing reliability: Significantly improved - eliminates timing-dependent race conditions that caused intermittent failures

Testing

Verification performed:

  • All modified test cases executed successfully on both single-core and SMP configurations
  • Tests verified under various load conditions to ensure timing changes don't cause false failures
  • Confirmed 60% reduction in total ostest execution time (150s → 60s)

Test configurations verified:

  • Single-core ARM Cortex-M systems
  • Multi-core SMP configurations
  • Simulator (sim:ostest)

Before changes:

  • Test time: ~150 seconds
  • Intermittent failures observed in:
    • robust_test on SMP systems (timing race)
    • sighand_test on slower systems (fixed delay insufficient)
    • restart_test (assertion in semaphore code)

After changes:

  • Test time: ~60 seconds (60% improvement)
  • All tests pass reliably across configurations
  • No false positives due to timing issues
  • SMP race conditions eliminated

Build verification:

  • Host: Linux x86_64
  • Target: Multiple architectures (ARM, RISC-V, simulator)
  • All builds successful with no new warnings

The changes maintain test coverage while significantly improving execution speed and reliability.

zzby0 and others added 6 commits January 15, 2026 21:30
reduce sleep() time in testcase, total test time reduced from 150s to 60s

Signed-off-by: guanyi3 <guanyi3@xiaomi.com>
usleep too fast, victim maybe not exit.

Signed-off-by: An Jiahao <anjiahao@xiaomi.com>
In a single-core environment, there is no problem because sleep will give up control of the CPU, ensuring that the child thread obtains the mutex first, and there is no timing problem. However, in an SMP environment, since there are two cores, the newly created thread can run on the other core. If the parent thread sleeps for a short time, it will obtain the mutex before the child thread obtains the mutex, causing the case to fail. Here, you need to increase the sleep time of the parent thread to ensure that the child thread obtains the mutex first.

Failed case like this:

thread_wait_mutex(CPU0)           thread_hold_mutex(CPU1)
sleep                             printf    -- takes a long time
pthread_mutex_lock  -- succeed
assert              -- failed     pthread_mutex_lock  -- failed

Signed-off-by: wangzhi16 <wangzhi16@xiaomi.com>
…rdlock

The timeout for pthread_rwlock_timedrdlock in the success case was too short (1 second). This could cause test failures on slower systems or under high load, as there may not be enough time for the semaphore post, thread scheduling, lock release, and lock acquisition sequence to complete.

Increase the timeout to 10 seconds to provide sufficient margin for test stability while still maintaining reasonable test execution time.

Signed-off-by: wushenhui <wushenhui@xiaomi.com>
Replace usleep(20ms) with polling sem_getvalue() to verify the waiter thread is in sem_wait() state before sending the signal. This ensures deterministic synchronization regardless of system speed.

Signed-off-by: ligd <liguiding1@xiaomi.com>
Here is the assert backtrace:
sched_dumpstack: [242] [<0x63617e>] __assert+0x1d/0x3c34a7
sched_dumpstack: [242] [<0x6c17ee>] nxsem_wait_slow+0x3e9/0xfff77d3c
sched_dumpstack: [242] [<0x62e724>] uart_write+0xb7/0x22e8
sched_dumpstack: [242] [<0x688944>] file_writev+0xf3/0x29818c
sched_dumpstack: [242] [<0x6889e2>] write+0x4d/0x69d04
sched_dumpstack: [242] [<0x63ff72>] lib_fflush_unlocked+0x99/0x95d30
sched_dumpstack: [242] [<0x6df1c0>] fputc+0x43/0x109784c
sched_dumpstack: [242] [<0x6d7174>] stdoutstream_putc+0x3f/0xffffff68
sched_dumpstack: [242] [<0x63c45a>] vsprintf_internal.constprop.0+0x55/0xffffdbfc
sched_dumpstack: [242] [<0x63d32a>] lib_vsprintf+0x1d/0x3c
sched_dumpstack: [242] [<0x6d603a>] vfprintf+0x29/0x10a0b70
sched_dumpstack: [242] [<0x1776228>] printf+0x23/0xfefa0774
sched_dumpstack: [242] [<0x6d603a>] vfprintf+0x29/0x10a0b70
sched_dumpstack: [242] [<0x1776228>] printf+0x23/0xfefa0774
sched_dumpstack: [242] [<0x72cebc>] restart_main+0x1f/0x1fc
sched_dumpstack: [242] [<0x638c66>] nxtask_startup+0x1d/0xdb4b6c

Root cause:
task:                     child task:
start child
                          nxmutex_lock
                          sleep(1)
restart_task()
                          nxmutex_lock (assert happen)

The nxsem_recover() only recover when the sem in WAIT_LIST,
if we want recover it, or we must open CONFIG_PRIORITY_INHERITANCE,
it is a heavy method for a embedded-RTOS.

So I change the restart case, and let the restart happen in control

Signed-off-by: ligd <liguiding1@xiaomi.com>
@zzby0 zzby0 marked this pull request as draft January 15, 2026 14:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants