Skip to content

Workaround bmalloc hang with RT thread priorities.#1435

Merged
magomez merged 1 commit intowpe-2.38from
pgorszkowski/2.38/fix-bmalloc-hang-with-rt-thread-priorities
Dec 11, 2024
Merged

Workaround bmalloc hang with RT thread priorities.#1435
magomez merged 1 commit intowpe-2.38from
pgorszkowski/2.38/fix-bmalloc-hang-with-rt-thread-priorities

Conversation

@pgorszkowski-igalia
Copy link
Copy Markdown

@pgorszkowski-igalia pgorszkowski-igalia commented Dec 11, 2024

To enable the usleep you need to set environmental variable WEBKIT_WPE_BMALLOC_MICROSECONDS_SLEEP (microseconds of sleep).

When real time (RT) thread priorities are used for some of the gstreamer pipeline elements, we may run into a situation where several RT threads start spinning during a mutex acquisition process, leading to a system hang as most other threads won't be able to run.

Sequence of events leading up to the hang:

  1. A web process thread acquires the mutex lock for the heap and is then involuntary descheduled, and does not run again
  2. vqueue:src (RT priority) enters the lockSlowCase and starts spinning in the while loop
  3. multiqueue0:src (instance 1, RT priority) enters the lockSlowCase and starts spinning in the while loop
  4. aqueue:src (RT priority) enters the lockSlowCase and starts spinning in the while loop
  5. multiqueue0:src (instance 2, RT priority) enters the lockSlowCase and starts spinning in the while loop

Once stage 5 is hit, the box is hung as the only thing that can run on a CPU core is:

  1. one of the above RT threads (aqueue, vqueue, or multiqueue)
  2. any other RT thread with a priority equal or greater than the above RT threads
  3. any h/w irq

The use of the usleep() will allow the low priority process to run and release the mutex lock, avoiding the hang

Author of issue analysis and fix proposal: Steven Webster.

  • Source/bmalloc/bmalloc/Mutex.cpp:
    (bmalloc::yield):
    8782ef0
Build-Tests Layout-Tests
❌ 🛠 wpe-amd64-build ❌ 🧪 wpe-amd64-layout
❌ 🛠 wpe-arm32-build ❌ 🧪 wpe-arm32-layout

@pgorszkowski-igalia
Copy link
Copy Markdown
Author

It is reimplementation of the PR from #1408 where I added the environmental variable: WEBKIT_WPE_BMALLOC_MICROSECONDS_SLEEP which can be used to set how many microseconds we use in case of usleep, if WEBKIT_WPE_BMALLOC_MICROSECONDS_SLEEP is not set or set to 0 than old implementation (sched_yield) will be used.

To enable the usleep you need to set environmental variable WEBKIT_WPE_BMALLOC_MICROSECONDS_SLEEP (microseconds of sleep).

When real time (RT) thread priorities are used for some of the gstreamer pipeline elements, we may run into a situation
where several RT threads start spinning during a mutex acquisition process, leading to a system hang as most other
threads won't be able to run.

Sequence of events leading up to the hang:
1. A web process thread acquires the mutex lock for the heap and is then involuntary descheduled, and does not run again
2. vqueue:src (RT priority) enters the lockSlowCase and starts spinning in the while loop
3. multiqueue0:src (instance 1, RT priority) enters the lockSlowCase and starts spinning in the while loop
4. aqueue:src (RT priority) enters the lockSlowCase and starts spinning in the while loop
5. multiqueue0:src (instance 2, RT priority) enters the lockSlowCase and starts spinning in the while loop

Once stage 5 is hit, the box is hung as the only thing that can run on a CPU core is:
1. one of the above RT threads (aqueue, vqueue, or multiqueue)
2. any other RT thread with a priority equal or greater than the above RT threads
3. any h/w irq

The use of the usleep() will allow the low priority process to run and release the mutex lock, avoiding the hang

Author of issue analysis and fix proposal: Steven Webster.

* Source/bmalloc/bmalloc/Mutex.cpp:
(bmalloc::yield):
@pgorszkowski-igalia pgorszkowski-igalia force-pushed the pgorszkowski/2.38/fix-bmalloc-hang-with-rt-thread-priorities branch from ed956e9 to 8782ef0 Compare December 11, 2024 09:53
@magomez magomez merged commit 1667f9e into wpe-2.38 Dec 11, 2024
@magomez magomez deleted the pgorszkowski/2.38/fix-bmalloc-hang-with-rt-thread-priorities branch December 11, 2024 09:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Development

Successfully merging this pull request may close these issues.

2 participants