Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

display: enable offloading render process to background thread #40

Conversation

danieldegrasse
Copy link
Contributor

Description of the feature or fix

Enable offloading of display_write call to background thread for color displays. This feature is opt-in, as it may offer significant performance gains for every display pipeline.

When enabled display_write and lv_disp_flush_ready will be called from a background thread. This means that while the display driver waits on the hardware to render the framebuffer, the LVGL rendering thread will not be blocked.

@jakkra
Copy link

jakkra commented Jul 2, 2023

Hi @danieldegrasse I have tried this one out but ran into some issues. How are you calling lv_task_handler in your testing code (what thread and prio)?

My issue is that I'm running lv_task_handler from them system workqueue and also using lvgl only from the system workqueue (LVGL is not thread safe). For the proposed solution here the way I understand at least is that lv_task_handler has to run from a preemptive thread, which will be preempted directly after

k_msgq_put(&flush_queue, request, K_FOREVER);
by lvgl_flush_thread_entry otherwise what I see there will only be a ping-pong between the thread running lv_task_handler and the new lvgl_flush_thread_entry without any gain from double buffering (same FPS as with single buffer).
The way I "fixed" it was by adding a k_yield(); after k_msgq_put to let lvgl_flush_thread_entry run before rendering into next buffer, and got about 50% more FPS with this PR.

Another thing I thought about that may be a possible issue (please correct me if I'm wrong):
Let's say the Display driver uses SPI, for many SPI displays you need to send more than one SPI transaction to write. For example: https://github.com/zephyrproject-rtos/zephyr/blob/f6eeb9dc84bec63dea231a49e97992875b659614/drivers/display/display_ili9xxx.c#L111C1-L111C1
Often first setting X, Y and then after that sending the bytes (including seperate SPI transactions for command as Zephyr does not support HW control of DCX pin). So there may be 5+ SPI transactions before the large chunk of data is actually sent.
My concern here is that the display_write may when waiting for one of those small SPI transactions to finish get "preempted" (while waiting for SPI finish mutex) and we may get back to rendering into the other buffer before the big chunk of data is actually being transmitted. And we loose the purpose of double buffering again.

@danieldegrasse
Copy link
Contributor Author

danieldegrasse commented Jul 3, 2023

Hi @danieldegrasse I have tried this one out but ran into some issues. How are you calling lv_task_handler in your testing code (what thread and prio)?

Currently from the main thread, which should use priority 0. I am testing this using the LVGL benchmarking and widgets samples on an RT1060 EVK. I see what you mean that k_yield is need when using a cooperative thread, but as I mention below, I think this feature will only work correctly if the thread running LVGL rendering is preemptable. Do you think adding documentation of that within the Kconfig help strings would be useful?

My concern here is that the display_write may when waiting for one of those small SPI transactions to finish get "preempted" (while waiting for SPI finish mutex) and we may get back to rendering into the other buffer before the big chunk of data is actually being transmitted. And we loose the purpose of double buffering again.

display_write being preempted is ok, and frankly desirable (as we want LVGL to start rending into the second framebuffer while the first is written to the display) If using cooperative threads, the LVGL rending thread may run the entire render before the framebuffer can be streamed out. I don't see a great solution for this besides allowing the thread running LVGL rendering to be preemptable, and running the display_write thread at a higher priority. For your use case, could you use a workqueue running from a thread with a set priority or set CONFIG_SYSTEM_WORKQUEUE_PRIORITY to be a preemptable value?

@jakkra
Copy link

jakkra commented Jul 3, 2023

Do you think adding documentation of that within the Kconfig help strings would be useful?

Yes I think this should be documented that lvgl rendering has to run in a preemtable thread and that LV_Z_FLUSH_THREAD_PRIO need to have a higher priority. Otherwise this feature will not work and people enabling will will be confused.

For your use case, could you use a workqueue running from a thread with a set priority or set CONFIG_SYSTEM_WORKQUEUE_PRIORITY to be a preemptable value?

Unfortunately due to some kconfigs such as BT and IPC my CONFIG_SYSTEM_WORKQUEUE_PRIORITY has to be cooperative priority.
And yes I can run rendering from a separate workqueue thread with preemtable prio and this works, althought it's very handy (saves lot's of extra code) to be able to use LVGL from system workqueue context as many subsystems and drivers etc. will have their callbacks in system workqueue context already and the system workqueue is available from anywhere. Ignoring the extra "wasted" ~3K RAM needed for stack for this new workqueue thread to do rendering.

For now I think I'll just patch in the k_yield in my application manually, unless you think this would be something that makes sense to add a Kconfig entry for, but maybe it's a bit "hackish".

Enable offloading of display_write call to background thread for color
displays. This feature is opt-in, as it may offer significant
performance gains for every display pipeline.

When enabled display_write and lv_disp_flush_ready will be called from a
background thread. This means that while the display driver waits on the
hardware to render the framebuffer, the LVGL rendering thread will not
be blocked.

Signed-off-by: Daniel DeGrasse <daniel.degrasse@nxp.com>
@danieldegrasse
Copy link
Contributor Author

Yes I think this should be documented that lvgl rendering has to run in a preemtable thread and that LV_Z_FLUSH_THREAD_PRIO need to have a higher priority. Otherwise this feature will not work and people enabling will will be confused.

I've added some documentation to the latest push highlighting this, thanks for pointing it out.

For now I think I'll just patch in the k_yield in my application manually, unless you think this would be something that makes sense to add a Kconfig entry for, but maybe it's a bit "hackish".

Truthfully, I think that adding a k_yield is harmless (as the preemptable thread would have already been forced to yield if another thread of higher priority could run). I'll add in to this PR, since it seems you still see some performance gains

@danieldegrasse
Copy link
Contributor Author

@carlescufi are you aware of anyone else who can review this PR?

@faxe1008
Copy link
Contributor

faxe1008 commented Jul 17, 2023

I tried running the stress demo on a RT1060-EVKB with an RK043FN66HS-CTG, using CONFIG_LV_USE_PERF_MONITOR. Changing the display refresh period to 10ms without the change I observe roughly 50-65 FPS. With the flush thread I am observing ~70-100 FPS. For anyone also testing, you'd need to adjust LV_DISP_DEF_REFR_PERIOD.

@jakkra
Copy link

jakkra commented Aug 14, 2023

This PR is such an amazing improvement for FPS, It would be very nice if it could get merged, either before (would be easier) or after #45 (if after I guess this PR needs to move to main repo).

Since nothing have happened for a month, maybe we could ping it in discord "pr-help" channel to get more reviewers if that is needed?

@fabiobaltieri
Copy link
Member

fabiobaltieri commented Aug 15, 2023

@danieldegrasse I just merged #45, let's get #61300 in then you should be able to just move this over and we should not have PR visibility issues anymore on this.

I'll do a swipe of the lvgl open PRs as well after that.

@fabiobaltieri
Copy link
Member

zephyrproject-rtos/zephyr#61300 is in, feel free to move this upstream.

@danieldegrasse
Copy link
Contributor Author

Closing this, as the changes will be integrated into Zephyr modules gluecode

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants