-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(vg_lite): add asynchronous rendering support #5398
Conversation
75f2a53
to
2cce05f
Compare
#endif | ||
|
||
LV_VG_LITE_CHECK_ERROR(vg_lite_finish()); | ||
lv_vg_lite_flush(draw_unit); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if flush_count == 3
and there are no more draw tasks. What will trigger the flushing of these 3 comamnds?
lv_conf_template.h
Outdated
@@ -176,6 +176,9 @@ | |||
/* Enable VG-Lite assert. */ | |||
#define LV_VG_LITE_USE_ASSERT 0 | |||
|
|||
/* VG-Lite flush commit trigger threshold */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/* VG-Lite flush commit trigger threshold */ | |
/* VG-Lite flush commit trigger threshold. VGlLite will try to batch these many draw tasks */ |
- How can we determine the ideal value? Can it be 256 too?
- How large the command buffer is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can we determine the ideal value? Can it be 256 too?
8 is an experience value.
We plan to implement this set of logic in the underlying driver in the future. Check whether the GPU is in idle state in vg_lite_flush
: if the GPU is in idle, the rendering will be submitted immediately. If the GPU is busy, the rendering task will be queued to wait for the next time examine.
In LVGL, just simply call vg_lite_flush
.
How large the command buffer is?
Currently we configure the cmd buffer to be 64K x2. These two buffers are ping-pong buffers. When the GPU is reading one of the buffers, the user can write to the other buffer and swap at the appropriate time. The cmd buf length occupied by different drawing commands is different, ranging from tens to hundreds of bytes.
@@ -157,6 +152,7 @@ static int32_t draw_dispatch(lv_draw_unit_t * draw_unit, lv_layer_t * layer) | |||
|
|||
/* Return 0 is no selection, some tasks can be supported by other units. */ | |||
if(!t || t->preferred_draw_unit_id != VG_LITE_DRAW_UNIT_ID) { | |||
lv_vg_lite_finish(draw_unit); | |||
return -1; | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kisvegabor
If there are no more drawing tasks, it will wait for the GPU drawing to complete and clear flush_count.
Signed-off-by: pengyiqiang <pengyiqiang@xiaomi.com> Signed-off-by: FASTSHIFT <vifextech@foxmail.com>
2cce05f
to
fe44a9e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll merge it after the release.
In fact, merge now also works, does not affect the main line function :) |
@@ -157,6 +152,7 @@ static int32_t draw_dispatch(lv_draw_unit_t * draw_unit, lv_layer_t * layer) | |||
|
|||
/* Return 0 is no selection, some tasks can be supported by other units. */ | |||
if(!t || t->preferred_draw_unit_id != VG_LITE_DRAW_UNIT_ID) { | |||
lv_vg_lite_finish(draw_unit); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why to wait for GPU completion here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When all the drawing tasks are finished, it will go here (usually about to be sent), and you can wait here for the GPU to finish the unrendered work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can go here also if there are tasks which are not supported by VGlite but only by the CPU. So you do not have to wait for GPU completion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We basically implement full GPU rendering, and the CPU basically does not need to participate in the rendering work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is a single thread, then it does not really matter if everything can be drawn by vglite.
The main lvgl thread will call the vglite dispatcher. If the task is supported, will be passed to vglite. Else, the CPU dispatcher will get called and CPU will take care of it.
I will try to run your implementation on my side and see how it behaves.
#endif | ||
|
||
LV_VG_LITE_CHECK_ERROR(vg_lite_finish()); | ||
lv_vg_lite_flush(draw_unit); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling vg_lite_flush() after each task does not necessary mean that command gets batched. Instead, after each vg_lite_flush() you will force the other command buffer to take over. As GPU have only 2 command buffers, if a 3th vg_lite_flush() comes too fast while the first vg_lite_flush() is not yet complete - e.g. the command buffer 1 is not complete - then the 3'th vg_lite_flush() will act as a vg_lite_finish().
This mechanism you implemented does not work as you intend to.
What is the improvement with this new code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remember that vg_lite_flush() will switch the two GPU command buffers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is a double encapsulation of vg_lite_flush
.
In lv_vg_lite_flush
I added the count. When lv_vg_lite_flush
is called more than a certain number of times, vg_lite_flush
is called to tell the GPU to start rendering, so that the GPU can process several small tasks together to improve efficiency.
The next time vg_lite_flush
is called, usually the GPU has already processed it and can immediately start a new rendering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand now.
But you batch by force a number of LV_VG_LITE_FLUSH_MAX_COUNT tasks! Even the first time.
In my tests (running the benchmark v9) I found that GPU gets busy for a couple of 4-5 tasks only when it has to deal with filling the screen area - big task. Otherwise the GPU completes any task right away.
So the GPU gets IDLE most of the time and you do not have to batch anymore commands. This is why I suggest the solution I implemented here:
void vglite_run(void) |
In this way I am making sure the GPU is busy as much as possible. If the GPU is idle then makes no sense to queue even more commands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we implement this logic inside vg_lite_flush
? Let the driver determine the GPU status by itself. If the GPU is busy, it will join the queue. If it is idle, it will start rendering immediately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is an API change behavior. In documentation it says it "submits the command to the GPU without waiting for completion". You can not just return because it might not be come another vg_lite_flush() to really submit it to the GPU.
There are other applications beside lvgl on top of vglite or in some cases customers can use vglite api directly to design the application. I would not ask for a change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm...that makes sense. May I ask if you added the VG_LITE_GPU_IDLE_STATE
enumeration in vglite_run
yourself? Or is it already implemented in the SDK?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We asked for the API implementation to the VSI. You should see it as well in the next driver releases. I use 4.0.58 version.
vg_lite_get_parameter(VG_LITE_GPU_IDLE_STATE, 1, (vg_lite_pointer)&gpu_idle);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! I'll switch to the same plan..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also let the VGLite thread implementation with the possibility to run it single thread (v8 like).
lvgl/src/draw/nxp/vglite/lv_draw_vglite.c
Line 113 in 152dc0b
#if LV_USE_OS |
And check all the places where the
#if LV_USE_OS
is used in this file. Simply setting the LV_USE_OS to LV_OS_NONE will make it run single thread (as you have it now).
VELAPLATFO-22913 Signed-off-by: pengyiqiang <pengyiqiang@xiaomi.com> Signed-off-by: FASTSHIFT <vifextech@foxmail.com> Co-authored-by: pengyiqiang <pengyiqiang@xiaomi.com>
Help us review this PR! Anyone can approve it or request changes.
Description of the feature or fix
Add VG-Lite flush commit trigger threshold. GPU will try to batch these many draw tasks. No need to enable independent rendering threads.
Checkpoints
lv_conf_template.h
run lv_conf_internal_gen.py and update Kconfig.scripts/code-format.py
(astyle needs to be installed) and follow the Code Conventions