Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gpu multi thread #3

Merged
merged 6 commits into from
Feb 2, 2022
Merged

Gpu multi thread #3

merged 6 commits into from
Feb 2, 2022

Conversation

Amjad50
Copy link
Owner

@Amjad50 Amjad50 commented Jan 19, 2022

Fixes #2.

This is a starting work on making the GPU more performant , but I think it can be improved.

@Amjad50 Amjad50 marked this pull request as draft January 19, 2022 06:33
@Amjad50 Amjad50 marked this pull request as ready for review February 2, 2022 07:27
This is just moving the previous implementation to another thread, there
is no improvement, so its still slow.

There is also a stupid case with `gpu_read`, I added a delay when
reading from the channel, since sometimes, the backend isn't fast enough
to send the whole thing faster than the dma.
This design now, will try to use the same buffer until any of the
following happens:
- front_blit
- drawing textured/transparent polygon/line
- read command

Because these require last state of the vram, the front_blit and reading
is ok, but would be nice to optimize for textured and transparent
drawing.

This design now, makes emulation faster for drawing many opaque
polygons/lines, which is a nice improvement.
By copying to the back image before setting it as texture, it's not
triggering conflict error anymore (not sure if its a good thing or a bug
from vulkano).

But even with this, stacking many commands in one buffer is very bad.
The bottleneck happens in
`vulkano::command_buffer::synced::builder::append_command`, in which I
think its checking for any conflicts with ALL previous commands. Not
sure if it can be improved, but limiting the number of commands in a
single command buffer seems to help.
This is much faster than using `CpuAccessibleBuffer`.
This improved performance as well, mostly by using secondary buffers.
This reduced the calls to `append_command` on the primary command
buffer, which already has a lot. I think this is the main reason for
performance improvements.
We can use the channel as fifo, and execute the commands and params as
they come in.
@Amjad50 Amjad50 merged commit e7a4e1e into master Feb 2, 2022
@Amjad50 Amjad50 deleted the gpu_multi_thread branch February 2, 2022 07:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Multithread GPU
1 participant