Gpu multi thread #3

Amjad50 · 2022-01-19T06:24:55Z

Fixes #2.

This is a starting work on making the GPU more performant , but I think it can be improved.

This is just moving the previous implementation to another thread, there is no improvement, so its still slow. There is also a stupid case with `gpu_read`, I added a delay when reading from the channel, since sometimes, the backend isn't fast enough to send the whole thing faster than the dma.

This design now, will try to use the same buffer until any of the following happens: - front_blit - drawing textured/transparent polygon/line - read command Because these require last state of the vram, the front_blit and reading is ok, but would be nice to optimize for textured and transparent drawing. This design now, makes emulation faster for drawing many opaque polygons/lines, which is a nice improvement.

By copying to the back image before setting it as texture, it's not triggering conflict error anymore (not sure if its a good thing or a bug from vulkano). But even with this, stacking many commands in one buffer is very bad. The bottleneck happens in `vulkano::command_buffer::synced::builder::append_command`, in which I think its checking for any conflicts with ALL previous commands. Not sure if it can be improved, but limiting the number of commands in a single command buffer seems to help.

This is much faster than using `CpuAccessibleBuffer`.

This improved performance as well, mostly by using secondary buffers. This reduced the calls to `append_command` on the primary command buffer, which already has a lot. I think this is the main reason for performance improvements.

We can use the channel as fifo, and execute the commands and params as they come in.

Amjad50 marked this pull request as draft January 19, 2022 06:33

Amjad50 force-pushed the gpu_multi_thread branch from f7299fb to 45f9707 Compare January 23, 2022 09:36

Amjad50 marked this pull request as ready for review February 2, 2022 07:27

Amjad50 added 6 commits February 2, 2022 15:30

Gpu: used CpuBufferPool for vertex buffer allocation

1a57cec

This is much faster than using `CpuAccessibleBuffer`.

Gpu: Removed the Gp0 clock and fifo from the backend

7fba16c

We can use the channel as fifo, and execute the commands and params as they come in.

Amjad50 force-pushed the gpu_multi_thread branch from 45f9707 to 7fba16c Compare February 2, 2022 07:32

Amjad50 merged commit e7a4e1e into master Feb 2, 2022

Amjad50 deleted the gpu_multi_thread branch February 2, 2022 07:33

Amjad50 mentioned this pull request Feb 2, 2022

Fix input delay in Crash2 (and maybe other games) #6

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gpu multi thread #3

Gpu multi thread #3

Amjad50 commented Jan 19, 2022 •

edited

Loading

Gpu multi thread #3

Gpu multi thread #3

Conversation

Amjad50 commented Jan 19, 2022 • edited Loading

Amjad50 commented Jan 19, 2022 •

edited

Loading