-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Choppy sound in NieR when rsx thread is too slow, only on the Vulkan backend #6882
Comments
On what game? There is no useful information or log. Vulkan uses multithreading more efficiently and is more CPU intensive then it's expected. The audio thread becomes "hungry". It also depends on how the game was coded. Your processor is quite weak and is limited in threads, even if there are 8 they are not very powerful. There is the buffering option (Audio buffer duration) for that, it doesn't make a miracle but it helps to relieve the problem, try it with 125ms or 150ms. For example on Nier. since your PR is about this game, I guess that's what you're testing. Nier uses SPUs to control audio, the game is very prone to have a "choppy sound" even on powerful PCs. That being said, optimizations are still possible yes, to relieve the CPU and threads. |
Ah right, it’s indeed NieR, here is a log of it running on Vulkan: RPCS3.log.gz and doing approximately the same on OpenGL: RPCS3.log.gz I am fully aware that my CPU isn’t the strongest, but I’m surprised your Vulkan backend is that much slower (on the CPU) than your OpenGL backend, which uses about 65% of a core at the same area. I’ll have a look at more in-depth profiling when it isn’t 3:30am anymore. :-° Audio buffering doesn’t do much sadly, I already tested the OpenAL backend previously. |
This may be somewhat Nier specific behavior, the game itself isn't super threaded, so stalls with drawing the graphics can cause the audio to crackle. Turning the resolution scaling way up will cause crackling on my RX 570. I don't regularly use the integrated graphics on my 7700K, but last I tried, it was much faster at drawing stuff in RPCS3 with OpenGL and Mesa vs with Vulkan and ANV. This may be what you're seeing. |
The two main functions I see on the Vulkan backend are What seems strange though is that OpenGL also gets sub-20fps and ~never desyncs the sound thread. |
Yes, my expectation is that the game ends up waiting too long for your GPU to do something, and since Nier isn't well threaded, it ends up writing the audio too late. Even on my 7700K which can effortlessly run the game well above 60FPS with ~25% CPU usage, it will crackle if I push the resolution too high. EDIT: Actually, the problem is clear here, with 800% scaling and write colour buffers on, the game runs at 14fps with terrible audio. With write colour buffers off and 800% scaling, the game still runs at 14fps, but now with perfect audio. Of course this breaks the visuals, but it's clear what the game is waiting on now. |
Yes, this has always been the case on Nier, the WCB on vulkan creates problems on audio, I know that KD or Ruipin had explained the reason to me but I don't remember it. The buffer option was there to relieve this problem but apparently it's not enough for the CPU of Linkmauve. Keep in mind that Nier was coded by demons, the code of this game is incomprehensible. :P |
Another interesting datapoint is that the music runs perfectly when the rsx thread is entirely blocked, for instance on io. I tried to profile the Vulkan backend but I don’t have enough RAM to use renderdoc on a given game frame, despite it working properly on OpenGL, so I’m working on pure guesses so far. I tried to figure out the differences between when it runs fine and when it doesn’t, and it seems to upload a swizzled 512×512 texture from the cell to Vulkan taking quite a lot of (CPU) time, and when it stops the framerate increases back to some 30 fps. Do you know why this game is uploading a texture (the same?) every frame but only in certain areas? Have you tried deswizzling the texture on the GPU after upload instead, for instance using a compute shader? Or am I following a red herring and it’s totally unrelated? I’ve tested with write colour buffers off and I can indeed reproduce your findings. What exactly does this mean? The game is rendering something on the GPU, and then copies it to the main RAM for the CPU to do things with, but where/when does this happen? Is it a situation where we have no idea when to do the copy and have to do it all the time, or is there some way to know the CPU access patterns and such? Thanks a lot for your answers, they are greatly appreciated. :) |
The deswizzling algorithm is not well-suited for GPUs. Can it be done? Yes, but it will hurt graphics performance.
Texture modification is detected using page faults. Its usually faster to queue an upload than to hash the texture data since on PS3 the textures are changed a lot every frame to make use of the small memory available. The hardware is designed to make this cheap.
In the texture cache, only identifiable by a page fault.
There is no innate hardware access pattern, but we do have a very competent predictor that actually initiates the transfer before we need the data. If you're seeing GPU->CPU flushes, they're likely triggering based on prediction, not actual page faults. If you start queuing DMA requests after a page fault you're going to have a very bad time. Which brings us to the main question: Why is OpenGL immune? But why not yield/sleep? Most of these stalls are "small" in threading timescales (ranging from around 40us to several milliseconds) and you cannot know beforehand how long you need to wait. Wait too long and performance drops for some people, busy wait and some weak CPUs may sometimes have audio stuttering. This is actually because the graphics and audio threads run in the same higher-than-normal priority, but the audio thread has a fixed 'tick rate' that is large on threading timescales causing a situation where (in your case) RSX and Audio are contending with each other. |
Just for thoroughness, I may add a GPU-side deswizzling option since most people have much better GPUs than CPUs. Not sure how well it will work on a UHD chip due to the high amount of non-continuous memory access and the deep loops which suck on a graphics card, but at the very least, it could be better than executing this on a weaker CPU. Will probably default to off though. |
@linkmauve can you recheck this one? |
I can confirm, the sound is choppy sometimes. Don't know anything about rsx threads. |
I have an architectural solution for this. Reviving. |
sorry if dumb comment. edit: |
Yea, commenting out is obviously wrong, you're not actually doing the requested operation (WCB/WDB/GPU blit) instead just reading whatever junk was in memory. Sometimes this works fine, most of times everything gets fucked and you get flashing or flicker. Closing as fixed by #15205 |
I’m using rpcs3 f3ed26e, on an i5-8350u sporting an UHD 620 GPU.
When using Vulkan, the sound gets choppy as soon as the framerate drops under 30fps and the rsx thread uses 100% of a core, but this isn’t the case when using OpenGL despite the framerate being approximately identical.
I’m in the process of optimising the rsx thread so that it runs better on this laptop, but in the meantime it’d be nice to be able to play with better sound. :)
The text was updated successfully, but these errors were encountered: