New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pulseaudio: rewrite the pa backend with the async api #20
Conversation
|
delroth: updated |
|
You might need to worry about instruction reordering and coherency issues with the m_run_thread variable. m_run_thread = false;
m_thread.join();The compiler is free to move the assignment of m_run_thread around the join(). Also, the m_run_thread variable could be set, but not be visibly set on the audio thread if it's on a separate cpu core. The variable seemed to be [incorrectly] marked as volatile before to attempt to get rid of these errors. Volatile solves the coherence, but not the ordering issue, at least on x86. I would say to use the atomic helper functions in Common/Atomic.h, specifically AtomicStoreAquire/AtomicStoreRelease, since I don't think the win32 versions of AtomicStore/AtomicLoad are correct. |
|
Idea if someone is bored: implement a proper Event class in Dolphin that we In general we're lacking a few useful multithreading constructs in Dolphin: On Mon, Feb 3, 2014 at 4:35 PM, Jordan Cristiano
Pierre "delroth" Bourdon delroth@gmail.com |
|
Updated with the AtomicStore/AtomicLoad ones, if they don't work, we should fix them. But do we need a MemoryBarrier between the two calls SizzlingCalamari has pasted? Sorry I don't know how to do threaded code. If you want, I'll change it back to the volatile one like all other audio backends. |
|
After taking another look, I'd say that the easiest, most correct way would be to use std::atomic<bool> m_run_thread;This type may be a bit slower than doing the synchronization manually, but it takes care of all the gotchas and threading tricks as far as I know. I really think this should be used until someone looks at the atomic operations and creates a fully functional interface for them, along with some documentation on their usage. The only thing we're losing by using the std::atomic types is speed. To answer your question about the memory barrier, saying that the atomic calls need a memory barrier is what I was going for in the previous comment. std::atomic types by default use a full memory barrier in both directions and synchronize with all other threads. If you want to look in to doing it manually still, I'm pretty sure that the AtomicStoreAquire/AtomicStoreRelease functions are strict enough with their memory ordering constraints to get the job done correctly. Someone chime in if this isn't correct. Last thing, about the AtomicStore/AtomicLoad functions. They seem to be misrepresented in what they can accomplish. These functions use a relaxed memory model, which only provide atomic access and give no protection against compiler/hardware instruction reordering. They are fine for some uses, but an event signal like this isn't one of them. Also, since they are template functions, they may not even be atomic in the win32 case because of their implementation just being a variable read/assign. If someone puts a 64 bit type through them, the 64 bit reads/writes could turn out non atomic by being split into two 32 bit reads/writes. |
|
Might be worth integrating https://chromium.googlesource.com/chromium/src/base/+/master/atomicops.h (+ implementation and related files) into Common? We would at least have a robust, well tested implementation at hand instead of the current API which has problems as Jordan mentioned. BSD licensed, so this should be possible without too much trouble. Might need some changes to better match our codebase (namespace changes, etc.). |
|
Of course, I still think the proper solution would be to not use atomics at all here and rely on a proper Event object that would encapsulate all that (this Event object might internally use atomics - but users shouldn't care). |
|
Another alternative could be the Mintomic lock free atomics library. It uses a BSD license as well. |
|
updated with std::atomic, so imo this branch is ready for merging. For history: The first PA backend used the simple (push based) api. This was changed to use this async api for volume control, but without using callbacks. So it was basicly the same code flow but with a more complicated api. This was reverted sometimes to use a nicer api. |
|
LGTM, would appreciate seeing some user testing results though. |
|
I did some testing with this, and on Linux it's very hard for me record/get actual data when it comes to the latency. I think it's actually slightly better than Windows xaudio2. The latency of audio is the main reason I took interest in testing this. The main thing I did was check Zelda Four Swords Adventures menus, and basically time how long it took the sound to play after I hit the button. With Zelda-HLE, it was instantaneous, which is great. But the issues with LLE/AX-HLE + Latency are still bad. Regardless, that is likely not backend specific, and wouldn't be fixed by rewriting an audio backend. Still unfortunate. As for performance in games, pulse audio outperformed all the other audio backends in Linux when it comes to latency, choppiness, among other things. I tested a bunch of GC games. Compared to the Windows, audio latency is the same, which is a huge improvement over Alsa and the other backends I tried. Now, while writing this, I actually discovered why I got such great audio latency when first testing this. Holding tab when starting a game will temporarily (until slowdown) fix the audio latency. It improves it by at least 2 - 3 frames according to my Four Swords Adventures test. I don't know how stupid that is, but, whatever. That just tells me the audio latency problem is separate from backends. This backend performs well, gets a thumbs up from me. |
| m_pa_mlapi = pa_mainloop_get_api(m_pa_ml); | ||
| m_pa_ctx = pa_context_new(m_pa_mlapi, "dolphin-emu"); | ||
| m_pa_error = pa_context_connect(m_pa_ctx, NULL, PA_CONTEXT_NOFLAGS, NULL); | ||
| pa_context_set_state_callback(m_pa_ctx, StateCallback, this); |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
I store the return value of every non-void call in m_pa_error. If this error flag is set, we will break both on connection with the pa server and in the mainloop. |
The default async api allow us to set some latency options. The old one (simple API) was the lazy way to go for usual audio where latency doesn't matter. This also streams audio, so it should be a bit faster then the old one.
Pulseaudio: rewrite the pa backend with the async api
The default async api allow us to set some latency options. The old one (simple API) was the lazy way to go for usual audio where latency doesn't matter.
This also streams audio, so it should be a bit faster then the old one.