-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace MemoryManager implementation with rpmalloc #3873
Conversation
Cool! On Linuxmint 17.3, I get:
. Edit. New to submodules. The commands that got me going was:
|
It's a submodule, you'll have to clone with the recurse option. |
Error when compiling with carla:
|
@Umcaruje that appears to be a bug with Carla, it should have imported This is easy to miss when another common header accidentally provides the include. |
Built without carla, I can confirm I don't get stutters while playing on elementary os loki under both qt4 and qt5. Very good and comparable performance with 1.1. I'll keep using it for further testing. |
Tested. It showed very fast allocation speed and I got no crashes during tests. |
src/core/MemoryManager.cpp
Outdated
MemoryHelper::alignedFree( ( *it ).m_pool ); | ||
MemoryHelper::alignedFree( ( *it ).m_free ); | ||
if (--thread_guard_depth == 0) { | ||
rpmalloc_thread_initialize(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably meant to do rpmalloc_thread_finalize
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely. Thanks!
May someone try this project? |
On my computer with 2.2 Ghz CPU LMMS with your project opened use 14% of CPU. |
I am merged this PR into my local lmms and got this on cmake generate: CMake Error at src/3rdparty/rpmalloc/CMakeLists.txt:3 (add_library):
Tried extensions .c .C .c++ .cc .cpp .cxx .m .M .mm .h .hh .h++ .hm .hpp CMake Error: Cannot determine link language for target "rpmalloc". |
@qnebra please see #3873 (comment). This PR switches the project to using submodules, which will take some adjustment, but will greatly reduce the amount of 3rd party source code we bundle. |
Thank you. |
Let's merge it first. |
Test & merge, of course. |
@qnebra Thanks for testing. I'm making some tests about memory use. Actually half of the instruments are good (don't allocate extra memory) and the other half is not, mainly because of Oscillator. All oscillators are duplicated for each played note (with no new information). There must be a way to avoid that. PS1: I also rewrote BaseDetuning to avoid allocation. @softrabbit was right, a single float works fine. PS2: This memory manager doesn't seem to improve performance a lot. I wouldn't merge it as it is. The memory manager is a sensible part very depending on the OS, the OS version, the compiler, etc. It should be configurable at runtime. Right now, 5 are available (legacy, standard, rpmalloc, mine and PsySong's one). An abstract class would make wonders. |
What, no new information? Frequency and phase come to mind... that being said, I'd like to see some kind of "oscillator bank" concept, where all 6 (TripleO) or 18 (Organic) Oscillators are merged into one object, preferably organizing the data and math in a way that makes it easy for compilers to optimize. (SIMD, anyone?) That might even up for some new features... |
@softrabbit Frequency is the same than for the NotePlayHandle, same for ext_phaseOffset. PhaseOffset seems to be more or less equal to ext_phaseOffset. So only phase stays (one float), which is probably computable (I think but I may be wrong). Anyway, there is a big room for improvment here. Especially as Oscillator by many many instruments. I wouldn't group the oscillators into a single object. Instead I would pre-allocate 2048 oscillators (for example) and reuse them when needed because all the instruments use the same Oscillator type (for now). More important, if it goes over 2048, the note should be skipped. As a result: no memory allocation and CPU under controlled for all the instruments. |
I hate to dig up my own code as an example, but: #2089 That's a speedup of maybe ~2x from reorganizing arrays of structs into structs of arrays. Probably even more if compiling for wider SIMD models than SSE2. That's not to say the same applies to the Oscillator situation, but it might be worth considering. |
@softrabbit Actually I don't disagree. But that was my understanding that you talked about grouping, not restructurating. As a first step requiring little code changes, I'm suggesting to group them all in a big array (with eventually informal subgrouping in 6/18). As for switching from "arrays of structs into structs of arrays", there are certainly possible gains. But it is a complete rewrite of the Oscillator class and of the depending instruments. And IMHO, that's something that should be only tried later. Eventually in parallel (new classes: Oscillators, Kicker2, TripleOscillator2, Organic2, etc). |
I forgot about something, on my PC your project run "as is" consume 65% of CPU. My previous tests are incorrect, because I modified your project. Interesing behaviour with this loop points. What tools can I use to test lmms perfomance? Because in my project program has low usage of system resources according to system monitoring tools, but sounds are extremely stutter and glitch. |
Are there some benchmarks for cache miss rate? It is also important for performance. |
@PhysSong I did a couple more benchmarks, again using Startup and load project:
Render project using CLI:
If KCachegrind's cycle estimation is accurate, the old Merging. |
@qnebra You are of course absolutely right here. The now updated wiki page is: |
My memory manager is about 35% faster than rpmalloc. Drawbacks: is dedicated to LMMS, uses on average more memory To do: auta-adaptable to the style of the user Suggestion: make the memory manager an option configurable by the user |
This reverts commit 8d6cb12.
* Replace MemoryManager implementation with rpmalloc Fixes LMMS#3865 * Travis: Specify OSX image for Qt5 build
* Replace MemoryManager implementation with rpmalloc Fixes LMMS#3865 * Travis: Specify OSX image for Qt5 build
* Replace MemoryManager implementation with rpmalloc Fixes LMMS#3865 * Travis: Specify OSX image for Qt5 build
Fixes #3865, by integrating rpmalloc.
Quick – probably not very meaningful – benchmark, using Skiessi-RandomProjectNumber14253: