New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Allow use of abstract ModelT #88
Conversation
Signed-off-by: falkTX <falktx@falktx.com>
|
So just to make sure I'm correctly understanding the proposed use case, the idea is that you want to have a system that can dynamically create/load a Given that this is the problem you're trying to solve. I don't think the proposed solution is the best way to go. Part of the design philosophy for The way I have done this sort of thing in the past is as follows: using ModelType1 = RTNeural::ModelT<...>;
using ModelType2 = RTNeural::ModelT<...>;
using ModelVariant = std::variant<ModelType1, ModelType2, ...>;
ModelVariant model;
void custom_model_loader (const nolohmann::json& model_json, ModelVariant& model)
{
if (is_model_type1 (model_json))
model.emplace<ModelType1>();
else if (is_model_type2 (model_json))
model.emplace<ModelType2>();
std::visit ([&model_json] (auto&& chosen_model) { chosen_model.parseJson(model_json); }, model);
}
void process_audio (float* data, int num_samples)
{
std::visit ([&data, num_samples] (auto&& chosen_model)
{
for (int n = 0; n < num_samples; ++n)
data[n] = chosen_model.forward (data[n]);
}, model);
}This strategy allows the model to be created in local memory rather than heap memory, and avoids the need for an abstract function call in the innermost processing loop. If you don't care for Anyway, if you can think of a way to make some helper functions or classes to encapsulate this logic, I'd be happy to add that to RTNeural. At the very least, I think it would be a good idea for me to add an example and some documentation to let folks know about this strategy. |
Yes, supporting as much as we can write in verbose code that would check known conditions.
For the plugin in question we actually want to do heap allocation, which is done in worker threads synchronized with the plugin host. We do a pointer swap when loading new models, the entire thing being thread-safe and lock-free as it is integrated with the host threads (on LV2 this can be done with the worker extension, CLAP has similar concepts too). Basically:
For this to work we really need to be working with direct pointers.
From what we have tested, this approach has a significant performance hit, and becomes unsuitable for the plugin (we are running it in a 1.3GHz ARM processor after all, there is not a whole lot of CPU to go around...) The virtual methods was my first initial thought of something that could work for a generic loader plugin.
Using a union of pointers (or std::variant for a more modern C++ take on it) could be an option, but then every run would need to have a big switch case to find the right model instance type to call. |
|
Okay great, it seems like we are talking about essentially the same use-case :).
Sure, so my point is that any solution built-in to That said, I think that in this case, the non-heap-allocated variant-based solution still makes the most sense. For example, if you ignore host-managed threads, you could do use using ModelType1 = RTNeural::ModelT<...>;
using ModelType2 = RTNeural::ModelT<...>;
using ModelVariant = std::variant<ModelType1, ModelType2, ...>;
ModelVariant models [2];
std::atomic_int active_model_index { 0 };
int inactive_model_index { 1 };
// on some non-real-time thread...
void custom_model_loader (const nolohmann::json& model_json)
{
auto& inactive_model = models[inactive_model_index];
if (is_model_type1 (model_json))
inactive_model.emplace<ModelType1>();
else if (is_model_type2 (model_json))
inactive_model.emplace<ModelType2>();
std::visit ([&model_json] (auto&& chosen_model) { chosen_model.parseJson(model_json); }, inactive_model);
// swap models here!
inactive_model = active_model.exchange (inactive_model);
}
// on the real-time thread...
void process_audio (float* data, int num_samples)
{
std::visit ([&data, num_samples] (auto&& chosen_model)
{
for (int n = 0; n < num_samples; ++n)
data[n] = chosen_model.forward (data[n]);
}, models[active_model.load()]);
}
Would it be possible to explain a little bit more where this performance hit is coming from? From my own testing, I haven't been able to measure any performance difference between a plain By contrast, I have found that constructing a
Sure, so I think the variant-based solution that I've proposed satisfies both these requirements. For requirement 1 (as I've shown above), it's possible to set it up so that the only data that would need to be copied between the real-time and non-real-time threads is a single array index. Or if you prefer, I guess you could use a pointer to a For requirement 2, in my experience,
Right, so that's basically what I'm proposing, although I would say it's better to use the direct Anyway, I'm a bit busy this week, but I'll see if I can get a working example up and running sometime before the end of the week. |
|
that all looks very good and promising, thanks for the detailed info! there is one crucial point though, the creation of the new model and its swap are not done on the same function. from your example code I dont see how we would create a model and have it pass through the host without needing to rely on the plugin class having it in some kind of variable for what to handle next.
I meant when creating the model in a dynamic way, defining its architecture while parsing the json file. For pointers or std::variant, performance should be pretty similar if not the same.
Can you give more information about this? From our testing the performance is pretty much the same using stack or heap allocated models.
Can you clarify this too? Do you mean that using my proposed abstract pointer solution would lead to per-sample processing? |
No problem!
Right, so doing the swap inside the model-loading function was only part of my example because my example was assuming that host-managed worker threads don't exist. In cases where they do exist, I guess the correct solution would depend on how exactly they work, which I'm not immediately familiar with. From what I do know, I think my preferred solution would be:
But like I said, there are some details here that I'm not very familiar with, so I trust that you'll do the right thing for your use case. If you do end up using heap-allocated objects, I'd suggest using something like a
Sure, so a heap-allocated
So one of the biggest performance gains between the |
|
By the way, I just put together a little example project. |
|
Bringing some news.. I ran a few tests with different approaches, and basically:
the example project was extremely helpful, thanks a lot for that! specially the python script to generate a list of model types in a variant.
would be nice if we could extend ModelT to support a few more on how to deal with LV2 worker contexts, for now I still ended up going with pointer swap technique and allocating the model on the heap. not by itself though, but along side other fields that relate to it. which allows to add more things to it as needed. As this becomes a pointer, it is safe to pass around in the worker stuff. Personally I am happy with the results. PS: Just pushed Chow Centaur to the MOD plugin store <3 |
|
Closing this ticket, the variant approach has proven to work well. Great stuff! Thanks a lot for your help and a great project! <3 |
|
Very cool! Glad to hear that this approach is working well. And thanks to you for the useful discussion :) |
Me and @KaisKermani have been testing RTNeural in the context of https://github.com/AidaDSP/aidadsp-lv2 plugin, trying to find a way to make it load json files at runtime without losing performance compared to the statically constructed model types.
One approach that proved to work is to make an abstract Model class and use that as the pointer type to which the plugin calls into, which allows stuff like this:
Compared to the more dynamic handling of the json namespace methods, this approach gives a performance equal to creating the model with all parameters set in the code (kinda expected, as it does pass all template params in a static way).
Our idea is to have a verbose/extended json parser that then creates all known architectures possible in this "static" way, thus not losing performance.
This is a first test patch, meant for general feedback.
A proper patch would at least need the abstract class to have a template for float/double/etc type.
And I am not sure what to call that new
forwardfcall, we just need it to not be "forward" as to avoid recursion and name clashes.What do you think?