-
Notifications
You must be signed in to change notification settings - Fork 298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Code generation for large Jacobians in nonlinear system initialization scales badly #11302
Comments
Keeping @matteodepascali in the loop. |
More than 30% of the total translation time is spent in the function OpenModelica/OMCompiler/Compiler/BackEnd/BackendDAEUtil.mo Lines 10288 to 10319 in 865a1cc
I am not sure if I understood the whole thing but basically the function The check for existence in the UnorderedSet is done using the function @kabdelhak implemented the relevant code fairly recently in #10397 to fix issues introduced by #9263 . Maybe he or @phannebohm can give you a more complete analysis. |
Thanks @mahge for the analysis! Alas, I'm not enough into the details of how the backend works to be able to give any technical suggestion. What I understand, also given the reference to #9263, is that the point of this part of the code is to figure out which variables appear nonlinearly in a given strong component. These strong components are normally sparse, so there are N equations with max M << N variables in each of them. The following is a naive pseudo code to build the list of nonlinear variables:
The complexity is Then, I would expect the complexity of this algorithm to be Maybe the problem is using an unordered set instead of an ordered list? |
Fixes OpenModelica#11302 For each SCC of the system all variables were traversed, so the time complexity was O(N*S) where N is the number of variables and S is the number of SCCs. By first collecting all nonlinear iteration vars in the same set and then marking them once it should now be O(N).
Thanks @mahge, without your measurement I would never have found the issue so quickly 🚀 The problem was that for each of the S strong components the list of all N variables in the system was traversed, so complexity was something like
Unordered sets have approximately constant lookup times because they use hashes for indexing, that's the whole idea of using them over lists or trees 😄 |
@phannebohm this looks great if you have many relatively small systems (which we also have in our power plant model at lambda = 0, thanks to smart simplifications). But what if you have only one very big strong component? That is the case of the MWE I posted in this ticket. Did you check how long it takes to carry out simcode: create initialization part after your fix? |
You're right and I thought about that too. But as far as I can see #11312 only saved computation time, no real trade-off except that the unordered set gets larger because it is one big set compared to many smaller sets. But that should still be no real issue since lookup is practically constant. I tested your MWE. Times went down significantly:
|
Fixes OpenModelica#11302 For each SCC of the system all variables were traversed, so the time complexity was O(N*S) where N is the number of variables and S is the number of SCCs. By first collecting all nonlinear iteration vars in the same set and then marking them once it should now be O(N).
BTW, the NB has a completely different structure and things like this should not happen there because we have direct pointers so there is no traversing, only direct access. I think... |
Fixes #11302 For each SCC of the system all variables were traversed, so the time complexity was O(N*S) where N is the number of variables and S is the number of SCCs. By first collecting all nonlinear iteration vars in the same set and then marking them once it should now be O(N).
Judging from the regression report, this commit had very beneficial effects on simcode performance when dealing with large models, including those of the ClaRa library 😃 |
Description
I am experiencing really bad performance of the simcode: created initialization part on a power plant model of about 100,000 equations, where this part takes a whooping 4500 s to generate the code, and over 10,000 s to compile it.
I think I managed to capture the issue in a much simpler MWE, that we can use to identify the issue and solve it.
Steps to Reproduce
Consider the following MWE:
Model
M
replicates the crucial feature of the original model, namely the need to solve a large sparse nonlinear system of equations for initialization. In this case, the initialization problem has about N initial equations, with about 2N non-zero elements in the Jacobian.On my Windows PC, simcode: create initialization part takes 0.79 s for N = 1000, 2.42 s for N = 2000, 12.2 s for N = 4000. Clearly, this does not scale well, given that the number of nonzero elements in the Jacobian is proportional to N.
I checked the size of generated code: all files scale linearly, except
06.inz
. This is a bit weird: for N = 1000 there is one such file of 600 kB, for N = 2000 there are for files totalling 2.8 MB, but for N = 4000 they are actually a bit smaller, 2.4 MB. Not sure why this happens.@mahge if you want to try profiling this test case, the model code is trivial.
@phannebohm any idea what could be the root cause?
Expected Behavior
Simcode time should scale linearly with N.
The text was updated successfully, but these errors were encountered: