-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
document performance of model building functions #480
Comments
If you build a model by calling Highs_addRows repeatedly - in the limit, once for each row in the model - there will be a performance penalty. A pass through the column-wise matrix is necessary to insert any number of new rows since, as you identify, data in the column-wise representation must be moved to accommodate it. However, relative to the cost of solving the subsequent LP, the overhead of building the model row-wise is negligible, so it's not worth documenting. |
In the worst case, the complexity of building a problem row by row will be O(n²) (with n the number of constraints), right ? Is this always negligible compared to solving a large problem ? Other solvers not only have a warning for inefficient model building, but lpsolve for instance even has a way to configure how the matrix is stored internally, and states :
|
If possible, create the LP at once: HiGHS/src/interfaces/highs_c_api.cpp Lines 75 to 79 in 74a4485
If not possible, create row-by-row. Discussion in the Julia wrapper: jump-dev/HiGHS.jl#41 |
I think you mean column-by-column, right ? I think this should be be explicitly documented. The quadratic behavior of |
It depends on the modeling environment. There are two likely scenarios:
The only time a user would want to call |
This is what I am suggesting to document. Your comment above should be in the documentation for the |
No, I really mean Here's a typical JuMP model (although every other modeling language has something similar): model = Model(HiGHS.Optimizer)
@variable(model, x[1:5] >= 0)
@constraint(model, sum(x) <= 1)
@objective(model, Max, sum(x))
optimize!(model) If we decide to start building the model incrementally after the first At no point do we call This is not really about the difference between |
These two functions, despite having similar names, have very different performance characteristics. All that I am saying is that this should be documented. You are saying that when building the model incrementally, it should be done in the way that leads to a quadractic runtime. Users should probably at least be aware of that. Some modelers are internally smarter than JuMP and would be able to leverage addRow if it were performant. |
Actually, looking at the issue you link, it looks like JuMP does the exact same thing as other modelers, and uses addRow, which makes it slow with HiGHS. |
Interesting discussion. The cost of adding rows one-by-one with addRow is, indeed O(n^2) for n constraints. I'd assumed that most models would be built outside HiGHS and passed in as a HighsLp, with addRow being used by - say - a MIP solver adding cuts. It's not hard to have the empty constraint matrix defined as being "neutral", and row-wise storage used so long as addRow is called, switching to column-wise storage when run() is called, or if addCol is called. An LP defined as a HighsLp would immediately be stored column-wise. Happy to document this |
JuMP has the option, but it's currently not implemented in HiGHS.jl. This issue is to enable the method that allows JuMP to use the cache. |
My branch of Highs now switches to row-wise storage of the LP matrix if a row is added to a model with only columns. Thus adding rows one-by-one does not incur the O(n^2) cost. I'll update this issue when the branch is merged in to master. |
The matrix orientation code seems to be in master now. |
Indeed, and more so. HiGHS looks at the number of entries being added compared with the number in the current matrix and transposes it if it is advantageous. And it's all in master, now |
This was fixed long ago! |
Hello !
Looking at the code, it looks like the model is stored internally column-wise. Is there a performance penalty in building a model row by row using
Highs_addRows
? When adding rows, does highs need to move large quantities of data in its internal sparse matrix representation ?If so, then could it be documented ?The text was updated successfully, but these errors were encountered: