document performance of model building functions #480

lovasoa · 2021-02-28T11:52:03Z

Hello !
Looking at the code, it looks like the model is stored internally column-wise. Is there a performance penalty in building a model row by row using Highs_addRows ? When adding rows, does highs need to move large quantities of data in its internal sparse matrix representation ?If so, then could it be documented ?

The text was updated successfully, but these errors were encountered:

jajhall · 2021-02-28T20:37:14Z

If you build a model by calling Highs_addRows repeatedly - in the limit, once for each row in the model - there will be a performance penalty. A pass through the column-wise matrix is necessary to insert any number of new rows since, as you identify, data in the column-wise representation must be moved to accommodate it. However, relative to the cost of solving the subsequent LP, the overhead of building the model row-wise is negligible, so it's not worth documenting.

lovasoa · 2021-02-28T21:51:45Z

In the worst case, the complexity of building a problem row by row will be O(n²) (with n the number of constraints), right ? Is this always negligible compared to solving a large problem ?

Other solvers not only have a warning for inefficient model building, but lpsolve for instance even has a way to configure how the matrix is stored internally, and states :

set_add_rowmode
Specifies which add routine performs best. [...]
The speed improvement is spectacular, especially for bigger models, so it is advisable to call this routine to set the mode.
http://lpsolve.sourceforge.net/5.5/set_add_rowmode.htm

odow · 2021-02-28T22:01:24Z

If possible, create the LP at once:

HiGHS/src/interfaces/highs_c_api.cpp

Lines 75 to 79 in 74a4485

    
           int Highs_passLp(void* highs, const int numcol, const int numrow, 
        
                            const int numnz, const double* colcost, const double* collower, 
        
                            const double* colupper, const double* rowlower, 
        
                            const double* rowupper, const int* astart, const int* aindex, 
        
                            const double* avalue) {

If not possible, create row-by-row.

Discussion in the Julia wrapper: jump-dev/HiGHS.jl#41

lovasoa · 2021-02-28T22:06:46Z

If not possible, create row-by-row.

I think you mean column-by-column, right ?

I think this should be be explicitly documented. The quadratic behavior of addRows is not obvious and I had to read the code (and to be used to other solver libraries) to understand the drastic performance difference between addRows and addColumns. Adding it to the documentation of the function is a small change and could avoid hours of performance debugging to users.

odow · 2021-02-28T22:24:44Z

It depends on the modeling environment. There are two likely scenarios:

The user supplies the constraint matrix, variable bounds, and objective coefficients all at once from some matrix generator. If so, call Highs_passLp.
The user builds the model incrementally, defining variables, and objective function, and constraints. Then potentially modifying the data they have already supplied. In this case, they want to call addRow.

The only time a user would want to call addColumn is if they are doing column generation.

lovasoa · 2021-02-28T22:31:31Z

This is what I am suggesting to document. Your comment above should be in the documentation for the addRow function.
And I think you are confusing addRow and addColumn. The efficient one is addColumn, isn't it ? passLp takes a column-wise matrix.

odow · 2021-02-28T23:03:40Z

And I think you are confusing addRow and addColumn.

No, I really mean addRow. Modellers think in terms of constraints as rows. They will not give you columns containing the non-zero constraint coefficients, because at the point they are defining their variables, they have not defined any constraints yet.

Here's a typical JuMP model (although every other modeling language has something similar):

model = Model(HiGHS.Optimizer)
@variable(model, x[1:5] >= 0)
@constraint(model, sum(x) <= 1)
@objective(model, Max, sum(x))
optimize!(model)

If we decide to start building the model incrementally after the first @variable, we can't supply the complete column because we haven't defined the constraints yet. Therefore, when get get to @constraint, we need to call addRow.
Alternatively, we can store a cache of the problem and call passLp at the start of optimize!.

At no point do we call addColumn with a list of constraint coefficients. (There is one specific use-case for addColumn, and that is column generation https://en.wikipedia.org/wiki/Column_generation.)

This is not really about the difference between addRow and addColumn. It's about the difference between should I pass everything at once, or should I build the model incrementally. If you're using an algebraic modeling language, the AML should take care of this for you (as JuMP does). If you're using HiGHS directly and you don't need the modeling convenience of addRow, you may as well just build the full A matrix and call passLp instead of calling addColumns.

lovasoa · 2021-02-28T23:45:05Z

This is not really about the difference between addRow and addColumn

These two functions, despite having similar names, have very different performance characteristics. All that I am saying is that this should be documented.

You are saying that when building the model incrementally, it should be done in the way that leads to a quadractic runtime. Users should probably at least be aware of that.

Some modelers are internally smarter than JuMP and would be able to leverage addRow if it were performant.

lovasoa · 2021-02-28T23:52:14Z

Actually, looking at the issue you link, it looks like JuMP does the exact same thing as other modelers, and uses addRow, which makes it slow with HiGHS.

jajhall · 2021-03-01T00:09:56Z

Interesting discussion. The cost of adding rows one-by-one with addRow is, indeed O(n^2) for n constraints. I'd assumed that most models would be built outside HiGHS and passed in as a HighsLp, with addRow being used by - say - a MIP solver adding cuts. It's not hard to have the empty constraint matrix defined as being "neutral", and row-wise storage used so long as addRow is called, switching to column-wise storage when run() is called, or if addCol is called. An LP defined as a HighsLp would immediately be stored column-wise.

Happy to document this

odow · 2021-03-01T00:21:37Z

it looks like JuMP does the exact same thing as other modelers, and uses addRow, which makes it slow with HiGHS.

JuMP has the option, but it's currently not implemented in HiGHS.jl. This issue is to enable the method that allows JuMP to use the cache.

jajhall · 2021-03-09T18:10:48Z

My branch of Highs now switches to row-wise storage of the LP matrix if a row is added to a model with only columns. Thus adding rows one-by-one does not incur the O(n^2) cost. I'll update this issue when the branch is merged in to master.

feldmeier · 2021-10-21T19:27:20Z

The matrix orientation code seems to be in master now.
The first call to addRow still has O(nnz) complexity if the matrix had column-wise orientation previoulsy, right? subsequent calls to addRow will have linear complexity.

jajhall · 2021-10-21T19:30:58Z

The matrix orientation code seems to be in master now. The first call to addRow still has O(nnz) complexity if the matrix had column-wise orientation previously, right? subsequent calls to addRow will have linear complexity.

Indeed, and more so. HiGHS looks at the number of entries being added compared with the number in the current matrix and transposes it if it is advantageous. And it's all in master, now

jajhall · 2021-12-30T14:11:56Z

This was fixed long ago!

lovasoa changed the title ~~document performance of various functions~~ document performance of model building functions Feb 28, 2021

jajhall closed this as completed Feb 28, 2021

jajhall reopened this Mar 1, 2021

odow mentioned this issue Mar 1, 2021

passLP resets the objective sense #484

Closed

lovasoa mentioned this issue Oct 22, 2021

Call addRow directly instead of caching the entire problem before calling HiGHS rust-or/highs#2

Open

jajhall closed this as completed Dec 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

document performance of model building functions #480

document performance of model building functions #480

lovasoa commented Feb 28, 2021

jajhall commented Feb 28, 2021

lovasoa commented Feb 28, 2021

odow commented Feb 28, 2021

lovasoa commented Feb 28, 2021

odow commented Feb 28, 2021

lovasoa commented Feb 28, 2021

odow commented Feb 28, 2021

lovasoa commented Feb 28, 2021 •

edited

Loading

lovasoa commented Feb 28, 2021 •

edited

Loading

jajhall commented Mar 1, 2021

odow commented Mar 1, 2021

jajhall commented Mar 9, 2021

feldmeier commented Oct 21, 2021

jajhall commented Oct 21, 2021

jajhall commented Dec 30, 2021

document performance of model building functions #480

document performance of model building functions #480

Comments

lovasoa commented Feb 28, 2021

jajhall commented Feb 28, 2021

lovasoa commented Feb 28, 2021

odow commented Feb 28, 2021

lovasoa commented Feb 28, 2021

odow commented Feb 28, 2021

lovasoa commented Feb 28, 2021

odow commented Feb 28, 2021

lovasoa commented Feb 28, 2021 • edited Loading

lovasoa commented Feb 28, 2021 • edited Loading

jajhall commented Mar 1, 2021

odow commented Mar 1, 2021

jajhall commented Mar 9, 2021

feldmeier commented Oct 21, 2021

jajhall commented Oct 21, 2021

jajhall commented Dec 30, 2021

lovasoa commented Feb 28, 2021 •

edited

Loading

lovasoa commented Feb 28, 2021 •

edited

Loading