Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating 'big' models is very slow #62

Closed
SupplyChef opened this issue Aug 13, 2021 · 14 comments
Closed

Creating 'big' models is very slow #62

SupplyChef opened this issue Aug 13, 2021 · 14 comments

Comments

@SupplyChef
Copy link

Hello,

I am trying to build a large model (~100,000 variables and constraints) and the call to optimize! takes a long time (10+ min before the optimization even starts).

I have attached a partial trace (I stopped executing after a minute or so) and my understanding of it is that adding the rows and constraints is costly.

Is there a way to speed things in the Julia wrapper or is this an issue with HiGHS (or with my code...)?

Thank you.
trace.txt

@odow
Copy link
Member

odow commented Aug 14, 2021

Can you provide a reproducible example in Julia?

@SupplyChef
Copy link
Author

The following code shows the issue

using JuMP
using HiGHS

struct Product
end

struct Facility
fixed_cost
end

struct Customer
demand
end

struct Lane
facility
customer
unit_cost
minimum
end

m = Model(HiGHS.Optimizer)

customer_count = 1000
facility_count = 500

products = Set([Product()])
facilities = Set([Facility(rand(10000:50000)) for i in 1:facility_count])
customers = Set([Customer(Dict{Product, Int}(p => rand(1:100))) for i in 1:customer_count, p in products])
lanes = Set([Lane(f, c, rand(1:100), rand(0:1)) for f in facilities, c in customers])

@variable(m, opened[facilities], Bin)
@variable(m, sent[products,lanes] >= 0)
@variable(m, used[lanes], Bin)

@constraint(m, [p=products,c=customers], sum(sent[p, l] for l in lanes if l.customer == c) >= c.demand[p])
@constraint(m, [p=products,f=facilities], sum(sent[p, l] for l in lanes if l.facility == f) <= 1_000_000 * opened[f])
@constraint(m, [l=lanes], sum(sent[p, l] for p in products) >= l.minimum * used[l])
@constraint(m, [l=lanes], sum(sent[p, l] for p in products) <= 1_000_000 * used[l])

@objective(m, Min, sum(f.fixed_cost * opened[f] for f in facilities) + sum(l.unit_cost * sent[p, l] for l in lanes, p in products))

optimize!(m)

@odow
Copy link
Member

odow commented Aug 15, 2021

I'll take a look, but a few things:

  1. This isn't a 100,000 variable problem, it's a million variable and constraint problem (lanes has 500,000 elements).
  2. Type the fields of your structs. Part of the slow down is probably the untyped fields
  3. You're comparing items l.customer == c and l.facility == f, but you haven't defined == for your types. You should define Base.:(==)(x::Customer, y::Customer) and the same for Facility. You should also define Base.hash.

Once you've fixed those things, try the timing again and see if it improves things.

Also: I don't know if HiGHS is ready for million variable integer problems yet (cc @jajhall and @lgottwald how big problems have you solved?)

@lgottwald
Copy link

Its hard to say generally as models can behave very different in the solver but in principle HiGHS can already solve many big models.
E.g. looking at large models in miplib2017 HiGHS solves supportcase11 in just 20 seconds, faster than what it takes to read the MPS file. The model has ~8 million columns ~15 million rows and ~32 million nonzeros. Presolve removes most of the model very quickly though.
On the other hand there are quite small MIP models in miplib2017 that are not solved by any solver yet. E.g. pb-market-split8-70-4 has just 70 columns and ~1000 nonzeros and not even a feasible solution is listed on the MIPLIB homepage.

For both MIP and LP HiGHS can sometimes solve very large models very quickly but also its possible that you encounter performance bottlenecks that we still need to address.

@odow
Copy link
Member

odow commented Aug 15, 2021

You could also try m = direct_model(HiGHS.Optimizer()) or m = Model(HiGHS.Optimizer, bridge_constraints = false) to see if that helps. Some speed-ups on this front are coming in the next minor release of JuMP.

@odow
Copy link
Member

odow commented Aug 15, 2021

@lgottwald good to know. The problem is almost certainly on our side then.

@SupplyChef
Copy link
Author

Thank you both for your comments.

I understand that MIP are hard to solve and that it is difficult to predict up-front how long the solver will take to find a solution. My concern here was the time it took to create the model (before the actual solve).

My (very crude) understanding is that the methods used to build the model (e.g., add rows) do quite a bit of work every time they are called. I was curious if there was a way to streamline them or maybe batch the column/row creations.

I tried using a model without bridge constraints as suggested. The creation time is much faster. I had to change some of the variables (e.g., Binary variables produced an exception; I changed them to Integer variables between 0 and 1). Is there a way to know which constructs are supported without the bridge and which ones are not?

@odow
Copy link
Member

odow commented Aug 16, 2021

Binary variables produced an exception

What was the exception? This should work.

I was curious if there was a way to streamline them or maybe batch the column/row creations.

When you used bridge_constraints = false we loaded the full problem in a single call, instead of using add_rows. At the moment if JuMP sees a bridge it falls back to the slow route, even if the bridges are never used. There are some fixes coming in the next minor release of JuMP that will resolve this.

@SupplyChef
Copy link
Author

Here is the stack trace when using Binary variables:

ERROR: KeyError: key (MathOptInterface.SingleVariable, MathOptInterface.ZeroOne) not found
Stacktrace:
[1] getindex
@ .\dict.jl:482 [inlined]
[2] getindex
@ C:\Users\renau.julia\packages\MathOptInterface\YDdD3\src\Utilities\DoubleDicts.jl:149 [inlined]
[3] getindex
@ C:\Users\renau.julia\packages\MathOptInterface\YDdD3\src\Utilities\copy.jl:137 [inlined]
[4] copy_to(dest::HiGHS.Optimizer, src::MathOptInterface.Utilities.UniversalFallback{MathOptInterface.Utilities.GenericModel{Float64, MathOptInterface.Utilities.ModelFunctionConstraints{Float64}}}; copy_names::Bool, kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ HiGHS C:\Users\renau.julia\packages\HiGHS\pOIMC\src\MOI_wrapper.jl:2256
[5] attach_optimizer(model::MathOptInterface.Utilities.CachingOptimizer{MathOptInterface.AbstractOptimizer, MathOptInterface.Utilities.UniversalFallback{MathOptInterface.Utilities.GenericModel{Float64, MathOptInterface.Utilities.ModelFunctionConstraints{Float64}}}})
@ MathOptInterface.Utilities C:\Users\renau.julia\packages\MathOptInterface\YDdD3\src\Utilities\cachingoptimizer.jl:185
[6] optimize!(m::MathOptInterface.Utilities.CachingOptimizer{MathOptInterface.AbstractOptimizer, MathOptInterface.Utilities.UniversalFallback{MathOptInterface.Utilities.GenericModel{Float64, MathOptInterface.Utilities.ModelFunctionConstraints{Float64}}}})
@ MathOptInterface.Utilities C:\Users\renau.julia\packages\MathOptInterface\YDdD3\src\Utilities\cachingoptimizer.jl:248
[7] optimize!(model::Model, optimizer_factory::Nothing; bridge_constraints::Bool, ignore_optimize_hook::Bool, kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ JuMP C:\Users\renau.julia\packages\JuMP\b3hGi\src\optimizer_interface.jl:185
[8] optimize! (repeats 2 times)
@ C:\Users\renau.julia\packages\JuMP\b3hGi\src\optimizer_interface.jl:157 [inlined]
[9] top-level scope
@ REPL[24]:1

@odow
Copy link
Member

odow commented Aug 16, 2021

Ah. I fixed that in #50 (along with some other things).

I guess I forgot to add it back to master. Let me take a look

@odow
Copy link
Member

odow commented Aug 16, 2021

The fix will be available as a new release once JuliaRegistries/General#42944 is merged.

What are the timings with the typed structs, == implemented, and bridge_constraints = false?

@SupplyChef
Copy link
Author

I changed the code to override == and hash as follows:

struct Product
name::String
end

struct Facility
name::String
fixed_cost::Float64
end

struct Customer
name::String
demand::Dict{Product, Int64}
end

struct Lane
facility::Facility
customer::Customer
unit_cost::Float64
minimum::Int64
end

Base.:(==)(x::Product, y::Product) = x.name == y.name
Base.:(==)(x::Customer, y::Customer) = x.name == y.name
Base.:(==)(x::Facility, y::Facility) = x.name == y.name
Base.:(==)(x::Lane, y::Lane) = (x.customer == y.customer) && (x.facility == y.facility)

Base.:hash(x::Product) = hash(x.name)
Base.:hash(x::Customer) = hash(x.name)
Base.:hash(x::Facility) = hash(x.name)
Base.:hash(x::Lane) = hash(x.customer, hash(x.facility))

Base.:show(io::IOContext{IOBuffer}, x::Product) = print(io, "$(x.name)")
Base.:show(io::IOContext{IOBuffer}, x::Facility) = print(io, "$(x.name)")
Base.:show(io::IOContext{IOBuffer}, x::Customer) = print(io, "$(x.name)")
Base.:show(io::IOContext{IOBuffer}, x::Lane) = print(io, "$(x.facility.name) $(x.customer.name)")

with these changes the creation takes about 20s:

@time create_model()
22.227698 seconds (49.58 M allocations: 3.300 GiB, 5.91% gc time)
A JuMP Model
Minimization problem with:
Variables: 1000500
Objective function type: AffExpr
AffExpr-in-MathOptInterface.GreaterThan{Float64}: 501000 constraints
AffExpr-in-MathOptInterface.LessThan{Float64}: 500500 constraints
VariableRef-in-MathOptInterface.GreaterThan{Float64}: 500000 constraints
VariableRef-in-MathOptInterface.ZeroOne: 500500 constraints
Model mode: AUTOMATIC
CachingOptimizer state: EMPTY_OPTIMIZER
Solver name: HiGHS
Names registered in the model: opened, sent, used

@odow
Copy link
Member

odow commented Aug 16, 2021

Your definition of hash isn't quite correct. It should be:

Base.hash(x::Product, h::UInt64) = hash(x.name, h)

Your show definitions should also just be:

Base.show(io::IO, x::Product)

But pleased to see that it's not an issue with HiGHS. Can this issue be closed then?

@SupplyChef
Copy link
Author

Yes; let's close the issue. Thank you for your help in speeding up the model construction (including my code...).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants