Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Implementation of write_to_file that use MathOptFormat. #1982

Closed
wants to merge 8 commits into from

Conversation

dourouc05
Copy link
Contributor

Follows odow/MathOptFormat.jl#70. Not mergeable right now due to MOF not being registered (but it may happen soon, as I understood).

Copy link
Member

@mlubin mlubin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good start. I added comments on some technical points that need to be addressed before merging.

src/JuMP.jl Outdated Show resolved Hide resolved
src/JuMP.jl Outdated
"""
writeLP(model::Model, fname::AbstractString)

Exports the input model as a LP file. To compress the output file, use a `.gz` extension.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CPLEX (although there are outstanding issues re naming odow/MathOptFormat.jl#67).

src/JuMP.jl Outdated Show resolved Hide resolved
src/JuMP.jl Outdated Show resolved Hide resolved
src/JuMP.jl Outdated Show resolved Hide resolved
src/JuMP.jl Outdated
function writeLP(model::Model, fname::AbstractString)
lp_model = MathOptFormat.LP.Model()
MOI.copy_to(lp_model, backend(model))
MOI.write_to_file(lp_model, fname)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do the JuMP variable names translate to variable names in the LP file? What happens if multiple variables have the same (or empty) names? What if the variable name is incompatible with LP format, e.g., "5 * x2 + x3"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open issue: odow/MathOptFormat.jl#67

There is code in LPWriter.jl to sanitize the names, but I haven't ported it yet.

src/JuMP.jl Outdated
end

"""
writeMPS(model::Model, fname::AbstractString)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comments as for write_LP.

@dourouc05
Copy link
Contributor Author

Thanks for your comments! I've adapted the code as you suggested; as far as I can see, your other remarks are more about the documentation than the code, I will handle them when I have more time for it :)!

Copy link
Member

@odow odow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could potentially just have

function write_to_file(model::Model, filename::String)

and dispatch to the appropriate MathOptFormat model based on the extension.

src/JuMP.jl Outdated
"""
writeLP(model::Model, fname::AbstractString)

Exports the input model as a LP file. To compress the output file, use a `.gz` extension.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CPLEX (although there are outstanding issues re naming odow/MathOptFormat.jl#67).

@dourouc05
Copy link
Contributor Author

dourouc05 commented Jun 10, 2019

Having a simpler write_to_file function would be even simpler; would this function allow extensions from outside world? I am currently going through Pyomo's documentation, and they have a GAMS export functionality as a plugin: https://pyomo.readthedocs.io/en/latest/_modules/pyomo/repn/plugins/gams_writer.html (the format is registered by a Python decorator: @WriterFactory.register).

By the way, how expensive is the copy operation for large models? Couldn't we use the solvers' functionalities for writing models to files in order to avoid a copy? We would then have to deal with the variable syntax of LP formats, for instance (how about a keyword argument to choose the variant, resulting in either the solver being called, or MathOptFormat coming into play, or an error being thrown?).

(How about giving access to more "advanced" solver file outputs, such as solution files, IISes, or even bases, solution pools? I suspect these are not available through MOI right now, but could be highly important in some applications.)

EDIT: if the operations are handed to the solver, then it should know the name of the variables (when I looked at it, it seems the names are kept in JuMP and not passed over to the solver). Probably related to jump-dev/CPLEX.jl#225.

@dourouc05
Copy link
Contributor Author

Performance-wise, the current code is a tad slow for my needs: to output 779 models (a total of 17 MiB of data), I need roughly 25 seconds on my under-powered machine (Intel i7 6600U), while solving all these problems only takes 2.5 seconds with CPLEX (roughly, 50 constraints, 150 variables each). I never saw such slowdowns with previous versions of JuMP (pre-MOI) just due to outputting LP files.

@odow
Copy link
Member

odow commented Jun 13, 2019

It would be useful to know what proportion of the slow-down was in copying the models from JuMP to MathOptFormat.MPS.Model, and what proportion was in writing the MPS.Model.

If it's the former, switching from LQOI to native backends (e.g., jump-dev/Gurobi.jl#216) might help.

If it's the latter, then we should investigate improvements in MathOptFormat.

@odow
Copy link
Member

odow commented Jul 6, 2019

Bump. MathOptFormat is now registered. Did you take a look at the performance issue?

@dourouc05
Copy link
Contributor Author

Unfortunately, not yet, although I've been thinking of it recently. I'll try to do something soon.

@dourouc05
Copy link
Contributor Author

dourouc05 commented Jul 6, 2019

First, just to compare the cost of exporting LPs with respect to solving them, here is what I got (one pair of lines per experiment I ran, which is mostly solving a large amount of small LPs with CPLEX, just continuous programs so no B&B stuff) -- just to note, I did not add any kind of loop to compute the timings multiple times:

OPT: 2.96013895e8 ns
LPS: 6.17054099e8 ns
OPT: 1.310818802e9 ns
LPS: 3.602323612e9 ns
OPT: 1.45128e7 ns
LPS: 6.5755503e7 ns
OPT: 4.5673306e7 ns
LPS: 2.843012e8 ns
OPT: 1.627415e8 ns
LPS: 7.54438602e8 ns
OPT: 1.65859005e8 ns
LPS: 8.38074795e8 ns
OPT: 2.1386592e7 ns
LPS: 1.28045298e8 ns
OPT: 3.8551403e7 ns
LPS: 1.65060008e8 ns
OPT: 3.5069401e7 ns
LPS: 1.60229893e8 ns
OPT: 7.4416002e7 ns
LPS: 3.950145e8 ns
OPT: 7.2683992e7 ns
LPS: 3.6835911e8 ns
OPT: 7.4029499e7 ns
LPS: 3.68355501e8 ns
OPT: 6.3251402e7 ns
LPS: 3.6049119e8 ns
OPT: 9.4040605e7 ns
LPS: 4.44301411e8 ns
OPT: 8.394589e7 ns
LPS: 4.317132e8 ns
OPT: 8.952704e6 ns
LPS: 6.4694401e7 ns
OPT: 1.11839e7 ns
LPS: 6.5468701e7 ns
OPT: 1.8196702e7 ns
LPS: 6.7295803e7 ns
OPT: 1.2283594e7 ns
LPS: 7.8794004e7 ns
OPT: 1.7275297e7 ns
LPS: 9.5494896e7 ns
OPT: 1.58006e7 ns
LPS: 9.32676e7 ns
OPT: 1.03004204e8 ns
LPS: 4.43092803e8 ns
OPT: 1.0842302e7 ns
LPS: 6.8420494e7 ns
OPT: 3.5597198e7 ns
LPS: 2.62593402e8 ns
OPT: 1.4946201e7 ns
LPS: 9.2984199e7 ns
OPT: 2.2713606e7 ns
LPS: 9.3310098e7 ns
OPT: 1.0263803e7 ns
LPS: 5.25585e7 ns
OPT: 1.4173799e7 ns
LPS: 8.6237599e7 ns
OPT: 2.0447203e7 ns
LPS: 6.3067005e7 ns
OPT: 4.411899e6 ns
LPS: 3.5387294e7 ns
OPT: 8.0122e6 ns
LPS: 5.7177998e7 ns

Then, with more details, I split the LP timings into three part: creating a new MathOptFormat.LP.Model(), copying the current model into it, writing the file.

OPT: 1.959348e8 ns
LPS: 5.93981003e8 ns
LP instanciations: 530502.0
LP copies: 7.5022498e7
LP writes: 5.18166792e8
OPT: 1.304709898e9 ns
LPS: 3.560990694e9 ns
LP instanciations: 2.074896e6
LP copies: 4.47579713e8
LP writes: 3.110344601e9
OPT: 1.0345098e7 ns
LPS: 4.9579002e7 ns
LP instanciations: 174002.0
LP copies: 3.418394e6
LP writes: 4.5918502e7
OPT: 5.3183899e7 ns
LPS: 3.28032602e8 ns
LP instanciations: 483699.0
LP copies: 2.3829595e7
LP writes: 3.03491694e8
OPT: 1.74188499e8 ns
LPS: 7.54974295e8 ns
LP instanciations: 1.132793e6
LP copies: 5.9982704e7
LP writes: 6.93371901e8
OPT: 2.14658101e8 ns
LPS: 9.45808283e8 ns
LP instanciations: 1.33889e6
LP copies: 6.6768207e7
LP writes: 8.77101295e8
OPT: 2.3910697e7 ns
LPS: 1.23068298e8 ns
LP instanciations: 355106.0
LP copies: 1.7956503e7
LP writes: 1.04598198e8
OPT: 4.5122495e7 ns
LPS: 1.57952899e8 ns
LP instanciations: 485096.0
LP copies: 1.3379405e7
LP writes: 1.43886799e8
OPT: 3.5169001e7 ns
LPS: 1.20683794e8 ns
LP instanciations: 334799.0
LP copies: 9.622004e6
LP writes: 1.10587897e8
OPT: 7.3313988e7 ns
LPS: 3.68387708e8 ns
LP instanciations: 623800.0
LP copies: 2.3465604e7
LP writes: 3.44006097e8
OPT: 7.4649988e7 ns
LPS: 3.81879197e8 ns
LP instanciations: 623494.0
LP copies: 3.0126203e7
LP writes: 3.50857406e8
OPT: 9.3378911e7 ns
LPS: 4.20925402e8 ns
LP instanciations: 644000.0
LP copies: 3.2323306e7
LP writes: 3.87659393e8
OPT: 6.8824398e7 ns
LPS: 3.71507002e8 ns
LP instanciations: 597301.0
LP copies: 2.3079396e7
LP writes: 3.47566999e8
OPT: 9.27079e7 ns
LPS: 4.42489703e8 ns
LP instanciations: 601901.0
LP copies: 3.0534107e7
LP writes: 4.11097605e8
OPT: 8.4536001e7 ns
LPS: 4.29721896e8 ns
LP instanciations: 617098.0
LP copies: 3.0921496e7
LP writes: 3.97914003e8
OPT: 9.8928e6 ns
LPS: 4.7871398e7 ns
LP instanciations: 147296.0
LP copies: 3.146906e6
LP writes: 4.4517299e7
OPT: 1.1122496e7 ns
LPS: 5.4088795e7 ns
LP instanciations: 213797.0
LP copies: 3.739808e6
LP writes: 5.0071899e7
OPT: 1.3020601e7 ns
LPS: 7.4555092e7 ns
LP instanciations: 210195.0
LP copies: 1.2229598e7
LP writes: 6.20066e7
OPT: 1.1498397e7 ns
LPS: 5.9497401e7 ns
LP instanciations: 174700.0
LP copies: 3.759399e6
LP writes: 5.5486299e7
OPT: 2.0558303e7 ns
LPS: 9.7162896e7 ns
LP instanciations: 291500.0
LP copies: 7.566297e6
LP writes: 8.9142806e7
OPT: 1.7685399e7 ns
LPS: 8.5170002e7 ns
LP instanciations: 247401.0
LP copies: 5.193104e6
LP writes: 7.9585003e7
OPT: 9.75973e7 ns
LPS: 4.440695e8 ns
LP instanciations: 4.852302e6
LP copies: 2.9781094e7
LP writes: 4.09124406e8
OPT: 1.715771e7 ns
LPS: 6.0481002e7 ns
LP instanciations: 177302.0
LP copies: 3.750098e6
LP writes: 5.6463495e7
OPT: 3.5680602e7 ns
LPS: 2.44048799e8 ns
LP instanciations: 362904.0
LP copies: 2.4543194e7
LP writes: 2.18956502e8
OPT: 1.1935895e7 ns
LPS: 5.7586603e7 ns
LP instanciations: 151300.0
LP copies: 4.139397e6
LP writes: 5.3236601e7
OPT: 1.80226e7 ns
LPS: 5.4645297e7 ns
LP instanciations: 156398.0
LP copies: 4.358903e6
LP writes: 5.0067398e7
OPT: 9.7667e6 ns
LPS: 4.6783499e7 ns
LP instanciations: 141399.0
LP copies: 3.193802e6
LP writes: 4.3389503e7
OPT: 1.2195103e7 ns
LPS: 5.5930901e7 ns
LP instanciations: 159204.0
LP copies: 4.069896e6
LP writes: 5.1642401e7
OPT: 1.9800501e7 ns
LPS: 5.8073201e7 ns
LP instanciations: 201605.0
LP copies: 4.406104e6
LP writes: 5.3392296e7
OPT: 4.484998e6 ns
LPS: 2.4944801e7 ns
LP instanciations: 79701.0
LP copies: 1.541498e6
LP writes: 2.3283902e7
OPT: 8.2362e6 ns
LPS: 4.9648997e7 ns
LP instanciations: 136997.0
LP copies: 2.8211e6
LP writes: 4.6606304e7

It seems like the majority of the time spent is due to writing things to disk.

Just to be complete, I ran the tests on my Windows 10 x64 1809 laptop (with an SSD!).

julia> versioninfo()
Julia Version 1.1.1
Commit 55e36cc308 (2019-05-16 04:10 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)

@odow
Copy link
Member

odow commented Jul 6, 2019

Do you have the code so I can run? What is LPS and OPT?

@dourouc05
Copy link
Contributor Author

I can send you the code privately, if you like (which email address?).

Otherwise: LPS for the complete time to write down the LP files; OPT for the duration of the calls to optimize!.

@odow
Copy link
Member

odow commented Jul 6, 2019

Can you post a gist of a simplified benchmark? Otherwise oscar.dowson at northwestern.edu

@dourouc05
Copy link
Contributor Author

dourouc05 commented Jul 17, 2019

I used the integrated profiler to generate those files (both tree and flat profiles). The procedure: first start my code (with an include), then import Profile, restart the code (@profile include…), export the output. My code has some part of graph processing (shortest paths) plus calling CPLEX.

In the meantime, I suppose the code could be merged as is, with performance improvements coming in future iterations?

profile.zip

@odow
Copy link
Member

odow commented Jul 18, 2019

Instead of copying the old JuMP syntax, how about something like:

@enum(FileFormats, MPS, LP, MOF)

function write_to_file(model::Model, io::IO; format::FileFormats = MPS, kwargs...)
    dest = if format == MPS
        MathOptFormat.MPS.Model(; kwargs...)
    elseif format == LP
        MathOptFormat.LP.Model(; kwargs...)
    else
        @assert format == MOF
        MathOptFormat.MOF.Model(; kwargs...)
    end
    MOI.copy_to(dest, backend(model))
    MOI.write_to_file(dest, io)
    return
end

"""
    write_to_file(model::Model, filename::String; format = MPS, kwargs...)

Write `model` to the file called `filename` using the format `format`. 

Valid formats are given by the enum [`FileFormats`](@ref). 

For keyword options, see [MathOptFormat.jl](https://github.com/odow/MathOptFormat.jl).
"""
function write_to_file(model::Model, filename::String; format = MPS, kwargs...)
    open(filename, "w") do io
        write_to_file(model, io; format = format, kwargs...)
    end
    return
end

This PR also needs tests and documentation before we can merge.

@dourouc05
Copy link
Contributor Author

This other API seems nice; to make it as convenient to use as the old one, what about detecting the format from the extension (if no keyword argument provides it)? The same could be used for compression (detection of an extension, plus another keyword argument to overwrite the extension).

@odow
Copy link
Member

odow commented Jul 18, 2019

We can actually hook into the compression logic in MathOptFormat (https://github.com/odow/MathOptFormat.jl/blob/39048ec3d50153b38992f8a6b38a60396fa3433a/src/MathOptFormat.jl#L143-L157).

"""
    write_to_file(model::Model, io::Union{IO, String}; format = MPS, kwargs...)

Write `model` to `io` using the format `format`.

If `io` is an `IO`, then the model will be written to that IO stream. 
If `io` is a `String`,  then the model will be written to the to file 
called `io`. If `io` is a `String` that ends in `.gz`, then the file 
will be compressed using GZip.

Valid formats are given by the enum [`FileFormats`](@ref). 

For keyword options, see [MathOptFormat.jl](https://github.com/odow/MathOptFormat.jl).
"""
function write_to_file(
        model::Model, io::Union{IO, String}; 
        format::FileFormats = MPS, kwargs...
    )
    dest = if format == MPS
        MathOptFormat.MPS.Model(; kwargs...)
    elseif format == LP
        MathOptFormat.LP.Model(; kwargs...)
    else
        @assert format == MOF
        MathOptFormat.MOF.Model(; kwargs...)
    end
    MOI.copy_to(dest, backend(model))
    MOI.write_to_file(dest, io)
    return
end

@odow
Copy link
Member

odow commented Jul 18, 2019

See: https://github.com/odow/MathOptFormat.jl/blob/39048ec3d50153b38992f8a6b38a60396fa3433a/src/MathOptFormat.jl#L165-L185

You could have the format be format::Union{Nothing, FileFormats} = nothing and automatically detect unless specified. That seems nice.

@dourouc05 dourouc05 force-pushed the dourouc05/mof-link branch 2 times, most recently from 02008a7 to 7c66b75 Compare October 29, 2019 16:12
@dourouc05
Copy link
Contributor Author

dourouc05 commented Oct 29, 2019

Here is what I came up to. Besides rebasing and implementing the proposed changes, I also added an enumeration for compression (in case you want the GZip compression, but have other requirements for the extension). I think parts of this should move into MathOptFormat, though. What is your opinion?

(By the way, the failure seems related to a wrong UUID for MOF.)

Copy link
Member

@odow odow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are failing. Do you need to add a [compat] entry for MOF?

You might want to rename export.jl to file_formats.jl. It could get confused with Julia's export keyword.

src/export.jl Outdated Show resolved Hide resolved
@dourouc05
Copy link
Contributor Author

Actually, I have not really tested this code, I was more looking for feedback on the proposed API (including the split between JuMP and MOF)

Copy link
Member

@odow odow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you looked at
https://github.com/odow/MathOptFormat.jl/blob/e2041074fa6080f00bfb8bf290ebff84f3afbe8e/src/MathOptFormat.jl#L136-L162
CodecZlib is already included in MathOptFormat.

You can probably drop the compression argument, and just infer from the filename.

You really just need

function write_to_file(model, filename; format, kwargs...)
    dest = if format == AUTOMATIC_FILE_FORMAT
        _filename_to_format(filename)
    elseif format == CBF
        MathOptFormat.CBF.Model(; kwargs...)
    elseif ...
        ...
    end
    MOI.copy_to(dest, backend(model))
    MOI.write_to_file(dest, filename)
end

Project.toml Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Oct 30, 2019

Codecov Report

Merging #1982 into master will decrease coverage by 0.65%.
The diff coverage is 0%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1982      +/-   ##
==========================================
- Coverage   91.01%   90.36%   -0.66%     
==========================================
  Files          40       39       -1     
  Lines        4350     4347       -3     
==========================================
- Hits         3959     3928      -31     
- Misses        391      419      +28
Impacted Files Coverage Δ
src/JuMP.jl 84.89% <0%> (+3.9%) ⬆️
src/file_formats.jl 0% <0%> (ø)
src/optimizer_interface.jl 81.08% <0%> (-0.74%) ⬇️
src/nlp.jl 91.93% <0%> (-0.17%) ⬇️
src/aff_expr.jl 88.54% <0%> (ø) ⬆️
src/objective.jl 91.17% <0%> (ø) ⬆️
src/quad_expr.jl 94% <0%> (ø) ⬆️
src/variables.jl 95.05% <0%> (ø) ⬆️
src/constraints.jl 91.6% <0%> (ø) ⬆️
src/indicator.jl
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 326017c...e1679eb. Read the comment docs.

@odow
Copy link
Member

odow commented Oct 30, 2019

I'd be in favor of moving the compression logic into MathOptFormat.

@dourouc05
Copy link
Contributor Author

I actually took inspiration from that part of MOF's code :)! I also included support for more than just GZip compression (which does not always do that great), but that should belong to MOF, I think.

@dourouc05
Copy link
Contributor Author

The compression logic is proposed in odow/MathOptFormat.jl#90; the latest commit is based on this other PR.

@dourouc05 dourouc05 changed the title [WIP] Implementations of writeLP and writeMPS that use MathOptFormat. [WIP] Implementation of write_to_file that use MathOptFormat. Nov 26, 2019
odow pushed a commit to odow/MathOptFormat.jl that referenced this pull request Nov 26, 2019
* Move most of the code from jump-dev/JuMP.jl#1982.

* Refactor to use a dictionary storing all the variance within compression codecs.

* Same refactoring for model types.

* Start restructuring compression format handling.

* Finalise implementation based on types.

* Add documentation for AbstractCompressionScheme.

* Clean up code.

* Add more tests.

* Comment out XZ.

* Remove dependency for commented out code.

* Comment out XZ.
@dourouc05
Copy link
Contributor Author

With MOF master, tests pass (locally).

@odow
Copy link
Member

odow commented Nov 27, 2019

Did you forget to push some commits? The code you have isn't tested, and it won't run because I removed _filename_to_format that we spoke about.

You probably want something like:

function write_to_file(
    model::Model, 
    filename::String;
    format::MathOptFormat.FileFormat = MathOptFormat.AUTOMATIC_FILE_FORMAT,
    compression::MathOptFormat.AbstractCompressionScheme = 
        MathOptFormat.AutomaticCompression(),
)
    if file_format == AUTOMATIC_FILE_FORMAT
        file_format = _detect_file_format(filename)
    end
    dest = MathOptFormats._FILE_FORMATS[file_format][2]()
    MOI.copy_to(dest, backend(model))
    MOI.write_to_file(dest, filename; compression = compression)
    return
end

@odow odow closed this in #2114 Dec 9, 2019
@dourouc05 dourouc05 deleted the dourouc05/mof-link branch September 20, 2020 22:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants