-
-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does scipy support OpenMP #13372
Comments
In short, no not with OpenMP - see gh-10239. We do have parallel algorithms do, through the |
Another program that tries to use OpenMP is optimize.linprog HiGHS.
on my 4-core Mac, OMP_NUM_THREADS=4, any LP at all. This is just a comment -- over to you |
OpenMP is disabled, but HiGHS will still tell you about it because it agrees you might be leaving performance on the table. I think this warning should only be displayed in verbose modes -- please do send a reproducible example if this is not the case. The performance boost of parallelizing any simplex solver is minimal (as it's difficult to do), so this warning can safely be ignored. If you're experimenting, you can enable OpenMP for HiGHS by commenting out this line, but you have no guarantees about what happens after from us |
Folks,
|
The default is chosen to be the default solver that HiGHS chooses (currently hard-coded to be dual simplex, but will probably change in the future to use some more intelligent decision logic). I think we don't want to make promises about which solver is chosen when the user doesn't specify, but we might want to change scipy over to using IPM by default if there's enough of a performance improvement. @mdhaber any thoughts about switching to use IPM as HiGHS default?
Yes, this warning is produced in the C++ source code. It is a harmless warning, but we might think about asking the upstream HiGHS developers about removing it to not bother users who opt out of OpenMP. You can try creating a PR to comment it out, I see no reason we wouldn't accept a PR like that, but it doesn't bother me personally.
There is currently no mechanism in HiGHS to retrieve a partial solution, so if HiGHS does not report a solution then For more context you can see this comment: "The only useful "partial solution" I can think of offering is in the case where an LP is primal unbounded. HiGHS could return the point at which primal unboundedness is detected. Quite how valuable this is to a user is another question."
There was a lot of discussion and testing to determine how tolerances worked (and several fixes that I don't believe have made it into scipy yet). We hope that the tolerances are working in a reasonable way -- is it unclear from the documentation how tolerances are used? This comment clarifies that IPM tolerances may not be working as expected: "The reason why playing with the feasiblity and optimality tolerances - esp setting them to large values - doesn't have much effect on behaviour of IPX is that there is another tolerance, "crossover_start", that determines when crossover starts." The new If there's a particular failure with dual simplex tolerances that you can produce, please do open a ticket so we can investigate! |
@mckib2, Thanks for looking this over. Fwiw
What do you think ? |
@denis-bz These are really interesting results that show a big difference between dual simplex and IPM. Do you mind if we share the gist with the HiGHS developers? I think they would be interested to see and might have comments on the disparity |
Since IPM has crossover (and therefore is just as accurate as DS), that's fine with me if it tends to be faster on all major platforms. If it's platform-dependent, we would need to figure out what causes the difference between them and decide which way to go from there. BTW, almost all of the Netlib problems are available as benchmarks in optimize_linprog.py, and they'll all run if you have an environment variable |
I can confirm that DS tends to be slower for larger and more difficult problems. ("Failed" occurs when the benchmark takes longer than 60s, I think.) but it is basically always faster for the smaller problems - that is, all ~70 of the feaibly Netlib problems included in SciPy that are not shown above. DS is clearly better at identifying infeasibility correctly. There are 28 infeasible problems (separate from the ones above) ; DS identifies all of them correctly, but IPM doesn't give the right status code for There is no difference in accuracy. (However, I suspect that the "correct" values listed along with the Netlib problems are incorrect in some places, or there is something slightly wrong with the MPS files or my conversion code. There are some problems for which IPM and DS agree with one another on the optimal objective but disagree with the listed value - e.g. GREENBEA, GREENBEB relative differences are > 1e-5.) @mckib2 could you run HiGHS directly (not via SciPy) on some of these to see if the results hold? |
I did a quick run for 106 feasible and 23 infeasible NETLIB problems (some of the feasible problems crashed on HiGHS master, so omitted those from the results). For the set of feasible problems, status codes only disagreed in 1 instance: @mdhaber For the infeasible problems, IPM fails much more often than simplex as you said. The only real "failure" seems to be Pinging @jajhall in case he wants to listen in to our discussion Feasible NETLIB Problems
Infeasible NETLIB Problems
Scripts for those who want to follow along# run_probs.sh
# HiGHS master, Release build, commit b8f0bb9
solvers='simplex ipm'
dirs='netlib infeas'
for d in $dirs; do
echo "" > results.txt
for problem in $d/*; do
echo "====================== Problem $d/$problem ======================" >> results.txt
for solver in $solvers; do
echo "Solver = $solver" >> results.txt
echo "solver = $solver" > myopts
echo "time_limit = 60" >> myopts
./highs --options_file myopts $problem | grep 'Objective value\|HiGHS run time\|Model status\|HiGHS status' >> results.txt
done
done
python analyze.py
done '''analyze.py -- rough n ready analysis of HiGHS LP runs'''
import re
import numpy as np
if __name__ == '__main__':
regex = re.compile(r'====================== Problem (?P<DIR>\w+)\/(?P<PROBLEM>.+).mps ======================\nSolver = simplex\n((Model\s+status\s+:\s+(?P<SIMPLEX_STATUS>\w+)\n(Objective value\s+:\s+(?P<SIMPLEX_FUN>-?\d+\.\d+)e[+|-]\d+\n)?HiGHS run time\s+:\s+(?P<SIMPLEX_TIME>\d+\.\d+)\n)|(HiGHS status: (?P<SIMPLEX_HIGHS_STATUS>\w+)\n))Solver = ipm\n(Model\s+status\s+:\s+(?P<IPM_STATUS>\w+)\n((Objective value\s+:\s+(?P<IPM_FUN>-?\d+\.\d+)e[+|-]\d+\n)?HiGHS run time\s+:\s+(?P<IPM_TIME>\d+\.\d+))|(HiGHS status: (?P<IPM_HIGHS_STATUS>\w+)\n))', flags=re.MULTILINE)
with open('results.txt', 'r') as fp:
results = fp.read()
matches = re.finditer(regex, results)
print('PROBLEM | IPM_TIME | SIMPLEX_TIME | IPM_STATUS | SIMPLEX_STATUS | IPM_FUN | SIMPLEX_FUN | DS/IPM - 1 | STATUS DIFF | FUN DIFF')
print('-- | -- | -- | -- | -- | -- | -- | -- | -- | --')
for m in matches:
try:
fun_diff = np.allclose(float(m.group('SIMPLEX_FUN')), float(m.group('IPM_FUN')))
except (IndexError, TypeError):
fun_diff = 'None'
try:
ipm_status = m.group("IPM_STATUS")
ipm_time = float(m.group('IPM_TIME'))
except (IndexError, TypeError):
ipm_status = m.group('IPM_HIGHS_STATUS')
ipm_time = None
try:
simplex_status = m.group("SIMPLEX_STATUS")
simplex_time = float(m.group('SIMPLEX_TIME'))
except (IndexError, TypeError):
simplex_status = m.group('SIMPLEX_HIGHS_STATUS')
simplex_time = None
if simplex_time and ipm_time:
perc = (simplex_time / ipm_time - 1)*100
else:
perc = 0
# it ain't pretty, but it gets it done
print('%s' % m.group('PROBLEM') + ' | '
'%s' % ipm_time + ' | '
'%s' % simplex_time + ' | '
'%s | %s' % (ipm_status, simplex_status) + ' | '
'%s | %s' % (m.group("IPM_FUN"), m.group("SIMPLEX_FUN")) + ' | '
'%g%%' % perc + ' | '
'%s' % (ipm_status == simplex_status) + ' | '
'%s' % fun_diff) |
I think we need some of the more challenging problems to know for sure, because currently in your data DS/IPM - 1 is almost always negative, indicating that DS time < IPM time. Can you run these for some of the top problems in my list - QAP12/QAP15, STOCFOR3, DFL001, TRUSS, etc.? |
Fwiw, I've run some of the Mittelmann benchmarks through HiGHS-ipm, with runtimes 5 .. 30 minutes or so: Which way is up ? |
Thanks for all this. Here are a few observations.
Not a lot will happen with HiGHS simplex in the next couple of months, as I'm now teaching my course. The real action with HiGHS at the moment is the growing power of its MIP solver! |
There must be a problem with my scripts, the negative in the tables I provide indicate that IPM time < DS time.
|
They are available here: http://www.numerical.rl.ac.uk/cute/netlib.html Although the extension is SIF, i believe they're regular MPS format. |
Thanks @mdhaber , here are results from all the programs on that site: AGG, BLEND, DFL001, GFRD-PNC, SIERRA all throw exceptions and are omitted from the table. Simplex seems a lot more competitive on this set.
|
Yup. Definitely more reliable. I think we should leave the behavior of |
So, a bunch of Scipy functionalities have been cythonized for performance. Will it be possible to include OpenMP and parallelize some of them?
The text was updated successfully, but these errors were encountered: