Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize LinExpr multiplication by one or zero #98

Closed
bonelli opened this issue May 11, 2020 · 14 comments
Closed

Optimize LinExpr multiplication by one or zero #98

bonelli opened this issue May 11, 2020 · 14 comments

Comments

@bonelli
Copy link
Contributor

bonelli commented May 11, 2020

I'm working on a problem that is mathematically defined with a couple of big sparse binary matrices,
I'm using quite a lot of python-mip binary variables and I collected them into numpy tensors to have some syntactic sugar and matrix operations over them.

It's very nice but very very slow, and it seems to me that with some basic optimization it could perform much better, notably to the LinExpr.__mul__() function.

Would it be reasonable to have the special cases for multiplication by 1 or 0 to speed up this kind of use?

@h-g-s
Copy link
Contributor

h-g-s commented May 11, 2020 via email

@bonelli
Copy link
Contributor Author

bonelli commented May 12, 2020

Hi @h-g-s, thanks for your answer.

I understand the situation, I'm using CPython and started profiling to better understand if I can be of any support.

Here is an extract of the profile results for the heaviest matrix operation, which is a numpy.matmul between a numpy array containing python-mip variables and another with dtype=int in 0 and 1.

ncalls tottime percall cumtime percall filename:lineno(function)
3099193 36.14 1.166e-05 62.22 2.008e-05 entities.py:170(mul)
32640914 29.01 8.888e-07 40 1.225e-06 entities.py:310(add_var)
1 26.79 26.79 26.79 26.79 default.py:488(connect)
145835132 24.36 1.67e-07 24.36 1.67e-07 entities.py:528(hash)
2616313 20.25 7.739e-06 64.66 2.471e-05 entities.py:288(add_expr)
5638618 7.923 1.405e-06 14.23 2.523e-06 entities.py:320(copy)
1 4.976 4.976 144 144 foo.py:166(D)
2622175 4.845 1.848e-06 4.845 1.848e-06 ~:0(<method 'items' of 'dict' objects>)
2539173 4.698 1.85e-06 76.69 3.02e-05 entities.py:106(add)
1 4.233 4.233 4.233 4.233 ~:0()
5638653 3.536 6.271e-07 3.536 6.271e-07 ~:0(<method 'copy' of 'dict' objects>)
8760991/8760990 3.278 3.741e-07 5.999 6.847e-07 ~:0()
5783502 3.234 5.593e-07 3.559 6.154e-07 entities.py:85(init)
72/71 1.829 0.02576 1.831 0.02579 ~:0()
3240098 1.461 4.51e-07 2.695 8.318e-07 abc.py:96(instancecheck)
3240098 1.232 3.802e-07 1.234 3.808e-07 ~:0()

I cut the results at 1 second or more of tottime

Here attached is the plot of the same profiling run

profile_viz

@tkralphs
Copy link
Member

If you already have all your input data as numpy matrices, you probably just want to use CyLP. Someone with the basically the same issue just reported reducing the time to get the instance into Cbc from 10 minutes with python-mip down to 10 seconds with CyLP (see coin-or/CyLP#95).

@bonelli
Copy link
Contributor Author

bonelli commented May 12, 2020

Thanks I'll see into CyLP, but I like Python-MIP so far so I'll give it a good try before changing.

I tried basic optimization against the master branch:

  • sum by integer 0 will return self without even a copy
  • multiply by integer 1 will return self without even a copy
  • multiply by integer 0 will return 0

this improves significantly my case of numpy.matmul with a binary integer matrix: here are some measurements

not optimized:
2020-05-13 00:55:03,827 INFO foo.bar - Starting D multiplication
2020-05-13 01:03:10,727 INFO foo.bar - Finishing D multiplication
= 487 secs

optimized:
2020-05-13 00:49:26,222 INFO foo.bar - Starting D multiplication
2020-05-13 00:50:54,799 INFO foo.bar - Finishing D multiplication
= 88 secs

for that same operation in this particular case I get a gain of about 5.5x time faster

Do you think this is worth a PR?

@h-g-s
Copy link
Contributor

h-g-s commented May 12, 2020

Hi @bonelli , your optimization seems good, please make a PR.

I'm thinking in some optimizations here that I can test, but I would need a code example where this bottleneck appears. Do you think you could write a simpler code that exposes the same performance bottleneck ? Then I could test my ideas and see the effect in an environment similar to yours.

@tkralphs , didn't know that the matrix interface could be so much faster, I'll check on how to provide a similar input in Python-MIP.

@jurasofish
Copy link
Contributor

I was thinking of making the same optimisation, but wouldn't the following code result in issues? a would end up being multiplied by 3, correct? haven't tested.

a = m.add_var()
b = a * 1
b *= 3

@bonelli
Copy link
Contributor Author

bonelli commented May 12, 2020

I was thinking of making the same optimisation, but wouldn't the following code result in issues? a would end up being multiplied by 3, correct? haven't tested.

a = m.add_var()
b = a * 1
b *= 3

If the *= operator performs in-place (I think so) it is correct, a would be multiplied by 3.

Frankly with this kind of mathematical applications I believe that having in-place operations is so misleading that they shouldn't be allowed.

Mathematical variables/constants are defined once and immutable after that, encouraging this kind of imperative programming in a math/optimization library is something I wouldn't do.

@bonelli
Copy link
Contributor Author

bonelli commented May 12, 2020

I'm thinking in some optimizations here that I can test, but I would need a code example where this bottleneck appears. Do you think you could write a simpler code that exposes the same performance bottleneck ?

@h-g-s sure, I'll make a PR and post some examples of slow numpy operations with python-mip

@jurasofish
Copy link
Contributor

if not isinstance(other, numbers.Real):

yes that's right, it's an in-place operation. Removing in-place ops would of course be a severe backwards compatibility issue (perhaps for version 2). I'd support it though.

@h-g-s
Copy link
Contributor

h-g-s commented May 14, 2020

Hi @bonelli , @tkralphs , I'll add a matrix interface, so that rows could be queried and inserted as numpy arrays, then I think that it will be as fast as t can (with a not so pretty code, however)

@bonelli
Copy link
Contributor Author

bonelli commented May 14, 2020

Hi @bonelli , @tkralphs , I'll add a matrix interface, so that rows could be queried and inserted as numpy arrays, then I think that it will be as fast as t can (with a not so pretty code, however)

All due respect, I don't think it's useful to add a matrix notation if it's gonna be just syntactic sugar, I just arranged normal variables inside a numpy array and the matrix notation is already done, and as a plus I get all the matrix operations I need, already implemented in numpy.

If on the other hand you're thinking of implementing faster versions of the LinExpr operations, taking advantage of the matrix structure (as you did with xsum for iterables), then I'm all for it.

Sorry I couldn't make the PR today, didn't have time even if it's high on my priority list

@h-g-s
Copy link
Contributor

h-g-s commented May 15, 2020

a faster LinExpr is the priority but my idea is also to make a available a low level interface to communicate with the solver. With CFFI it is possible to allocate C arrays, which can be directly passed to CBC, for example, this would be a "last resort" for those who want to the maximum performance.

@bonelli
Copy link
Contributor Author

bonelli commented May 18, 2020

Here is the PR for reference #100
If it will be released it solves this issue for my part of the problem.

@bonelli
Copy link
Contributor Author

bonelli commented May 23, 2020

release 1.9.0 resolve this issue for me, I'd close it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants