ENH: add `mdot`: fast dot with multiple arguments. #4977

sotte · 2014-08-20T10:07:53Z

This is not ready to be merged!

mdot makes it easy to chain dot. Think of mdot as this:

def mdot(*args): return reduce(np.dot, args)

However, mdot uses the fact that matrix multiplication is associative and
finds the optimal order of multiplications with dynamic programing before
doing the mutiplication. This can speed up the multiplication quite a bit.

The algorithm follows:
Cormen, "Introduction to Algorithms", Chapter 15.2, p. 370-378

Note: currently mdot only works with arrays which have ndim == 2 to avoid
ambiguities of dot. Also see the discussion in #4311.

TODO:

Decide if ndim != 2 should be ok
Add unittests
decide on name. possible names are:
- mdot: think "multi dot"
- multidot: as above :)
- multi_dot: as above :)
- chain_dot
- fdot: think "fast dot" (should the parameter optimize be optional then? I don't think so.)
- ddot: think "multiple d for multiple dot"
- cdot: think "chain dot"
- odot: think "optimal do

seberg · 2014-08-20T11:15:24Z

About your todo's. The first thing, more then ndim == 2 is OK, if it stick to the new __matmul__ rule (not current dot outer broadcasting) in my opinion. But, I would say that it could be added later on. The only thing I am a bit unsure on is the exact name, because I am not sure if we will add (and what) other names to do with __matmul__ introduction.

sotte · 2014-08-20T11:36:29Z

@seberg I agree, we can add that later.

Possible names if you want to reserve mdot for __matmul__:

fdot: think "fast dot" (should the parameter optimize be optional then? I don't think so.)
ddot: think "multiple d for multiple dot"
cdot: think "chain dot"
odot: think "optimal dot"

I think I prefer cdot of these...

juliantaylor · 2014-08-20T11:53:52Z

I think an explicit chain_dot is best, a single letter is easily confused with lowlevel blas dots, e.g. sdot, ddot, zdot etc.

njsmith · 2014-08-20T12:10:50Z

Yeah, for ndim > 2 I think that leaving it out for now is fine, and we
certainly shouldn't implement legacy dot's behaviour.

For ndim = 1, I think this should be allowed for the first and last entries
in the chain.

On Wed, Aug 20, 2014 at 12:53 PM, Julian Taylor notifications@github.com
wrote:

I think an explicit chain_dot is best, a single letter is easily confused
with lowlevel blas dots, e.g. sdot, ddot, zdot etc.

—
Reply to this email directly or view it on GitHub
#4977 (comment).

Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org

jaimefrio · 2014-08-20T13:54:12Z

numpy/linalg/linalg.py

+
+    """
+    for array in args:
+        if array.ndim != 2:


Most numpy functions do not assume that their inputs are ndarrays, only that they are "array-like". You probably want an args = [np.asanyarray(args) for arg in args] before this check.

In most places (and all places in linalg) I looked at np.asarray is used. Any reason why I should use np.asanyarray here?

jaimefrio · 2014-08-20T14:15:33Z

I think this is a cool addition as is. Aside from the unittests, everything else in your TODO list can be added afterwards.

jaimefrio · 2014-08-20T14:20:38Z

numpy/linalg/linalg.py

+    # The cost for multiplying AB is then: 10 * 100 * 5
+    p = [arg.shape[0] for arg in args]
+    p.append(args[-1].shape[1])
+


This may be premature optimization, but since the future plans will likely include taking stacks of arrays, once the @ functionality is in place, checking for shape[-2] and shape[-1], instead ofshape[0]andshape[1]` may be appropriate.

Yeah, let's handle this once @ is in place. I don't think there is even a prototype yet (see #4464).

sotte · 2014-08-21T09:45:00Z

@juliantaylor chain_dot is kinda long. I didn't even know about ddot and sdot. I don't think most people care about the underlying blas function. There is vdot already: http://docs.scipy.org/doc/numpy/reference/generated/numpy.vdot.html. So why not have cdot?

sotte · 2014-08-21T10:08:57Z

I wonder if there is a place for "benchmark unittest". I mean I want to make sure that mdot with optimize=True is faster than the not optimal version.

I also wonder if I should handle cases where there are less than 3 arrays? Should I just throw an Error? Currently I'm using my personal version of mdot everywhere where I used dot before. So it's useful to be able to handle 2 arguments.

sotte · 2014-08-21T12:25:47Z

I'm just realizing that reduce is not part of python 3 anymore. There is functools.reduce which is available in python 2.6 and 3.x. Since the current version of numpy wants 2.6 and above I'll use functools.reduce.

larsmans · 2014-08-23T16:32:33Z

I called a similar function multidot once. I still like that name, it was easy to remember and type.

sotte · 2014-08-25T08:39:52Z

@njsmith would you broadcats a 1D vector into a row matrix (1 x n) for the first in the chain and column matrix (m x 1) for the last in the chain? In that case we don't have to modify the non optimized reduce part and the dynamic programming part also works without changes.

njsmith · 2014-08-25T08:43:11Z

Technically this should be called "promotion" or something rather than
broadcasting, but yes, that's how dot works. (Try it!)
On 25 Aug 2014 09:39, "Stefan Otte" notifications@github.com wrote:

@njsmith https://github.com/njsmith would you broadcats a 1D vector
into a row matrix (1 x n) for the first in the chain and column matrix (m x

for the last in the chain? In that case we don't have to modify the non
optimized reduce part and the dynamic programming part also works without
changes.

—
Reply to this email directly or view it on GitHub
#4977 (comment).

argriffing · 2014-08-27T15:51:59Z

+1 for the mdot name and for allowing 1d first and last args

I wonder if there is a place for "benchmark unittest".

Have you checked python runtests.py --bench? The analogous script in scipy runs benchmarks that are separate from the unit tests but which are organized similarly. Setting up something like vbench could be interesting but would be more work.

mhvk · 2014-08-27T19:47:18Z

numpy/linalg/linalg.py

+    elif n == 2:
+        return dot(arrays[0], arrays[1])
+
+    arrays = [asarray(a) for a in arrays]


Just happened across this: would suggest to always use asanyarray rather than asarray, since there would seem to be no reason here not to preserve a possible subclass.

Separately, should this line be moved up? It seems illogical that the type of output array might depend on the number of arguments passed in.

asarray is used throughout linalg.py and I wanted to preserve it. I'll change it.

The second argument: mdot behaves like dot if there are only one or two arguments. No need to do anything that is unique to mdot. That's why I moved arrays = [asarray(a) for a in arrays] down.

sotte · 2014-08-28T12:06:37Z

@argriffing I'll take a look at python runtests.py --bench. Thanks for the pointer!

sotte · 2014-08-31T14:37:09Z

I added tests and documentation for the ndim==1 cases.

The name of the function is not clear yet. Some for, some against mdot. If there is no crazy opposition against mdot I'd keep the name.

I think we can merge it soon.

(I'll still add some benchmark tests later. Btw, @argriffing thanks for the pointer to vbench. It's a cool tool!)

juliantaylor · 2014-09-02T21:17:33Z

I still prefer chain_dot or multi_dot if its too long users can always change its name in the import, but if I'm the only one mdot is fine.

Can you still write a small section in the release notes for 1.10? this is a really nice feature which should also have a summary line in the highlights.

njsmith · 2014-09-02T21:49:24Z

I also vote for a more obvious name. Reading is harder than writing etc.
On 2 Sep 2014 17:17, "Julian Taylor" notifications@github.com wrote:

I still prefer chain_dot or multi_dot if its too long users can always
change its name in the import, but if I'm the only one mdot is fine.

Can you still write a small section in the release notes for 1.10? this is
a really nice feature which should also have a summary line in the
highlights.

—
Reply to this email directly or view it on GitHub
#4977 (comment).

argriffing · 2014-09-02T22:31:46Z

Those longer names are OK too, I only +1'd the name mdot because I'd coincidentally used this same name for the same function.

sotte · 2014-09-03T09:38:36Z

Ok, I can live with a longer name (I'll import it as mdot anyway :)). multi_dot it is.

@juliantaylor I edited the release notes.

I wonder if I should mention multi_dot in linalg/info.py and linalg/__init__.py. info.py and __init__.py contain almost the same docstring (also in different subpackages) but not quite the same. Is there a reason for this?

sotte · 2014-09-03T11:23:31Z

Do you normally squash the commits before you merge?

seberg · 2014-09-03T11:51:39Z

If it gets a lot then typically yes (and here it seems like quite a lot ;)).

sotte · 2014-09-03T12:05:33Z

Once I get a final OK I'll squash the commits.

jaimefrio · 2014-09-04T06:02:36Z

numpy/linalg/linalg.py

+    of the matrices [1]_ [2]_. Depending on the shape of the matrices
+    this can speed up the multiplication a lot.
+
+    The first and last argument can have `ndim==1` and are treated as


Looking around the docstrings of other functions, see e.g. np.dot, 1-D and 2-D seem to be the preferred ways of indicating the dimensionality of the input arrays. You should try to stick to that convention.

sotte · 2014-11-03T12:58:54Z

@shoyer the documentation mentions the constraints regarding the dimensionality:

The first and last argument can be 1-D and are treated respectively as
row and column vector. The other arguments must be 2-D.

I could make it more explicit.

abalkin · 2014-11-03T13:02:15Z

You do realize that solving the dynamic programming problem at all for a set of square matrices is a waste of effort?

Yes, and this is another reason in favor of multi_dot(x) signature. A set of conforming square matrices is easier to detect.

I think the signature taking *args serves as a useful nudge to signal to users that using a 3D matrix as input is probably not what they want want to be doing ...

Nudging the users is not what a well-designed library should be doing. When NumPy can select the best algorithm by itself, it should just do it. It is perfectly fine to have different code paths for multi_dot of square matrices and rectangular ones. Nudging the users into not using a convenience function in the probably most common case is not.

shoyer · 2014-11-03T13:08:12Z

Well, I'm definitely +1 on skipping the dynamic programming if all matrices have the same shape (at least if there are enough matrices for the speed of the calculating the optimal ordering to matter). That is indeed a better idea than telling users to use reduce/np.dot.

sotte · 2014-11-03T13:23:03Z

The dynamic programming optimization does not help in case of square matrices. So we could skip it of course. Checking the shape of arrays does not really depend on how we supply arguments though.

As for the 3-D arrays: I'm not really interested in it, but of course one can add it. I don't see how this influences the choice of *args, but I'm willing to be convinced :)

abalkin · 2014-11-03T14:00:09Z

Checking the shape of arrays does not really depend on how we supply arguments though.

Yes, it does. If args is a 3-d array you know that all shapes are the same, but you loose this information if you pass args in as *args. So you get an O(1) vs. O(n) difference in this case.

sotte · 2014-11-03T14:24:13Z

I agree. But if you want to circumvent the DP optimization for a list of 2-D square arrays you still have to check every element. The runtime difference (O(1) vs O(n)) is probably neglectable compared to the real multiplication.

jaimefrio · 2014-11-03T14:50:56Z

In case I haven't made this clear, I think *args is a dreadful idea, because I really hope and wish for this function to have a more complete functionality and go on to need the ugly **kwargs (I still don't like it). And if there are ever going to be kwargs, the benefit of *args is exactly two keystrokes, a saving which hardly justifies anything.

That said, the 3D stack of arrays argument is not a very solid one: if a is the 3D array, you could always do multi_dot(*a), which turns your array into a tuple of 2D views into the original 3D array, so it should be a fast operation. And anyway, worrying about performance degradation because of this, or in shape checking, when matrix multiplication is at best an O(n^2.x) algorithm seems pointless.

If there really is a strong liking for *args, I think that we should take the opportunity to discuss what numpy's policy regarding this should be, so that we can at least spare ourselves having this discussion again.

Personally, and as a general rule, I don't think that *args, **kwargs is ok, and I don't buy the Python 3 argument either: support for Python 2.x is not going to be dropped in a very long time. If there really is a strong liking for *args, I think the better option would be to eventually have two functions in the API: one with all the bells and whistles, that took a sequence of arrays as first argument, and whatever kwargs are deemed necessary afterwards, and a second, simplified version of this, that took a variable list of arguments and had predefined options for the kwargs.

This is, by the way, what I think concatenate and the family of hstack, vstack, dstack and the like should have done.

abalkin · 2014-11-03T15:26:52Z

That said, the 3D stack of arrays argument is not a very solid one: if a is the 3D array, you could always do multi_dot(*a), which turns your array into a tuple of 2D views into the original 3D array, so it should be a fast operation. And anyway, worrying about performance degradation because of this, or in shape checking, when matrix multiplication is at best an O(n^2.x) algorithm seems pointless.

Note that in the problems where you need to multiply many square matrices these matrices tend to be small. Oftentimes they are 2x2 or 3x3.

sotte · 2014-11-06T11:16:56Z

My main point was that the interface from the user's perspective would be a bit nicer if we used *args. But I don't have a very strong opinion. If we won't change it to *args we can merge the pull request.

Nodd · 2014-11-07T13:37:52Z

@jaimefrio Just my 2 cents: using *args is not about saving two keystrokes but increasing readability. The more parenthesis you have, the more confusing it is to see where each argument/call/tuple begins and ends. Code is read way more often than it is written...

jaimefrio · 2014-11-07T14:45:37Z

We could go on for ever giving qrguments and counter arguments: one could argue that, because of its very nature, this function is geared towards runtime generated lists of matrices, not with a hardcoded list of them. e.g.it would likely more often be used as something like:

mats = [get_me_a_matrix() for j in range(n)]
multi_dot(*mats)

rather than:

multi_dot(m0, m1, m2, ..., mn)

and that having that extra *, which is not one of the most prominent features of the language, similarly hurts readability.

And don't get me wrong, I find the extra parenthesis as annoying as anyone else, but I much prefer this minor annoyance than being shown an f(*args, **kwargs) signature by an IDE while coding. But if that's not how the community feels, and we can come to an agreement on what the general numpy rule should be, I'll happily give in. I'll try to find some time to articulate the options and send a mail to the discussion list so that we can have a vote there.

sotte · 2014-11-07T15:11:14Z

@jaimefrio maybe a discussion about *args (independent of this concrete pull request) would be good simply to have guidelines for the future. 👍

argriffing · 2014-11-07T15:25:46Z

On one hand I've only ever used the mdot utility for hardcoded lists of arrays and I'd implemented my version using *args, but on the other hand I dislike *args because for whatever reason it tends to spread, turning the signature of every related function into f(*args, **kwargs). For example, it has metastasized throughout the scipy.optimize.minimize framework making that code less than readable to me.

seberg · 2014-11-07T15:52:03Z

Frankly, to me this is more about whether the function would be used as

a @ b @ c @ d  -->  multi_dot(a, b, c, d)

or

reduce(np.dot, arrays)  --> multi_dot(arrays)

and less about whether or not *args, **kwargs is horrible or not. There are a lot of horrible *args, **kwargs uses, but there is no passing through.
If the second one is a common use case, I think we should probably not use the *args signature, if it is rare enough, I guess then depending on how horrible you think *args, **kwargs is, you would choose differently...

argriffing · 2014-11-07T16:09:26Z

@seberg if your question is purely to gather evidence about use cases, I've used your case (1) many times and never case (2). This is the opposite of @jaimefrio's intuition.

sotte · 2014-11-07T16:15:18Z

I used both cases.

juliantaylor · 2014-11-09T16:09:43Z

would need a rebase, probably due to the release notes

`np.linalg.multi_dot` computes the dot product of two or more arrays in a single function call, while automatically selecting the fastest evaluation order. The algorithm for selecting the fastest evaluation order uses dynamic programming and closely follows: Cormen, "Introduction to Algorithms", Chapter 15.2, p. 370-378

sotte · 2014-11-10T10:59:15Z

I rebased onto the current master.

sotte · 2014-11-13T09:38:54Z

So, what are the next steps?

juliantaylor · 2014-11-13T10:08:29Z

list of arrays is consistent with already existing api so its probably not worth bikesheding more about it.
Thanks, this is a very nice new feature.

ENH: add `mdot`: fast dot with multiple arguments.

sotte · 2014-11-13T10:17:32Z

Thanks everybody who was involved in this and gave feedback!

solarjoe · 2017-03-01T12:26:13Z

Though np.linalg.multi_dot was added in Numpy 1.10 a long, long time ago
it is not listed in the Numpy docs. Search engines only find other pages with examples.
Does anyone know how to fix this?

eric-wieser · 2017-03-01T12:31:57Z

@solarjoe: One line in :/doc/source/reference/routines.linalg.rst should do it

solarjoe · 2017-03-01T22:27:06Z

Ok, line added.
#8723

sotte mentioned this pull request Aug 20, 2014

Function to chain multiplications / dot for many arrays #4311

Closed

jaimefrio reviewed Aug 20, 2014
View reviewed changes

mhvk reviewed Aug 27, 2014
View reviewed changes

jaimefrio reviewed Sep 4, 2014
View reviewed changes

sotte force-pushed the mdot branch from 4ffeb68 to 1b12c39 Compare November 10, 2014 10:57

juliantaylor added a commit that referenced this pull request Nov 13, 2014

Merge pull request #4977 from sotte/mdot

f14d5e1

ENH: add `mdot`: fast dot with multiple arguments.

juliantaylor merged commit f14d5e1 into numpy:master Nov 13, 2014

dgasmith mentioned this pull request Jan 23, 2015

ENH: add contract: optimizing numpy's einsum expression #5488

Merged

eric-wieser mentioned this pull request Mar 1, 2017

ENH: Add matmul equivalent of multi_dot #8719

Open

ENH: add mdot: fast dot with multiple arguments. #4977

ENH: add mdot: fast dot with multiple arguments. #4977

Conversation

sotte commented Aug 20, 2014

seberg commented Aug 20, 2014

sotte commented Aug 20, 2014

juliantaylor commented Aug 20, 2014

njsmith commented Aug 20, 2014

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jaimefrio commented Aug 20, 2014

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sotte commented Aug 21, 2014

sotte commented Aug 21, 2014

sotte commented Aug 21, 2014

larsmans commented Aug 23, 2014

sotte commented Aug 25, 2014

njsmith commented Aug 25, 2014

argriffing commented Aug 27, 2014

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sotte commented Aug 28, 2014

sotte commented Aug 31, 2014

juliantaylor commented Sep 2, 2014

njsmith commented Sep 2, 2014

argriffing commented Sep 2, 2014

sotte commented Sep 3, 2014

sotte commented Sep 3, 2014

seberg commented Sep 3, 2014

sotte commented Sep 3, 2014

Choose a reason for hiding this comment

sotte commented Nov 3, 2014

abalkin commented Nov 3, 2014

shoyer commented Nov 3, 2014

sotte commented Nov 3, 2014

abalkin commented Nov 3, 2014

sotte commented Nov 3, 2014

jaimefrio commented Nov 3, 2014

abalkin commented Nov 3, 2014

sotte commented Nov 6, 2014

Nodd commented Nov 7, 2014

jaimefrio commented Nov 7, 2014

sotte commented Nov 7, 2014

argriffing commented Nov 7, 2014

seberg commented Nov 7, 2014

argriffing commented Nov 7, 2014

sotte commented Nov 7, 2014

juliantaylor commented Nov 9, 2014

sotte commented Nov 10, 2014

sotte commented Nov 13, 2014

juliantaylor commented Nov 13, 2014

sotte commented Nov 13, 2014

solarjoe commented Mar 1, 2017

eric-wieser commented Mar 1, 2017

solarjoe commented Mar 1, 2017

ENH: add `mdot`: fast dot with multiple arguments. #4977

ENH: add `mdot`: fast dot with multiple arguments. #4977