New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: add mdot
: fast dot with multiple arguments.
#4977
Conversation
About your todo's. The first thing, more then ndim == 2 is OK, if it stick to the new |
@seberg I agree, we can add that later. Possible names if you want to reserve
I think I prefer |
I think an explicit |
Yeah, for ndim > 2 I think that leaving it out for now is fine, and we For ndim = 1, I think this should be allowed for the first and last entries On Wed, Aug 20, 2014 at 12:53 PM, Julian Taylor notifications@github.com
Nathaniel J. Smith |
|
||
""" | ||
for array in args: | ||
if array.ndim != 2: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most numpy functions do not assume that their inputs are ndarray
s, only that they are "array-like". You probably want an args = [np.asanyarray(args) for arg in args]
before this check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In most places (and all places in linalg) I looked at np.asarray
is used. Any reason why I should use np.asanyarray
here?
I think this is a cool addition as is. Aside from the unittests, everything else in your TODO list can be added afterwards. |
# The cost for multiplying AB is then: 10 * 100 * 5 | ||
p = [arg.shape[0] for arg in args] | ||
p.append(args[-1].shape[1]) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be premature optimization, but since the future plans will likely include taking stacks of arrays, once the @
functionality is in place, checking for shape[-2]
and shape[-
1], instead of
shape[0]and
shape[1]` may be appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, let's handle this once @
is in place. I don't think there is even a prototype yet (see #4464).
@juliantaylor |
I wonder if there is a place for "benchmark unittest". I mean I want to make sure that I also wonder if I should handle cases where there are less than 3 arrays? Should I just throw an Error? Currently I'm using my personal version of mdot everywhere where I used dot before. So it's useful to be able to handle 2 arguments. |
I'm just realizing that |
I called a similar function |
@njsmith would you broadcats a 1D vector into a row matrix (1 x n) for the first in the chain and column matrix (m x 1) for the last in the chain? In that case we don't have to modify the non optimized |
Technically this should be called "promotion" or something rather than
|
+1 for the mdot name and for allowing 1d first and last args
Have you checked |
elif n == 2: | ||
return dot(arrays[0], arrays[1]) | ||
|
||
arrays = [asarray(a) for a in arrays] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just happened across this: would suggest to always use asanyarray
rather than asarray
, since there would seem to be no reason here not to preserve a possible subclass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Separately, should this line be moved up? It seems illogical that the type of output array might depend on the number of arguments passed in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
asarray
is used throughout linalg.py and I wanted to preserve it. I'll change it.
The second argument: mdot
behaves like dot
if there are only one or two arguments. No need to do anything that is unique to mdot
. That's why I moved arrays = [asarray(a) for a in arrays]
down.
@argriffing I'll take a look at |
I added tests and documentation for the The name of the function is not clear yet. Some for, some against I think we can merge it soon. (I'll still add some benchmark tests later. Btw, @argriffing thanks for the pointer to vbench. It's a cool tool!) |
I still prefer Can you still write a small section in the release notes for 1.10? this is a really nice feature which should also have a summary line in the highlights. |
I also vote for a more obvious name. Reading is harder than writing etc.
|
Those longer names are OK too, I only +1'd the name mdot because I'd coincidentally used this same name for the same function. |
Ok, I can live with a longer name (I'll import it as @juliantaylor I edited the release notes. I wonder if I should mention |
Do you normally squash the commits before you merge? |
If it gets a lot then typically yes (and here it seems like quite a lot ;)). |
Once I get a final OK I'll squash the commits. |
of the matrices [1]_ [2]_. Depending on the shape of the matrices | ||
this can speed up the multiplication a lot. | ||
|
||
The first and last argument can have `ndim==1` and are treated as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking around the docstrings of other functions, see e.g. np.dot
, 1-D and 2-D seem to be the preferred ways of indicating the dimensionality of the input arrays. You should try to stick to that convention.
@shoyer the documentation mentions the constraints regarding the dimensionality:
I could make it more explicit. |
Yes, and this is another reason in favor of multi_dot(x) signature. A set of conforming square matrices is easier to detect.
Nudging the users is not what a well-designed library should be doing. When NumPy can select the best algorithm by itself, it should just do it. It is perfectly fine to have different code paths for multi_dot of square matrices and rectangular ones. Nudging the users into not using a convenience function in the probably most common case is not. |
Well, I'm definitely +1 on skipping the dynamic programming if all matrices have the same shape (at least if there are enough matrices for the speed of the calculating the optimal ordering to matter). That is indeed a better idea than telling users to use reduce/np.dot. |
The dynamic programming optimization does not help in case of square matrices. So we could skip it of course. Checking the shape of arrays does not really depend on how we supply arguments though. As for the 3-D arrays: I'm not really interested in it, but of course one can add it. I don't see how this influences the choice of |
Yes, it does. If args is a 3-d array you know that all shapes are the same, but you loose this information if you pass args in as *args. So you get an O(1) vs. O(n) difference in this case. |
I agree. But if you want to circumvent the DP optimization for a list of 2-D square arrays you still have to check every element. The runtime difference (O(1) vs O(n)) is probably neglectable compared to the real multiplication. |
In case I haven't made this clear, I think That said, the 3D stack of arrays argument is not a very solid one: if If there really is a strong liking for Personally, and as a general rule, I don't think that This is, by the way, what I think |
Note that in the problems where you need to multiply many square matrices these matrices tend to be small. Oftentimes they are 2x2 or 3x3. |
My main point was that the interface from the user's perspective would be a bit nicer if we used |
@jaimefrio Just my 2 cents: using *args is not about saving two keystrokes but increasing readability. The more parenthesis you have, the more confusing it is to see where each argument/call/tuple begins and ends. Code is read way more often than it is written... |
We could go on for ever giving qrguments and counter arguments: one could argue that, because of its very nature, this function is geared towards runtime generated lists of matrices, not with a hardcoded list of them. e.g.it would likely more often be used as something like:
rather than:
and that having that extra And don't get me wrong, I find the extra parenthesis as annoying as anyone else, but I much prefer this minor annoyance than being shown an |
@jaimefrio maybe a discussion about |
On one hand I've only ever used the |
Frankly, to me this is more about whether the function would be used as
or
and less about whether or not |
@seberg if your question is purely to gather evidence about use cases, I've used your case (1) many times and never case (2). This is the opposite of @jaimefrio's intuition. |
I used both cases. |
would need a rebase, probably due to the release notes |
`np.linalg.multi_dot` computes the dot product of two or more arrays in a single function call, while automatically selecting the fastest evaluation order. The algorithm for selecting the fastest evaluation order uses dynamic programming and closely follows: Cormen, "Introduction to Algorithms", Chapter 15.2, p. 370-378
I rebased onto the current master. |
So, what are the next steps? |
list of arrays is consistent with already existing api so its probably not worth bikesheding more about it. |
ENH: add `mdot`: fast dot with multiple arguments.
Thanks everybody who was involved in this and gave feedback! |
Though np.linalg.multi_dot was added in Numpy 1.10 a long, long time ago |
@solarjoe: One line in :/doc/source/reference/routines.linalg.rst should do it |
Ok, line added. |
This is not ready to be merged!
mdot
makes it easy to chaindot
. Think ofmdot
as this:However,
mdot
uses the fact that matrix multiplication is associative andfinds the optimal order of multiplications with dynamic programing before
doing the mutiplication. This can speed up the multiplication quite a bit.
The algorithm follows:
Cormen, "Introduction to Algorithms", Chapter 15.2, p. 370-378
Note: currently
mdot
only works with arrays which havendim == 2
to avoidambiguities of
dot
. Also see the discussion in #4311.TODO:
ndim != 2
should be okmdot
: think "multi dot"multidot
: as above :)multi_dot
: as above :)chain_dot
fdot
: think "fast dot" (should the parameter optimize be optional then? I don't think so.)ddot
: think "multiple d for multiple dot"cdot
: think "chain dot"odot
: think "optimal do