Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*(a, b, c, xs...) should have methods for matrices and vectors #12065

Closed
andreasnoack opened this issue Jul 8, 2015 · 21 comments
Closed

*(a, b, c, xs...) should have methods for matrices and vectors #12065

andreasnoack opened this issue Jul 8, 2015 · 21 comments
Labels
linear algebra Linear algebra performance Must go faster

Comments

@andreasnoack
Copy link
Member

There are two problems right now. First of all, it would be great to have A*B*x=A*(B*x) instead of (A*B)*x because the former is Θ(n^2)-ish whereas the latter is Θ(n^3)-ish. The second problem is that the optimized transposed multiplication methods not called for e.g. A'*B*C. Right now, A is transposed before the multiplication so typically (A'*B)*C is much faster than A'*B*C.

@StefanKarpinski
Copy link
Sponsor Member

As a generalization of this, we could use the classic dynamic programming
algorithm to decide what order to multiply matrices in. This could be
memoized, or it might be fast enough to compute every time. I'd guess that
it is since the sizes of arrays needed to compute this would generally
dwarf those required for the new matrix.

On Wednesday, July 8, 2015, Andreas Noack notifications@github.com wrote:

There are two problems right now. First of all, it would be great to have
A_B_x=A_(B_x) instead of (A_B)_x because the former is Θ(n^2)-ish whereas
the latter is Θ(n^3)-ish. The second problem is that the optimized
transposed multiplication methods not called for e.g. A'_B_C. Right now, A
is transposed before the multiplication so typically (A'_B)_C is much
faster than A'_B_C.


Reply to this email directly or view it on GitHub
#12065.

@kshyatt kshyatt added performance Must go faster linear algebra Linear algebra labels Jul 9, 2015
@andreasnoack andreasnoack added this to the 0.5 milestone Aug 11, 2015
@tpapp
Copy link
Contributor

tpapp commented Oct 27, 2015

Matrix multiplications with floats are not associative, so technically A*(B*x) is not equal to (A*B)*x. My understanding is that the proposal would make Julia select the grouping with the best performance, but this would make A op B op C either (A op B) op C or A or (B op C) depending on the values of A,B, and C for op=*. I don't know if there is a name for what this violates... not type stability, but something similar. That said, I can only think of very contrived examples for when this matters in practice.

@jiahao
Copy link
Member

jiahao commented Oct 27, 2015

Everyone knows that floating point multiplication is not associative, but the reason why it rarely matters in practice is because the error in nonassociativity is bounded essentially by two ulps. See #12375 (comment) for a reference to TAoCP.

@LMescheder
Copy link

I'm new to Julia and took this as an exercise to get used to it. I created a simple package that uses the dynamic programming algorithm mentioned by Stefan. It think it works quite well and gives huge speedups, when the default ordering is suboptimal. Maybe you find it useful!

@kshyatt
Copy link
Contributor

kshyatt commented Dec 2, 2015

Hi @LMescheder, this looks cool. I don't see your package in https://github.com/JuliaLang/METADATA.jl - have you registered it yet?

@StefanKarpinski
Copy link
Sponsor Member

Yes, very cool! I would note that the left-to-right multiplication result is not necessarily more correct, in fact, I would guess that the method that uses fewer operations would tend to be more accurate, but I'm sure that people like @alanedelman, @stevengj, @andreasnoack and @jiahao could say better than I.

@jiahao
Copy link
Member

jiahao commented Dec 2, 2015

Well, oestensibly the reason for our special parsing of *(varargs...) was precisely to enable tricks of the form @LMescheder has played with, so I'm glad that someone has been able to make use of it! In theory, reordering is unsafe because of the nonassociativity of floating point roundoff, but in practice it doesn't seem to be a big problem in the end. And @alanedelman has yet to see a case where reordering is problematic in real problems.

@StefanKarpinski
Copy link
Sponsor Member

You can always force a particular association with parentheses.

@jiahao
Copy link
Member

jiahao commented Dec 2, 2015

Yes, that seems to be what is done in practice, rather than being left ambiguous for the reader to deduce from parsing rules.

@LMescheder
Copy link

What I really like about the idea is, that it is very reasonable to interpret the absence of parentheses as "I don't care about association". In fact, I think this interpretation is far more natural than the usual "act as though I wanted multiplication from the left" interpretation (even though most programming languages and frameworks interpret it in the second way).
@kshyatt I've just created a pull request to METADATA.jl

@LMescheder
Copy link

However, it should be noted, that the dynamic programming algorithm's running time is cubic and the required space quadratic in the number of factors. So there should be some heuristic preventing the algorithm from running if someone e.g. wants to multiply 10000 2x2 matrices...

@Jutho
Copy link
Contributor

Jutho commented Dec 3, 2015

I guess nobody will explicitly write the multiplication of 10000 matrices as
A1 * A2 * ... * A10000, i.e. the heuristic might be that the algorithm is worth running for as many multiplications as people can actually write after each other. Unless of course some kind of code generation is used, but that certainly seems bad practice.

@andreasnoack
Copy link
Member Author

Excluding sqaure matrices seems like good conservative heuristic. Have you ever looked at the Hu and Shing (1981) algorithm under the More Efficient Algorithms section on Wikipedia? Apparantly it's n log n instead of n^3.

@LMescheder
Copy link

@andreasnoack Thanks for the hint, no I was not aware of it. In their paper (part 1 and part2) they state that empirically the algorithm runs faster than the dynamic programming algorithm for 7 or more factors. Probably the best approach would be to implement both and select the algorithm depending on the problem size. There also seems to be a O(n) heuristic algorithm whose solution is never worse than 1.155 times the optimal solution... In their paper they also cite some other heuristic approaches with different error ratios.

@andreasnoack
Copy link
Member Author

Thanks for summarizing the paper. I've never read it. Seems like dynamic programming with an upper limit to the number of factors considered is a good solution. Later on, someone might want to add the fancy version for long chains.

@StefanKarpinski
Copy link
Sponsor Member

What about

  1. Always using the O(n) heuristic, or
  2. Using the this exact algorithm up to 6 factors and then switching to the O(n) heuristic

?

@stevengj
Copy link
Member

stevengj commented Dec 3, 2015

+1 for something like @StefanKarpinski's option 2, or even doing the naive algorithm for > MIN factors (or grouping it into clumps of consecutive MIN terms and applying the optimal algorithm to each clump independently). In practice, we really only care about being optimal for a limited number of factors. For products of hundreds or thousands of factors, which can really only arise from metaprogramming, we should just do something reasonable — if someone is doing metaprogramming, they should probably think about their problem enough to figure out a good order themselves (and most likely all the matrices are the same size anyway if there are that many factors).

@LMescheder
Copy link

What's wrong with*(arrays...) for arrays an array of matrices? That's a way of multiplying a large number of matrices without metaprogramming, isn't it?

@StefanKarpinski
Copy link
Sponsor Member

It unduly stresses the type system since we dispatch on the type of all n arguments to *. It would be better to use prod to do a reduction. Having a reducer for matrices that cleverly tries to minimize work is an entirely different problem from optimizing a fixed, limited number of matrix multiplies.

@JeffBezanson
Copy link
Sponsor Member

dup of #3223?

@JeffBezanson JeffBezanson modified the milestones: 0.5.x, 0.5.0 Mar 9, 2016
@tkelman tkelman removed this from the 0.5.x milestone Jul 28, 2016
vtjnash pushed a commit that referenced this issue Jun 7, 2021
This addresses the simplest part of #12065 (optimizing * for optimal matrix order), by adding some methods for * with 3 arguments, where this can be done more efficiently than working left-to-right.

Co-authored-by: Daniel Karrasch <daniel.karrasch@posteo.de>
@vtjnash
Copy link
Sponsor Member

vtjnash commented Jun 7, 2021

Closed by #37898

@vtjnash vtjnash closed this as completed Jun 7, 2021
shirodkara pushed a commit to shirodkara/julia that referenced this issue Jun 9, 2021
This addresses the simplest part of JuliaLang#12065 (optimizing * for optimal matrix order), by adding some methods for * with 3 arguments, where this can be done more efficiently than working left-to-right.

Co-authored-by: Daniel Karrasch <daniel.karrasch@posteo.de>
johanmon pushed a commit to johanmon/julia that referenced this issue Jul 5, 2021
This addresses the simplest part of JuliaLang#12065 (optimizing * for optimal matrix order), by adding some methods for * with 3 arguments, where this can be done more efficiently than working left-to-right.

Co-authored-by: Daniel Karrasch <daniel.karrasch@posteo.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
linear algebra Linear algebra performance Must go faster
Projects
None yet
Development

No branches or pull requests