Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

convert arguments to MatrixBase<T> #62

Closed
syclik opened this issue Jul 6, 2015 · 29 comments

Comments

@syclik
Copy link
Member

commented Jul 6, 2015

From @jpritikin on January 10, 2015 14:17

Per http://eigen.tuxfamily.org/dox/TopicFunctionTakingEigenTypes.html, it is more flexible if all Eigen arguments are declared as MatrixBase. This issue exists to track the conversion.

Copied from original issue: stan-dev/stan#1209

@jpritikin

This comment has been minimized.

Copy link
Contributor

commented Jan 29, 2016

Any progress here? Have the templates been improved to make this conversion easier?

@syclik

This comment has been minimized.

Copy link
Member Author

commented Jan 29, 2016

What tests are sufficient for converting to MatrixBase<T>?

This was brought up a while ago and I don't know what the right answer is. Bob and I have both struggled with making things more general because some of the types Eigen returns doesn't fit into MatrixBase when it might be compatible with Matrix.

For action to be taken, @jpritikin please clarify what testing would be sufficient to verify something is general enough for use and that it functions properly.

@jpritikin

This comment has been minimized.

Copy link
Contributor

commented Jan 29, 2016

@syclik What tests are sufficient for converting to MatrixBase?

In my mind, it's just a matter of converting the types as needed. The way templates work, T could be literally anything so you need to decide which types we actually care about. I suggest we care about the types that we have tests for, not the other way around.

@syclik

This comment has been minimized.

Copy link
Member Author

commented Mar 11, 2016

@jpritikin, just pinging you again. I sent an example via email where MatrixBase doesn't work -- is there a set of closed types we should test with to make sure it'll work with OpenMX? I could list the types for Stan.

@syclik

This comment has been minimized.

Copy link
Member Author

commented Mar 11, 2016

(if you think it's not going to hold up generally, I'd rather close this issue and we can address the MatrixBase issue again if it is ever dealt with more consistently from Eigen)

@jpritikin

This comment has been minimized.

Copy link
Contributor

commented Mar 11, 2016

@syclik Yeah, I look at the email you forwarded. The guess the use case for OpenMx is different compared to Stan. OpenMx does not generate code. We just use the Stan math library as a regular C++ library. So there is never ambiguity about what types you pass into Stan functions since a human wrote the code. With Stan, you're generating code so I grant that you do have to know which types work.

Let's consider an example. Suppose you have a function that takes a Matrix<S,R,C>. Stan generates code that compiled with this function prototype. Now suppose we change the function to take MatrixBase<T>. We know that Stan generated code will continue to work because we can test with MatrixBase<Matrix<S,R,C> >, right? And OpenMx developers get happy too because now we can also pass in Map< ... >. What am I missing?

@syclik

This comment has been minimized.

Copy link
Member Author

commented Mar 11, 2016

If you look at the example I sent, it shows where it compiles, but does the
wrong thing at runtime when using MatrixBase, but not when using Matrix
directly. Please let me know when you take a look at the program in the
email and can verify that MatrixBase and Matrix are not completely
compatible... we can then talk about how to work around that, if it's
possible.

On Fri, Mar 11, 2016 at 3:52 PM, Joshua Pritikin notifications@github.com
wrote:

@syclik https://github.com/syclik Yeah, I look at the email you
forwarded. The guess the use case for OpenMx is different compared to Stan.
OpenMx does not generate code. We just use the Stan math library as a
regular C++ library. So there is never ambiguity about what types you pass
into Stan functions since a human wrote the code. With Stan, you're
generating code so I grant that you do have to know which types work.

Let's consider an example. Suppose you have a function that takes a
Matrix<S,R,C>. Stan generates code that compiled with this function
prototype. Now suppose we change the function to take MatrixBase. We
know that Stan generated code will continue to work because we can test
with MatrixBase<Matrix<S,R,C> >, right? And OpenMx developers get happy
too because now we can also pass in Map< ... >. What am I missing?


Reply to this email directly or view it on GitHub
#62 (comment).

@jpritikin

This comment has been minimized.

Copy link
Contributor

commented Mar 11, 2016

Yeah, I looked at it. The u*v in foo(u*v, 0, 0); does not perform a multiply but creates an expression template because the first argument of foo is a MatrixBase<T> instead of a Matrix<S,R,C>. My point was that an OpenMx developer will never pass something like u*v into a MatrixBase argument because OpenMx is written by humans. Stan developers need to be more careful because you are writing a code generator. If you already generate code like u*v and expect it to work then changing the type of a function like foo is going to wreck havoc until you teach your code generator to generate (u*v).eval() where needed. Hm, I guess this change request is more complex than I realized.

@syclik "is there a set of closed types we should test with to make sure it'll work with OpenMX?" -- No, the testing needed for OpenMx is modest.

@syclik I could list the types for Stan.

Can you? Can you easily point to all the callers and potentially generated callers of foo (for a given foo)?

@syclik

This comment has been minimized.

Copy link
Member Author

commented Mar 11, 2016

Regarding OpenMx, I'm not sure I buy that a human won't ever write (u*v).
Why wouldn't a human write that if you really wanted to pass the product
into a function?

For Stan, it's:

  • Matrix
  • Block
  • Map
  • any expression of those three things that are returned from the
    operations of Eigen (*, /, , and whatever else Eigen has available)

Regarding generating code, sure, we could evaluate... or even have all our
internal functions return Matrix instead of the expression type, but I
think we lose a lot of the power of Eigen's expression templates if we do.
Of course, we might have already lost it by not taking MatrixBase, but at
least I know that things are safe.

That said, let's close this issue. Can you email stan-dev and outline which
functions you'd like to see extended to using MatrixBase and we can see
what we can do to make it happen? I don't think we should try to tackle all
of the code base due to this expression problem, but we could tackle a few
of them if you wanted to expose them.

On Fri, Mar 11, 2016 at 4:23 PM, Joshua Pritikin notifications@github.com
wrote:

Yeah, I looked at it. The u_v in foo(u_v, 0, 0); does not perform a
multiply but creates an expression template because the first argument of
foo is a MatrixBase instead of a Matrix<S,R,C>. My point was that an
OpenMx developer will never pass something like u_v into a MatrixBase
argument because OpenMx is written by humans. Stan developers need to be
more careful because you are writing a code generator. If you already
generate code like u_v and expect it to work then changing the type of a
function like foo is going to wreck havoc until you teach your code
generator to generate (u*v).eval() where needed. Hm, I guess this change
request is more complex than I realized.

@syclik https://github.com/syclik "is there a set of closed types we
should test with to make sure it'll work with OpenMX?" -- No, the testing
needed for OpenMx is modest.

@syclik https://github.com/syclik I could list the types for Stan.

Can you? Can you easily point to all the callers and potentially generated
callers of foo (for a given foo)?


Reply to this email directly or view it on GitHub
#62 (comment).

@jpritikin

This comment has been minimized.

Copy link
Contributor

commented Mar 11, 2016

Regarding generating code, sure, we could evaluate... or even have all our
internal functions return Matrix instead of the expression type, but I
think we lose a lot of the power of Eigen's expression templates if we do.
Of course, we might have already lost it by not taking MatrixBase, but at
least I know that things are safe.

I suggest that not knowing is a concern. Is there an easy way (besides looking at the assembly code) to understand whether the compiler is creating unnecessary temporary copies of everything? It seems like the testing should be focused more specifically on what we think the compiler might do wrong. I mean, if an addition expression template works then subtraction will certainly work. It would be a bit silly to test both of them.

@bob-carpenter

This comment has been minimized.

Copy link
Contributor

commented Mar 12, 2016

On Mar 11, 2016, at 6:02 PM, Joshua Pritikin notifications@github.com wrote:

Regarding generating code, sure, we could evaluate... or even have all our
internal functions return Matrix instead of the expression type, but I
think we lose a lot of the power of Eigen's expression templates if we do.
Of course, we might have already lost it by not taking MatrixBase, but at
least I know that things are safe.

I suggest that not knowing is a concern. Is there an easy way (besides looking at the assembly code) to understand whether the compiler is creating unnecessary temporary copies of everything?

Not that I know of. But we know that whenever we
have a function that returns Matrix, then we have to
copy if the internal result is a template expression.
Same for arguments to functions.

So

MatrixXd plus(const MatrixXd& a, const MatrixXd& b) {
  return a + b;
}

would copy rather than return the sum template expression.

And if we did this:

plus(a + b, c + d);

then the arguments get copied rather than passed as template expressions.

It seems like the testing should be focused more specifically on what we think the compiler might do wrong.

Testing pretty much needs to be exhaustive to make sure
things don't go wrong. But in some sense, we can't really test
every floating point input to every function, so some judgement
is required.

I mean, if an addition expression template works then subtraction will certainly work. It would be a bit silly to test both of them.

certainly -> probably

@jpritikin

This comment has been minimized.

Copy link
Contributor

commented Mar 13, 2016

But we know that whenever we have a function that returns Matrix, then we have to copy if the internal result is a template expression.

So you don't care very much about obtaining extra performance using template expressions. You'd rather just have correctness and fewer C++ subtleties?

@bob-carpenter

This comment has been minimized.

Copy link
Contributor

commented Mar 13, 2016

I'd say that we insist on correctness first. Given
we can maintain correctness, extra performance is good.

I'm OK with C++ subtlety, but all else being equal,
straightforward code is better because many people have
to maintain it.

To summarize, the heart of the problem with correctness
and template expressions here is:

  1. Eigen returns fundamentally different kinds of template
    expressions for arithmetic operations and blocking operations
  2. MatrixBase deals with one and Ref deals with the other; Ref
    is nice and general, but type inference won't work through it.

The only solution I see is to either

A. rewrite the library to accept both inputs (lots of code)

B. rewrite the library to accept just MatrixBase and
make sure we never return block() operations (trades one
kind of efficiency for the other)

So I'm not clear on what you think we should do.

There are at least three other ways we can speed up basic
matrix math:

  • better memory locality blocking algorithms for the matrix
    autodiff functions (potential order of magnitude speedup here)
  • CPU parallelism with multithreading in both matrix libs
    and in ODE solvers and in our density calculations
  • GPU callouts from matrix ops (potential order of magnitude or
    more speedup here, conditioned on doing the above first so
    everything's not memory bound)
  • Bob

On Mar 13, 2016, at 3:27 PM, Joshua Pritikin notifications@github.com wrote:

But we know that whenever we have a function that returns Matrix, then we have to copy if the internal result is a template expression.

So you don't care very much about obtaining extra performance using template expressions. You'd rather just have correctness and fewer C++ subtleties?


Reply to this email directly or view it on GitHub.

@jpritikin

This comment has been minimized.

Copy link
Contributor

commented Mar 15, 2016

On Sun, Mar 13, 2016 at 12:38:44PM -0700, Bob Carpenter wrote:

To summarize, the heart of the problem with correctness
and template expressions here is:

  1. Eigen returns fundamentally different kinds of template
    expressions for arithmetic operations and blocking operations
  2. MatrixBase deals with one and Ref deals with the other; Ref
    is nice and general, but type inference won't work through it.
    The only solution I see is to either
    A. rewrite the library to accept both inputs (lots of code)
    B. rewrite the library to accept just MatrixBase and
    make sure we never return block() operations (trades one
    kind of efficiency for the other)
    So I'm not clear on what you think we should do.
    There are at least three other ways we can speed up basic
    matrix math:
  3. better memory locality blocking algorithms for the matrix
    autodiff functions (potential order of magnitude speedup here)
  4. CPU parallelism with multithreading in both matrix libs
    and in ODE solvers and in our density calculations
  5. GPU callouts from matrix ops (potential order of magnitude or
    more speedup here, conditioned on doing the above first so
    everything's not memory bound)

Any solution here seems so complicated that I give up. I suspect C++
(even C++17) just isn't up to the task.

Has anybody spent some time looking closely at Rust,
https://www.rust-lang.org/ ?

Joshua N. Pritikin
Department of Psychology
University of Virginia
485 McCormick Rd, Gilmer Hall Room 102
Charlottesville, VA 22904
http://people.virginia.edu/~jnp3bc

@bgoodri

This comment has been minimized.

Copy link
Contributor

commented Mar 15, 2016

This

B. rewrite the library to accept just MatrixBase and make sure we never return block() operations (trades one kind of efficiency for the other)

seems better for Stan than option A, but I don't know if OpenMx is ever returning a block().

@jpritikin

This comment has been minimized.

Copy link
Contributor

commented Mar 15, 2016

but I don't know if OpenMx is ever returning a block()

For OpenMx, MatrixBase is preferable. That's how this whole discussion got started.

@bob-carpenter

This comment has been minimized.

Copy link
Contributor

commented Aug 18, 2016

I want to reserve issues for when we have a concrete plan of what to implement, so I'm closing this. The top-line summary is that

  • MatrixBase<T> would allow us to pass template expressions, thus avoiding copies and improving performance
  • What's preventing us from doing this is that not every Eigen expression is assignable to MatrixBase<T> Specifically, the blocking operations aren't assignable.
@bgoodri

This comment has been minimized.

Copy link
Contributor

commented Aug 18, 2016

The manual has an example of assigning to the top left block

http://eigen.tuxfamily.org/dox/classEigen_1_1Block.html

On Wed, Aug 17, 2016 at 9:12 PM, Bob Carpenter notifications@github.com
wrote:

Closed #62 #62.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#62 (comment), or mute the
thread
https://github.com/notifications/unsubscribe-auth/ADOrqq4hM5MUv35ICB0JF5T0Q6LP5_82ks5qg7FggaJpZM4FTFS9
.

@bob-carpenter

This comment has been minimized.

Copy link
Contributor

commented Aug 18, 2016

Exactly what Daniel and I keep trying to say---when you apply the block method, you get an instance of Eigen::Block<Derived>, which can't be assigned to Eigen::MatrixBase<Derived2>.

I'd be very happy to be wrong about this, but won't be convinced until I see working code for a function with an Eigen::MatrixBase<Derived2> argument that can accept the result of a block operation without evaluating first.

@bgoodri

This comment has been minimized.

Copy link
Contributor

commented Aug 18, 2016

I am not understanding what doesn't work. Does the generated code for this "evaluate m first"? If so, what is the problem with that?

goodrich@T540p:/tmp$ clang++-3.9 -I/usr/include/eigen3 -o test.o test.cc 
goodrich@T540p:/tmp$ ./test.o 
m = 
1 2
5 6
goodrich@T540p:/tmp$ cat test.cc 

#include <Eigen/Dense>
#include <iostream>

template<typename Derived>
void foo(const Eigen::MatrixBase<Derived>& m) {
  std::cout << "m = " << std::endl << m << std::endl;
}

int main()
{
  Eigen::MatrixXd m(4,4);
  m <<  1, 2, 3, 4,
        5, 6, 7, 8,
        9,10,11,12,
       13,14,15,16;
  foo(m.topLeftCorner(2,2));
  return 0;
}
@bob-carpenter

This comment has been minimized.

Copy link
Contributor

commented Aug 18, 2016

I meant "eval" in the expression template sense --- copy it
out of an expression template into the base representation type.
That happens implicitly on assigning to a MatrixXd or passing
to a function as a MatrixXd argument.

I verified that code works for me and with the actual block() call
in Eigen.

@syclik Wasn't it the block() operations that were causing
us problems before with assigning to MatrixBase?

Assuming @syclik and I were just confused, the next step
would be figuring out what the return types would have to
be for all of our functions. As @jpritikin keeps pointing out,
this could lead to a substantial reduction in copies
(not that those are likely to be a bottleneck compared
to the autodiff, but it can't hurt).

Changes can be staged in two steps:

  1. generalizing all the arguments
  2. generalizing the return types

the first will have to be all in one go, whereas the second
can be done piecemeal.

It will make our math lib doxygen doc as incomprehensible
as Eigen's, but I think that's probably OK.

Is there going to be a way to specialize for vectors?

@syclik

This comment has been minimized.

Copy link
Member Author

commented Aug 18, 2016

I'll try to dig up a reproducible example. If didn't have to do with assignment. It had to do with some things not working as advertised if you used MattixBase instead of Matrix.

On Aug 18, 2016, at 6:52 AM, Bob Carpenter notifications@github.com wrote:

I meant "eval" in the expression template sense --- copy it
out of an expression template into the base representation type.
That happens implicitly on assigning to a MatrixXd or passing
to a function as a MatrixXd argument.

I verified that code works for me and with the actual block() call
in Eigen.

@syclik Wasn't it the block() operations that were causing
us problems before with assigning to MatrixBase?

Assuming @syclik and I were just confused, the next step
would be figuring out what the return types would have to
be for all of our functions. As @jpritikin keeps pointing out,
this could lead to a substantial reduction in copies
(not that those are likely to be a bottleneck compared
to the autodiff, but it can't hurt).

Changes can be staged in two steps:

  1. generalizing all the arguments
  2. generalizing the return types

the first will have to be all in one go, whereas the second
can be done piecemeal.

It will make our math lib doxygen doc as incomprehensible
as Eigen's, but I think that's probably OK.

Is there going to be a way to specialize for vectors?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

@jpritikin

This comment has been minimized.

Copy link
Contributor

commented Aug 18, 2016

@syclik If I recall correctly, the problem was that when you use expression templates then you might unintentionally pass some big complex expression into some other function that is written to assume, from a performance perspective, that it is getting a flat, simple matrix with elements that can be addressed in constant time. Since the current code base uses Matrix<> everywhere, nobody is careful about where it is important to eval() expression templates into a simple Matrix. @bob-carpenter is skeptical that data copying is a major performance hit. Well, I bet it is. My experience is that CPU/memory bandwidth is often the limiting factor for numerical algorithms.

Changing the whole codebase would be a huge project. What I would encourage is some exploration of whether an incremental approach of small steps is feasible.

@jpritikin

This comment has been minimized.

Copy link
Contributor

commented Aug 18, 2016

I tried to find the troublesome example cited by @syclik or @bob-carpenter in my email archive, but all I could find was this:

On Sun, Mar 13, 2016 at 12:38:44PM -0700, Bob Carpenter wrote:

To summarize, the heart of the problem with correctness
and template expressions here is:

  1. Eigen returns fundamentally different kinds of template
    expressions for arithmetic operations and blocking operations
  2. MatrixBase deals with one and Ref deals with the other; Ref
    is nice and general, but type inference won't work through it.
    The only solution I see is to either
    A. rewrite the library to accept both inputs (lots of code)
    B. rewrite the library to accept just MatrixBase and
    make sure we never return block() operations (trades one
    kind of efficiency for the other)
    So I'm not clear on what you think we should do.
    There are at least three other ways we can speed up basic
    matrix math:
  3. better memory locality blocking algorithms for the matrix
    autodiff functions (potential order of magnitude speedup here)
  4. CPU parallelism with multithreading in both matrix libs
    and in ODE solvers and in our density calculations
  5. GPU callouts from matrix ops (potential order of magnitude or
    more speedup here, conditioned on doing the above first so
    everything's not memory bound)

I'm not sure that I 100% believe what Bob said, but I agree it is a very complex issue.

@syclik

This comment has been minimized.

Copy link
Member Author

commented Aug 18, 2016

I posted an example to stan-dev.

@jpritikin: I'd love to see an example where there is a performance hit. If
you can produce a reasonable example (one that might arise from the Stan
language and where we can clearly see the performance hit that's consistent
on at least two platforms), then I'm really happy to think about how to
avoid the copying.

On Thu, Aug 18, 2016 at 7:25 AM, Joshua Pritikin notifications@github.com
wrote:

I tried to find the troublesome example cited by @syclik
https://github.com/syclik or @bob-carpenter
https://github.com/bob-carpenter in my email archive, but all I could
find was this:

On Sun, Mar 13, 2016 at 12:38:44PM -0700, Bob Carpenter wrote:

To summarize, the heart of the problem with correctness
and template expressions here is:

  1. Eigen returns fundamentally different kinds of template expressions
    for arithmetic operations and blocking operations
  2. MatrixBase deals with one and Ref deals with the other; Ref is nice
    and general, but type inference won't work through it. The only solution I
    see is to either A. rewrite the library to accept both inputs (lots of
    code) B. rewrite the library to accept just MatrixBase and make sure we
    never return block() operations (trades one kind of efficiency for the
    other) So I'm not clear on what you think we should do. There are at least
    three other ways we can speed up basic matrix math:
  3. better memory locality blocking algorithms for the matrix autodiff
    functions (potential order of magnitude speedup here)
  4. CPU parallelism with multithreading in both matrix libs and in ODE
    solvers and in our density calculations
  5. GPU callouts from matrix ops (potential order of magnitude or more
    speedup here, conditioned on doing the above first so everything's not
    memory bound)

I'm not sure that I 100% believe what Bob said, but I agree it is a very
complex issue.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#62 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAZ_F5kxn9bXKaZLM0-1tA6n9Lu8Sa9uks5qhEE0gaJpZM4FTFS9
.

@jpritikin

This comment has been minimized.

Copy link
Contributor

commented Aug 18, 2016

@syclik I'm not sure how I would produce an example to show the performance difference in copying data vs not-copying in the stan language because stan-math copies. Where I am going to get an implementation that doesn't copy? The reason I suspect that copying impacts performance is because of my experience with programming mathematical algorithms.

@jpritikin

This comment has been minimized.

Copy link
Contributor

commented Aug 18, 2016

I posted an example to stan-dev.

Yeah, linked here. So what I am saying is that you'd have to find all the places where you might generate a multiplication (or similar non-addressable expression template) and add an eval() before passing it into a function.

@bob-carpenter

This comment has been minimized.

Copy link
Contributor

commented Aug 18, 2016

@syclik

This comment has been minimized.

Copy link
Member Author

commented Aug 18, 2016

Yes. I came to the same conclusion. But... it would still be helpful to know what operations don't work with for template expressions passed into MatrixBase. Is it just accessing elements? Or is there more?

I really dislike that it has an assertion failure (in the current Eigen). We can't catch assertions and do something sensible with it. It'll just stop everything.

@syclik syclik modified the milestone: v2.12.0 Sep 7, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.