-
-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds views for var_value<Eigen::Matrix> class #2024
Conversation
…4.1 (tags/RELEASE_600/final)
…4.1 (tags/RELEASE_600/final)
…4.1 (tags/RELEASE_600/final)
stan/math/rev/core/vari.hpp
Outdated
* @param j Column index | ||
*/ | ||
inline const auto coeff(Eigen::Index i, Eigen::Index j) const { | ||
return vari_value<double>(val_(i, j), adj_(i, j)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making just a copy of the value and adjoint is not the correct thing to do here, as setting adjoint of the copy will not propagate it back to the matrix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that's intented since you can't assign to subsets of a var_value<>
. That's also why these are all returning const types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately there still is a problem. Even if you only use a value in a rev operation (and not change it), that rev operation still needs to update the adjoint and that adjoint needs to propagate. I think we need a scalar var view with separate pointers to value and adjoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes so I agree with this, what I'm trying to figure out right now is that for the var to decide if it holds a view or an actual vari_value
I'm doing
using vari_type = std::conditional_t<is_plain_type<value_type>::value,
vari_value<value_type>, vari_view<T>>;
So this is fine and good for Block
types and other views, but for views into coefficients value_type
/T
will be a double
which is the plain type so this will become a vari_value<double>
. We don't want to explicitly have a second template parameter to set the vari_type
to, can you think of another way to handle this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would make a separate class var_view
(not vari
). This also avoids two pointer dereferences (one to get vari and second to get value/adjoint) )when operationg on such a scalar. Also since this class is only used for scalars we don't need to template it.
We could also make this object be implicitly convertible to var_value<double>
, registering the callback to propagate adjoint on the stack. Although for performance reason we could also not (or at least make it explicit) and explicitly use var_view
wherever this is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting! Okay I'll work on this today
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I had a bit of a hard time getting what you suggested to work. In var<mat>.coeff
I added a callback to propogate the adjoint of the var_value<double>
back to the original var<mat>
. I think this makes since assuming that the two cases this will be used 99% of the time are
- assigning a cell of a static matrix to a var
real a = -B[i];
- iterating over a static matrix to assign to the cells in a dynamic matrix
matrix[N, M] A; // assume static
matrix[N, M] B; // assume dynamic
for (i in 1:N) {
B[i] = -A[i]; // very inefficient but a thing a user might want to do
}
so for both of these we will need to make a var anyway so having a var_view
would not help. Are there normal cases I'm missing that a var_view
would be more efficient? If you have any time to take a crack at this I'd be interested in seeing what you have in mind!
…4.1 (tags/RELEASE_600/final)
Ugh I think we updated doxygen and it's catching a new doc error. Gonna look into it |
Ack this might be a doxygen bug. @serban-nicusor-toptal do you know what version of doxygen Jenkin's uses? If it's from 1.8.14-1.8.17 it might be the bug below. |
Hey @SteveBronder
Should I update it ? Latest or any specific version ? |
It looks like the amici folks were able to work around this and I just asked them what the problem was for them so we could potentially fix it here. I'm not sure updating doxygen would work. I'm going to set doxygen warnings to not error out for a bit. I'm on a debian buster setup right now that only has access to doxygen 1.8.13 so can't replicate the error but I think I can replicate this on my home desktop |
Jenkins Console Log Machine informationProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010CPU: G++: Clang: |
…into feature/vari-eigen-slices
stan/math/prim/meta/plain_type.hpp
Outdated
@@ -70,7 +70,7 @@ struct plain_type<T, require_t<is_detected<T, internal::plain_type_check_t>>> { | |||
*/ | |||
template <typename T> | |||
struct plain_type<T, | |||
require_t<bool_constant< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@t4c1 I had to add this so that it would use the default for kernel expressions. Or should we add a PlainObject
to kernel expressions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or should we add a PlainObject to kernel expressions?
No. Adding a require for eigen types here is fine. If/when we do need a plain_type
for kernel expressions we should add a specialization, not PlainObject
.
@serban-nicusor-toptal it looks like the distrubtion tests are failing because the jenkins machine ran out of disk storage, do you know how to fix that? |
reverse_pass_callback([vari_coeff, this, i]() mutable { | ||
this->vi_->adj_(i) += vari_coeff->adj_; | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like having to do a copy for the reverse pass. I guess it works, so we can leave it for now, but I sure am revisiting this soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it's a bummer, but I think given the use case it make sense no? Like if most of the time we are using .coeff()
to assign to a var then I think we need to do something like this right? But I'm fine with it for now so I can go start the plumbing up in the stan repo so we can kick of work on the compiler stuff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you wait a few days with changes in stan repo and compiler? I have an idea I would like to try on this views. I think we can make them assignable, so there is no need for thius type of matrices to be static. This would allow us to eventuallly completely replace matrix<var>
with var<matrix>
.
Jenkins Console Log Machine informationProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010CPU: G++: Clang: |
@t4c1 when you get a min can you take a look at this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, there is just some cleanup left to do.
Jenkins Console Log Machine informationProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010CPU: G++: Clang: |
Summary
This adds the views
block()
,row()
,col()
,operator()()
,coeff()
segment()
,head()
,tail()
tovar_value
andvari_value
.Some things to note
const
.Strides
so this adds a new template parameter tovari_value
that allows the underlyingEigen::Map
to have anInnerStride<>
orOuterStride<>
.const
and we need to specify a stride for thevari_type
inside ofvar_value
there's an additional templateVariType
forvar_value
that allows for the underlying vari_type to be overridden. So for example in theblock()
function we havewhere
vari_sub::vari_type
is anvari_value<Eigen::MatrixXd, Eigen::OuterStride<>>
I'd really like to get rid of the extra templates here but I'm not sure how else to handle the
InnerStride<>
andOuterStride<>
for the underlyingEigen::Map
without them.Tests
Tests were added to
var_test
andvari_test
to check whether subsets are taken correctly.Side Effects
Not a side effect but some todo's include
var_value<Matrix>
overrides for the functionshead()
etc for the stan language to accessvar_value<SparseMatrix>
. I wanted to wait on that to make sure the pattern here looked okay before I wrote it out for sparse as well.Release notes
Replace this text with a short note on what will change if this pull request is merged in which case this will be included in the release notes.
Checklist
Math issue How to add static matrix? #1805
Copyright holder: Steve Bronder
The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
- Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
- Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
the basic tests are passing
./runTests.py test/unit
)make test-headers
)make test-math-dependencies
)make doxygen
)make cpplint
)the code is written in idiomatic C++ and changes are documented in the doxygen
the new changes are tested