Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eigen type caster improvements #215

Merged
merged 14 commits into from Jun 7, 2023
Merged

Eigen type caster improvements #215

merged 14 commits into from Jun 7, 2023

Conversation

WKarel
Copy link
Contributor

@WKarel WKarel commented May 16, 2023

  • new: is_ndarray_scalar_v, used to limit type_caster<Eigen> to those scalars that it can actually handle. This gives users the chance to add their own specializations for other scalars.
  • de-bug: correct handling of Eigen/ndarray strides
  • enhanced: fully support Eigen's dynamic strides; request contiguous memory from ndarray only if really needed.
  • de-bug: prevent nb::cast<Eigen::Map> and nb::cast<Eigen::Ref<mutable>> from pointing to invalid memory due to conversions.
  • enhanced: renamed a few test functions to better reflect their signature.

As discussed, type_caster<Eigen::Map> and type_caster<Eigen::Ref<mutable>> do not support conversions any longer, unless a cleanup_list is passed. This prevents nb::cast from returning Eigen types that point into destroyed memory.

Still, I suggest these 2 type casters to not support conversions at all, even if they handle bound function arguments. I believe that the intention of binding a function that expects an Eigen::Map is to modify the passed memory (Map<mutable>) or to avoid copies (Map<const>). Silent conversions may hide errors and show surprising behaviour instead, see e.g. https://github.com/WKarel/nanobind/blob/f339ec0df6ce0d2d185e82df03e4e1905c04a183/tests/test_eigen.py#L99 where the argument remains unmodified.

… scalars that it can actually handle.

- de-bug: correct handling of Eigen/ndarray strides
- enhanced: fully support Eigen's dynamic strides; request contiguous memory from ndarray only if really needed.
- de-bug: prevent nb::cast<Eigen::Map> and nb::cast<Eigen::Ref<mutable>> from pointing to invalid memory due to conversions.
- enhanced: renamed a few test functions to better reflect their signature.
…und function argument, and ensure that it owns the data it maps.

- re-format to project coding style.
@wjakob
Copy link
Owner

wjakob commented May 22, 2023

(FYI, I am working towards a conference deadline and will get back to reviewing PRs in a few days -- sorry for the delay)

Copy link
Owner

@wjakob wjakob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A first set of feedback from a quick pass over the PR.

@@ -28,6 +28,28 @@ NAMESPACE_BEGIN(detail)
template <typename T>
constexpr int NumDimensions = bool(T::IsVectorAtCompileTime) ? 1 : 2;

template<typename T>
struct StrideExtr
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very minor: in nanobind, opening braces are on the same line.

using Type = Eigen::Stride<0, 0>;
};

template <typename T, int Options, typename StrideType>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

template<typename T>
using Stride = typename StrideExtr<T>::Type;

template<typename T>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment explaining the different cases being considered here. I looked at it and couldn't understand the whole thing.

Please rename top-level constexpr to snake_case (requires_contig_memory)

>
>;
f_contig>,
any_contig>>;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part wasn't clear to me. Why any_contig if requiresContigMemory==false?

@@ -150,4 +178,18 @@ NB_MODULE(test_eigen_ext, m) {
nb::class_<ClassWithEigenMember>(m, "ClassWithEigenMember")
.def(nb::init<>())
.def_rw("member", &ClassWithEigenMember::member);

m.def("castToMapVXi", [](nb::object obj) -> Eigen::Map<Eigen::VectorXi>
{
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix indentation/brace, ditto for lines below


if constexpr (T::IsRowMajor)
std::swap(inner, outer);

if constexpr (IS == 0) {
assert(inner == 1);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't include cassert, so I don't think it's generally valid to call that here. Do you think it's really necessary that these assertions are part of the type caster, or were these merely helpful to debug all cases on your end?

if (std::max<int64_t>(1, StrideType::InnerStrideAtCompileTime) != (NumDimensions<T> == 1 || !T::IsRowMajor ? caster.value.stride(0) : caster.value.stride(1)))
return false;
if constexpr (NumDimensions<T> == 2 && StrideType::OuterStrideAtCompileTime != Eigen::Dynamic) {
int64_t outerStrideExpected;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

camel_case.

return false;
if constexpr (!requiresContigMemory<Map>) {
if constexpr (StrideType::InnerStrideAtCompileTime != Eigen::Dynamic)
if (std::max<int64_t>(1, StrideType::InnerStrideAtCompileTime) != (NumDimensions<T> == 1 || !T::IsRowMajor ? caster.value.stride(0) : caster.value.stride(1)))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't include <algorithm> in type casters, so you can't call std::max (even if convenient). The fact that it happens to compile is probably not good to rely on.

if (!caster.from_python(src, flags, cleanup))
return false;
if constexpr (!requiresContigMemory<Map>) {
if constexpr (StrideType::InnerStrideAtCompileTime != Eigen::Dynamic)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic of this function is quite intricate -- as with the constexpr expression at the top, it would be good to document the cases in readable language.

Comments to clarify the type_casters for Eigen types.
Formatting.
@wjakob
Copy link
Owner

wjakob commented May 25, 2023

Hi @WKarel — I saw that you posted some changes. Please let me know once I should re-review.

@WKarel
Copy link
Contributor Author

WKarel commented May 25, 2023

Dear @wjakob ,

I had prepared this comment, but forgot to post it :-D

I hope to have addressed all your remarks in my latest commit.

I'm a friend of comments in code, but I noticed that they are rather sparse in nanobind, and so I initially did not introduce any. Concerning the formatting rules, are they documented somewhere? I was confused by some pre-existing code like

constexpr int NumDimensions ...

right in dense.h, and the rules are still a bit unclear to me.

I have commented out the assert statements. But once more out of curiosity, what is the motivation for not including <cassert>? I see assertions as a more concise alternative to verbose text comments, with the additional benefit of them being enforced in debug builds (if not commented out). Like comments, they are not really necessary, of course.

Along the way, I discovered a bug when mapping Eigen::RowVector, which I resolved, together with introducing a respective test.

The failed tests on Ubuntu should be gone.

Still, I suggest these 2 type casters to not support conversions at all, even if they handle bound function arguments. I believe that the intention of binding a function that expects an Eigen::Map is to modify the passed memory (Map) or to avoid copies (Map). Silent conversions may hide errors and show surprising behaviour instead, see e.g. https://github.com/WKarel/nanobind/blob/f339ec0df6ce0d2d185e82df03e4e1905c04a183/tests/test_eigen.py#L99 where the argument remains unmodified.

What is your opinion on the above?

@wjakob
Copy link
Owner

wjakob commented May 25, 2023

Hi Karel,

a few comments:

  • Formatting rules: they are not 100% consistent, sorry. Usually upper case CamelCase for classes, lowercase snake_case for methods and other globals. Some constexpr members have the former, some the latter. But we don't have any lower case camelCase, which is why I commented.

  • Commenting: this is basically my hobby project. When I accept a PR, I take on a responsibility to maintain that code in the future. I can usually understand my old code pretty easily. But when I read somebody else's code, I might not follow and ask for more comments. So the commenting requirement is a little asymmetric but needed for the continued maintainability of the project.

  • Nanobind has a policy of truly minimal include requirements (see the nanobind documentation on "why another library" regarding why this was a problem in pybind11). So that means much of the STL isn't available, including assert(). I think that you were able to call it because of the Eigen headers, which include cassert somewhere.

    An alternative would be an if-test followed by nanobind::detail::fail() or similar. But my question was: what are cases where you would expect these assertions to fail? Can it actually happen? Is an assertion truly the right error criterion?

  • What is your opinion on the above?

I agree that it is confusing and would be willing to support such a change.

@WKarel
Copy link
Contributor Author

WKarel commented May 25, 2023

Dear Wenzel,

thanks for your explanations. I do not expect any of the assertions to fail under any circumstances, unless there is a not yet discovered bug, of course. It seems that I had not inspected <cassert> for a long time, somehow thinking that it really only contained the definition of a single macro, which is far from true at least for VC++. Still, I think the assertions document the expectations well, and so I'd leave them commented out, if you agree.

Since you agree that Eigen::Map and Eigen::Ref<mutable T> should no longer support conversions, I'll push another according commit. Only Eigen::Ref<const T> will continue to support conversions in general, since this is what it is made for.

Cheers,
Willi.

type_caster<Eigen::Ref<T>>: prohibit conversions if T is mutable.
@WKarel
Copy link
Contributor Author

WKarel commented May 30, 2023

As discussed, Eigen::Map and Eigen::Ref<mutable T> no longer support conversions in my latest commit.

@wjakob
Copy link
Owner

wjakob commented Jun 4, 2023

Hi @WKarel,

I did a pass over the PR (great work!) with some minor stylistic changes and plan to merge it shortly with those edits. However, there was one part of your changes that I did not understand.

The type caster for Eigen::Ref<const T> attempts to invoke two nested Map type casters in sequence. The first tries to find a compatible representation, while the second one (dcaster) accepts a wider set of inputs, which will then likely involve a further copy being made.

What confuses me is that I don't understand how Ref(dcaster.operator DMap()) can be a valid operation. The text says that this creates a Ref that owns the data it maps, but I did not see anywhere in the Eigen docs how such ownership works. The Eigen documentation of Ref only mentions that temporaries may be created when calling a function (with Ref argument) that stay alive until the function call is done (which is basically the "normal" behavior of implicit conversions done during C++ function calls). But things are different here, because we are returning the Ref instance, so the usual lifetime policies of function argument construction don't apply.

Could the following could happen?

  1. cast via caster fails (e.g., because of incompatible strides and incompatible dtype)
  2. cast via dcaster succeeds with an implicit conversion to change the dtype (e.g., from float32 to float64). However, dcaster accepts any (dynamic) strides, and so it doesn't change the strides as part of the conversion. Ownership of the newly created temporary belongs to the cleanup stack.
  3. The operation Ref(dcaster.operator DMap()) performs another implicit conversion to conform the strides to the requirement of Ref. This temporary exists until this expression has finished evaluating, but it is freed by the time the return completes.
  4. The function taking a Ref accesses its contents, which is now undefined behavior. Two temporaries were created, but only the first one is still alive, while the Ref wraps the second.

Maybe I am understanding some of the details of how Ref works, and I would be curious about your thoughts on this.

Unrelated: On top of your changes, I will be adding one more tweak to potentially create ndarray with a const Scalar instead of Scalar. That makes it possible to call Eigen type bindings with read-only numpy arrays.

@wjakob
Copy link
Owner

wjakob commented Jun 5, 2023

(updated my above comment with a more precise question)

- support scalar type conversions only if a cleanup_list is passed, or if Ref will surely map its own data.
- support dynamic strides only if they can surely map Ref's contiguous memory.
@WKarel
Copy link
Contributor Author

WKarel commented Jun 6, 2023

Dear @wjakob ,

thanks!

The documentation on Eigen::Ref is quite sparse. I think the reason for using the term temporary is the use case for which this class was introduced, namely to use Eigen::Ref<T const> as function argument type. This makes it easier to define non-template functions that accept Eigen instances without making copies all the time.
If Ref<T const> is constructed from a suitable instance, then it just maps that instance's data. Otherwise, it maps its own member variable, having copy-converted it from that instance. If used as function argument, then this copy only exists during the function call, which may justify to call it 'temporary' in this case. A function may safely return a Ref, however. One only needs to consider its unusual copy constructor that does not copy its member variable. But that is of no concern in nanobind's type_caster, because it constructs the Ref it returns in-place (prvalue), and so the copy constructor is not involved here. For these reasons, your point 3 is of no concern: the returned instance owns the data it maps.

But indeed, there was still a flaw when scalar conversion took place, but no stride conversion! I have therefore pushed another commit that supports scalar type conversions only if either a cleanup_list is passed, or if the Ref is guaranteed to own the data it maps.

Along the way, I have discovered a feature/behaviour/bug in Eigen that makes Ref fail silently if it cannot map its own data. Therefore, stride conversions are now restricted to the common case that Ref's StrideType can actually describe the contiguous memory of Ref's member variable.

I've tried to document all the considerations while not making them too lengthy.

Cheers,
Willi.

…eness, without affecting existing code.

- avoid MSVC compiler warnings about unreachable code.
@wjakob
Copy link
Owner

wjakob commented Jun 6, 2023

Hi @WKarel,

thank you for those changes. I am confused about one last thing:

The trait Eigen::internal::traits<Ref>::template match<DMap>::type::value suggests to me that it represents whether Ref can represent data mapped using DMap.

Should this not be a precondition for calling Ref(dcaster.operator DMap())? (In the sense that that whole code block should be wrapped not just in if constexpr (std::is_const_v<T>) but also take this compatibility into account?

I also don't understand how this flag is combined with the presence of a cleanup list. I remembered that the ndarray caster doesn't even use the cleanup stack.

@wjakob
Copy link
Owner

wjakob commented Jun 6, 2023

diff.txt

(Here is my current diff for this PR. It's mostly a consistency/naming pass, and I shortened the commentary as well -- having the detailed version there was really useful to figure out what's going on though.)

@wjakob
Copy link
Owner

wjakob commented Jun 6, 2023

The only thing that still confuses me is the role of the CompatibleWithDMap part at the end.

@WKarel
Copy link
Contributor Author

WKarel commented Jun 6, 2023

Dear @wjakob ,

The constructors of Ref<T const> use one of two methods for construction, depending on the value of Eigen::internal::traits<Ref>::template match<DMap>::type::value. If it is false, then Ref<T const> will own the data it maps, see the relevant constructor, which then calls the method that copy-constructs Ref<T const>::m_object before mapping it.
If the trait is true, then the constructors do not necessarily touch m_object, but they may directly map the constructor argument.
Thus, the trait does not decide on whether the Ref can be constructed from a DMap, but if it is false, then the Ref is guaranteed to own its data.

However, it is true that the construction of a Ref<T const> from an Eigen type may silently fail at run-time, because while it always manages to copy-convert the argument fine, it may fail to map its own (contiguous) data considering the dimensions of the constructor argument and the given StrideType. If have posted 2 respective issues in the Eigen repo:

  1. already the compilation should fail if it is clear at compile-time that the mapping will fail
  2. Ref<T const> should provide a move constructor. type_caster<Ref<T const>> could then own a member variable of type Ref, assign to it in from_python and directly return false if its creation failed, and operator Ref() could then move out the member variable.

So far, type_caster<Ref<T const>> ensures that the creation of Ref will succeed by duplicating internal Eigen logic using can_map_contig_memory, which refuses to call dcaster if the creation of Ref may fail, based on compile time information.

It may be necessary for type_caster<Ref<T const>> to return a Ref that owns the data it maps. That is certainly not the case if a cleanup_list is passed, because that is only the case if the handle to be casted is a bound function argument, which in turn means that the type_caster will stay alive until that function has returned. This guarantees that the memory that Ref maps remains valid even if Ref does not own it, and if type_caster<ndarray> creates a copy due to scalar type conversion or because contiguous memory was requested, because type_caster<Ref> owns both its type_caster<Map> which in turn own a type_caster<ndarray>, which finally owns an ndarray as member variable value, which is assigned by type_caster<ndarray>::from_python. Indeed, the cleanup_list is not touched at all, but it serves as an indicator that Ref does not need to own the data it maps.

If no cleanup_list is passed, then Ref<T const> must own the data it maps, because it will outlive its type_caster. It would be ideal to have some means to force Ref<T const> into constructing itself such that it owns the data it maps, independent of its specialization, and independent of the constructor argument. Unfortunately, there is none, and Ref<T const> guarantees to own the data it maps only if its traits say so - depending on its own specialization, and on the constructor argument type.

Cheers,
Willi.

@wjakob wjakob merged commit df9e138 into wjakob:master Jun 7, 2023
16 checks passed
@wjakob
Copy link
Owner

wjakob commented Jun 7, 2023

Hi @WKarel -- many thanks for your hard word on this, I just merged the commit. 7b74d2f contains a pass from me over the code (I hope this looks okay to you, and that I didn't break anything).

@wjakob wjakob changed the title https://github.com/wjakob/nanobind/discussions/196 Eigen type caster improvements Jun 7, 2023
@WKarel
Copy link
Contributor Author

WKarel commented Jun 7, 2023

Dear @wjakob,

great, and you are welcome. Looking at dense.h, I think that https://github.com/wjakob/nanobind/blob/08b363aaeac0f487261e06d1d2437dec88a06cc2/include/nanobind/eigen/dense.h#LL400C5-L401C69

static constexpr auto Name =
        const_name<MaybeConvert>(MapCaster::Name, DMapCaster::Name);

should be replaced with

static constexpr auto Name =
        const_name<MaybeConvert>(DMapCaster::Name, MapCaster::Name);

@wjakob
Copy link
Owner

wjakob commented Jun 7, 2023

Good catch, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants