Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pybind::array and pybind::array_t universal references #497

Closed
SylvainCorlay opened this issue Nov 13, 2016 · 28 comments
Closed

pybind::array and pybind::array_t universal references #497

SylvainCorlay opened this issue Nov 13, 2016 · 28 comments

Comments

@SylvainCorlay
Copy link
Member

Variadic template methods of pybind::array and pybind::array_t used for accessing data, namely

data, mutable_data, offset_at, index_at, get_byte_offset , at, mutable_at

should take arguments by value instead of universal reference.

The semantic in the standard library for functions taking integral or iterator arguments is to take them by value. (For example std::vector<T>::operator[] takes its argument by value).

@SylvainCorlay
Copy link
Member Author

^@aldanor

@SylvainCorlay SylvainCorlay changed the title Move semantic issue in operator() for pybind::array and pybind::array_t pybind::array and pybind::array_t universal references Nov 13, 2016
@jagerman
Copy link
Member

The issue is that, although all arguments get converted to size_t eventually, the variable-arguments-through-a-template-parameter-pack approach being used here doesn't allow that implicit conversion to take place.

For stl functions like vector's operator[] I get implicit conversion because the argument is a size_t, which means that I can call it with anything that is implicitly convertible to size_t, e.g.:

class SomeNonCopyableClass {
    // ...
    operator size_t() const { return /* ... */; }
};

// ...

SomeNonCopyableClass a(2);
std::cout << myvec[a] << "\n";

With pass-by-value semantics without the type being in the function signature, this fails because I can't copy my a object to pass it along: the deduced argument type will be SomeNonCopyableClass, not size_t. But the current code fails here, too, because the various argument chains actually are passing by value rather than forwarding arguments. This should be fixed.

So I think, to have an stl-like interface that takes anything convertible to a size_t, the code should keep the && arguments, but needs to also use perfect forwarding, e.g. by changing the calls such as next_method(index...) to either next_method(std::forward<Ix>(index)...) (to pass the conversion down the line), or next_method(size_t(index)...) to do the conversion immediately.

On a side note: there's an index_at method that current takes its arguments by lvalue reference, which looks like a typo (i.e. it's missing the second &).

@SylvainCorlay
Copy link
Member Author

SylvainCorlay commented Nov 13, 2016

So I think, to have an stl-like interface that takes anything convertible to a size_t, the code should keep the && arguments, but needs to also use perfect forwarding, e.g. by changing the calls such as next_method(index...) to either next_method(std::forward(index)...) (to pass the conversion down the line), or next_method(size_t(index)...) to do the conversion immediately.

Can you think of a realistic case where people would be indexing with non-integral types convertible to size_t?

@jagerman
Copy link
Member

Sure; perhaps I encode something else in my indices--a row title, for instance--and so my index objects are:

struct myindex {
    size_t i;
    std::string name;
    operator size_t() const { return i; }
};

Sure, there are other ways I could do this, but what is the compelling reason to not allow arbitrary types that are implicitly convertible to size_t to be used? (In other words, what does pass-by-value gain over perfect forwarding?)

@SylvainCorlay
Copy link
Member Author

SylvainCorlay commented Nov 13, 2016

Sure, there are other ways I could do this, but what is the compelling reason to not allow arbitrary types that are implicitly convertible to size_t to be used? (In other words, what does pass-by-value gain over perfect forwarding?)

Not much except that it is the most used semantic for integral types, so universal references go against the principle of least surprise.

@SylvainCorlay
Copy link
Member Author

SylvainCorlay commented Nov 13, 2016

In some unlikely cases it can have border effects, such as an object being moved unexpectedly. For example if you have a function foo that transform the index, but sometimes only returns it unchanged as an rvalue (effectively equivalently to std::move). Then myvec[foo(i)] will cause i to be moved and not usable afterwards.

@SylvainCorlay
Copy link
Member Author

Regarding your concern regarding non-copyable things, I think that it is a good assumptions / requirement for and index to have a value semantic.

@jagerman
Copy link
Member

jagerman commented Nov 13, 2016

I agree: the ideal interface here is (size_t i_0, size_t i_1, size_t i_2, ...) where the number of arguments is variable. But I don't see an easy way to do that here (two things that occur to me: making each of the calls a recursive template so that all arguments really do have to be size_t; or using va_args). Neither is all that nice, but perhaps one or the other is worth doing anyway.

@dean0x7d
Copy link
Member

In some unlikely cases it can have border effects, such as an object being moved unexpectedly. For example if you have a function foo that transform the index, but sometimes only returns it unchanged as an rvalue. Then myvec[foo(i)] will cause i to be moved and not usable afterwards.

That would be a fault in foo, not in myvec::operator[]. There's really no way to accidentally move out an lvalue -- std::move is required and very loud. (Pure rvalues are moved silently, but there's nothing unexpected about that.)

The only downside of forwarding references is that they result in extra overhead (pointers to ints) in case the template function isn't inlined (unlikely for short templates, but it does happen). There's no difference for fully inlined templates.

It would probably be best to keep things simple and optimize for the most common use case. These are pretty much always going to be simple integers so value semantics seems preferable.

@JohanMabille
Copy link
Contributor

JohanMabille commented Nov 14, 2016

@dean0x7d said:
It would probably be best to keep things simple and optimize for the most common use case. These are pretty much always going to be simple integers so value semantics seems preferable.

I strongly agree with @SylvainCorlay and @dean0x7d on that; passing by value is preferable, and follows the conventions of the C++ standard library.

@jagerman said:
I agree: the ideal interface here is (size_t i_0, size_t i_1, size_t i_2, ...) where the number of arguments is variable. But I don't see an easy way to do that here (two things that occur to me: making each of the calls a recursive template so that all arguments really do have to be size_t; or using va_args). Neither is all that nice, but perhaps one or the other is worth doing anyway.

The recursive approach is fine, it is what we use in xtensor. It is more efficient, no additional array on the stack is required, and the compiler can inline it so the assembly is just exactly the computation of the index in the underlying buffer, nothing more. And you force conversion to size_t for free.

@SylvainCorlay
Copy link
Member Author

@wjakob what do you think of the above?

@wjakob
Copy link
Member

wjakob commented Nov 15, 2016

It sounds like a fairly low level change -- either is fine with me I guess. This piece of code "belongs" to @aldanor, who has done an excellent job designing these APIs -- so I ultimately defer to him.

@SylvainCorlay
Copy link
Member Author

There is a significant performance difference too.

Regarding the dimension checks, I would like to remove them all and only check at the top-level calls to at and data.

@dean0x7d
Copy link
Member

Significant performance difference for what?

Are the checks being duplicated somewhere or do you just want to get rid of them altogether? The latter was discussed previously and went in favor of checking.

@wjakob
Copy link
Member

wjakob commented Nov 15, 2016

I would assume all of this gets inlined for any decent C++ compiler (the implementations of these functions are all trivial), so I'm surprised by this statement.

@SylvainCorlay
Copy link
Member Author

I would assume all of this gets inlined for any decent C++ compiler (the implementations of these functions are all trivial), so I'm surprised by this statement.

The performance issue comes from the stack array that is created in get_byte_offset which we don't need in the case of the recursive implementation.

@SylvainCorlay
Copy link
Member Author

Are the checks being duplicated somewhere or do you just want to get rid of them altogether? The latter was discussed previously and went in favor of checking.

I want the checks to only be at the top-level functions (at, data) and not in get_byte_offset which we want to use without performance penalty in xtensor.

@SylvainCorlay
Copy link
Member Author

PR coming in a minute.

@aldanor
Copy link
Member

aldanor commented Nov 15, 2016

Hi all, a bit late to the party.

Re: universal refs vs by-value, I'm personally fine with whichever way it is. Do we have a consensus on whether Ix&& should be converted Ix in array methods? Note that for literal ints and ints passed by-value the assembly would already be identical, you can easily check that. The only difference would be when you pass e.g. a const size_t& (why?..) and now it basically has to do a dereference. If we leave it be as universal refs, could do proper forwarding as noted above.

Re: checks, it's probably fine to remove the check from the lowest-level method (get_byte_offset()?) as long all higher-level methods remain bound-checked to avoid surprises. Or, alternatively, just provide a get_byte_offset_unchecked() -- that's a very common idiom e.g. in Rust (example).

@wjakob
Copy link
Member

wjakob commented Nov 15, 2016

Actually, I would suggest to do what Eigen and similar libraries do here:

Simply

#if !defined(NDEBUG)
....
#endif

those parts. It makes a lot of sense to have checks for debugging, but they should probably be disabled for release builds.

@SylvainCorlay
Copy link
Member Author

I like this idea even better.

@aldanor
Copy link
Member

aldanor commented Nov 15, 2016

This one I'm not sure I fully agree with, tying bounds checking with a commonly used ndebug flag is very implicit; you switch for a release build and suddenly out of bounds exceptions become bad memory access. There's basically no way to re-enable bound checks in release builds then.

@dean0x7d
Copy link
Member

py::array is essentially a convenient C++ wrapper for np.array and Python doesn't have an unchecked release mode. Translating Python's runtime checks/exception into C++ undefined behavior would be pretty surprising.

@SylvainCorlay
Copy link
Member Author

@aldanor did you see the point about avoiding the temporary array (performance)?

The following produces simpler assembly

https://github.com/QuantStack/xtensor/blob/master/include/xtensor/xcontainer.hpp#L238

@aldanor
Copy link
Member

aldanor commented Nov 15, 2016

There's really two use cases here, I guess, hence all the bikeshedding. One is NumPy-like interface with NumPy-like behaviour which implies bound checks and no UB. The other is downstream library code like xtensor that chooses to manage bound checks and other things on its own. Hence my suggestion of providing "unchecked" variants of a few methods.

@aldanor
Copy link
Member

aldanor commented Nov 15, 2016

@SylvainCorlay yes, recursive would result in a simpler assembly indeed, that's something we should probably do

@SylvainCorlay
Copy link
Member Author

#500

@SylvainCorlay
Copy link
Member Author

SylvainCorlay commented Nov 15, 2016

So the checks are moved out of byte_offset so that xtensor can use it unchecked.

We should still decide which top-level functions should have the checks and which not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants