-
Notifications
You must be signed in to change notification settings - Fork 431
For discussion: use dedicated shape objects #374
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| template <class C, std::size_t N, layout_type L = DEFAULT_LAYOUT> | ||
| xtensor_adaptor<C, N, L> | ||
| xadapt(C& container, const std::array<typename C::size_type, N>& shape, layout_type l = L); | ||
| xadapt(C& container, const stat_shape<typename C::size_type, N>& shape, layout_type l = L); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Argh, I can't comment on the lines above, but there is still a std::enable_if_t<!detail::is_array<SC>::value... which obviously should be replaced. Maybe we can have something like detail::is_static_container or detail::is_static_size?
This currently prevents building of the xadapt tests with GCC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I just discovered your new overload further down. Unfortunately it doesn't seem to be picked up. I'd love to investigate more, but lack the time (it's on Ubuntu 16.04 / GCC 5.4).
| template <class C, std::size_t N> | ||
| xtensor_adaptor<C, N, layout_type::dynamic> | ||
| xadapt(C& container, const std::array<typename C::size_type, N>& shape, const std::array<typename C::size_type, N>& strides); | ||
| xadapt(C& container, const stat_shape<typename C::size_type, N>& shape, const stat_shape<typename C::size_type, N>& strides); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in terms of naming i would prefer xshape and then decide on it being static vs. dynamic based on the template arguments. Is it necessary to have dyn_shape vs stat_shape?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ie. xshape<3> == stat_shape, and xshape() == dyn_shape...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On that note, I also think we shouldn't use the size_t template everywhere. It would be much better to define the xshape<xt::index_t> somewhere and then have it consistent across the library and only one place to change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that xshape<3> and xshape<> are better alias types in the long run. During the experimentation phase, the present definition has the advantage of making the old and new shape types exchangeable.
| /// Don't initialize memory that gets overwritten anyway. | ||
| enum skip_initialization_tag { dont_init }; | ||
|
|
||
| /// Copy-construct array in reversed order. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it's nitpicky, but it would be cool to have a similar style everywhere. We don't use these indented triple-slash comments. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Triple-slashs are simply doxygen's C++ variant of /** for one-line documentation. I don't see the necessity to enforce a convention here, but would change the code if you insist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moreover, I consider the code more readable when the documentation is indented more than the code.
| /// Copy-construct array in reversed order. | ||
| enum reverse_copy_tag { copy_reversed }; | ||
|
|
||
| namespace tags { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And we always put newlines in front of brackets
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So do I, with this single exception: I usually don't indent at namespace level. If you prefer, I can indent the code.
|
I think it's a really great addition to xtensor! Thanks for contributing. However, I wonder wether it would be better to remove some features for now, and add them at a later stage? If we concentrate on the TinyArray being a shape type, do we really need all the math functions defined for it? E.g. the trigonometric stuff etc. Also, it might be confusing to have another kind of view in xtensor which doesn't adhere to the same interface as the existing views. For example, I am not sure if we really need What do you think? I hope this doesn't come across as too negative. I think this is really important work and personally I really want to see it in xtensor. |
|
Is it possible to make the shape an expression? |
|
Luckily we can define the shape type of the shape type to be something else ;) So xshape-container shape could be a std::array<xt::index_t, 1> container! (Which is easy to assume as shapes will always be 1D). The question is if this carries any overhead (which I can't answer). |
|
Each xexpression has a shape type. In order to make the shape an xexpression, we only need to give it a particular shape type that would not be an xexpression (that could be an Edit: @wolfv you beat me to it! |
|
Ok, but if the ShapeType is an expression, and the shape of that expression is again something very special, then whats the point of having the ShapeType shape. |
|
You need to perform mathematical operations on the shapes, not on the size of the shape (which would become the shape of the shape). Making the shape_type an xexpression is just a way to avoid code duplication and reuse the implementation of the math functions. The special type used as shape_type of the shape has no importance, it's only meant to stop the template recursion. |
I was of the same opinion initially (a decade or so ago 😄), but functions make all the difference. The most important lesson I learned over the years is "Users love convenience functions." Just consider Since I always forgot which functions were available and which weren't, I adopted the easy-to-remember rule "support everything from
True, but (ok, this doesn't special-case singleton dimensions ATM).
I don't know the details of |
|
I am definitly not arguing against having all the syntactic sugar, I think it's awesome. But I think it will be much easier to merge a bunch of small commits. And I think we need to be careful to integrate the ideas from xtiny into the overall framework. It might be tough to add a lot of functions now, only to decide to e.g. rename some of them later. Couldn't we do a PR with only the tinyarray and the arithmetic functions first, and then do separate PR's for views, and more functionality? At least that's what I think would be easier to review and figure everything out. |
|
I'm running into some weird compiler bug it seems: When I change the Edit: this is on GCC 5.4 / Ubuntu 16.04. I'll try later with GCC 7 on Fedora 26. |
|
With clang 3.8 it builds if I remove the constexpr on line 1723 in xtiny.hpp as the return type is not a LiteralType. |
|
GCC 7 seems to work with the exception of xadapt. I think I found the issue: The templates for static shapes / xtensor are using |
|
I wouldn't mind to get rid of but this doesn't look terribly elegant either. Edit: see #374 (comment) below for a better idea. |
|
I can split the PR into several parts, but not before end of August (I'm heading off to vacation tomorrow 😀 ). And of course, once the tests succeed on all platforms, there are a ton of warnings waiting to be fixed. Moreover, if the PR gets accepted one might want to change some of the existing shape manipulation code to take advantage of the new syntactic sugar. |
|
Another important point that needs to be discussed: right now, the shape's |
|
How about the following idea to eliminate tiny_array_view and tiny_array_base? |
|
@ukoethe after reading your proposal, I have a better understanding of where you want to go. I wanted to say that this is really sophisticated work. And it also shows that you guys really have a real mastery of the type of metaprogramming that is in play here. I would like to propose a plan to move forward with these ideas. However, rather than integrating a massive change like this as a first PR, I would like to have a more progressive approach. 1) Static and dynamic shapes One feature of both A first way to reduce the gap between stack-allocated shapes and heap-allocated shapes would be to use a dynamically-sized container type that
Such a container would have the same performances as A good place to start the implementation of this new container would be to start from our 2) Update to the Currently in xtensor, if we do
Instead of having
Then, downstream could use their custom shape types that would be recognized as static. 3) arithmetic and extra operations on shapes
4) Conclusion Apologies for taking so much time to give you proper feedback. We should probably talk viva voce (over skype, hangout or appear.in) sometimes. Let me know if you like this plan ! Items |
|
to item |
That would be good!
I think it is beneficial for our discussion when you and others can look at the PR in its entirety. Actual integration - if agreed upon - can then proceed more gradually.
The most important benefit of my design is that static and dynamic shapes have the same API and can be used interchangeably in generic code. In my experience, this leads to huge simplifications in functionality like promote_shape, dynamic ROI definition etc. Shape arithmetic in this PR is implemented such that static shapes are used whenever the resulting dimension can be determined at compile time, and dynamic shapes otherwise (this happens automatically, no additional template magic is needed).
I'm not sure if allocation of the shape storage is really the bottleneck. According to my experience, the most critical operation is the transformation of index objects into memory addresses (i.e. the computation of the scalar product between index and strides), which regularly occurs in inner loops. The compiler can optimize this much much better if the dimension is known at compile time, but this optimization is independent of the allocation mechanism. If benchmarking experiments demonstrate that heap allocation is indeed too slow, I can implement the mixed allocation mode you proposed (stack for small N, heap for bigger N) or a custom allocator that calls
I'm proposing to replace
I found over the years that dynamic tensors are more generally useful. For example, functions which are separable over dimensions (e.g. Gaussian filters, distance transforms, data I/O) don't run faster if the dimensionality is known at compile time, whereas the benefits of dynamic dimensions are substantial: compilation times are greatly reduced and binaries get much smaller, because the compiler needs to instantiate these templates only once instead of repeatedly for every dimension.
In general, I agree that redundancy in code should be avoided. However, I'm not so sure about the present case:
|
I am coming back from JupyterCon on wednesday. Both I and Johan would be available pretty much anytime Thurday or Friday, or next week.
Awesome, I think that we are on the same page.
I agree with you about the cost of indexed-based indexing, but the iterators that we have in
We have a very strong use-case for supporting user-provided shape types, for the adaptors from other data structures. (e.g. in xtensor-python, the inner shape type is a buffer adaptor). However, I hear your need for richer arithmetic on shapes, so I have a proposal which should address this usecase.
So that user-defined expressions adapting existing data structures don't need to recursively define expressions for the shapes etc...
👍 I agree with that 100%
I think that the adaptor approach proposed above bridges the gap by avoiding the duplication of the code, while proving a rich API for shape types and making shape type interoperable with expressions.
Gotcha. I don't think that is a big issue. |
1b9fece to
0e6f9af
Compare
1c73f36 to
c1df739
Compare
|
Closing this one which, which was split into multiple discussions. |
This PR offers an implementation that replaces the current
std::arrayandstd::vectorshapes with a dedicated shape classtiny_array. The PR is mainly intended for discussion, although it might eventually be merged if people happen to like the proposed shape class 😃 . I successfully executed the full xtensor test suite using Visual Studio, but did not yet test much on other compilers. Changes come in three groups:Preliminaries
The PR adds headers
xconcepts.hpp(concept checking and type inference),xmathutil.hpp(additional mathematical functions) andxtags.hpp(simple keyword argument support). Concept checking can now be done viawhich is arguably more readable than
enable_if_tapplied to the return type.Shape Class
Files
xtiny.hppandtest_xtiny.cppimplement and test the classestiny_array(arrays that own their memory) andtiny_array_view(arrays that don't own their memory) along with a large set of arithmetic, logical, and algebraic functions.tiny_array<size_t, 3>corresponds tostd::array<size_t, 3>, andtiny_array<size_t, runtime_size>corresponds tostd::vector<size_t>. The shape classes are designed to resemble the behavior of built-in arithmetic types. In particular, operations usually return new objects, rather than modifying existing ones in-place. This is key to the unification of static and dynamic shapes. IMHO, this design decision won't cause performance problems, because shape calculations are rarely done in the inner loop.It is open for debate if it would be prefarable to implement mathematical functions for
tiny_arrayin terms of the existing xexpression machinery. A possible advantage of the present design is that the compiler will unroll loops automatically for static shapes. (As an aside, I had to addXTENSOR_REQUIRE<xexpression_concept<E>::value>to some functions in xmath.hpp because these templates matched too greedily and thus dominated the corresponding functions inxtiny.hpp.)Changes to the Existing Code Base
In the original code, declarations of shape types are spread all over the place (I found >30 occurences in the headers and another ~60 in the tests). I first implemented a single point of reference for these declarations by defining aliases
stat_shapeanddyn_shapeinxutils.hppand replacing hard-coded declarations ofstd::arrayandstd::vectoras appropriate. I then added overloads of the required utility functions and metafunctions fortiny_arrayand finally replaced the types instat_shapeanddyn_shapewithtiny_array. Enjoy!A major advantage of the present design is that
xarrayandxtensoressentially become the same class -- all differences can potentially be handled bytiny_array.