-
Notifications
You must be signed in to change notification settings - Fork 431
added xtiny and tests #591
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cool! I'll check it out later, and on Monday. |
include/xtensor/xtiny.hpp
Outdated
| using base_type::assign; | ||
|
|
||
| template <class T> | ||
| void assign(std::initializer_list<T> v); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The initializer list should only take value_type and not template T, otherwise implicit conversion costs 2ns :) I spent a day wondering about a benchmark and it was that, hehe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having it templated on T is important to avoid type conversion warnings in code like this:
xtiny<unsigned, 3> vi{1, 2, 3}; // T = int
xtiny<float, 3> vf{1.0, 2.0, 3.0}; // T = doubleHowever, we can provide an overload with initializer_list<value_type>. Or does the compiler automatically convert the initializer list arguments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure that std::vector etc. do not have a templated initializer_list constructor, so I don't think we should have one. And the conversion warnings in this case would be legit, so I think it's fine to just remove it.
In my test, the compiler basically used the initializer_list() overload, and then did the type conversion upon assignment (instead of statically at compile time) and that costs time in the benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dropping the templated variant would be very unfortunate:
template <class REAL>
void foo(xtiny<REAL, 2> & v)
{
v.assign({-0.5, 0.5}); // doesn't compile when REAL=float
v.assign({static_cast<REAL>(-0.5), static_cast<REAL>(0.5)}); // UGLY!!
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's really weird. E.g. this compiles fine with clang:
std::vector<float> a({1.,2.,3.});
Is it a compiler setting or MSVC?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I made a mistake. Never mind.
| void deallocate(); | ||
|
|
||
| allocator_type m_allocator; | ||
| size_type m_size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you have the chance to benchmark this against storing begin/end pointers? At least llvm's small vector stores begin and end pointers, and it might be faster in certain cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't benchmark this specific code. But I've consistently found that loops over integers are as fast (or even faster) as loops over iterators. For example, one would think that
auto iter = v.begin();
for(size_t k=0, k < v.size(); ++k, ++iter)is slower than
for(auto iter = v.begin(); iter != v.end(); ++iter)because one increments two variables. But actually the two versions are equally fast, or the first is faster because compilers seem to be better at optimizing integer loops. However, please confirm these statements by independent measurements.
|
BTW, I think the design can also accommodate a fixed vector: template <index_t ... M>
class xtiny_impl<index_t, (index_t)sizeof...(M), fixed_impl<M...>>
: public fixed_impl<M...>
, tags::xtiny_tag
{
};
template <index_t ... M>
using xshape_fixed = xtiny<index_t, (index_t)sizeof...(M), fixed_impl<M...>>; |
include/xtensor/xtiny.hpp
Outdated
|
|
||
| using index_t = std::ptrdiff_t; | ||
|
|
||
| static const index_t runtime_size = -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think replacing static const with constexpr static would be nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed.
wolfv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've looked through the xtiny.hpp file and added my comments!
|
|
||
| template <class V1, index_t N1, class R1, class V2, index_t N2, class R2> | ||
| inline bool | ||
| operator==(xtiny<V1, N1, R1> const & l, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the following functions should be modified such that the loops are replaced with std::equal and instead of re-implementing almost the same functionality replace with a call to == e.g. != -> !operator==(a, b) etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you ever benchmarked if !operator==(a, b) is as fast as the explicit implementation? I haven't, but the negation is clearly an extra operation that the compiler may or may not optimize away.
The same applies to the call std::equal(l.begin(), l.end(), r.begin(), r.end()): Checking if both sequences have equal length requires extra work when std::equal is not explicitly specialized for random access iterators. Moreover, my code version will optimize out this check entirely if the array sizes are known at compile time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have this version in xstorage (uvector).
I can do benchmarks tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll wait for your benchmarks before taking action.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another reason for the explicit loop in operator==(xtiny, xtiny) is symmetry with operator==(xtiny, scalar). AFAIK, there is no variant of std::equal that compares an array with a scalar, and I wanted the two functions to look similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've uploaded the benchmark here: https://gist.github.com/wolfv/a64526ceb94dd37b4a5f2f5dbcf76779 maybe you have time to check the results on MSVC?
I've modified the benchmark so that it modifies a random index each loop in an otherwise equal array to create some variance in the distance where the arrays are not equal. This incurs a heavy cost though ...
These are the latest results:
--------------------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------------------
bench_compare_eq1/4 77 ns 77 ns 8853946
bench_compare_eq1/8 78 ns 78 ns 8857423
bench_compare_eq1/64 91 ns 91 ns 7479546
bench_compare_eq1/512 183 ns 183 ns 3750081
bench_compare_eq1/1000 273 ns 273 ns 2567153
bench_compare_eq2/4 77 ns 77 ns 8866009
bench_compare_eq2/8 79 ns 79 ns 8777499
bench_compare_eq2/64 112 ns 112 ns 6227843
bench_compare_eq2/512 350 ns 350 ns 2027685
bench_compare_eq2/1000 588 ns 588 ns 1163340
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! I'll run your benchmarks later today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just fyi I found this implementation
auto ita = a.begin();
auto itb = b.begin();
auto resit = result.begin();
auto n = std::distance(ita, a.end());
for (std::size_t i = 0; i < n; ++i)
{
*resit = (*ita + *itb);
++resit;
++ita;
++itb;
}
on std::arrays to be faster than the integer-access based (gcc 7.2). The speed difference was 3ns vs 1ns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are the benchmark results on Windows:
Run on (8 X 1993 MHz CPU s), Visual Studio 2015
SIMD SIZE: 2
----------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------
bench_compare_eq1/4 507 ns 479 ns 1725827
bench_compare_eq1/8 390 ns 389 ns 1725827
bench_compare_eq1/64 415 ns 413 ns 1547293
bench_compare_eq1/512 829 ns 834 ns 897430
bench_compare_eq1/1000 1506 ns 1502 ns 498572
bench_compare_eq2/4 347 ns 348 ns 1794860
bench_compare_eq2/8 399 ns 392 ns 1869646
bench_compare_eq2/64 544 ns 534 ns 1547293
bench_compare_eq2/512 1204 ns 1189 ns 498572
bench_compare_eq2/1000 1844 ns 1877 ns 373929
bench_compare_fill/4 7 ns 7 ns 112178768
bench_compare_fill/8 10 ns 10 ns 64102153
bench_compare_fill/64 58 ns 57 ns 11217877
bench_compare_fill/512 307 ns 307 ns 2136738
bench_compare_fill/1000 579 ns 577 ns 1000000
bench_compare_fill2/4 14 ns 15 ns 44871507
bench_compare_fill2/8 15 ns 15 ns 44871507
bench_compare_fill2/64 41 ns 41 ns 18696461
bench_compare_fill2/512 273 ns 273 ns 2804469
bench_compare_fill2/1000 508 ns 515 ns 1000000
----------------------------------------------------------------
SIMD SIZE: 4
----------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------
bench_compare_eq1/4 416 ns 365 ns 1794860
bench_compare_eq1/8 371 ns 368 ns 1950935
bench_compare_eq1/64 407 ns 409 ns 1869646
bench_compare_eq1/512 791 ns 782 ns 897430
bench_compare_eq1/1000 1382 ns 1391 ns 560894
bench_compare_eq2/4 357 ns 352 ns 2039614
bench_compare_eq2/8 366 ns 368 ns 1950935
bench_compare_eq2/64 449 ns 441 ns 1661908
bench_compare_eq2/512 1063 ns 1046 ns 641022
bench_compare_eq2/1000 1638 ns 1606 ns 407923
bench_compare_fill/4 7 ns 7 ns 112178768
bench_compare_fill/8 10 ns 10 ns 74785845
bench_compare_fill/64 56 ns 57 ns 11217877
bench_compare_fill/512 268 ns 269 ns 2492862
bench_compare_fill/1000 511 ns 515 ns 1000000
bench_compare_fill2/4 15 ns 15 ns 44871507
bench_compare_fill2/8 14 ns 14 ns 49857230
bench_compare_fill2/64 37 ns 38 ns 18696461
bench_compare_fill2/512 242 ns 245 ns 2991434
bench_compare_fill2/1000 533 ns 528 ns 1447468
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
include/xtensor/xtiny.hpp
Outdated
| template <class T, std::size_t M> | ||
| inline | ||
| xtiny<V, N, R>::xtiny(std::array<T, M> const & v) | ||
| : base_type(v.begin(), v.end()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe use cbegin / cend everywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where's the difference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is more expressive. We use cbegin, cend wherever possible across the code base. When we find an instance where it is not the case, we change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
include/xtensor/xtiny.hpp
Outdated
| ? static_size-1 | ||
| : runtime_size; | ||
| xtiny<value_type, res_size> res(size()-1, dont_init); | ||
| for(size_type k=0; k<m; ++k) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could probably be replaced with a combination of std::copy and std::move_backward
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I generally find one-line loops more readable than calls to standard functions. I also have more trust in the compiler's optimization abilities when size() is known at compile time (but didn't measure recently if there is any difference).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
| inline void | ||
| xtiny_impl<V, N, R>::assign(IT begin, IT end) | ||
| { | ||
| std::ignore = end; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice trick! this is to avoid unused parameter warnings, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
include/xtensor/xtiny.hpp
Outdated
| xtiny_impl<V, runtime_size, R>::assign(size_type n, const value_type& v) | ||
| { | ||
| XTENSOR_ASSERT_MSG(n == size(), "xtiny_impl::assign(n, v): size mismatch."); | ||
| for(size_type k=0; k<size(); ++k) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use std::fill?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my benchmarks shows little difference between iterator based std::fill vs. for loop (number 1 is std::fill and number 2 is for loop):
----------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------
bench_compare_fill/4 2 ns 2 ns 402986315
bench_compare_fill/8 3 ns 3 ns 215037132
bench_compare_fill/64 19 ns 18 ns 36827484
bench_compare_fill/512 141 ns 141 ns 4734883
bench_compare_fill/1000 270 ns 270 ns 2587655
bench_compare_fill2/4 3 ns 3 ns 243319874
bench_compare_fill2/8 3 ns 3 ns 232418915
bench_compare_fill2/64 28 ns 28 ns 24789837
bench_compare_fill2/512 141 ns 141 ns 4891461
bench_compare_fill2/1000 272 ns 271 ns 2550597
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
include/xtensor/xtiny.hpp
Outdated
|
|
||
| template <class V, class R> | ||
| constexpr inline auto | ||
| xtiny_impl<V, runtime_size, R>::shape() const -> index_type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the shape function is confusing and unnecessary (at the current iteration of xtiny).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
|
I implemented your suggestions and uploaded a revised version. |
| { | ||
| if(!may_use_uninitialized_memory) | ||
| { | ||
| for(size_type k=0; k<m_size; ++k) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we could reuse detail::safe_destroy_deallocate from xstorage here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered this, but then found that I'd first like to change safe_init_allocate():
// rename current version
// (*_init_* is a misnomer, because the function doesn't init trivial types)
safe_allocate_no_init(A& alloc, typename A::size_type size)
// add new version with custom initial value
safe_allocate_init(A& alloc, typename A::size_type size,
typename A::value_type const & v = typename A::value_type())Then I could use safe_allocate_* and safe_destroy_deallocate symmetrically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked up where safe_init_allocate is called in the current code base, and found this:
pointer tmp = safe_init_allocate(al, rhs.m_size);
if (xtrivially_default_constructible<value_type>::value)
{
std::uninitialized_copy(rhs.m_data.get(), rhs.m_data.get() + rhs.m_size, tmp);
}
else
{
std::copy(rhs.m_data.get(), rhs.m_data.get() + rhs.m_size, tmp);
}This code is awkward, because std::uninitialized_copy works for arbitrary types, not just trivial ones. The above could simply be replaced with:
pointer tmp = al.allocate(rhs.m_size);
std::uninitialized_copy(rhs.m_data.get(), rhs.m_data.get() + rhs.m_size, tmp);There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't write that code, but maybe @JohanMabille has some insights here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uninitialized_copy calls the constructor, which has already been called by safe_init_allocate when the type is not trivially default constructible (hence the safe in the name). So using uninitialized_copy everywhere may end with a double call to the constructor with all the bad consequences it can have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't suggest to use uninitialized_copy everywhere. But in the particular code snippet above, non-trivial objects are needlessly initialized twice, first with placement new, then with copy assignment. In contrast, alloc.allocate(size); followed by uninitialized_copy() always calls placement new exactly once, as it should be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the word "everywhere", I meant in both branches of the "if" of the previous code snippet. I agree that the code can be simplified. We loose the "symmetric" appearance of the code (with the use of safe_destroy_deallocate), but the code is cleaner.
However, I don't understand why you would like to have two functions; you would have to test if the type is trivially default constructible to know which method to call, while the idea in safe_init_allocate is to automatically handle that and default construct the values only if it is needed. The "safe" here is related to the initialization, not the allocation, and it is safe to no initialize trivial types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is safe to not initialize trivial types.
Exactly that's why I thought that a name containing "no" (safe_no_init_allocate or safe_allocate_no_init or allocate_safe_no_init) would more explicitly express that trivial types are safely not initialized 😄 .
why you would like to have two functions
When I free memory with safe_destroy_deallocate (which I would like to do), I also want to allocate it with a corresponding allocate function (general rule: always allocate and free memory with corresponding function pairs). Since this must apply to all supported xtiny constructors, I actually need three allocate variants, one for safe (non-)initialization, value initialization, and copy initialization. In the hypothetical case that someone modifies the allocation/deallocation mechanism in xstorage, all functions in that file will hopefully be changed together, whereas a subsequent change in a seemingly unrelated file like xtiny is easily overlooked. Alternatively, xstorage could explicitly document a guarantee that incompatible changes of the allocation/deallocation mechanism will never occur.
wolfv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've found the following statement on stackoverflow regarding storing pointers vs. a pointer + size/capacity:
In essence, storing pointers means optimizing for iterators. As we're trying to follow STL conventions I would say we should go for the same. What do you think?
| "xtiny_impl::assign(begin, end): size mismatch."); | ||
| for(size_type k=0; k<N; ++k, ++begin) | ||
| { | ||
| (*this)[k] = static_cast<value_type>(*begin); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we should use iterators here, too.
And I am unsure about the static_cast. Is that correct behavior compared with e.g. std::vector?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we should use iterators here, too.
I don't mind as long as the compiler manages to unroll these loops (or applies equally powerful optimization) when N is known at compile time.
And I am unsure about the static_cast. Is that correct behavior compared with e.g. std::vector?
In principle, std::copy could be called here, but it issues warnings when decltype(*begin) doesn't match value_type. Since xtiny is meant to be usable beyond shapes (e.g. for RGB pixels), mixed-type expressions are not rare.
The warning policies of current compilers w.r.t. automatic type conversion amount to the implicit rule "automatic type conversion is deprecated". From the image processing point of view, I don't agree with this rule, and add explicit casts where I think automatic conversion should be allowed.
| xtiny_impl<V, runtime_size, xbuffer_adaptor<CP, O, A>>::assign(size_type n, const value_type& v) | ||
| { | ||
| std::ignore = n; | ||
| XTENSOR_ASSERT_MSG(n == size(), "xtiny_impl::assign(n, v): size mismatch."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think assign is supposed to take care of resizing the container (this is the behavior from std::vector).
The following works:
std::vector<double> test;
test.assign({1,2,3,4,5});
assert(test.size() == 5);There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but recall that xtiny exists in order to unify the APIs of static and dynamic arrays. Since a static array's assign cannot resize, I had to stick to this behavior for dynamic arrays as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the same reason, insert() and erase() don't work in-place, but return a new xtiny.
VIGRA was initially built on an iterator-based design. After several years, I realized that all data structures people care about support random access, and that users strongly prefer indexing. Here are some reasons:
This doesn't mean that iterators are not needed. But I think the desire to follow STL conventions is not enough to justify their primacy. |
| using base_type::rbegin; | ||
| constexpr const_reverse_iterator crbegin() const; | ||
| reverse_iterator rend(); | ||
| constexpr const_reverse_iterator rend() const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried this, but I am quite confident it doesn't compile with C++14 (constexpr reverse iterator will be a C++17 feature).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It worked on all platforms covered in your test suite. (The current red light seems to be due to failing setup code in the gcc 6 environment -- the test doesn't even get to the point of compilation.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep. because constexpr variables are probably not covered by the test.
constexpr xtiny<...> test({1,2,3});
constexpr auto it = test.rbegin();
will likely not compile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
anyways, for the constexpr to have any effect we probably need to make some constructors constexpr as well ...
I accidentally found an interesting example of a hybrid constexpr class today. Might be a source for some inspiration for potential future improvements :)
https://github.com/akrzemi1/Optional/blob/master/optional.hpp#L319-L346
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This already fails in the constructor line. I guess, when the constructor goes through, rbegin() will compile as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mainly defined constexpr ... rbegin() to potentially accommodate your fixed_vector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll make some experiments today to see how constexpr and base classes can be combined. Maybe I'll have a proposal ready tonight.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So far I can't get anything nice to work. I'm unsure if it's possible to get a constexpr std::array-like type to work ... the problem is that aggregate initialization with initializer list doesn't work when inheriting (as it's not an aggregate anymore).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to try the fixed_vector adapter I described here: #591 (comment). Then you only need a default constructor.
|
Hi Ulrich, Sorry for the radio silence. We are still 100 percent committed to making xtensor the perfect library for VIGRA. We had another look at the PR today, and we came to the conclusion that we should probably not move forward with integrating tiny_array in xtensor. It boils down to the design decisions that were necessary to reconcile dynamic and static sizes in a consistent fashion. For example, the Also, it is really unfortunate that you spent so much time on a PR that is ultimately not getting merged. We apologize for that. On the other side, we have not had many third-party contributions proposing changes with so much depth and sophistication. This discussion has been very helpful to figure out performance problems in the library. The idea of stack-allocated dynamic containers for small sizes, and the enabling of arithmetics on shapes also came out of these discussions. Would you be ok to collaborate on the two other PRs regarding fixed shape and |
|
Hi Sylvain, if this your final decision, I'll probably drop out. I spent an enormous amount of time attempting to find out what you want via trial and error, apparently without success. I'm still willing to implement concrete and constructive suggestions, but your feedback so far was mainly negative. It is an acceptable decision that you don't want to break the STL interface for I had many ideas about contributions to xtensor. I've partly considered xtiny also as a test if and how collaboration with you was possible. It's sad that it didn't work out so far. |
|
How about renaming |
|
I ageee that push_back that returns a copy is surprising, but I cannot see any problem with the renaming. |
|
How about I spend the day tomorrow using your PR and making sure it works well as a template parameter for the shape type? |
|
Sorry to "butt in" to this thread, but FWIW: I think signed shape types are very nice (perhaps even a necessity) for ND image processing, for the same reasons that Ullrich has mentioned elsewhere. Also, a unified API for dynamic and non-dynamic shapes would also make xtensor more approachable, I think. Disclaimer: Perhaps I'm biased -- I'm a VIGRA user and occasional collaborator with both Ullrich and Thorsten. That said, I'm cheering from the sidelines for the possibility that |
|
Hi @ukoethe, so I did some experiments (very basic ones), and all that was necessary is adding some free function overloads for xtiny types: #include "xtensor/xio.hpp"
#include "xtensor/xarray.hpp"
#include "xtensor/xtiny.hpp"
using namespace xt;
template <class T, layout_type L = layout_type::row_major>
using xspecial = xarray_container<xt::uvector<T>, L, xtiny<std::size_t>>;
namespace xt
{
template <class T, ptrdiff_t I>
inline bool resize_container(xtiny<T, I>& v, typename xtiny<T, I>::size_type s)
{
return I == s;
}
template <class T>
inline bool resize_container(xtiny<T, -1>& v, typename xtiny<T, -1>::size_type s)
{
v.resize(s);
return true;
}
template <class I, ptrdiff_t L, class... S>
struct xview_shape_type<xtiny<I, L>, S...>
{
using type = xtiny<I, L - integral_count<S...>() + newaxis_count<S...>()>;
};
template <class I, class... S>
struct xview_shape_type<xtiny<I, -1>, S...>
{
using type = xtiny<I, -1>;
};
}
template <class T, std::size_t I, layout_type L = layout_type::row_major>
using xspecialfixed = xarray_container<xt::uvector<T>, L, xtiny<std::size_t, I>>;
int main()
{
xspecial<double> a = {{1,2,3}, {4,5,6}};
std::cout << a << std::endl;
a.reshape({6});
std::cout << a << std::endl;
xspecialfixed<double, 2> b = {{1,2,3}, {4,5,6}};
std::cout << b << std::endl;
b.reshape({3, 2});
std::cout << b << std::endl;
std::cout << "\n\nPrinting shapes: " << std::endl;
std::cout << b.shape() << std::endl;
std::cout << a.shape() << std::endl;
}And it seems to work beautifully! :) |
|
Functions also work, however, I think |
|
Quick note on |
Are these specializations supposed to go into |
|
I think so. Or, if you would want to seperate xtiny, and xtensor<-> xtiny specific stuff, maybe adding another file xtensor_xtiny_compat.hpp would be good? There is a chance we could also do some simplifications regarding the interface. Using decltype and detecting members we could for example add an overload which detects a |
|
Now we do have
The |
|
Thanks for considering the PR. I guess our discussions did have some influence on your design of statically-shaped tensors and |
|
Hi Ullrich, of course it influenced a lot and got us thinking about this optimization and, as you know, we looked at your PR a lot for inspiration. I also mentioned that in this blog post: https://medium.com/@wolfv/faster-xtensor-87fcbe7f5293 We still need to add the operations on shapes using xadaptors. |
Finally, I was able to finish the PR for tiny arrays. It doesn't yet contain any of the arithmetic or algebraic functions, just the class
xtinyand its base classes. Five types of tiny arrays are currently implemented:small_vectorhere)xbuffer_adaptorAll types of
xtinysupport the STL interface, are fully interoperable with each other and can be constructed fromstd::vectorandstd::array. I hope you like it.