-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eytzinger search #376
Eytzinger search #376
Conversation
Thanks for the patch. This changes a lot of things that are unrelated to the feature. Please split this patch and put the cosmetic changes (wrapping min/max etc) in another PR. |
Don't make the data structure a template argument. It is fine to replace the standard implementation, because the speed is essentially the same and the additional memory overhead is small. The core logic of the eytzinger search should be implemented in a seperate struct or class, as it is now, so that the core algorithm can be tested in isolation (and potentially reused). Perhaps the vector should be merged into our eytzinger class, although this may have unwanted side effects for the design. |
249558b
to
43721fa
Compare
@HDembinski Thank you for simplifying all this! Now I can get rid of the four special members and the extra |
The cosmetic changes (wrapping min/max etc) are split into #377 . |
@@ -45,6 +45,12 @@ constexpr E* data(std::valarray<E>& v) noexcept { | |||
return std::begin(v); | |||
} | |||
|
|||
#if __cpp_lib_nonmember_container_access >= 201411L |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also a cosmetic change, please revert.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Reverted.
(The original intent of this change is to use unqualified size
inside include\boost\histogram\detail\eytzinger_search.hpp
to let size
resolve to user provided one (if there is one) through adl.)
- Before C++17 there is no
std::size
so I#include<boost/histogram/detail/nonmember_container_access.hpp>
. - From C++17
std::size
conflicts withdetail::size
when I write unqualifiedsize
so I added#if
.
I checked the doc just now. Seems the size
is intended to be a customization point, so if I use detail::size
then there is no adl then the customizability of size
will be lost.
https://en.cppreference.com/w/cpp/iterator/size says:
Custom overloads of size may be provided for classes and enumerations that do not expose a suitable size() member function, yet can be detected.
allocator_type alloc = {}) | ||
: metadata_base(std::move(meta)), vec_(std::move(alloc)) { | ||
: metadata_base(std::move(meta)) | ||
, vec_([&] { | ||
if (std::distance(begin, end) < 2) | ||
BOOST_THROW_EXCEPTION(std::invalid_argument("bins > 0 required")); | ||
|
||
vector_type vec(std::move(alloc)); | ||
|
||
vec.reserve(std::distance(begin, end)); | ||
vec.emplace_back(*begin++); | ||
bool strictly_ascending = true; | ||
for (; begin != end; ++begin) { | ||
strictly_ascending &= vec.back() < *begin; | ||
vec.emplace_back(*begin); | ||
} | ||
if (!strictly_ascending) | ||
BOOST_THROW_EXCEPTION( | ||
std::invalid_argument("input sequence must be strictly ascending")); | ||
return vec; | ||
}()) | ||
, eytzinger_layout_and_eytzinger_binary_search_(vec_) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The trick with the lambda is a bit too clever for my taste. I think it is more readable if we keep the code in the body of the constructor. This also does not provide the strong exception guarantee yet. Once the ctor provides the guarantee, that should also be stated in the docstring. metadata_type
would have to become a forward reference for this to work.
Ok, since it is not trivial to achieve, we also break that into a separate PR. Let's keep this PR focussed on implementing eytzinger search.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eytzinger_layout_and_eytzinger_binary_search_
's initialization needs an initialized vec_
.
If we move the intitialization code of vec_
into constructor body, then eytzinger_layout_and_eytzinger_binary_search_
will default initializes into a dummy state and then initializes again using eytzinger_layout_and_eytzinger_binary_search_t& assign(const Range& r)
, is that acceptable?
We don't allow direct access to the array, so it is fine if we internally change the data structure from vector_type to something else. However, users of auto bin(index_type idx) const noexcept { return interval_view<variable>(*this, idx); } A generic iterator and begin() and end() are generated through mixin classes, which call this interface. In most cases, users will iterate over the values in their original order, so if we need to make performance trade-offs, this is the case we should focus on. Ideally, we keep only the eytzinger-sorted array and not the original array. |
@HDembinski According to https://www.hyrumslaw.com/ the users of |
That's all true, however:
I am more concerned about the increase in memory from holding two arrays. We already need two arrays instead of one in the current implementation, to hold the eytzinger-sorted values and the indices. Adding yet another array makes the implementation unfavorable in my view. The performance increase of eytzinger is not so dramatic that tripling the amount of required memory seems worth it. If the cons outweigh the pros, I would abandon this change. |
@jhcarl0814 I have been thinking about this change and after all, the costs do seem to outweigh the benefits. Eytzinger users more storage, iteration over the axis will become slower, and the benefits are only substantial if you make a very large axis that we do not encounter in real life. In many cases, the axis should fit into the L1 cache, and then the Eytzinger layout should not provide any benefits. |
@HDembinski You are right.
I should have done more study before suggesting this thing. Sorry for taking up your time. |
No need to apologize. I could equally apologize to you, since I asked you to prepare a PR too early. We could have figured out these caveats without the PR. Thank you anyway, this was very interesting for me. |
design decisions (very very questionable):
vec_
is outside ofsorted_array_and_binary_search
andeytzinger_layout_and_eytzinger_binary_search
to make it easier for outsiders to getvec_
without knowing the type ofdata_structure_
useddata_structure_
is needed whenevervec_
constructs or changes (forgetting to add copy constructor, move constructor, copy assignment operator and move assignment operator causes hard-to-track bugs), andsorted_array_and_binary_search
andeytzinger_layout_and_eytzinger_binary_search
each gets an extraconstructor
and an extraassign
to makevariable
's move constructor and move assignment operatornoexcept
.sorted_array_and_binary_search
's does re-construction fromvec_
and ignoresrhs
because it's shallow;eytzinger_layout_and_eytzinger_binary_search
's doesstd::move
fromrhs
and ignoresvec_
because it has state and the function needs to benoexcept
.