-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmented ranges #1582
Segmented ranges #1582
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are some of my notes. My biggest concern is indexed|enumerated_iterator
, they are too similar IMO. We could do with just enumerated_iterator
. With some extra storage we could also enable it->
properly in that case.
TBH I find the iterator hierarchy difficult to follow. It would probably be much easier if the type aliases would be more direct.
|
||
|
||
template <typename IndexType> | ||
class integer_iterator { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't this usually called a counting iterator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this even an iterator? Usually, iterators give access to something. This class only handles an integer and the multiplication with a stride. It doesn't refer to anything (see your own reference
type definition).
integer_iterator<int> iterator(12, 2);
for (auto&& iter : iterator) {
*iter = 3; // This isn't doing anything and should result in a compiler warning
}
An alternative name might be: index_generator
, index_generator_iterator
, or maybe even index_iterator
(even though I still don't think it is a real iterator, but it has most of the properties of one).
You could even put this into the detail
namespace, as I don't expect it to be used directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the name iterator is fine. There are different classes of iterators, the one you described would be an output iterator, but there are also input iterators, which you can only read from. Other libraries, e.g. thrust, also call this an iterator, so it's not uncommon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Counting iterator implies unit stride, this iterator can have arbitrary strides, so I think it's not 100% accurate to call it that. But I agree that index_
would also be appropriate
template <typename ValueIterator, | ||
typename IndexIterator = integer_iterator< | ||
typename std::iterator_traits<ValueIterator>::difference_type>> | ||
class enumerating_indexed_iterator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH I don't understand why there is both indexed_iterator
and enumerating_indexed_iterator
. For me enumerated
and indexed
are (mostly) equivalent. (One difference might be that enumerated always uses the indices [0, n)
, while indexed could use any indices.)
From the implementation, the only difference is how you access the value and the index. I'm not sure if this warrants two separate implementations for two nearly identical types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO the enumerating case is the exception, the indexed iterator is the usual case. Take a Csr SpMV, that can be implemented using an indexed_iterator<zip_iterator<IndexType*, ValueType*>>
:
for (auto [column, value]: range) {
sum += in[column] * value;
}
I can only think of a small number of cases where we actually need the nonzero location.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think the names indexed
and enumerated
are too similar. For me they both say that I can iterate over a range and I get both the value and the corresponding index.
And your example, isn't that more of an zip operation than either indexed or enumerated? You are iterating both over the columns array and the values array.
|
||
|
||
template <typename ValueIterator, | ||
typename IndexIterator = integer_iterator< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do maybe want to restrict it to always use integer_iterator
? Otherwise it might be a bit more difficult to understand how it will work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could use this to implement a permuting_iterator
, but that's probably beyond the scope of this PR.
using base_value_type = typename base::value_type; | ||
using index_type = typename base::value_type; | ||
|
||
struct enumerated { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest storing a value of this in the iterator to enable it->value()
, it->index()
. But in that case, I would also suggest to not store the index and value directly, but pointers, and use accessor functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I put this into a struct is that then we can use it in structured bindings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you add a special struct for something that could be a std::pair
? The operators ==
and !=
are defined by std::pair
as well.
Are you worried about the other operators (<
, >
, ...) defined by std::pair
? They compare the first argument and only consider the second if the first are identical.
Or do you simply want access to index
and value
by those names?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think std::pair would also not work on the device, some functions there are not marked constexpr
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without C++17 support, the default way to access the iterator entry is like
for (auto entry : range) {
std::get<0>(entry);
std::get<1>(entry);
entry.index;
entry.value;
}
the latter is much more clear to me, and less error-prone
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH, maybe we should push for c++17 with this PR? This is not related to anything that openCARP needs, so we could use it.
{ | ||
std::vector<int> v; | ||
|
||
for (auto i : gko::irange<int>(1, 4)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: might also provide make_irange(...)
so that the template parameter doesn't need to be defined, while waiting for c++17
} | ||
|
||
|
||
TEST(IRangeStrided, KnowsItsProperties) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
size test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, size might be ambiguous. For this case it could be either 2 (how often can you increment before reaching the end), or 6 (the end of the stored half-interval).
} | ||
|
||
|
||
#define EXPECT_ASSERT_FAILURE(_expression, ...) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EXPECT_EXIT
is also used elsewhere, do those also need to change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not the biggest fan of the _iterator
suffix for most of the classes because they don't really reference any underlying data.
The _range
suffix makes sense (classes with begin()
and end()
, somewhat similar to python range
).
The enumerating
prefix isn't descriptive enough. To me, those are pair ranges (instead of returning an index or value iterator, they return a pair of both).
Additionally, I would like to have documentation for all new classes (and maybe examples for the more complicated classes like segmented_range
).
/** | ||
* Implements all `random_access_iterator` operations for _iterator in terms of | ||
* the already implemented advance `operator +=(difference_type)`, the | ||
* difference operator `operator-(_iterator, _iterator)` and the deference |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: typo
* difference operator `operator-(_iterator, _iterator)` and the deference | |
* difference operator `operator-(_iterator, _iterator)` and the dereference |
|
||
|
||
template <typename IndexType> | ||
class integer_iterator { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this even an iterator? Usually, iterators give access to something. This class only handles an integer and the multiplication with a stride. It doesn't refer to anything (see your own reference
type definition).
integer_iterator<int> iterator(12, 2);
for (auto&& iter : iterator) {
*iter = 3; // This isn't doing anything and should result in a compiler warning
}
An alternative name might be: index_generator
, index_generator_iterator
, or maybe even index_iterator
(even though I still don't think it is a real iterator, but it has most of the properties of one).
You could even put this into the detail
namespace, as I don't expect it to be used directly.
template <typename Group> | ||
constexpr irange_strided<index_type> striped(Group g) const | ||
{ | ||
return striped(detail::group_traits<Group>::get_local_id(g), | ||
detail::group_traits<Group>::get_size(g)); | ||
} | ||
|
||
constexpr irange_strided<index_type> striped(index_type local_index, | ||
index_type group_size) const | ||
{ | ||
assert(local_index >= 0); | ||
assert(local_index < group_size); | ||
return irange_strided<index_type>{begin_index() + local_index, | ||
end_index(), group_size}; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume these 2 functions are the reason why you separated irange
from irange_strided
. Are they targeting SYCL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are targeting SIMD/coalescing memory accesses in general. My aim is to turn the first loop into the second:
for (auto i = begin + warp.thread_rank(); i < end; i += warp.size());
for (auto i : irange{begin, end}.striped(warp))
I will remove the templated version though, since that will be part of a separate PR.
using base_value_type = typename base::value_type; | ||
using index_type = typename base::value_type; | ||
|
||
struct enumerated { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you add a special struct for something that could be a std::pair
? The operators ==
and !=
are defined by std::pair
as well.
Are you worried about the other operators (<
, >
, ...) defined by std::pair
? They compare the first argument and only consider the second if the first are identical.
Or do you simply want access to index
and value
by those names?
using index_iterator = IndexIterator; | ||
|
||
private: | ||
using value_traits = std::iterator_traits<index_iterator>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy-paste error:
using value_traits = std::iterator_traits<index_iterator>; | |
using value_traits = std::iterator_traits<value_iterator>; |
Closing in favor of #1601 |
This PR adds a bunch of utility abstractions for simplifying common iteration patterns in Ginkgo.
irange
: Similar to the Pythonrange([start, ] stop[, step])
function, theirange
provides a range that can be used in range-for loops (and the corresponding iterators in standard library algorithms):The range-for loop has the added advantage that the iteration variable can be made
const
, which prevents accidentally incrementing the wrong variable, e.g. in a nested loop. The compiler generates 100% identical code for this thanks to inlining, constant folding and common subexpression elimination.(enumerating_)indexed_iterator
is pretty similar topermuting_iterator
, but its intended usage is slightly different - it is mainly intended to provide strided access to an array, e.g. when stepping through a Csr matrix row with a whole warp, assigning nonzeros in a striped fashion. Theenumerating_
variant returns a struct(index, value)
. The type is unlikely to be used directly, but used as additional functionality in the following segmented range abstraction.segmented_((enumerating_)value_)range
provides a range-of-ranges abstraction that can cover most of our typical Csrrow_ptrs
access patterns. The final result is intended to look as follows:Additionally, I am planning to make coalesced/striped work assignment more clear with some small helpers that turn regular ranges into strided ranges based on the warp/subgroup lane and size:
Similarly, we can provide a
blocked(subwarp)
function, which assigns a consecutive block of indices to each thread, when that is useful. The inspiration for this approach comes from cub's striped and blocked work assignment strategiesTODO: