Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmented ranges #1582

Closed
wants to merge 6 commits into from
Closed

Segmented ranges #1582

wants to merge 6 commits into from

Conversation

upsj
Copy link
Member

@upsj upsj commented Apr 1, 2024

This PR adds a bunch of utility abstractions for simplifying common iteration patterns in Ginkgo.

  1. irange: Similar to the Python range([start, ] stop[, step]) function, the irange provides a range that can be used in range-for loops (and the corresponding iterators in standard library algorithms):
for (int i = 0; i < size; i++)
for (auto j = begin; j < end; j += step)
// becomes
for (auto i : irange<int>(size))
for (auto j : irange<int>(begin, end, step))
// or with C++17 CTAD
for (auto i : irange(size))
for (auto j : irange(begin, end, step))

The range-for loop has the added advantage that the iteration variable can be made const, which prevents accidentally incrementing the wrong variable, e.g. in a nested loop. The compiler generates 100% identical code for this thanks to inlining, constant folding and common subexpression elimination.

  1. (enumerating_)indexed_iterator is pretty similar to permuting_iterator, but its intended usage is slightly different - it is mainly intended to provide strided access to an array, e.g. when stepping through a Csr matrix row with a whole warp, assigning nonzeros in a striped fashion. The enumerating_ variant returns a struct (index, value). The type is unlikely to be used directly, but used as additional functionality in the following segmented range abstraction.

  2. segmented_((enumerating_)value_)range provides a range-of-ranges abstraction that can cover most of our typical Csr row_ptrs access patterns. The final result is intended to look as follows:

const auto begin = row_ptrs[row];
const auto end = row_ptrs[row + 1];
for (auto nz = begin; nz < end; nz++) {
    const auto col = cols[nz];
    const auto val = vals[nz];
    // ...
}
// with C++17 CTAD and structured bindings
csr_view{row_ptrs, cols, vals, num_rows}; // the actual type will probably differ
for (auto [nz, col, value] : csr_view.enumerate(row)) {
    // ...
}

Additionally, I am planning to make coalesced/striped work assignment more clear with some small helpers that turn regular ranges into strided ranges based on the warp/subgroup lane and size:

const auto begin = row_ptrs[row];
const auto end = row_ptrs[row + 1];
for (auto nz = begin + subwarp.thread_rank(); nz < end; nz += subwarp.size()) {
    const auto col = cols[nz];
    const auto val = vals[nz];
    // ...
}
// with C++17 CTAD and structured bindings
csr_view{row_ptrs, cols, vals, num_rows}; // the actual type will probably differ
for (auto [nz, col, value] : csr_view.enumerate(row).striped(subwarp)) {
    // ...
}

Similarly, we can provide a blocked(subwarp) function, which assigns a consecutive block of indices to each thread, when that is useful. The inspiration for this approach comes from cub's striped and blocked work assignment strategies

TODO:

  • more comprehensive tests
  • device tests to check compilation
  • strided ranges with group support

@upsj upsj self-assigned this Apr 1, 2024
@ginkgo-bot ginkgo-bot added reg:build This is related to the build system. reg:testing This is related to testing. mod:core This is related to the core module. labels Apr 1, 2024
@upsj upsj mentioned this pull request Apr 3, 2024
6 tasks
@MarcelKoch MarcelKoch modified the milestone: Ginkgo 1.8.0 Apr 5, 2024
Copy link
Member

@MarcelKoch MarcelKoch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are some of my notes. My biggest concern is indexed|enumerated_iterator, they are too similar IMO. We could do with just enumerated_iterator. With some extra storage we could also enable it-> properly in that case.

TBH I find the iterator hierarchy difficult to follow. It would probably be much easier if the type aliases would be more direct.



template <typename IndexType>
class integer_iterator {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this usually called a counting iterator?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this even an iterator? Usually, iterators give access to something. This class only handles an integer and the multiplication with a stride. It doesn't refer to anything (see your own reference type definition).

integer_iterator<int> iterator(12, 2);
for (auto&& iter : iterator) {
    *iter = 3; // This isn't doing anything and should result in a compiler warning
}

An alternative name might be: index_generator, index_generator_iterator, or maybe even index_iterator (even though I still don't think it is a real iterator, but it has most of the properties of one).
You could even put this into the detail namespace, as I don't expect it to be used directly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the name iterator is fine. There are different classes of iterators, the one you described would be an output iterator, but there are also input iterators, which you can only read from. Other libraries, e.g. thrust, also call this an iterator, so it's not uncommon.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Counting iterator implies unit stride, this iterator can have arbitrary strides, so I think it's not 100% accurate to call it that. But I agree that index_ would also be appropriate

core/base/integer_range.hpp Show resolved Hide resolved
template <typename ValueIterator,
typename IndexIterator = integer_iterator<
typename std::iterator_traits<ValueIterator>::difference_type>>
class enumerating_indexed_iterator
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I don't understand why there is both indexed_iterator and enumerating_indexed_iterator. For me enumerated and indexed are (mostly) equivalent. (One difference might be that enumerated always uses the indices [0, n), while indexed could use any indices.)

From the implementation, the only difference is how you access the value and the index. I'm not sure if this warrants two separate implementations for two nearly identical types.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO the enumerating case is the exception, the indexed iterator is the usual case. Take a Csr SpMV, that can be implemented using an indexed_iterator<zip_iterator<IndexType*, ValueType*>>:

for (auto [column, value]: range) {
    sum += in[column] * value;
}

I can only think of a small number of cases where we actually need the nonzero location.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think the names indexed and enumerated are too similar. For me they both say that I can iterate over a range and I get both the value and the corresponding index.

And your example, isn't that more of an zip operation than either indexed or enumerated? You are iterating both over the columns array and the values array.



template <typename ValueIterator,
typename IndexIterator = integer_iterator<
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do maybe want to restrict it to always use integer_iterator? Otherwise it might be a bit more difficult to understand how it will work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use this to implement a permuting_iterator, but that's probably beyond the scope of this PR.

using base_value_type = typename base::value_type;
using index_type = typename base::value_type;

struct enumerated {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest storing a value of this in the iterator to enable it->value(), it->index(). But in that case, I would also suggest to not store the index and value directly, but pointers, and use accessor functions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I put this into a struct is that then we can use it in structured bindings

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you add a special struct for something that could be a std::pair? The operators == and != are defined by std::pair as well.
Are you worried about the other operators (<, >, ...) defined by std::pair? They compare the first argument and only consider the second if the first are identical.
Or do you simply want access to index and value by those names?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think std::pair would also not work on the device, some functions there are not marked constexpr.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without C++17 support, the default way to access the iterator entry is like

for (auto entry : range) {
    std::get<0>(entry);
    std::get<1>(entry);
    entry.index;
    entry.value;
}

the latter is much more clear to me, and less error-prone

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, maybe we should push for c++17 with this PR? This is not related to anything that openCARP needs, so we could use it.

core/base/segmented_range.hpp Show resolved Hide resolved
core/base/segmented_range.hpp Show resolved Hide resolved
{
std::vector<int> v;

for (auto i : gko::irange<int>(1, 4)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: might also provide make_irange(...) so that the template parameter doesn't need to be defined, while waiting for c++17

}


TEST(IRangeStrided, KnowsItsProperties)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size test?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, size might be ambiguous. For this case it could be either 2 (how often can you increment before reaching the end), or 6 (the end of the stored half-interval).

}


#define EXPECT_ASSERT_FAILURE(_expression, ...) \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EXPECT_EXIT is also used elsewhere, do those also need to change?

Copy link
Member

@thoasm thoasm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not the biggest fan of the _iterator suffix for most of the classes because they don't really reference any underlying data.
The _range suffix makes sense (classes with begin() and end(), somewhat similar to python range).
The enumerating prefix isn't descriptive enough. To me, those are pair ranges (instead of returning an index or value iterator, they return a pair of both).
Additionally, I would like to have documentation for all new classes (and maybe examples for the more complicated classes like segmented_range).

/**
* Implements all `random_access_iterator` operations for _iterator in terms of
* the already implemented advance `operator +=(difference_type)`, the
* difference operator `operator-(_iterator, _iterator)` and the deference
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: typo

Suggested change
* difference operator `operator-(_iterator, _iterator)` and the deference
* difference operator `operator-(_iterator, _iterator)` and the dereference

core/base/integer_range.hpp Show resolved Hide resolved
core/base/integer_range.hpp Show resolved Hide resolved


template <typename IndexType>
class integer_iterator {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this even an iterator? Usually, iterators give access to something. This class only handles an integer and the multiplication with a stride. It doesn't refer to anything (see your own reference type definition).

integer_iterator<int> iterator(12, 2);
for (auto&& iter : iterator) {
    *iter = 3; // This isn't doing anything and should result in a compiler warning
}

An alternative name might be: index_generator, index_generator_iterator, or maybe even index_iterator (even though I still don't think it is a real iterator, but it has most of the properties of one).
You could even put this into the detail namespace, as I don't expect it to be used directly.

Comment on lines +237 to +251
template <typename Group>
constexpr irange_strided<index_type> striped(Group g) const
{
return striped(detail::group_traits<Group>::get_local_id(g),
detail::group_traits<Group>::get_size(g));
}

constexpr irange_strided<index_type> striped(index_type local_index,
index_type group_size) const
{
assert(local_index >= 0);
assert(local_index < group_size);
return irange_strided<index_type>{begin_index() + local_index,
end_index(), group_size};
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume these 2 functions are the reason why you separated irange from irange_strided. Are they targeting SYCL?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are targeting SIMD/coalescing memory accesses in general. My aim is to turn the first loop into the second:

for (auto i = begin + warp.thread_rank(); i < end; i += warp.size());
for (auto i : irange{begin, end}.striped(warp))

I will remove the templated version though, since that will be part of a separate PR.

using base_value_type = typename base::value_type;
using index_type = typename base::value_type;

struct enumerated {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you add a special struct for something that could be a std::pair? The operators == and != are defined by std::pair as well.
Are you worried about the other operators (<, >, ...) defined by std::pair? They compare the first argument and only consider the second if the first are identical.
Or do you simply want access to index and value by those names?

using index_iterator = IndexIterator;

private:
using value_traits = std::iterator_traits<index_iterator>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy-paste error:

Suggested change
using value_traits = std::iterator_traits<index_iterator>;
using value_traits = std::iterator_traits<value_iterator>;

This was referenced Apr 29, 2024
@MarcelKoch MarcelKoch removed this from the Ginkgo 1.8.0 milestone May 3, 2024
@tcojean tcojean added this to the Ginkgo 1.9.0 milestone May 3, 2024
@upsj
Copy link
Member Author

upsj commented Jul 15, 2024

Closing in favor of #1601

@upsj upsj closed this Jul 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mod:core This is related to the core module. reg:build This is related to the build system. reg:testing This is related to testing.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants