Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<array>: Should array<T, N> iterators depend on N? #211

Open
StephanTLavavej opened this issue Oct 25, 2019 · 14 comments
Open

<array>: Should array<T, N> iterators depend on N? #211

StephanTLavavej opened this issue Oct 25, 2019 · 14 comments
Labels
throughput Must compile faster vNext Breaks binary compatibility

Comments

@StephanTLavavej
Copy link
Member

StephanTLavavej commented Oct 25, 2019

Long ago, we made most of our iterators SCARY (see WG21-N2911 and WG21-N2980). This means that they don't depend on allocators, comparators, etc.

array<T, N> iterators are an interesting case, as they're still templated on N:

STL/stl/inc/xutility

Lines 2759 to 2760 in 6b0238d

template <class _Ty, size_t _Size>
class _Array_const_iterator

I believe that for IDL=0 it would be better to avoid depending on N. That would reduce instantiations and improve throughput. For IDL=2 I'm less certain - if we don't template on N, we have to store it at runtime (which probably isn't a big deal, IDL=2 is already costly). If we do template on N (like today), then we'd be setting up an unusual situation where iterator types observably change depending on IDL (not just their representations).

vNext note: Resolving this issue will require breaking binary compatibility. We won't be able to accept pull requests for this issue until the vNext branch is available. See #169 for more information.

@StephanTLavavej StephanTLavavej added enhancement Something can be improved vNext Breaks binary compatibility throughput Must compile faster labels Oct 25, 2019
@miscco
Copy link
Contributor

miscco commented Oct 28, 2019

I would like to add that this also applies to _Span_iterator for static spans.

Moreover when I wrote _Span_iterator I actually thought about creating a common _Random_Access_Iterator base class as all the machinery is basically the same and they only ever differ by constructors. However, in the end I went with minimal changes, cleanup later.

Also the pointer based implementation suggested by @CaseyCarter is much clearer than the {pointer , size, size} implementation used here.

I have to check what _String_view_iterator does but I would guess one could base that one also on a generic implementation.

@MikeGitb
Copy link

Just wondering: Have you folks considered making a common base class (template) for iterators that are effectively pointer-like (vector, array, span, string, string_view)? Or are there too many differences in the details.

Regarding the original question:
If you made aray's iterator not dependent on size (which I would prefer), would wou really need to store ptr_to_array + offset + size for the IDL=2 version? Couldn't you just represent it as ptr_to_element + distance to end? Personally I'd rather pay the cost of an additional decrement during iteration than the increase in size.

Or do you actually need a pointer to the container for debugging purposes?

@miscco
Copy link
Contributor

miscco commented Oct 28, 2019

I can only speak from my experinence with implementing span. So I am just a "dude".

The first thing is that one needs two additional data members for IDL = 2. The reason is that we need to validate three things. begin, data and end. These are 3 independent values.

In principle it would be possible to create a generalized base _Continuous_iterator that simply holds a reference to its parent object. With that we could perform all range checks against base._Unchecked_begin() and base._Unchecked_end() only string a single reference.

That would have the advantage that we could abstract away all the differences between the different iterators of the contiguous_iterator concept. However, that comes with a considerable cost of an additional indirection when debugging and an additional cost when compiling in debug. Also if one uses a reference it would not be default constructible anymore.

Is that worth the simplified code? I dont know. In my own pet projects and even at work I would definitely say yes. For something as important and widely used as the STL maybe not?

For reference what I thought about would look like

// STRUCT TEMPLATE _Contiguous_iterator
template <class _Base>
struct _Contiguous_iterator{
#ifdef __cpp_lib_concepts
    using iterator_concept = contiguous_iterator_tag;
#endif // __cpp_lib_concepts
    using iterator_category = random_access_iterator_tag;
    using value_type        = typename _Base::value_type;
    using difference_type   = ptrdiff_t;
    using pointer           = typename _Base::pointer;
    using reference         = typename _Base::reference;

#if _ITERATOR_DEBUG_LEVEL == 0
    constexpr _Contiguous_iterator() = default;
#endif // _ITERATOR_DEBUG_LEVEL == 0

#if _ITERATOR_DEBUG_LEVEL >= 1
    constexpr _Contiguous_iterator(const pointer _Pointer, const _Base& _Cont) noexcept
        : _Myptr(_Pointer), _Mycontainer(_Cont) {}
#else // ^^^ _ITERATOR_DEBUG_LEVEL >= 1 ^^^ // vvv _ITERATOR_DEBUG_LEVEL == 0 vvv
    constexpr explicit _Contiguous_iterator(const pointer _Pointer) noexcept : _Myptr(_Pointer) {}
#endif // _ITERATOR_DEBUG_LEVEL

    _NODISCARD constexpr reference operator*() const noexcept {
#if _ITERATOR_DEBUG_LEVEL >= 1
        _STL_VERIFY(_Mycontainer._Unchecked_begin(), "cannot dereference value-initialized iterator");
        _STL_VERIFY(_Myptr < _Mycontainer._Unchecked_end(), "cannot dereference end iterator");
#endif // _ITERATOR_DEBUG_LEVEL >= 1
        return *_Myptr;
    }

    _NODISCARD constexpr pointer operator->() const noexcept {
#if _ITERATOR_DEBUG_LEVEL >= 1
        _STL_VERIFY(_Mycontainer._Unchecked_begin(), "cannot dereference value-initialized iterator");
        _STL_VERIFY(_Myptr < _Mycontainer._Unchecked_end(), "cannot dereference end iterator");
#endif // _ITERATOR_DEBUG_LEVEL >= 1
        return _Myptr;
    }

    constexpr _Contiguous_iterator& operator++() noexcept {
#if _ITERATOR_DEBUG_LEVEL >= 1
        _STL_VERIFY(_Mycontainer._Unchecked_begin(), "cannot increment value-initialized iterator");
        _STL_VERIFY(_Myptr < _Mycontainer._Unchecked_end(), "cannot increment iterator past end");
#endif // _ITERATOR_DEBUG_LEVEL >= 1
        ++_Myptr;
        return *this;
    }

    constexpr _Contiguous_iterator operator++(int) noexcept {
        _Contiguous_iterator _Tmp{*this};
        ++*this;
        return _Tmp;
    }

    constexpr _Contiguous_iterator& operator--() noexcept {
#if _ITERATOR_DEBUG_LEVEL >= 1
        _STL_VERIFY(_Mycontainer._Unchecked_begin(), "cannot decrement value-initialized iterator");
        _STL_VERIFY(_Mycontainer._Unchecked_begin()< _Myptr, "cannot decrement iterator before begin");
#endif // _ITERATOR_DEBUG_LEVEL >= 1
        --_Myptr;
        return *this;
    }

    constexpr _Contiguous_iterator operator--(int) noexcept {
        _Contiguous_iterator _Tmp{*this};
        --*this;
        return _Tmp;
    }

    constexpr void _Verify_offset([[maybe_unused]] const difference_type _Off) const noexcept {
#if _ITERATOR_DEBUG_LEVEL >= 1
        if (_Off != 0) {
            _STL_VERIFY(_Mycontainer._Unchecked_begin(), "cannot seek value-initialized iterator");
        }

        if (_Off < 0) {
            _STL_VERIFY(_Myptr - _Mycontainer._Unchecked_begin() >= -_Off, "cannot seek iterator before begin");
        }

        if (_Off > 0) {
            _STL_VERIFY(_Mycontainer._Unchecked_end() - _Myptr >= _Off, "cannot seek iterator after end");
        }
#endif // _ITERATOR_DEBUG_LEVEL >= 1
    }

    constexpr _Contiguous_iterator& operator+=(const difference_type _Off) noexcept {
#if _ITERATOR_DEBUG_LEVEL >= 1
        _Verify_offset(_Off);
#endif // _ITERATOR_DEBUG_LEVEL >= 1
        _Myptr += _Off;
        return *this;
    }

    _NODISCARD constexpr _Contiguous_iterator operator+(const difference_type _Off) const noexcept {
        _Contiguous_iterator _Tmp{*this};
        _Tmp += _Off;
        return _Tmp;
    }

    constexpr _Contiguous_iterator& operator-=(const difference_type _Off) noexcept {
#if _ITERATOR_DEBUG_LEVEL >= 1
        if (_Off != 0) {
            _STL_VERIFY(_Mycontainer._Unchecked_begin(), "cannot seek value-initialized iterator");
        }

        if (_Off > 0) {
            _STL_VERIFY(_Myptr - _Mycontainer._Unchecked_begin() >= _Off, "cannot seek iterator before begin");
        }

        if (_Off < 0) {
            _STL_VERIFY(_Mycontainer._Unchecked_end() - _Myptr >= -_Off, "cannot seek iterator after end");
        }
#endif // _ITERATOR_DEBUG_LEVEL >= 1
        _Myptr -= _Off;
        return *this;
    }

    _NODISCARD constexpr _Contiguous_iterator operator-(const difference_type _Off) const noexcept {
        _Contiguous_iterator _Tmp{*this};
        _Tmp -= _Off;
        return _Tmp;
    }

    _NODISCARD constexpr difference_type operator-(const _Contiguous_iterator& _Right) const noexcept {
#if _ITERATOR_DEBUG_LEVEL >= 1
        _STL_VERIFY(
            _Mycontainer._Unchecked_begin() == _Right._Mycontainer._Unchecked_begin() && _Mycontainer._Unchecked_end() == _Right._Mycontainer._Unchecked_end(), "cannot subtract incompatible iterators");
#endif // _ITERATOR_DEBUG_LEVEL >= 1
        return _Myptr - _Right._Myptr;
    }

    _NODISCARD constexpr reference operator[](const difference_type _Off) const noexcept {
        return *(*this + _Off);
    }

    _NODISCARD constexpr bool operator==(const _Contiguous_iterator& _Right) const noexcept {
#if _ITERATOR_DEBUG_LEVEL >= 1
        _STL_VERIFY(_Mycontainer._Unchecked_begin() == _Right._Mycontainer._Unchecked_begin() && _Mycontainer._Unchecked_end() == _Right._Mycontainer._Unchecked_end(),
            "cannot compare incompatible iterators for equality");
#endif // _ITERATOR_DEBUG_LEVEL >= 1
        return _Myptr == _Right._Myptr;
    }

    _NODISCARD constexpr bool operator!=(const _Contiguous_iterator& _Right) const noexcept {
        return !(*this == _Right);
    }

    _NODISCARD constexpr bool operator<(const _Contiguous_iterator& _Right) const noexcept {
#if _ITERATOR_DEBUG_LEVEL >= 1
        _STL_VERIFY(
            _Mycontainer._Unchecked_begin() == _Right._Mycontainer._Unchecked_begin() && _Mycontainer._Unchecked_end() == _Right._Mycontainer._Unchecked_end(), "cannot compare incompatible iterators");
#endif // _ITERATOR_DEBUG_LEVEL >= 1
        return _Myptr < _Right._Myptr;
    }

    _NODISCARD constexpr bool operator>(const _Contiguous_iterator& _Right) const noexcept {
        return _Right < *this;
    }

    _NODISCARD constexpr bool operator<=(const _Contiguous_iterator& _Right) const noexcept {
        return !(_Right < *this);
    }

    _NODISCARD constexpr bool operator>=(const _Contiguous_iterator& _Right) const noexcept {
        return !(*this < _Right);
    }

#if _ITERATOR_DEBUG_LEVEL >= 1
    friend constexpr void _Verify_range(const _Contiguous_iterator& _First, const _Contiguous_iterator& _Last) {
        _STL_VERIFY(_First._Mycontainer._Unchecked_begin() == _Last._Mycontainer._Unchecked_begin() && _First._Mycontainer._Unchecked_end() == _Last._Mycontainer._Unchecked_end(),
            "iterators from different containers do not form a range");
        _STL_VERIFY(_First._Myptr <= _Last._Myptr, "iterator range transposed");
    }
#endif // _ITERATOR_DEBUG_LEVEL >= 1

    using _Prevent_inheriting_unwrap = _Contiguous_iterator;

    _NODISCARD constexpr pointer _Unwrapped() const noexcept {
        return _Myptr;
    }

    static constexpr bool _Unwrap_when_unverified = _ITERATOR_DEBUG_LEVEL == 0;

    constexpr void _Seek_to(const pointer _It) noexcept {
        _Myptr = _It;
    }

    pointer _Myptr = nullptr;
#if _ITERATOR_DEBUG_LEVEL >= 1
    const _Base& _Mycontainer;
#endif // _ITERATOR_DEBUG_LEVEL >= 1
};

@MikeGitb
Copy link

These are 3 independent values.

Right. sorry, I always forget that you also have to check the lower bound of the array.

@StephanTLavavej
Copy link
Member Author

In principle it would be possible to create a generalized base _Continuous_iterator that simply holds a reference to its parent object.

When containers are swappable/movable while preserving iterators, you have to update those parent pointers. We currently have a "proxy object" solution for this, which is problematic (it requires dynamic memory allocation). In vNext, we're going to eliminate the proxy object and simply walk through all outstanding iterators (which requires each iterator to be chained into a linked list; currently singly linked, in vNext it will be doubly linked). Note that there is no way to update parent pointers if you don't have a proxy object and don't chain all of the iterators together - because how could the parent container know where its iterators are, in order to update them?

This doesn't apply to array (which doesn't have vector-like indirection).

@miscco
Copy link
Contributor

miscco commented Oct 29, 2019

So If I understand you correctly in IDL==2 each instance of a container with continuous heap memory (vector/deque/string) will have a std::list with all iterators that have been constructed from it via the member methods begin, end ... Then inside the move operator it will walk that list after the move and update all iterators to its new position?

In that case one would use a _Base* rather than a reference and all is good?

@StephanTLavavej
Copy link
Member Author

each instance of a container with continuous heap memory

All non-array containers, because they all need to deal with iterator invalidation.

(vector/deque/string)

Aside: deque is non-contiguous (yet random-access).

will have a std::list with all iterators that have been constructed from it

It's not a std::list. Instead it's an intrusive linked list, where each iterator itself contains a pointer to the "next" iterator. (In the current repo, this is a singly-linked list. In vNext, it will be a doubly-linked list.)

Here is the "next" pointer in the current repo:

_Iterator_base12* _Mynextiter;

Then inside the move operator it will walk that list after the move and update all iterators to its new position?

That's correct for vNext. (We have to hold the debug lock while doing so.)

In that case one would use a _Base* rather than a reference and all is good?

Yes, in vNext we can maintain a direct parent pointer.

@miscco
Copy link
Contributor

miscco commented Nov 6, 2019

If you are looking at std::vector this is already the case

    _NODISCARD pointer operator->() const {
#if _ITERATOR_DEBUG_LEVEL != 0
        const auto _Mycont = static_cast<const _Myvec*>(this->_Getcont());
        _STL_VERIFY(_Ptr, "can't dereference value-initialized vector iterator");
        _STL_VERIFY(
            _Mycont->_Myfirst <= _Ptr && _Ptr < _Mycont->_Mylast, 
            "can't dereference out of range vector iterator");
#endif // _ITERATOR_DEBUG_LEVEL != 0
        return _Ptr;
    }

However, I am just a dude and that was just a fast idea.

@StephanTLavavej
Copy link
Member Author

StephanTLavavej commented Nov 7, 2019

Contiguous iterators must support to_address(end_iterator), https://eel.is/c++draft/iterator.concept.contiguous .

We implement this by providing specializations of pointer_traits for our contiguous iterators. Examples:

STL/stl/inc/vector

Lines 207 to 232 in d9cf06e

#if _HAS_CXX20
template <class _Myvec>
struct pointer_traits<_Vector_const_iterator<_Myvec>> {
using pointer = _Vector_const_iterator<_Myvec>;
using element_type = const typename pointer::value_type;
using difference_type = typename pointer::difference_type;
_NODISCARD static constexpr element_type* to_address(const pointer _Iter) noexcept {
#if _ITERATOR_DEBUG_LEVEL != 0
// A value-initialized iterator is in the domain of to_address. An invalidated end iterator for a vector with
// capacity() of 0 is not. This function cannot distinguish those two cases, so it incorrectly does not diagnose
// the latter. In practice, this isn't a significant problem since to_address returns nullptr for such an
// iterator.
const auto _Mycont = static_cast<const _Myvec*>(_Iter._Getcont());
if (_Mycont) {
_STL_VERIFY(_Mycont->_Myfirst <= _Iter._Ptr && _Iter._Ptr <= _Mycont->_Mylast,
"can't convert out-of-range vector iterator to pointer");
} else {
_STL_VERIFY(!_Iter._Ptr, "can't convert invalid vector iterator to pointer");
}
#endif // _ITERATOR_DEBUG_LEVEL != 0
return _STD to_address(_Iter._Ptr);
}
};
#endif // _HAS_CXX20

STL/stl/inc/xutility

Lines 3030 to 3041 in d9cf06e

#if _HAS_CXX20
template <class _Ty, size_t _Size>
struct pointer_traits<_Array_const_iterator<_Ty, _Size>> {
using pointer = _Array_const_iterator<_Ty, _Size>;
using element_type = const _Ty;
using difference_type = ptrdiff_t;
_NODISCARD static constexpr element_type* to_address(const pointer _Iter) noexcept {
return _Iter._Unwrapped();
}
};
#endif // _HAS_CXX20

These specializations are then detected by non-member to_address() and are used instead of operator->(), https://eel.is/c++draft/pointer.conversion .

(Standardese citations as of WG21-N4835.)

@StephanTLavavej StephanTLavavej added the decision needed We need to choose something before working on this label Nov 16, 2019
@StephanTLavavej
Copy link
Member Author

For span, we decided to erase the extent. This improves release throughput, at the cost of debug iterator size.

@miscco
Copy link
Contributor

miscco commented Feb 4, 2020

I think there is a very interesting decision to make. Do we go for

template <class _Ty>
struct iterator {
    using pointer = _Ty*;

    pointer _Mydata;
#if _ITERATOR_DEBUG_LEVEL >= 1
    pointer _Mybegin;
    pointer _Myend;
#endif // _ITERATOR_DEBUG_LEVEL >= 1
}

or

template <class _Containter>
struct iterator {
    using pointer = _Containter::pointer;

    pointer _Mydata;
#if _ITERATOR_DEBUG_LEVEL >= 1
    _Containter* _Mycontainer;
#endif // _ITERATOR_DEBUG_LEVEL >= 1
}

The former has the advantage that it is closer to the memory model of a continuous range and there are no indirections. However, it requires twice as much memory and updating the iterators requires two stores.

The latter is smaller in footprint and simpler to update, However, everything will go through an _Mycontainer->{begin, end}() indirection and we have the explicit dependency on the container type.

Tradeoffs...

I would like to note that removing the extent from span::iterator has the additional benefit that this iterator could be used as a blueprint for _Continuous_iterator unifying all thoise continous_range iterators

@MikeGitb
Copy link

MikeGitb commented Feb 4, 2020

I'd definetely prefer option 1 - if only to make error messages more readable. I also doubt (but may be wrong) that storing a large nummer of iterators (where the size would matter) is a common usecase.

@miscco
Copy link
Contributor

miscco commented Feb 4, 2020

I'd definetely prefer option 1 - if only to make error messages more readable. I also doubt (but may be wrong) that storing a large nummer of iterators (where the size would matter) is a common usecase.

For DEBUG every iterator alive of every container is stored in a linked list so it might accumulate but on the other hand it is a debug build.

@StephanTLavavej StephanTLavavej removed the enhancement Something can be improved label Feb 5, 2020
@StephanTLavavej StephanTLavavej removed the decision needed We need to choose something before working on this label Mar 8, 2023
@StephanTLavavej
Copy link
Member Author

We talked about this at the weekly maintainer meeting and decided to follow span's example here - in vNext, we would like to make the types of array iterators independent of their N.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
throughput Must compile faster vNext Breaks binary compatibility
Projects
None yet
Development

No branches or pull requests

4 participants
@miscco @StephanTLavavej @MikeGitb and others