Sequence, iterator, and coordinate protocols

Blei edited this page Dec 27, 2011 · 10 revisions

Basic sequences and iterators

A sequence is an object usable as the subject of a for loop, that is, as the seq in for (x in seq) { ... }. An iterator object provides single-pass sequential iteration over the contents of the sequence. If seq is an lvalue sequence, the iterator allows for in-place modification of the sequence's elements as they are iterated. Iterating an iterator after it is exhausted causes undefined behavior. Iterators are invalidated if the sequence has elements added or removed, or if the sequence is destroyed by moving out of scope or being deallocated.

An iterator is all that is required to use a sequence in a for loop, or to use many of the sequence functions in the standard library, such as mapped and filtered. Sequences may optionally provide additional features by following the additional protocols documented later on this page, which may allow for more efficient implementations of some sequence operations.

Required functions:

  // For sequence type S, iterator type I, element type(s) ...E
  iterator(s: S) : I; // returns iterator object

  hasNext?(i: I) : Bool; // are more elements remaining to be iterated?
  next(i: I) : ..E; // returns next element(s)

Derived functions:

  [S] Sequence?(static S) = CallDefined?(iterator, S);
  [S] SequenceElementType(static S) = ..E;
  [S] SequenceIteratorType(static S) = I;
  [S] LValueSequence?(static S) : Bool; // true if next() can be assigned to

  [I] Iterator?(static I) = CallDefined?(hasNext?, I) and CallDefined?(next, I);
  [I] IteratorTargetType(static I) = ..E;

Reversible sequences

A sequence may provide a reverseIterator to efficiently support the reversed library function.

Required functions:

  reverseIterator(s: S) : I;

Sized sequences

If a sequence has a defined size, it should overload the size function.

Required functions:

  size(s: S) : SizeT;

Derived functions:

  [S] SizedSequence?(static S) = Sequence?(S) and CallDefined?(size, S);

Random access sequences

If a sequence allows cheap random access, it should overload index to implement the seq[n] operator.

Required functions:

  [N | Integer?(N)]
  index(s: S, n: N) : ..E; // returns nth (0-based) element of s (overload for s[n])

Derived functions:

  [S] RandomAccessSequence?(static S) = Sequence?(S) and CallDefined?(index, S, Int);

Coordinate sequences

Coordinates allow for richer traversal of sequences than iterators. They are modeled after what the C++ STL calls "bidirectional iterators". A coordinate represents a position in a sequence, and can be dereferenced, moved backward or forward, and compared for equality with other coordinates referencing the same sequence. Lvalue coordinate sequences may have their elements modified in-place through dereferenced coordinates. Comparing coordinates from different sequences and dereferencing a coordinate moved past the beginning or end of a sequence are undefined. Like iterators, coordinates are generically invalidated if elements are added or removed to the original sequence or if the sequence is destroyed. Specific sequence types may make stronger guarantees about the validity of their coordinates after modification.

Required functions:

  // For sequence S, coordinate C, element types ...E
  begin(s: S) : C; // coordinate referencing the beginning of the sequence
  end(s: S) : C; // coordinate referencing *past* the end of the sequence—not valid to dereference
  
  dereference(c: C) : ..E; // access element from sequence (overload for c^)
  inc(c: C); // move coordinate forward one element
  dec(c: C); // move coordinate backward one element
  equals?(c1: C, c2: C) : Bool; // true if the coordinates reference the same element in the same sequence
      // (overload for c1 == c2)

Derived functions:

  [S] CoordinateSequence?(static S) = Sequence?(S) and CallDefined?(begin, S) and CallDefined?(end, S);
  [S] SequenceCoordinateType(static S) = C;

  [C] Coordinate?(static C) = CallDefined?(dereference, C);
  [C] CoordinateTargetType(static C) = ..E;
  [C] LValueCoordinate?(static C) : Bool; // true if dereference() can be assigned to

  [C] coordinateRange(begin: C, end: C) : I;
      // returns a "coordinate range" that iterates over the elements between two coordinates
      // can be used as a sequence or iterator

  [C] notEquals?(c1: C, c2: C) = not equals?(begin, end); // overload for c1 != c2

Random access coordinates

Random access coordinates, like STL's "random access iterators", extend bidirectional iterators by allowing arbitrary movement of coordinates, coordinate arithmetic, and ordered comparison of coordinates within the same sequence. Comparing or performing arithmetic between coordinates from different sequences is undefined.

Required functions:

  // for coordinate C, signed integer type N, element type E
  add(c: C, n: N) : C; // new coordinate moved by n elements (overload for c + n)
  subtract(c: C, n: N) : C; // new coordinate moved by -n elements (overload for c - n)
  subtract(c1: C, c2: C) : N; // number of elements between c1 and c2 (overload for c1 - c2)
  lesser?(c1: C, c2: C) : Bool; // true if c1 comes before c2 in the sequence (overload for c1 < c2)

Derived functions:

  [C] RandomAccessCoordinate?(static C) =
      Coordinate?(C) and
      CallDefined?(add, C, Int) and
      CallDefined?(subtract, C, Int) and
      CallDefined?(subtract, C, C);

  lesserEquals?(c1: C, c2: C) Bool;
  greater?(c1: C, c2: C) Bool;
  greaterEquals?(c1: C, c2: C) Bool; // other comparison operators defined in terms of lesser?
       // overloads for c1 <= c2, c1 > c2, c1 >= c2