Skip to content
Permalink
Branch: master
Find file Copy path
Find file Copy path
4 contributors

Users who have contributed to this file

@DougGregor @airspeedswift @benrimmington @wongzigii
137 lines (111 sloc) 6.44 KB

Introduce withContiguous{Mutable}StorageIfAvailable methods

Introduction

This proposal introduces two new methods, on Sequence and MutableCollection. These methods will allow generic code to make use of the withUnsafe{Mutable}BufferPointer idiom, as well as provide fast paths in the standard library for adopting types.

Swift-evolution thread: Contiguous Collection Protocols

Motivation

Almost every feature of Array is made available via one of the protocols in the standard library, and so most code written against Array can be rewritten generically as an extension of one or more protocols.

The exceptions to this are the operations withUnsafeBufferPointer and withUnsafeMutableBufferPointer, which are only available on the concrete types. Given the usefulness of these methods, they should also be made available generically.

In addition, it is common to be able to provide an optimized fast path for many operations that are generic over Sequence or MutableCollection, when an unsafe buffer is available. For example, initializing a Data from a [UInt8] should be a memcpy when calling the generic initializer from a Sequence where Element == UInt8. sort currently is implemented as a merge sort, and the movements to/from its auxiliary storage can be done using memory moves for non-trivial types when the sorted collection is contiguously stored.

Proposed solution

Introduce two new methods, providing access to the with-unsafe capabilities of Array & co when operating generically on protocols:

protocol Sequence {
  /// Call `body(p)`, where `p` is a pointer to the collection's
  /// contiguous storage.  If no such storage exists, it is
  /// first created.  If the collection does not support an internal
  /// representation in a form of contiguous storage, `body` is not
  /// called and `nil` is returned.
  ///
  /// A `Collection` that provides its own implementation of this method
  /// must also guarantee that an equivalent buffer of its `SubSequence` 
  /// can be generated by advancing the pointer by the distance to the
  /// slice's `startIndex`.
  func withContiguousStorageIfAvailable<R>(
    _ body: (UnsafeBufferPointer<Element>) throws -> R
  ) rethrows -> R?
}

protocol MutableCollection {
  /// Call `body(p)`, where `p` is a pointer to the collection's
  /// mutable contiguous storage.  If no such storage exists, it is
  /// first created.  If the collection does not support an internal
  /// representation in a form of mutable contiguous storage, `body` is not
  /// called and `nil` is returned.
  ///
  /// A `Collection` that provides its own implementation of this method
  /// must also guarantee that an equivalent buffer of its `SubSequence` 
  /// can be generated by advancing the pointer by the distance to the
  /// slice's `startIndex`.
  public mutating func withContiguousMutableStorageIfAvailable<R>(
    _ body: (inout UnsafeMutableBufferPointer<Element>) throws -> R
  ) rethrows -> R?
}

Existing types (such as Array or ArraySlice) that currently support withUnsafe{MutableBuffer}Pointer will forward on to this method. The default implementations will return nil. Additionally, Unsafe{Mutable}BufferPointer will respond usingself, and Slice<T> will provide an implementation that forwards on to its Base.

A customization point already exists with an underscore in the standard library for the mutable version (it just returns nil by default), and should be exposed to general users. In addition, there exist underscored customization points that could be replaced by the immutable variant.

There are no guarantees made by the mutable version about the state left behind if the closure throws during mutation. The updates made may or may not be reflected in the collection (which might have given a direct pointer to its internal storage, or could have handed out a temporary buffer that it then does not write back after the error is thrown). The closure should always perform any cleanup it thinks is necessary itself.

It should be documented that successive calls to withUnsafe{Mutable}BufferPointer are not guaranteed to give you the same pointer with each call. For example, a packed small string implementation may be giving you a pointer to that string temporarily expanded to a buffer.

Use of this entry point can provide significant speedups in some algorithms, e.g. our current sort which needs to move elements of a collection back and forth between some storage.

Source compatibility

These are additive changes and do not affect source compatibility.

Effect on ABI stability

These are additive changes and so can be done without affecting ABI stability. However, some existing underscored entry points could be altered as a result if done before ABI stability is declared.

Alternatives considered

This proposal originally introduced two new protocols: ContiguouslyStored and MutableContiguouslyStored. The introduction of customization points lower in the stack mean that these protocols are not necessary. It may be that valid use cases for asserting contiguity at compile time exist, in which case these protocols could be re-proposed.

Some collections are not fully contiguous, but instead consist of multiple contiguous regions (for example, a ring buffer is one or two separate contiguous regions). Protocols or methods that handle this situation are left to subsequent proposals.

The inout argument to the closure in the mutating variant is debatable. It does imply the user can change the buffer to a totally different one. Nonetheless, this is better handled in documentation, since the improved ergonomics of the inout version are considerable. It would also be a source-breaking change to alter Array's implementation at this point.

You can’t perform that action at this time.