Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Borrowable container prototype #411

Open
wants to merge 16 commits into
base: future
Choose a base branch
from

Conversation

lorentey
Copy link
Member

@lorentey lorentey commented Aug 6, 2024

This is a very early draft of an in-memory container construct that supports noncopyable elements and provides direct borrowing access to its storage.

  • The Container protocol is a noncopyable amalgamation of multi-pass Sequence types and a limited subset of Collection. The same construct provides both iteration and indexing.
  • Iteration is done by exposing Spans over in-memory storage. Indices are better integrated with iterators -- we can start an iterator at a given index, and retrieve the index for the current position of any iterator.
  • Container has a BorrowingIteratorProtocol sister protocol. This may get merged into Container to reduce the number of conformances needed, but initial drafts implied that would not lead to a satisfying user experience.
  • Container has no equivalent for Indices nor SubSequence. If stored borrows become a thing, then these may come back, although perhaps not as customizable associated types.
  • We have BidirectionalContainer and RandomAccessContainer replicating the old collection hierarchy. I decided against doing clever associatedtype BidirectionalIndex trickery to reduce the number of protocols; although that idea solves some vexing issues with conditional conformances, initial drafts indicated that it obfuscates things too much.

This PR also adds some new types that conform to the new protocols:

  • The new HypoArray type is a noncopyable, heap-allocated, not implicitly resizing (but explicitly resizable) variant of Array that supports noncopyable elements. ("Hypo-" hints at "under" or "bottom". It pains me to capitalize the name like this; we may return to Hypoarray later.)
  • The new DynoArray type is a noncopyable, heap-allocated, implicitly resizing variant of Array that supports noncopyable elements. (This roughly corresponds to std::vector in C++.) ("Dyn-" hints at "force" or "power".)
  • The new HypoDeque and DynoDeque types implement the same variants for Deque.
  • For flat containers, the preexisting unsafe handle types are sort of equivalent to non-owning, unsafe variants of the hypo types. (HypoDeque is in fact directly using its unsafe handle as its storage representation.)
  • It may make sense to convert these to nonescapable, safe(r) constructs later, shared by all variants of the same data structure. (However, this nonescapable type cannot serve as storage representation, which is going to be annoying.) It may even make sense to make these nonescapable handle types public -- for example, in Array's case, this handle type is roughly the same as OutputSpan. However, Array is a special case; it is unclear if doing this would make sense for other data structures.
  • The preexisting Deque is reformulated to use the same unsafe handle type as HypoDeque and DynoDeque.

The immediate next item on my todo list is to investigate the last item's impact on Deque's performance.

Checklist

  • I've read the Contribution Guidelines
  • My contributions are licensed under the Swift license.
  • I've followed the coding style of the rest of the project.
  • I've added tests covering all new code paths my change adds to the project (if appropriate).
  • I've added benchmarks covering new functionality (if appropriate).
  • I've verified that my change does not break any existing tests or introduce unexplained benchmark regressions.
  • I've updated the documentation if necessary.

…xtracting(last:)

Also generalize some members to support ~Copyable elements
- protocol BorrowingIteratorProtocol
- protocol Container
- protocol BidirectionalContainer
- protocol RandomAccessContainer
- Bump to Xcode 16
- Update file lists
- Switch to Swift 6 language mode and enable a bunch of experimental features
@lorentey lorentey requested review from glessard and Azoy August 6, 2024 18:30
@lorentey
Copy link
Member Author

lorentey commented Aug 6, 2024

@swift-ci test

@lorentey
Copy link
Member Author

lorentey commented Aug 6, 2024

@swift-ci test

Instead, just have a regular class holding a HypoDeque. This requires two allocations per deque, but it makes representation a lot more straightforward.

I do not have a workable solution for empty singletons, though.
@lorentey
Copy link
Member Author

lorentey commented Aug 7, 2024

Welp, replacing ManagedBuffer had a significant performance impact. Comparing results between f64aded and 626e24f:

  Score   Sum     Improvements Regressions  Name
  7.286   0.137   1.000(#0)    0.1372(#76)  Deque<Int> prepend, reserving capacity (*)
  6.860   0.145   1.000(#0)    0.1458(#76)  Deque<Int> append, reserving capacity (*)
  5.731   0.174   1.000(#2)    0.1745(#74)  Deque<Int> removeFirst (discontiguous) (*)
  5.478   0.182   1.000(#3)    0.1825(#73)  Deque<Int> removeLast (discontiguous) (*)
  5.444   0.183   1.000(#2)    0.1837(#74)  Deque<Int> mutate through subscript (contiguous) (*)
  5.258   0.190   1.000(#3)    0.1902(#73)  Deque<Int> mutate through subscript (discontiguous) (*)
  5.031   0.198   1.000(#4)    0.1988(#72)  Deque<Int> removeLast (contiguous) (*)
  4.923   0.203   1.000(#3)    0.2031(#73)  Deque<Int> removeFirst (contiguous) (*)
  4.636   0.215   1.000(#2)    0.2157(#74)  Deque<Int> random swaps (contiguous) (*)
  4.470   0.223   1.000(#0)    0.2237(#76)  Deque<Int> append (*)
  4.141   0.241   1.000(#0)    0.2415(#76)  Deque<Int> prepend (*)
  3.814   0.262   1.000(#3)    0.2622(#73)  Deque<Int> random swaps (discontiguous) (*)
  2.649   0.377   1.033(#1)    0.3446(#75)  Deque<Int> sequential iteration (contiguous, indices) (*)
  2.558   0.390   1.000(#7)    0.3909(#69)  Deque<Int> partitioning around middle (discontiguous) (*)
  2.287   0.437   1.000(#10)   0.4372(#66)  Deque<Int> partitioning around middle (contiguous) (*)
  1.773   1.773   1.918(#13)   0.8553(#63)  Deque<Int> sequential iteration (discontiguous, iterator) (*)
  1.718   0.581   1.027(#25)   0.5551(#51)  Deque<Int> init from unsafe buffer (*)
  1.710   0.584   1.152(#5)    0.4329(#71)  Deque<Int> sequential iteration (discontiguous, indices) (*)
  1.567   0.638   1.000(#4)    0.6383(#72)  Deque<Int> sort (discontiguous) (*)
  1.401   1.401   1.510(#19)   0.8912(#57)  Deque<Int> sequential iteration (contiguous, iterator) (*)
  1.369   0.730   1.000(#4)    0.7304(#68)  Deque<Int> random insertions (*)
  1.267   0.789   1.000(#8)    0.7891(#64)  Deque<Int> random removals (discontiguous) (*)
  1.251   0.799   1.001(#21)   0.7979(#55)  Deque<Int> subscript get, random offsets (contiguous) (*)
  1.242   0.805   1.000(#10)   0.8049(#62)  Deque<Int> random removals (contiguous) (*)
  1.196   0.835   1.000(#23)   0.8356(#53)  Deque<Int> subscript get, random offsets (discontiguous) (*)
  1.062   0.941   1.000(#21)   0.9414(#55)  Deque<Int> sort (contiguous)

That 7.3x slowdown for prepend is incredibly high. Something has gone badly wrong, and it's worth investigating.

01 Deque Int  prepend, reserving capacity

…age class

It turns out the major slowdowns in individual element add/remove operations were mostly caused by runtime exclusivity checking on `Deque._Storage._value`.
…unchecked:)

Also, use these to spare some unnecessary instructions.
@lorentey
Copy link
Member Author

lorentey commented Aug 8, 2024

The 7x performance regression was mostly because of runtime exclusivity checks on Deque._Storage._value. Taking that out reduced to gap to ~30%, which is a lot more acceptable.

21 Deque Int  prepend

Compare this with the Deque<Int> prepend, reserving capacity results:

26 Deque Int  prepend, reserving capacity

There is some new constant(ish) overhead that contributes to the trouble here; it also appears to affect append and Deque.init(UnsafeBufferPointer):

22 Deque Int  append, reserving capacity 13 Deque Int  init from unsafe buffer

The UBP initializer used to be as fast as Array's, but it can now be as much as ~8 times slower, until the effect disappears at ~8k items.

I'm not yet sure what this is. We're now doing two allocations per Deque instance, but this smells different. This is worth a look, too.

@lorentey
Copy link
Member Author

lorentey commented Aug 8, 2024

We're now doing two allocations per Deque instance, but this smells different.

Maybe not -- these benchmarks were taken on a Linux VM, and I don't have a good instinct on the complexity of the default allocator there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant