Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[5.0] DataProtocol ContiguousCollection and new inline Data #21292

Merged

Conversation

itaiferber
Copy link
Contributor

Cherry-pick of #20225
(cherry picked from commit d030354)

This is an implementation of DataProtocol and a new smaller and leaner struct Data. The major change here is that we now promise that all Data's are contiguous. In order to enforce this requisite we added a new protocol to define contiguous collections (aptly named ContiguousCollection). This protocol is a draft of the work in progress proposed API for the standard library. Additionally it offers a direct byte accessor if safe on sequence to unify the sequence initialization to allow for fast paths.

Data itself now is represented in 16 bytes(on 64 bit platforms). This allows it to be stored in existentials without needing to heap allocate. Additionally during this refactor to shrink the representation we were able to make similar wins to the small string optimization so that we can store Data's on the stack when under a platform limit. This makes small Data structures never have to heap allocate. For example: Data() has zero heap allocations and Data([0xd, 0xa]) wont heap allocate for the storage of the data either since small representations can be stored completely on the stack efficiently. Larger data or slices however will still have a reference type storing the backing allocation as they did before. Our hope is that this will reduce the traffic on malloc which will reduce memory fragmentation and be considerably more efficient for small buffers.

@itaiferber
Copy link
Contributor Author

Please test with following pull request:
apple/swift-lldb#1138

@swift-ci Please test

@itaiferber itaiferber force-pushed the 5.0-inline_data_and_dataprotocol branch from 856b5c8 to 4bbf44d Compare December 13, 2018 17:54
@itaiferber
Copy link
Contributor Author

I've gone ahead and incorporated the changes from #21290 as well here.

@itaiferber
Copy link
Contributor Author

Please test with following pull request:
apple/swift-lldb#1138

@swift-ci Please test

@swift-ci
Copy link
Collaborator

Build failed
Swift Test OS X Platform
Git Sha - 856b5c8e773d70fb2573a0963c0808ab74ee3738

@swift-ci
Copy link
Collaborator

Build failed
Swift Test Linux Platform
Git Sha - 856b5c8e773d70fb2573a0963c0808ab74ee3738

@itaiferber itaiferber force-pushed the 5.0-inline_data_and_dataprotocol branch from 4bbf44d to 482e87d Compare December 13, 2018 21:00
@itaiferber
Copy link
Contributor Author

Please test with following pull request:
apple/swift-lldb#1138

@swift-ci Please test

@swift-ci
Copy link
Collaborator

Build failed
Swift Test Linux Platform
Git Sha - 4bbf44d4d17927d5c95350670db45ed2b2761bbc

@swift-ci
Copy link
Collaborator

Build failed
Swift Test OS X Platform
Git Sha - 4bbf44d4d17927d5c95350670db45ed2b2761bbc

@eeckstein
Copy link
Member

@itaiferber Did someone review this already?

@itaiferber
Copy link
Contributor Author

itaiferber commented Dec 13, 2018

@eeckstein Not in a formal capacity yet, no.

@eeckstein
Copy link
Member

@airspeedswift Can you please review this?

@@ -90,7 +78,7 @@ internal final class _DataStorage {
}
}

@usableFromInline
@inlinable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this needs to be inlinable? It's not a generic function.

@inline(__always)
public init<S: Sequence>(_ elements: S) where S.Element == UInt8 {
// If the sequence is already contiguous, access the underlying raw memory directly.
if let contiguous = elements as? ContiguousBytes {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious: why is this needed? Are there sequences of UInt8s which cannot implement withContiguousStorageIfAvailable but can conform to ContiguousBytes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes: any sequence of UInt8's which does not represent typed memory cannot implement withContiguousStorageIfAvailable without handing over a copy, which defeats what we need to do here. Data and UnsafeRawBufferPointer are both examples of such sequences — both have an element type of UInt8, but represent raw memory, and thus cannot implement withContiguousStorageIfAvailable.

Copy link
Member

@airspeedswift airspeedswift left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments mostly are for future clean-up, though I would suggest inlinability could do with another pass to be certain it's justified everywhere. In some places the internals seem to be unnecessarily exposed in functions that don't seem critical to inline (apologies if experimentation has shown otherwise).

block(UnsafeBufferPointer(start: bytePtr, count: effectiveLength - regionAdjustment), region.location + regionAdjustment - _offset, &stopv)
if stopv {
stop.pointee = true
@inlinable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is pretty huge, and not generic, so seems like there's not a strong reason to inline it. Did you see a noticeable perf difference from marking it inlined? Also it seems like it's the only reason some other functions need to be inlinable/usableFromInline.

guard otherData.count > 0 else { return }
otherData.withUnsafeBytes {
append($0.baseAddress!, length: $0.count)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indenting nit

}

@_fixed_layout
public struct Data : ReferenceConvertible, Equatable, Hashable, RandomAccessCollection, MutableCollection, RangeReplaceableCollection, MutableDataProtocol, ContiguousBytes {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend breaking this up some time into individual extensions for each conformance. We've found this helps readability of the code a fair bit in the std lib.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree — I didn't want to take on the task right now, but we should do this in the future.

@inlinable
var count: Int {
get {
return numericCast(length)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No real need to use numericCast when you know both types in various places here. This should be Int(length). Being explicit about what you're converting to is clearer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll change this too

return numericCast(length)
}
set(newValue) {
precondition(newValue <= MemoryLayout<Buffer>.size)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is internal, should it be an assert?

}

@inlinable
var hashValue: Int {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the goal here having Data's hashValue match NSData's? If not, you should pass the value through Swift's hashing mechanism to ensure it gets randomly seeded.

//===--- DataProtocol Extensions ------------------------------------------===//

extension DataProtocol {
public func firstRange<D: DataProtocol>(of data: D) -> Range<Index>? {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are generic and entirely protocol-based helpers so should be inlinable.

if range.upperBound == 0 {
self = .empty
} else if InlineData.canStore(count: range.upperBound) {
precondition(range.lowerBound <= endIndex, "index \(range.lowerBound) is out of bounds of \(startIndex)..<\(endIndex)")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I missing why the preconditions apply here but not below? Also could they be hoisted, they all seem the same.

#endif

#if DEPLOYMENT_RUNTIME_SWIFT

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar comment to other hashValue, ideally this would use Swift's randomization

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure why GitHub decided to make my comment here about a deleted line instead of the code on the rhs...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weird, it looks ok on the code tab

var upperBound: Int { return range.upperBound }

@inlinable
var count: Int { return range.upperBound - range.lowerBound }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to replacerange.upperBound - range.lowerBound with range.count in a number of places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: this and range.isEmpty — we found during testing that because range does not implement count itself (but instead gets the implementation from a default implementation on Collection), calculating this ourselves was orders of magnitude faster than calling range.count (dispatching through to the protocol implementation added a lot of overhead), and this was making a big difference in our unit tests.

It does appear that Range implements isEmpty directly, so we should be able to fix .isEmpty above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you file a bug for the optimizer for that? We should try and ensure they're equivalent if possible. It might also be something that inlining too much is causing as the inliner might be giving up on a huge function that needs a specialized count the middle.

}

public func lastRange<D: DataProtocol>(of data: D) -> Range<Index>? {
return self.firstRange(of: data, in: self.startIndex ..< self.endIndex)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be lastRange

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

count = range.upperBound
}
Swift.withUnsafeMutableBytes(of: &bytes) { rawBuffer in
bzero(rawBuffer.baseAddress?.advanced(by: range.lowerBound), range.upperBound - range.lowerBound)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bzero is deprecated and doesn't exist on some platforms (e.g. windows). Use memset(3) instead.

http://pubs.opengroup.org/onlinepubs/009696699/functions/bzero.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation is only for Darwin platforms so bzero is fine, but I'll make this change so things are easier to copy over to swift-corelibs-foundation wholesale.

@karwa
Copy link
Contributor

karwa commented Dec 14, 2018

This is an interesting design and the performance improvements are welcome, but I'd nonetheless like to register my disappointment that this major change to the Swift API never went through swift-evolution.

@itaiferber
Copy link
Contributor Author

@karwa The changes here are Foundation API, and the corelibs process is separate from swift-evolution. If we decide at any point to integrate this work into the standard library directly, we'll certainly go through swift-evolution, just like everyone else.

@itaiferber
Copy link
Contributor Author

@airspeedswift Thanks for the review! I'm going to make some of the functional changes here that actually affect the implementation — as you mention, we can do the inlinability fixes in another pass very soon. We'd like to get the implementation landed first for testing and we have a little wiggle room to re-evaluate inlinability.

In general, we are aggressively inlining to give the compiler as much room as possible to make performance optimizations, which we've seen are largely necessary to make this work compelling. We can evaluate on a case-by-case basis if we see this having negative effects.

@airspeedswift
Copy link
Member

In general, we are aggressively inlining to give the compiler as much room as possible to make performance optimizations

A word of warning – this can backfire. Marking things inlinable sometimes slows code down unexpectedly too because it limits what assumptions the compiler can make about internal uses of the inlinable function. It also bloats user binary size.

@itaiferber
Copy link
Contributor Author

Please test with following pull request:
apple/swift-lldb#1138

@swift-ci Please test

@swift-ci
Copy link
Collaborator

Build failed
Swift Test Linux Platform
Git Sha - 482e87d

@swift-ci
Copy link
Collaborator

Build failed
Swift Test OS X Platform
Git Sha - 482e87d

@itaiferber
Copy link
Contributor Author

itaiferber commented Dec 14, 2018

@airspeedswift Do you have any further blocking concerns at this time? If not, we'll need to merge this and apple/swift-lldb#1138 in concert. (I can merge the lldb change but don't have permission to merge this.)

@airspeedswift
Copy link
Member

I'm OK to merge this. Can you add some tests for the parts that had bugs that didn't get test failures in a follow-up PR?


// ... and append the rest byte-wise.
while let element = iter.next() {
Swift.withUnsafeBytes(of: element) {
Copy link
Collaborator

@palimondo palimondo Jan 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a terrible performance pitfall to me... No benchmark currently covers this, but in my local test, this is 32x!!! slower than hitting the happy _copyContents path.

I see no API on _Representation to work with individual bytes, so this probably needs to be buffered locally here.

Copy link
Contributor Author

@itaiferber itaiferber Jan 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@palimondo Sorry for the delay in responding — just got back from vacation.
Just to clarify, do you mean by this comment that we should slurp all of the elements out of the iterator into an array, then append to _representation, or just that hitting this code path is slow? (Because the latter is somewhat expected — we've got a sequence here which is not ContiguousBytes, doesn't have an implementation of withContiguousStorageIfAvailable, and doesn't implement the _copyContents optimization, AKA, the worst-case scenario.)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first. I don't think such slowdown is acceptable. Here's benchmark I got:


  BenchmarkInfo(name: "Data.append.Sequence.ExactCount",
    runFunction: { append($0*100,
      sequence: repeatElement(UInt8(0xA0), count: 809), to: medium) }, tags: d),
  BenchmarkInfo(name: "Data.append.Sequence.UnderestimatedCount",
    runFunction: { append($0*100,
      sequence: repeatElementSeq(809), to: medium) }, tags: d),



let repeatElementSeq = { count in
  return sequence(state: count) { (i: inout Int) -> UInt8? in
    defer { i = i &- 1 }; return i > 0 ? UInt8(0xA0) : nil
  }
}

@inline(never)
func append<S: Sequence>(_ N: Int, sequence: S, to data: Data)
  where S.Element == UInt8 {
  for _ in 1...N {
    var copy = data
    copy.append(contentsOf: sequence)
  }
}

Copy link
Collaborator

@palimondo palimondo Jan 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...and just to clarify, we probably don't need to slurp all of the elements out of the iterator in one go, just amortize the costs of going over the UnsafeBufferPointer by slurping more elements in batches periodically append to the _representation. We could try reusing InlineData as the buffer, or test some moderately sized ContiguousArray

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@palimondo Sure, this is a reasonable change we can and should make. Are you interested in taking this on?

(We should make a similar change in append<S : Sequence>(_: S) as well.)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not at the moment. I want to first finish my cleanup pass of the Swift Benchmark Suite, applying the legacyFactor.

Also, when I have played with this locally (I was println debugging Data implementation to see which branch get's exercised), I had trouble getting the build to work properly. Maybe it was just that most of the impl was inlined and the DataBenchmarks were't getting rebuilt? It was a mess.

So I cannot promise I could get back to this sooner than in two weeks...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@palimondo No worries, I'll take care of this, possibly in #21754

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants