New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[stdlib] [WIP] Add UnsafeRawBufferPointer.initialize(as:from:) #5718
[stdlib] [WIP] Add UnsafeRawBufferPointer.initialize(as:from:) #5718
Conversation
@swift-ci Please test |
Hi @atrick – as requested on your review of the other change. Thinking about it, I think the check of |
Build failed |
// TODO: Optimize where `C` is a `ContiguousArrayBuffer`. | ||
public func initializeMemory<S: Sequence>( | ||
as: S.Iterator.Element.Type, from source: S | ||
) -> (S.Iterator, Index) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When is using labels in return tuples considered correct? When the types leave it ambiguous? Not knowing our conventions here, (unwritten: S.Iterator, bytesWritten: Index)
makes the signature a lot more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, you're right, that would be clearer. Though it's an index, so it'd be writtenUpTo:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bytesWritten is clearer to me than writtenUpTo. They're of course equivalent, but bytesWritten is totally ambiguous while writtenUpTo leaves lingering questions of inclusivity and what the unit is (elements or bytes?). It's clear in the docs, but won't be clear to someone reading some code that happens to use this function.
But I'll defer to someone who knows the API guidelines better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type is an Index
not a byte count. Maybe this is splitting hairs in this case because the index on a buffer is also the byte count... but it's not good to rely on that, because sometimes slices of collections end up not having zero-based indices.
let underestimate = source.underestimatedCount | ||
guard let base = self.baseAddress else { | ||
_precondition(underestimate == 0, "no memory available to initialize from source") | ||
return (source.makeIterator(),startIndex) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: space after comma
|
||
for p in stride(from: base, | ||
// only advance to as far as the last element that will fit | ||
to: base + count - (count % MemoryLayout<S.Iterator.Element>.stride), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this equivalent to base + count
? This is basically rounding count down to the nearest multiple of stride, but since we're incrementing by stride we'll always stop at a multiple of stride. It doesn't randomly include to
if that's not a multiple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But you need to stride over the starting points for the elements up to the last starting point that will fit another element. to: base+count
would stride up to the starting point of the last element that won't fit i.e. if it were 10 byte buffer and 8 byte elements, the last stride(from: 0, to: 10, by: 8)
returns 8 but you can't write another element there. I've no doubt this could be expressed more clearly than what I've written here tho.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right. base + count - stride + 1
? (really just want to avoid modulo)
quick check:
(count: 10, stride: 8 ) = 0..<3 = [0]
(count: 10, stride: 10) = 0..<1 = [0]
(count: 10, stride: 5 ) = 0..<6 = [0, 5]
(count: 11, stride: 6 ) = 0..<6 = [0]
(count: 10, stride: 4 ) = 0..<7 = [0, 4]
guard let x = it.next() else { break } | ||
p.initializeMemory(as: S.Iterator.Element.self, to: x) | ||
formIndex(&idx, offsetBy: MemoryLayout<S.Iterator.Element>.stride) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optimizing this kind of loop well (so that it lowers to a memcpy when it should) has historically been a really big struggle. Slightly different because it reallocates, but this is the best version the Rust devs have been able to produce, if you're interested in aggressive micro-optimizations: https://doc.rust-lang.org/src/collections/up/src/libcollections/vec.rs.html#1550-1570
If we're serious about it, it might be worth even having an IRGen test verifying that e.g. passing Array<Int8>
to this lowers to memcpy. (not a blocker, just some idle thoughts)
formIndex(&idx, offsetBy: MemoryLayout<S.Iterator.Element>.stride) | ||
} | ||
|
||
return (it,idx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style: space after comma.
// the spare capacity of an Array buffer | ||
guard let x = it.next() else { break } | ||
p.initializeMemory(as: S.Iterator.Element.self, to: x) | ||
formIndex(&idx, offsetBy: MemoryLayout<S.Iterator.Element>.stride) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you use stride 3 times, I'd cache it into a variable, if only for code cleanliness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A variable I'll need to call something other than stride
:(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, stride function. Gross. elem(ent)Stride
?
@@ -294,7 +294,9 @@ public struct Unsafe${Mutable}RawPointer : Strideable, Hashable, _Pointer { | |||
/// - Postcondition: The `T` values at `self..<self + source.count * | |||
/// MemoryLayout<T>.stride` are initialized. | |||
/// | |||
/// TODO: Optimize where `C` is a `ContiguousArrayBuffer`. | |||
// TODO: Optimize where `C` is a `ContiguousArrayBuffer`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably if this shouldn't be used, optimizing it isn't interesting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair point!
defer { buffer.deallocate() } | ||
let source: [Int64] = [5,4,3,2,1] | ||
expectCrashLater() | ||
var (it,idx) = buffer.initializeMemory(as: Int64.self, from: source) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style: space after comma (and in the array, and in the above function)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh: is expectCrashLater()
capable of distinguishing an explicit assertion (safe) from a segfault (UB)?
var (it,idx) = buffer.initializeMemory(as: Int64.self, from: source) | ||
} | ||
|
||
|
||
// Directly test the byte Sequence produced by withUnsafeBytes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should have tests verifying both of the null pointer behaviours.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should have a test verifying the case where the iterator is exhausted with slack space. (the final sequence is empty)
Possibly a test that verifies that the "exactly enough space" case works (off by ones will getcha!).
([5,4,3] as [Int64]).withUnsafeBytes { | ||
expectEqualSequence($0,buffer[0..<idx]) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this testing the situation where the underEstimate is actually too small? (Not sure how accurate stride is)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but that's allowed (it is an underestimate after all). Your point about testing exactly-rightly-sized buffers too is taken tho.
/// MemoryLayout<T>.stride` is bound to type `T`. | ||
/// | ||
/// - Postcondition: The `T` values at `self..<self + source.count * | ||
/// MemoryLayout<T>.stride` are initialized. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This postcondition isn't accurate any more, since count
isn't available on a sequence and the buffer only initializes items until it runs out of room—hard to figure out how to write this concisely.
@@ -348,6 +348,56 @@ public struct Unsafe${Mutable}RawBufferPointer | |||
return 0 | |||
} | |||
|
|||
% if mutable: | |||
/// Initializes memory in the buffer with the elements of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it's crazy, but GYB control blocks shouldn't indent the Swift code they wrap.
@swift-ci Please test |
Build failed |
/// - Postcondition: The memory at `self[startIndex..<returned index] | ||
/// is bound to type `T`. | ||
/// | ||
/// - Postcondition: The `T` values at `self[startIndex..<returned index]` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it fair game to refer to initializedUpTo here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise looks great!
@@ -75,6 +75,59 @@ UnsafeRawBufferPointerTestSuite.test("initFromArray") { | |||
expectEqual(array2, array1) | |||
} | |||
|
|||
UnsafeRawBufferPointerTestSuite.test("initializeMemory(as:from:).underflow") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused by the underflow terminology here. This sequence still overflows the buffer right? Isn't it an overflow with underestimated count?
I don't see a test case for actual underflow!
Closing in order to raise a fresh PR combining this and the change to UnsafeMutablePointer. |
Along similar lines to the changes in this PR.
Deprecates
UnsafeRawPointer.initializeMemory(as:from:)
in favor ofUnsafeRawBufferPointer.initializeMemory(as:from:)
, which performs the same task but 1) takes a sequence rather than a collection, 2) has the precondition that the buffer be at least as big as to accommodateunderestimatedCount
, and 3) returns an iterator with the unwritten elements, as well as an index one past the last byte written.This is because calling
UnsafeRawPointer.initializeMemory(as:from:)
is unsafe to use in a generic context, as there is no way to guarantee that the appended collection'scount
isn't an overestimate.