Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Foundation] Convert Data’s SubSequence type to be Data #7155

Merged
merged 1 commit into from
Feb 15, 2017

Conversation

phausler
Copy link
Contributor

Data can encapsulate it’s own sub-sequence type by housing a range of the slice in the structural type for Data. By doing this it avoids the API explosion of supporting all APIs that take Data would need overloads to take a slice of Data. This does come at a small conceptual cost: any index based iteration should always account for the startIndex and endIndex of the Data (which was an implicit requirement previously by being a Collection). Moreover this prevents the requirement of O(n) copies of Data if it is never mutated while parsing sub sequences; so more than an API amelioration this also could offer a more effecient code-path for applications to use.

@phausler
Copy link
Contributor Author

@swift-ci please test

@swift-ci
Copy link
Contributor

Build failed
Jenkins build - Swift Test OS X Platform
Git Commit - bd14906ffcf2303f0e38b8c00e5ef5f3a0dba26c
Test requested by - @phausler

@phausler
Copy link
Contributor Author

@swift-ci Please test

@swift-ci
Copy link
Contributor

Build failed
Jenkins build - Swift Test Linux Platform
Git Commit - bd14906ffcf2303f0e38b8c00e5ef5f3a0dba26c
Test requested by - @phausler

@swift-ci
Copy link
Contributor

Build failed
Jenkins build - Swift Test OS X Platform
Git Commit - bd14906ffcf2303f0e38b8c00e5ef5f3a0dba26c
Test requested by - @phausler

@phausler
Copy link
Contributor Author

@jrose-apple Do you know if these errors are just transient and just need a re-build? or is it a failure from something I have changed?

******************** TEST 'Swift(macosx-x86_64) :: ClangImporter/MixedSource/broken-modules.swift' FAILED ********************
...
/Users/buildnode/jenkins/workspace/swift-PR-osx/branch-master/swift/test/ClangImporter/MixedSource/broken-modules.swift:37:11: error: expected string not found in input

@jrose-apple
Copy link
Contributor

They're failing on master too. We're trying to track them down now!

@phausler
Copy link
Contributor Author

ok, good to know - fyi this cannot be merged until it passes API review since it is modifying the types exposed by Foundation.

Copy link
Contributor

@dtorres dtorres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typos

/// - parameter repeatedValue: A byte to initialize the pattern
/// - parameter count: The number of bytes the data initially contains initialized to the repeatedValue
/// - parameter repeatedValue: A byte to initialze the pattern
/// - parameter count: The number of bytes the data initially contains initialzed to the repeatedValue
public init(repeating repeatedValue: UInt8, count: Int) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ This. why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The merge for this was a merge from swift-corelibs-foundation and the overlay to the temporary work, it looks like this was pulled in incorrectly; I will revert that back to the corrected spelling before merging (once we can verify functionality). Ideally we should keep all changes even spelling corrections in sync across swift-corelibs-foundation and the overlay.

@phausler phausler force-pushed the Data_as_own_SubSequence branch 2 times, most recently from 6c8446f to f8afeba Compare January 31, 2017 23:42
@phausler
Copy link
Contributor Author

@swift-ci please test

@swift-ci
Copy link
Contributor

Build failed
Jenkins build - Swift Test OS X Platform
Git Commit - 1e365fa6c9e101b9c9b184b54f29ee1fdb85308c
Test requested by - @phausler

@swift-ci
Copy link
Contributor

Build failed
Jenkins build - Swift Test Linux Platform
Git Commit - 1e365fa6c9e101b9c9b184b54f29ee1fdb85308c
Test requested by - @phausler

Data can encapsulate it’s own sub-sequence type by housing a range of the slice in the structural type for Data. By doing this it avoids the API explosion of supporting all APIs that take Data would need overloads to take a slice of Data. This does come at a small conceptual cost: any index based iteration should always account for the startIndex and endIndex of the Data (which was an implicit requirement previously by being a Collection). Moreover this prevents the requirement of O(n) copies of Data if it is never mutated while parsing sub sequences; so more than an API amelioration this also could offer a more effecient code-path for applications to use.
@phausler
Copy link
Contributor Author

phausler commented Feb 1, 2017

@swift-ci Please test

@swift-ci
Copy link
Contributor

swift-ci commented Feb 1, 2017

Build failed
Jenkins build - Swift Test Linux Platform
Git Commit - f8afebaa5218501f0096c5f88b1f7c1a13e0c5c0
Test requested by - @phausler

@swift-ci
Copy link
Contributor

swift-ci commented Feb 1, 2017

Build failed
Jenkins build - Swift Test OS X Platform
Git Commit - f8afebaa5218501f0096c5f88b1f7c1a13e0c5c0
Test requested by - @phausler

@@ -83,7 +83,7 @@ public final class _DataStorage {
var dest = dest_
var source = source_
var num = num_
if _DataStorage.vmOpsThreshold <= num && ((unsafeBitCast(source, to: Int.self) | unsafeBitCast(dest, to: Int.self)) & (NSPageSize() - 1)) == 0 {
if _DataStorage.vmOpsThreshold <= num && ((unsafeBitCast(source, to: Int.self) | Int(bitPattern: dest)) & (NSPageSize() - 1)) == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems surprising to me that source and dest would be handled differently in this expression

if range.lowerBound == 0 && range.upperBound == _length {
return _DataStorage(mutableReference: d.mutableCopy() as! NSMutableData)
} else {
return _DataStorage(mutableReference: d.subdata(with: NSRange(location: range.lowerBound, length: range.count))._bridgeToObjectiveC().mutableCopy() as! NSMutableData)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is subdata(with:) guaranteed to return a struct Data if it's not overridden? Paying for allocating and throwing away an NSSubrangeData here would be unfortunate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unless I thunk out to objc this is the only way to express it. I considered adding an additional thunk for this call but decided it probably wasn't worth it

return try work(d)
case .immutable(let d):
return try work(d)
if range.lowerBound == 0 && range.upperBound == _length {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this repeated pattern still correct if you get a subrange of a subrange? The initial subrange's bounds wouldn't be 0..<_length, then, no?

If I'm not off base there, maybe it'd be best to do if range == _sliceRange

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subrange composed with subrange is homomorphic to a subrange of the original

@@ -939,38 +907,47 @@ public struct Data : ReferenceConvertible, Equatable, Hashable, RandomAccessColl
/// - parameter count: The number of bytes to copy.
public init(bytes: UnsafeRawPointer, count: Int) {
_backing = _DataStorage(bytes: bytes, length: count)
_sliceRange = 0..<count
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This initialization gets repeated a lot. What about moving it to the declaration of _sliceRange, and just setting it differently in the few init()s where it's not 0..<count?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the declaration of _sliceRange needs to be initialized with the parameter count not the property count, so the initial value is dependent upon the initialized value of the backing

@@ -1280,9 +1276,10 @@ public struct Data : ReferenceConvertible, Equatable, Hashable, RandomAccessColl
public mutating func append<SourceType>(_ buffer : UnsafeBufferPointer<SourceType>) {
if buffer.count == 0 { return }
if !isKnownUniquelyReferenced(&_backing) {
_backing = _backing.mutableCopy()
_backing = _backing.mutableCopy(_sliceRange)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not something that needs to happen now, but there's an opportunity to make this more efficient by allocating the new backing at the right size initially, rather than copy + resize.

@@ -1298,9 +1295,6 @@ public struct Data : ReferenceConvertible, Equatable, Hashable, RandomAccessColl
public mutating func append<S : Sequence>(contentsOf newElements: S) where S.Iterator.Element == Iterator.Element {
let estimatedCount = newElements.underestimatedCount
var idx = count
if !isKnownUniquelyReferenced(&_backing) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale for removing this bit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was duplicate code actually to what will happen immediately after this in the mutation of count

_backing.replaceBytes(in: nsRange, with: buffer.baseAddress, length: bufferCount)
let resultingLength = currentLength - nsRange.length + bufferCount
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this not the same as just asking for _backing.length again?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, I thought this might be clearer about what is going on; that the mutation is about what is being done, not what has been done.

@@ -726,71 +758,6 @@ public final class _DataStorage {
}
}

public static func ==(_ lhs: _DataStorage, _ rhs: _DataStorage) -> Bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't see where this code went; just the diff being confusing, or was this actually deleted?

indexType: Int.self,
indexDistanceType: Int.self,
indicesType: CountableRange<Int>.self)
}

DataTestSuite.test("Data SubSequence") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get tests for slices of slices too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checkRandomAccessCollection creates slices of slices iirc, I would prefer to add more tests in the unit tests and not the validation; did you have a specific test in mind?
e.g. something to do with something like this? Data(bytes: [1, 2, 3, 4, 5, 6, 7, 8, 9])[0..<9][1..<4]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like that yeah. If it's already covered, no worries :)

@@ -678,34 +678,66 @@ public final class _DataStorage {
}

@inline(__always)
public func mutableCopy() -> _DataStorage {
public func mutableCopy(_ range: Range<Int>) -> _DataStorage {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm now wondering if this function is worth inlining; it is pretty large.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the inline case that is interesting here is the swift case, which is quite small. I wish there was a way to only inline part of the switch here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can check for the swift case in the inline, but call out to a non-inlined function for the rest?

@phausler phausler merged commit 525ffcc into swiftlang:master Feb 15, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants