Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Foundation] Performance improvements for IndexPath bridging and comparison #9339

Merged
merged 4 commits into from May 6, 2017

Conversation

phausler
Copy link
Member

@phausler phausler commented May 5, 2017

The previous implementation of IndexPath would cause a malloc of the underlying array buffer upon bridging from ObjectiveC. This impacts graphical APIs (such as UICollectionView or AppKit equivalents) when calling delegation patterns. Since IndexPath itself can be a tagged pointer and most often just a pair of elements it can be represented as an enum of common cases. Those common cases of empty, single, or pair can be represented respectively as no associated value, a single Int, and a tuple of Ints. These cases will be exclusively stack allocations, which is markably faster than the allocating code-path. IndexPaths that have a count greater than 2 will still fall into the array storage case. As an added performance benefit, accessing count and subscripting is now faster by aproximately 30% due to more tightly coupled inlining potential under whole module optimizations. Accessing count is also faster since it has better cache-line effeciency (lesson learned: the branch predictor is more optimized than pointer indirection chasing).

Benchmarks performed on x86_64, arm and arm64 still pending results but should be applicable across the board.

Resolves the following issues:
https://bugs.swift.org/browse/SR-3655
https://bugs.swift.org/browse/SR-2769

Resolves the following radars:
rdar://problem/28207534
rdar://problem/28209456

This was re-opened since the CI server decided it didn't like the previous PR

…ions

The previous implementation of IndexPath would cause a malloc of the underlying array buffer upon bridging from ObjectiveC. This impacts graphical APIs (such as UICollectionView or AppKit equivalents) when calling delegation patterns. Since IndexPath itself can be a tagged pointer and most often just a pair of elements it can  be represented as an enum of common cases. Those common cases of empty, single, or pair can be represented respectively as no associated value, a single Int, and a tuple of Ints. These cases will be exclusively stack allocations, which is markably faster than the allocating code-path. IndexPaths that have a count greater than 2 will still fall into the array storage case. As an added performance benefit, accessing count and subscripting is now faster by aproximately 30% due to more tightly coupled inlining potential under whole module optimizations. Accessing count is also faster since it has better cache-line effeciency (lesson learned: the branch predictor is more optimized than pointer indirection chasing).

Benchmarks performed on x86_64, arm and arm64 still pending results but should be applicable across the board.

Resolves the following issues:
https://bugs.swift.org/browse/SR-3655
https://bugs.swift.org/browse/SR-2769

Resolves the following radars:
rdar://problem/28207534
rdar://problem/28209456
@phausler
Copy link
Member Author

phausler commented May 5, 2017

@swift-ci please test

@phausler
Copy link
Member Author

phausler commented May 5, 2017

@swift-ci please test

@swift-ci
Copy link
Collaborator

swift-ci commented May 5, 2017

Build failed
Jenkins build - Swift Test Linux Platform
Git Commit - 823c498
Test requested by - @phausler

@swift-ci
Copy link
Collaborator

swift-ci commented May 5, 2017

Build failed
Jenkins build - Swift Test OS X Platform
Git Commit - 823c498
Test requested by - @phausler

@phausler
Copy link
Member Author

phausler commented May 6, 2017

StringIndex should not have any interaction with IndexPath

[ RUN      ] StringTests.ForeignIndexes/UnexpectedCrash (XFAIL: [Always(reason: <rdar://problem/18029290> String.Index caches the grapheme cluster size, but it is not always correct to use)])
stderr>>> fatal error: Can't form a Character from a String containing more than one extended grapheme cluster: file /Users/buildnode/jenkins/workspace/swift-PR-osx/branch-master/swift/stdlib/public/core/Character.swift, line 183
stderr>>> Current stack trace:
stderr>>> 0    libswiftCore.dylib                 0x000000010ad0e930 reportNow(unsigned int, char const*) + 125
stderr>>> 1    libswiftCore.dylib                 0x000000010ad587b0 _swift_stdlib_reportFatalErrorInFile + 100
stderr>>> 2    libswiftCore.dylib                 0x000000010aa3b2a0 closure #1 in closure #1 in closure #1 in _assertionFailure(_:_:file:line:flags:) + 320
stderr>>> 3    libswiftCore.dylib                 0x000000010aa3b270 closure #1 in closure #1 in closure #1 in _fatalErrorMessage(_:_:file:line:flags:) + 35
stderr>>> 4    libswiftCore.dylib                 0x000000010acece70 partial apply for closure #1 in closure #1 in closure #1 in _fatalErrorMessage(_:_:file:line:flags:) + 97
stderr>>> 5    libswiftCore.dylib                 0x000000010aa37570 specialized StaticString.withUTF8Buffer<A>(_:) + 204
stderr>>> 6    libswiftCore.dylib                 0x000000010acecdb0 partial apply for closure #1 in closure #1 in _assertionFailure(_:_:file:line:flags:) + 146
stderr>>> 7    libswiftCore.dylib                 0x000000010acecd90 partial apply for closure #1 in closure #1 in _fatalErrorMessage(_:_:file:line:flags:) + 23
stderr>>> 8    libswiftCore.dylib                 0x000000010aa37570 specialized StaticString.withUTF8Buffer<A>(_:) + 204
stderr>>> 9    libswiftCore.dylib                 0x000000010ac56660 partial apply for closure #1 in _fatalErrorMessage(_:_:file:line:flags:) + 159
stderr>>> 10   libswiftCore.dylib                 0x000000010ac56600 partial apply for closure #1 in _fatalErrorMessage(_:_:file:line:flags:) + 23
stderr>>> 11   libswiftCore.dylib                 0x000000010aa37570 specialized StaticString.withUTF8Buffer<A>(_:) + 204
stderr>>> 12   libswiftCore.dylib                 0x000000010aa37470 _fatalErrorMessage(_:_:file:line:flags:) + 126
stderr>>> 13   libswiftCore.dylib                 0x000000010ac32930 specialized String.subscript.getter + 1523
stderr>>> 14   libswiftCore.dylib                 0x000000010ab25a40 String.subscript.getter + 9
stderr>>> 15   a.out                              0x000000010a991410 closure #9 in  + 168
stderr>>> 16   libswiftStdlibUnittest.dylib       0x000000010b991e40 specialized TestSuite._runTest(name:parameter:) + 550
stderr>>> 17   libswiftStdlibUnittest.dylib       0x000000010b918150 _childProcess() + 1614
stderr>>> 18   libswiftStdlibUnittest.dylib       0x000000010b91f070 runAllTests() + 688
stderr>>> 19   a.out                              0x000000010a985560 main + 36603
stderr>>> 20   libdyld.dylib                      0x00007fff8f2c4254 start + 1
stderr>>> CRASHED: SIGILL

@phausler
Copy link
Member Author

phausler commented May 6, 2017

@swift-ci please test

@swift-ci
Copy link
Collaborator

swift-ci commented May 6, 2017

Build failed
Jenkins build - Swift Test OS X Platform
Git Commit - 2bc7b6f
Test requested by - @phausler

@swift-ci
Copy link
Collaborator

swift-ci commented May 6, 2017

Build failed
Jenkins build - Swift Test Linux Platform
Git Commit - 2bc7b6f
Test requested by - @phausler

@phausler phausler merged commit ffc594b into apple:master May 6, 2017
let totalBits = MemoryLayout<Int>.size * 8
let lengthBits = 8
let firstIndexBits = (totalBits - lengthBits) / 2
return count + (first << lengthBits) + (last << (lengthBits + firstIndexBits))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this have been:

return count &+ (first << lengthBits) &+ (last << (lengthBits + firstIndexBits))

i.e. if the first/last value are large, you can overflow.

phausler added a commit to phausler/swift that referenced this pull request May 16, 2017
…arison (apple#9339)

* [Foundation] Refactor the backing of IndexPath to favor stack allocations

The previous implementation of IndexPath would cause a malloc of the underlying array buffer upon bridging from ObjectiveC. This impacts graphical APIs (such as UICollectionView or AppKit equivalents) when calling delegation patterns. Since IndexPath itself can be a tagged pointer and most often just a pair of elements it can  be represented as an enum of common cases. Those common cases of empty, single, or pair can be represented respectively as no associated value, a single Int, and a tuple of Ints. These cases will be exclusively stack allocations, which is markably faster than the allocating code-path. IndexPaths that have a count greater than 2 will still fall into the array storage case. As an added performance benefit, accessing count and subscripting is now faster by aproximately 30% due to more tightly coupled inlining potential under whole module optimizations. Accessing count is also faster since it has better cache-line effeciency (lesson learned: the branch predictor is more optimized than pointer indirection chasing).

Benchmarks performed on x86_64, arm and arm64 still pending results but should be applicable across the board.

Resolves the following issues:
https://bugs.swift.org/browse/SR-3655
https://bugs.swift.org/browse/SR-2769

Resolves the following radars:
rdar://problem/28207534
rdar://problem/28209456

* [Foundation] remove temp IndexPath hashing that required bridging to ref types

* [Foundation] IndexPath does not guarentee hashing to be the same as objc

# Conflicts:
#	stdlib/public/SDK/Foundation/IndexPath.swift
tkremenek added a commit that referenced this pull request May 17, 2017
[Foundation] Performance improvements for IndexPath bridging and comparison (#9339)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants