-
Notifications
You must be signed in to change notification settings - Fork 10.6k
Avoid StringUTF16View dispatch overhead for some bridged String methods #83529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid StringUTF16View dispatch overhead for some bridged String methods #83529
Conversation
@swift-ci please test |
@swift-ci please Apple Silicon benchmark |
1 similar comment
@swift-ci please Apple Silicon benchmark |
That's very weird. The benchmark run consistently dies at -Onone only, the optimized ones pass. |
|
||
@inline(__always) | ||
final internal var utf16: String.UTF16View { | ||
String.UTF16View(_StringGuts(self)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Theoretically this should all inline out, but this skips going through the String initializer
_uncheckedBounds: (aRange.location, aRange.location+aRange.length)) | ||
let str = asString | ||
unsafe str._copyUTF16CodeUnits( | ||
unsafe utf16._nativeCopy( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Part of the theme of "move all the smarts into StringUTF16View.swift instead of having them scattered in 3-4 different files"
"String index is out of bounds") | ||
return unsafe UInt16((start + offset).pointee) | ||
} else { | ||
return utf16[nativeNonASCIIOffset: offset] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Part of the theme of "move all the smarts into StringUTF16View.swift instead of having them scattered in 3-4 different files"
return _foreignSubscript(position: idx) | ||
} | ||
|
||
internal subscript(nativeNonASCIIOffset offset: Int) -> UTF16.CodeUnit { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty much just pulling the relevant bits out of the code it used to go through
_precondition(alignedRange.lowerBound._encodedOffset < _guts.count && | ||
alignedRange.upperBound._encodedOffset < _guts.count, | ||
"String index is out of bounds") | ||
unsafe _nativeCopy( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should do a fast path in this for blocks of ascii (already have one for the entire thing being known-ascii), but I was having trouble getting it to generate reasonable looking code
} | ||
|
||
@inlinable | ||
@inlinable @inline(__always) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was actually generating a retain-release around this before?? ~18% overhead in my simple test
unsafe d.initialize(from: s, count: n) | ||
return (unsafe Iterator(_position: s + n, _end: s + count), n) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the remaining changes are kinda unrelated, but I was getting frustrated with trivial pointer operations generating calls
Local -characterAtIndex: benchmark gives ASCII: ~2s -> ~0.63s |
@swift-ci please test |
@swift-ci please Apple Silicon benchmark |
For a shorter (3 character) non-ASCII string we go from ~2s to ~1.25s, which indicates that most of the win is removing constant overhead, as expected. |
lol for ASCII we're now very slightly faster than NSCFString |
@swift-ci please test |
@swift-ci please Apple Silicon benchmark |
@swift-ci please benchmark |
I should look into what's up with StringWalk, but overall this is looking pretty decent
|
The Linux failure is funny. The expected error is moved to compile time so the test fails to build
|
@swift-ci please test |
@swift-ci please test |
@swift-ci please Apple Silicon benchmark |
let emptyAllocated = UnsafeMutablePointer<Float>.allocate(capacity: 0) | ||
defer { emptyAllocated.deallocate() } | ||
|
||
// Make sure we can't emit the error at compile time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder… does converting a runtime trap into an error like this count as a source break?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have converted runtime traps to compilation errors before, without considering them source breaks. What it amounts to is speeding up the discovery of an incorrect program to compilation time.
StringWalk appears to be extremely sensitive to minor inlining differences. I've filed a followup bug for the optimizer folks about a missed optimization that occurs both before and after, and I'm hoping will more than recover the difference. |
This removes a bunch of overhead on the UTF16 paths in String, as well as consolidating the complicated bits of the logic in one file.
Fixes rdar://157500258