-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[stdlib] performance optimizations in Array.replaceSubrange
#66160
Conversation
@swift-ci please test |
1 similar comment
@swift-ci please test |
@swift-ci please benchmark Significant changes with the original commit: Performance (x86_64): -O
Code size: -O
Performance (x86_64): -Osize
Code size: -Osize
Performance (x86_64): -Onone
Code size: -swiftlibs |
@swift-ci please benchmark |
Adding the extra branches to skip Performance (x86_64): -O
Code size: -O
Performance (x86_64): -Osize
Code size: -Osize
Performance (x86_64): -Onone
|
@swift-ci please benchmark Another commit, another set of benchmarks: Performance (x86_64): -O
Code size: -O
Performance (x86_64): -Osize
Code size: -Osize
Performance (x86_64): -Onone
|
// place the values into the hole we created | ||
if newCount > 0 { | ||
let done: Void? = newValues.withContiguousStorageIfAvailable { | ||
holeStart.initialize(from: $0.baseAddress!, count: newCount) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's so annoying that withContiguousStorageIfAvailable is allowed to pass a buffer with a nil baseAddress rather than just returning nil without executing the closure in that case. Otherwise we could use .unsafelyUnwrappedUnchecked here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Actually what IS supposed to happen in that case? Should it just be the same as if newValues was empty?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, it says that if there is no contiguous representation, it should ignore body and just return nil?
and in this case, newValues.count
/ newCount
should be the same (and from above, > 0
) for a well-implemented collection?
I'm not sure if its worth throwing in an extra assert for $0.count == newCount, or removing !
for .unsafelyUnwrappedUnchecked
either.
@swift-ci please benchmark |
@swift-ci please benchmark |
@swift-ci please benchmark Apple Silicon |
@swift-ci Please Apple Silicon benchmark The AS results are less noisy, but also highlight a different regression (???) Performance (arm64): -O
Code size: -O
Performance (arm64): -Osize
Code size: -Osize
Performance (arm64): -Onone
|
@swift-ci please benchmark |
@swift-ci please apple silicon benchmark |
[removed one of the three branches that could have potentially introduced a regression in the Suffix benchmarks - working on better understanding for optimizer behavior on the other two to potentially fix the regression] |
@swift-ci please benchmark |
@swift-ci please apple silicon benchmark |
@swift-ci please benchmark |
@swift-ci please apple silicon benchmark |
@swift-ci please benchmark |
@swift-ci please apple silicon benchmark |
Test again, same results: Performance (arm64): -O
Code size: -O
Performance (arm64): -Osize
Code size: -Osize
Performance (arm64): -Onone
Code size: -swiftlibsHow to read the dataThe tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.If you see any unexpected regressions, you should consider fixing the Noise: Sometimes the performance results (not code size!) contain false Hardware Overview
|
extension CollectionOfOne: Sequence { | ||
// `mutating` since `withUnsafePointer` needs inout. | ||
@inlinable // trivial-implementation | ||
public mutating func withContiguousStorageIfAvailable<R>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll need to get rid of this for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
@swift-ci please test |
1 similar comment
@swift-ci please test |
@swift-ci please test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!
@swift-ci Please clean test Linux platform |
I haven't been able to reproduce the test failures locally on a Ubuntu 20.04 Intel container that ran the same It doesn't appear to be flakiness, since multiple runs on multiple platforms had the test pass locally but the test still fails in CI. |
@swift-ci please test |
I think something in the past ~half year fixed it - I rebased onto |
@swift-ci please test EDIT: I finally know what tests are failing in release mode and why:
|
435a333
to
08586c5
Compare
Add an explanation about why UMBP.initialize(fromContentsOf:) can't be used.
@swift-ci please test |
@swift-ci please benchmark |
@swift-ci Please Apple Silicon benchmark |
new performance results are in! the Suffix* benchmarks no longer have major regressions, but there are a few things I might want to investigate: Performance (x86_64): -O
Code size: -O
Performance (x86_64): -Osize
Code size: -Osize
Performance (x86_64): -Onone
Code size: -swiftlibs
|
Oh, Apple Silicon benchmarks have a clearer cut case: Performance (arm64): -O
Code size: -O
Performance (arm64): -Osize
Code size: -Osize
Performance (arm64): -Onone
Code size: -swiftlibs
|
Yeah these results look great :) nicely done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once again: 👍🏽
This PR is intended to generally improve performance for replaceSubrange.
Currently, it simplifies the main codepath to avoid unneeded branches.
Additional plans: