Skip to content

Conversation

meg-gupta
Copy link
Contributor

@meg-gupta meg-gupta commented Sep 4, 2020

ARCSequenceOpts with loop support works on regions inside out. While processing an outer region, it uses summary from
the inner loop region to compute RefCountState transitions used for CodeMotionSafe optimization.

We do not compute the loop summary by iterating over it twice or iterating until a fixed point is reached.
Instead loop summary is just a list of refcount affecting instructions. And BottomUpRefCountState::updateForDifferentLoopInst and TopDownRefCountState::updateForDifferentLoopInst are used to compute the RefCountState transition when processing the inner loop region. These functions are very conservative and the flow sensitive nature of the summarized instructions do no matter.
This PR adds a verifying assert to confirm this.

@meg-gupta meg-gupta requested a review from atrick September 4, 2020 22:30
@meg-gupta
Copy link
Contributor Author

@swift-ci test

@meg-gupta
Copy link
Contributor Author

@swift-ci benchmark

@meg-gupta
Copy link
Contributor Author

@swift-ci Please Test Source Compatibility

@swift-ci
Copy link
Contributor

swift-ci commented Sep 4, 2020

Performance: -O

Regression OLD NEW DELTA RATIO
FlattenListFlatMap 5112 5592 +9.4% 0.91x (?)
EqualSubstringSubstring 35 38 +8.6% 0.92x (?)
EqualSubstringSubstringGenericEquatable 35 38 +8.6% 0.92x (?)
 
Improvement OLD NEW DELTA RATIO
NSStringConversion.UTF8 737 673 -8.7% 1.10x (?)
InsertCharacterEndIndex 143 133 -7.0% 1.08x (?)

Code size: -O

Performance: -Osize

Regression OLD NEW DELTA RATIO
Dictionary4 226 270 +19.5% 0.84x
Dictionary4OfObjects 311 359 +15.4% 0.87x
EqualSubstringSubstring 35 38 +8.6% 0.92x
EqualStringSubstring 35 38 +8.6% 0.92x (?)
EqualSubstringSubstringGenericEquatable 35 38 +8.6% 0.92x
EqualSubstringString 35 38 +8.6% 0.92x
 
Improvement OLD NEW DELTA RATIO
PrefixAnyCollection 163 147 -9.8% 1.11x
DropFirstAnySeqCRangeIter 163 147 -9.8% 1.11x
DropFirstAnySeqCntRange 163 147 -9.8% 1.11x
DropFirstAnySeqCRangeIterLazy 223 202 -9.4% 1.10x
DropFirstAnySeqCntRangeLazy 223 202 -9.4% 1.10x (?)
DropLastAnyCollection 64 58 -9.4% 1.10x (?)
DropFirstAnyCollection 163 148 -9.2% 1.10x (?)
PrefixAnySeqCRangeIterLazy 174 158 -9.2% 1.10x
PrefixAnySeqCntRangeLazy 174 158 -9.2% 1.10x
DropWhileAnySeqCRangeIterLazy 242 221 -8.7% 1.10x
DropWhileAnySeqCntRangeLazy 242 221 -8.7% 1.10x (?)
SuffixAnyCollection 58 53 -8.6% 1.09x
DropWhileAnyCollectionLazy 248 227 -8.5% 1.09x (?)
DropWhileAnyCollection 179 164 -8.4% 1.09x
PrefixWhileAnyCollectionLazy 172 158 -8.1% 1.09x (?)
InsertCharacterEndIndex 144 133 -7.6% 1.08x (?)
PrefixWhileAnyCollection 211 195 -7.6% 1.08x
StrComplexWalk 4640 4310 -7.1% 1.08x (?)
Data.append.Sequence.809B.Count.RE.I 100 93 -7.0% 1.08x (?)
StringWalk 2440 2280 -6.6% 1.07x (?)

Code size: -Osize

Regression OLD NEW DELTA RATIO
Suffix.o 20078 20371 +1.5% 0.99x

Performance: -Onone

Improvement OLD NEW DELTA RATIO
NSStringConversion.MutableCopy.UTF8 1369 1274 -6.9% 1.07x (?)

Code size: -swiftlibs

How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview
  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 8-Core Intel Xeon E5
  Processor Speed: 3 GHz
  Number of Processors: 1
  Total Number of Cores: 8
  L2 Cache (per Core): 256 KB
  L3 Cache: 25 MB
  Memory: 64 GB

#ifndef NDEBUG
// Verify updateForDifferentLoopInst is conservative enough that the flow
// sensitive native of the loop summarized instructions does not matter.
if (verifyARCLoopSummary) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this verification inside the loop over SummarizedInterestingInsts? Shouldn't it be outside? Otherwise it's unnecessarily quadratic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I fixed it.

@meg-gupta
Copy link
Contributor Author

@swift-ci test

ARCSequenceOpts with loop support works on regions inside out.
While processing an outer region, it uses summary from
the inner loop region to compute RefCountState transitions used for
CodeMotionSafe optimization.

We do not compute the loop summary by iterating over it twice or
iterating until a fixed point is reached.
Instead loop summary is just a list of refcount effecting instructions.
And BottomUpRefCountState::updateForDifferentLoopInst and
TopDownRefCountState::updateForDifferentLoopInst are used to compute the
RefCountState transition when processing the inner loop region. These
functions are very conservative and the flow sensitive nature of the
summarized instructions do no matter.
This PR adds a verifying assert to confirm this.
Copy link
Contributor

@atrick atrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. But I think you should kick off another build of the tests, benchmarks and SCK after the latest change in case that changes the semantics.

@meg-gupta
Copy link
Contributor Author

@swift-ci test

@meg-gupta
Copy link
Contributor Author

@swift-ci Please Test Source Compatibility

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - e1cc16eb5f3b8f2918fbae9e9100f041c2f03aa3

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - e1cc16eb5f3b8f2918fbae9e9100f041c2f03aa3

@meg-gupta
Copy link
Contributor Author

Errors in sck are all known SILGen bugs.

@meg-gupta
Copy link
Contributor Author

Removed commit for testing

@meg-gupta
Copy link
Contributor Author

@swift-ci smoke test

@meg-gupta
Copy link
Contributor Author

@swift-ci smoke test Linux platform

@meg-gupta
Copy link
Contributor Author

@swift-ci smoke test OSX platform

@meg-gupta
Copy link
Contributor Author

@swift-ci smoke test OS X platform

1 similar comment
@meg-gupta
Copy link
Contributor Author

@swift-ci smoke test OS X platform

@meg-gupta meg-gupta merged commit 581f611 into swiftlang:master Sep 15, 2020
dabrahams pushed a commit that referenced this pull request Sep 15, 2020
* 'master' of github.com:apple/swift:
  [docs] Fix broken links in the documentation index (#33829)
  [Profiler] Increment closure body count prior to the prolog (#33946)
  ARCSequenceOpts: Add LoopSummary verifier (#33810)
  [build-script] Cleanup source directory layout help
  Address review comment.
  ABI checker: when invoking via build system, explicitly mention ABI breakge in diagnostics
  stdlib: Remove unused, unsafe helper function _withUninitializedString (#33704)
  test: Replace _silgen_name w/ _cdecl in CommandLineStressTest
  [AutoDiff] [Docs] Clarify 'Differentiable' derived conformances conditions.
  Revert "build-script: remove dead CMake options for Swift"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants