Fix KnownSafety optimization bugs in ARCSequenceOpts #33722

meg-gupta · 2020-08-31T18:25:36Z

ARCSequenceOpts does KnownSafety optimization where 'KnownSafe' retain/release pairs are deleted.
A retain is 'KnownSafe' if during top down dataflow, it is preceded by another retain or non-arc use on the same RCIdentity.
A release is 'KnownSafe' if during bottom up dataflow, it is preceded by another release or non-arc use on the same RCIdentity.
This PR fixes the following two bugs in KnownSafety optimization.

a) KnownSafety bug in ARCSequenceOpts without loop support

1: retain_value S
2: retain_value S

3: strong_release S.subobject
4: strong_retain S.subobject
5: release_value S
6: release_value S

2 and 5 are marked as 'KnownSafe'. And currently they get paired and optimized away. 2 and 5 pairing is incorrect, this can be seen clearly expanding retain_value and release_value.

1e: strong_retain S.subobject
2e: strong_retain S.subobject
3: strong_release S.subobject
4: strong_retain S.subobject
5e: strong_release S.subobject
6e: strong_release S.subobject

The correct pairings are 2e and 3, 4 and 5e. Since RCIdentity distinguishes between S and S.subobject we can end
up with incorrect pairings and optimization with KnownSafety. This is true for any aliasing RCIdentities.

b) KnownSafety bug in ARCSequenceOpts with loop support

bb0:
   1: strong_retain S
   2: strong_retain S
loop:
    3: strong_release S
    4: strong_retain S
    5: cond_br undef, loop, bb2
bb2:
   6: strong_release S
   7: strong_release S

2 and 6 are marked as 'KnownSafe'. And currently get paired and optimized away. This is incorrect and the correct pairings are 2 and 3 and 4 and 6. ARCLoopOpts computes loop summaries and uses it to only compute refcount states for CodeMotionSafe optimization in ARCSequenceOpts. The instructions in the loop have no effect on the already computed 'KnownSafe' states outside the loop.

With this PR, while doing bottom up dataflow, if we encounter an unmatched retain instruction, that can pair with a 'KnownSafe' already visited release instruction, we turn off KnownSafety if the two
RCIdentities mayAlias.
Similarly, during top down dataflow, if we encounter an unmatched release instruction, that can pair with a 'KnownSafe' already visited retain instruction, we turn off KnownSafety if the two RCIdentities mayAlias. This fixes the bugs in both ARCSequenceOpts without loop support and with loop support.

Fixes rdar://67720198 and rdar://27775427

gottesmm

There is a bunch of refactoring going on here. Can you split that out in a separate PR and land that separately?

gottesmm · 2020-08-31T18:58:10Z

lib/SILOptimizer/ARC/ARCRegionState.cpp

+      SetFactory);
+
+  auto II = this->summarizedinterestinginsts_rbegin();
+  auto IE = this->summarizedinterestinginsts_rend();


Please remove the usage of this.

gottesmm · 2020-08-31T19:08:15Z

Just reading the description, this is incorrect. We only require bottom up that there is a post-dominating use that requires liveness that will not be removed.

meg-gupta · 2020-08-31T20:06:02Z

@gottesmm I edited the description of 'KnownSafety' to also include non-arc use.

meg-gupta · 2020-09-01T21:59:10Z

@swift-ci test

meg-gupta · 2020-09-01T21:59:25Z

@swift-ci benchmark

swift-ci · 2020-09-01T22:02:08Z

Build failed
Swift Test Linux Platform
Git Sha - 1540e28ff8a4584406b30d44195490ff11bea9ef

swift-ci · 2020-09-01T22:03:02Z

Build failed
Swift Test OS X Platform
Git Sha - 1540e28ff8a4584406b30d44195490ff11bea9ef

swift-ci · 2020-09-01T23:01:38Z

Performance: -O

Regression	OLD	NEW	DELTA	RATIO
FlattenListFlatMap	2556	3859	+51.0%	0.66x (?)
UTF8Decode_InitFromCustom_contiguous	143	169	+18.2%	0.85x
UTF8Decode_InitDecoding	143	168	+17.5%	0.85x (?)
ArrayLiteral2	67	78	+16.4%	0.86x
Set.isStrictSubset.Int.Empty	45	52	+15.6%	0.87x (?)
Set.isSubset.Int.Empty	43	49	+14.0%	0.88x (?)
Set.isStrictSubset.Seq.Int.Empty	108	120	+11.1%	0.90x (?)
BucketSort	137	148	+8.0%	0.93x (?)

Improvement	OLD	NEW	DELTA	RATIO
ObjectiveCBridgeStubDateAccess	152	130	-14.5%	1.17x (?)
SuffixAnySequence	1691	1572	-7.0%	1.08x (?)

Code size: -O

Performance: -Osize

Regression	OLD	NEW	DELTA	RATIO
Set.subtracting.Empty.Box	7	9	+28.6%	0.78x
UTF8Decode_InitFromCustom_contiguous	143	169	+18.2%	0.85x (?)
UTF8Decode_InitDecoding	145	168	+15.9%	0.86x
Set.isStrictSubset.Int.Empty	47	53	+12.8%	0.89x (?)
DataCreateEmptyArray	800	900	+12.5%	0.89x (?)
StringUTF16Builder	210	230	+9.5%	0.91x (?)
UTF8Decode_InitFromCustom_noncontiguous	263	288	+9.5%	0.91x (?)
Set.isStrictSubset.Seq.Int.Empty	112	122	+8.9%	0.92x (?)
ArrayLiteral2	98	106	+8.2%	0.92x (?)

Improvement	OLD	NEW	DELTA	RATIO
FlattenListFlatMap	2903	2466	-15.1%	1.18x (?)
RemoveWhereFilterInts	26	24	-7.7%	1.08x (?)
DistinctClassFieldAccesses	200	186	-7.0%	1.08x (?)
Chars2	3650	3400	-6.8%	1.07x (?)
StringBuilderLong	1060	990	-6.6%	1.07x (?)

Code size: -Osize

Performance: -Onone

Regression	OLD	NEW	DELTA	RATIO
UTF8Decode_InitDecoding	154	174	+13.0%	0.89x (?)
UTF8Decode_InitFromCustom_contiguous	157	177	+12.7%	0.89x
DictionaryKeysContainsCocoa	27	30	+11.1%	0.90x (?)

Improvement	OLD	NEW	DELTA	RATIO
SubstringFromLongString	10	9	-10.0%	1.11x (?)

Code size: -swiftlibs

How to read the data

The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview

  Model Name: Mac mini
  Model Identifier: Macmini8,1
  Processor Name: 6-Core Intel Core i7
  Processor Speed: 3.2 GHz
  Number of Processors: 1
  Total Number of Cores: 6
  L2 Cache (per Core): 256 KB
  L3 Cache: 12 MB
  Memory: 64 GB

atrick

I'm still trying to prove to myself it's correct and efficient.

I was initially worried that this would fail to optimize a more "normal" case like this:

1: retain_value S
2: retain_value S
3: strong_retain S.subobject
4: strong_release S.subobject
5: release_value S
6: release_value S

Maybe that case will actually sill be optimized with your change? Anyway, I won't argue strongly that this pass needs to handle all cases since SemanticARCOpts can handle it eventually.

Will we correctly handle a case like this?

1: retain_value S
2: retain_value S
3: strong_release S.subobject
4: release_value S
5: strong_retain S.subobject
6: release_value S

meg-gupta · 2020-09-02T00:55:35Z

@atrick We will still optimize the "normal" case. Test $matched_rr_subobject in arcsequenceopts_knownsafetybugs.sil has this pattern.

meg-gupta · 2020-09-02T00:59:13Z

In:

1: retain_value S
2: retain_value S
3: strong_release S.subobject
4: release_value S
5: strong_retain S.subobject
6: release_value S

2 and 4 get paired and optimized away due to CodeMotionSafe. This fix turns off KnownSafety on 2 when it sees 3.

atrick · 2020-09-02T19:38:24Z

I'm trying to understand why this known safety fix is looking for retains. Let's say code-motion safe doesn't apply. So what about this?:

1: retain_value S
2: retain_value S
3: strong_release S.subobject
4: use(S)
5: release_value S
6: strong_retain S.subobject
7: release_value S

meg-gupta · 2020-09-02T23:28:27Z

@atrick Similar to previous example, 3 turns off KnownSafety on 2. And nothing gets optimized if we consider only 'KnownSafety'.
In bottom up traversal, the fix only looks at unmatched retains that could potentially match with already visited KnownSafe RCIdentity.

atrick · 2020-09-03T01:21:59Z

For efficiency, RefCountInstToMatchedMap can just be a set of unmatched instructions.

atrick · 2020-09-03T01:23:11Z

When we summarize the loop in the RefCountInstToMatchedMap. It assumes that the IncToDecStateMap finds any unmatched pair on any paths through the loop.

LoopARCSequenceDataflowEvaluator::mergeSuccessors clears the data flow state at any non-local successor, so it should be conservative.

This only time this doesn't happen is with "allowsleaks" exist, which reach an unreachable before any side-effect instructions.

To be safe, add tests cases for when the release is on the loop tail; add a test:

loop:
  retain
  cond_br exit
  release
  br loop

loop:
  retain
  cond_br exit-to-unreachable
  release
  br loop

atrick · 2020-09-03T01:23:53Z

When computing the loop summary, the code assumes that dataflow can be applied to each "interesting" instruction in order to produce a summary. In general this is not sound because loops iterate and the order is meaningless. This means that we're relying on updateForDifferentLoopInst to apply data flow so conservatively that the order doesn't matter. It's not clear that's actually happening. We should clearly comment that requirement and verify the assumption under some verification flag:

In ARCRegionState::processLoopBottomUp

  // For each state that we are currently tracking, apply our summarized
  // instructions to it.
  for (auto &OtherState : getBottomupStates()) {
    if (!OtherState.hasValue())
      continue;

    for (auto *I : State->getSummarizedInterestingInsts()) {
      OtherState->second.updateForDifferentLoopInst(I, AA);
      OtherState->second.checkAndResetKnownSafety(
          I, OtherState->first, checkIfRefCountInstIsMatched, RCIA, AA);
    }
#ifndef NDEBUG
    if (verifyLoopARCSummary) {
      auto newOtherState = OtherState
      for (auto *I : State->getSummarizedInterestingInsts()) {
        OtherState->second.updateForDifferentLoopInst(I, AA);
        OtherState->second.checkAndResetKnownSafety(
            I, OtherState->first, checkIfRefCountInstIsMatched, RCIA, AA);
      }
      assert(newOtherState == OtherState);
    }
#endif
  }

atrick

I asked for some test cases and comments. Otherwise this looks good to me. Thanks!

Regarding the loop data-flow verification, that's a separate issue from this PR so can be done in a follow-up.

While doing bottom up dataflow, if we encounter an unmatched retain instruction, that can pair with a 'KnownSafe' already visited release instruction, we turn off KnownSafety if the two RCIdentities mayAlias. This is done in BottomUpRefCountState::checkAndResetKnownSafety. In order to determine if a retain is umatched, we look at IncToDecStateMap. If a retain was matched during bottom up dataflow, it is always found in IncToDecStateMap with value of the matched release's BottomUpRefCountState. Similarly, during top down dataflow, if we encounter an unmatched release instruction, that can pair with a 'KnownSafe' already visited retain instruction, we turn off KnownSafety if the two RCIdentities mayAlias. This is done in TopDownRefCountState::checkAndResetKnownSafety. In order to determine if a release is umatched, we look at DecToIncStateMap. If a release was matched during top down dataflow, it is always found in DecToIncStateMap with value of the matched retain's TopDownRefCountState. For ARCLoopOpts, during bottom up and top down traversal of a region with a nested loop, we find if the retain/release in the loop summary was matched or not by looking at the persistent RefCountInstToMatched map. This map is populated when processing the nested loop region from the IncToDecStateMap/DecToStateMap which gets thrown away after the loop region is processed. This fixes the bugs in both ARCSequenceOpts without loop support and with loop support.

meg-gupta · 2020-09-03T08:19:43Z

@swift-ci test

meg-gupta · 2020-09-03T08:20:44Z

@swift-ci Please Test Source Compatibility

swift-ci · 2020-09-03T08:22:42Z

Build failed
Swift Test Linux Platform
Git Sha - 52bb0ef94cf8633a0c6c1120a751dfdfbcd90e61

swift-ci · 2020-09-03T08:23:02Z

Build failed
Swift Test OS X Platform
Git Sha - 52bb0ef94cf8633a0c6c1120a751dfdfbcd90e61

meg-gupta · 2020-09-03T17:31:27Z

@swift-ci Please Test Source Compatibility

meg-gupta · 2020-09-03T22:20:35Z

Addressed review comments.

meg-gupta · 2020-09-04T01:17:25Z

sck debug failures are pre-existing

meg-gupta requested review from gottesmm and atrick August 31, 2020 18:25

gottesmm reviewed Aug 31, 2020

View reviewed changes

meg-gupta force-pushed the asochanges branch 3 times, most recently from 7cd553e to 52bb0ef Compare September 1, 2020 21:58

atrick reviewed Sep 2, 2020

View reviewed changes

atrick approved these changes Sep 3, 2020

View reviewed changes

meg-gupta added 4 commits September 3, 2020 00:51

ARCSequenceOpts: Improve debug print

fe6d456

ARCSequenceOpts: Remove unused args in several functions

ce669ca

ARCSequenceOpts: Minor code consolidation

132d49f

meg-gupta force-pushed the asochanges branch from 52bb0ef to 586de0a Compare September 3, 2020 08:19

meg-gupta merged commit 1372414 into swiftlang:master Sep 4, 2020

Fix KnownSafety optimization bugs in ARCSequenceOpts #33722

Fix KnownSafety optimization bugs in ARCSequenceOpts #33722

Uh oh!

Conversation

meg-gupta commented Aug 31, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gottesmm left a comment

Choose a reason for hiding this comment

Uh oh!

gottesmm Aug 31, 2020

Choose a reason for hiding this comment

Uh oh!

gottesmm commented Aug 31, 2020

Uh oh!

meg-gupta commented Aug 31, 2020

Uh oh!

meg-gupta commented Sep 1, 2020

Uh oh!

meg-gupta commented Sep 1, 2020

Uh oh!

swift-ci commented Sep 1, 2020

Uh oh!

swift-ci commented Sep 1, 2020

Uh oh!

swift-ci commented Sep 1, 2020

Performance: -O

Code size: -O

Performance: -Osize

Code size: -Osize

Performance: -Onone

Code size: -swiftlibs

Uh oh!

atrick left a comment

Choose a reason for hiding this comment

Uh oh!

meg-gupta commented Sep 2, 2020

Uh oh!

meg-gupta commented Sep 2, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

atrick commented Sep 2, 2020

Uh oh!

meg-gupta commented Sep 2, 2020

Uh oh!

atrick commented Sep 3, 2020

Uh oh!

atrick commented Sep 3, 2020

Uh oh!

atrick commented Sep 3, 2020

Uh oh!

atrick left a comment

Choose a reason for hiding this comment

Uh oh!

meg-gupta commented Sep 3, 2020

Uh oh!

meg-gupta commented Sep 3, 2020

Uh oh!

swift-ci commented Sep 3, 2020

Uh oh!

swift-ci commented Sep 3, 2020

Uh oh!

meg-gupta commented Sep 3, 2020

Uh oh!

meg-gupta commented Sep 3, 2020

Uh oh!

meg-gupta commented Sep 4, 2020

Uh oh!

Uh oh!

meg-gupta commented Aug 31, 2020 •

edited

Loading

meg-gupta commented Sep 2, 2020 •

edited

Loading