New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dead interpolation elimination improvements #20317

Closed
wants to merge 9 commits into
base: master
from

Conversation

Projects
None yet
3 participants
@brentdax
Collaborator

brentdax commented Nov 4, 2018

The optimizer doesn't understand the effects of appendLiteral(_:) and appendInterpolation(_:) calls, which only modify self. Because of this, it can't tell that an interpolation can be eliminated if the resulting string is unused, which causes a (no-assertions-only) test failure and some code size regressions.

This is an attempt to help DeadCodeElimination along. The first revision benchmarked poorly when I tried it a few days ago; I'm just pushing it now to get a baseline, and then I'll experiment a little more from there. Currently pretty hacky, too.

This could be a dead end. If it isn't, it should resolve SR-9008, rdar://45294605, and rdar://45788348.

@brentdax

This comment has been minimized.

Collaborator

brentdax commented Nov 4, 2018

@swift-ci please smoke benchmark

@swift-ci

This comment has been minimized.

Contributor

swift-ci commented Nov 4, 2018

Build comment file:

Performance: -O

TEST OLD NEW DELTA RATIO
Improvement
IterateData 1790 1617 -9.7% 1.11x

Code size: -O

TEST OLD NEW DELTA RATIO
Improvement
DeadArray.o 1940 1596 -17.7% 1.22x

Performance: -Osize

TEST OLD NEW DELTA RATIO
Improvement
DictionaryKeysContainsNative 36 29 -19.4% 1.24x
IterateData 1842 1643 -10.8% 1.12x (?)
CStringLongAscii 3528 3285 -6.9% 1.07x

Code size: -Osize

TEST OLD NEW DELTA RATIO
Improvement
DeadArray.o 1906 1578 -17.2% 1.21x

Performance: -Onone

TEST OLD NEW DELTA RATIO
Improvement
Histogram 6711 5654 -15.8% 1.19x
CStringLongAscii 3662 3373 -7.9% 1.09x
QueueGeneric 20417 19034 -6.8% 1.07x

Code size: Swift libraries

TEST OLD NEW DELTA RATIO
Regression
libswiftFoundation.dylib 1683456 1708032 +1.5% 0.99x
Improvement
libswiftSwiftReflectionTest.dylib 53248 45056 -15.4% 1.18x
How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false alarms. Unexpected regressions which are marked with '(?)' are probably noise. If you see regressions which you cannot explain you can try to run the benchmarks again. If regressions still show up, please consult with the performance team (@eeckstein).

Hardware Overview
  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

@brentdax

This comment has been minimized.

Collaborator

brentdax commented Nov 5, 2018

Wow, much better than expected. Let's do a longer run.

@brentdax

This comment has been minimized.

Collaborator

brentdax commented Nov 5, 2018

@swift-ci please benchmark

@swift-ci

This comment has been minimized.

Contributor

swift-ci commented Nov 5, 2018

Build comment file:

Performance: -O

TEST OLD NEW DELTA RATIO
Improvement
StringComparison_latin1 722 642 -11.1% 1.12x
IterateData 1787 1616 -9.6% 1.11x (?)

Code size: -O

TEST OLD NEW DELTA RATIO
Improvement
DeadArray.o 1940 1596 -17.7% 1.22x

Performance: -Osize

TEST OLD NEW DELTA RATIO
Improvement
IterateData 1840 1640 -10.9% 1.12x

Code size: -Osize

TEST OLD NEW DELTA RATIO
Improvement
DeadArray.o 1906 1578 -17.2% 1.21x

Performance: -Onone

TEST OLD NEW DELTA RATIO
Regression
ArrayOfPOD 780 856 +9.7% 0.91x (?)
Improvement
Dictionary3 709 591 -16.6% 1.20x

Code size: Swift libraries

TEST OLD NEW DELTA RATIO
Regression
libswiftFoundation.dylib 1683456 1708032 +1.5% 0.99x
Improvement
libswiftSwiftReflectionTest.dylib 53248 45056 -15.4% 1.18x
How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false alarms. Unexpected regressions which are marked with '(?)' are probably noise. If you see regressions which you cannot explain you can try to run the benchmarks again. If regressions still show up, please consult with the performance team (@eeckstein).

Hardware Overview
  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

brentdax added some commits Nov 1, 2018

Rename new semantic to interpolation.append
@eeckstein pointed out that it’s not strictly correct to declare a guarantee that a function wrapping a call to write(to:) on an arbitrary type has no side effects except on self. However, we do currently eliminate these calls anyway and we want to keep doing so, so this is best thought of as a quirk of interpolation behavior, not a result of the effects system.

Result: We’re going to stick with using an @_semantics attribute, but rename it to avoid talking about effects.
Add new elimination test for non-small string
If an interpolation is estimated to contain more than 15 characters, we will generate code which is harder to correctly eliminate. Make sure we handle this case.
Make DefaultStringInterpolation.init easier to eliminate
The optimizer has a hard time understanding that _StringGuts.init(_initialCapacity:) is fully eliminatable, particularly when we end up choosing the large string path. This change allows us to eliminate the interpolation before we do that inlining. With it, even large string allocations can be quickly and completely eliminated.
@brentdax

This comment has been minimized.

Collaborator

brentdax commented Nov 9, 2018

@swift-ci please test

@brentdax

This comment has been minimized.

Collaborator

brentdax commented Nov 9, 2018

@swift-ci please test macOS platform

@swift-ci

This comment was marked as resolved.

Contributor

swift-ci commented Nov 9, 2018

Build failed
Swift Test OS X Platform
Git Sha - face186

@brentdax

This comment has been minimized.

Collaborator

brentdax commented Nov 9, 2018

@swift-ci please benchmark

@swift-ci

This comment was marked as resolved.

Contributor

swift-ci commented Nov 9, 2018

!!! Couldn't read commit file !!!

@brentdax

This comment has been minimized.

Collaborator

brentdax commented Nov 9, 2018

@swift-ci please benchmark

@swift-ci

This comment has been minimized.

Contributor

swift-ci commented Nov 9, 2018

Build comment file:

Performance: -O

TEST OLD NEW DELTA RATIO
Improvement
IterateData 1672 1445 -13.6% 1.16x
Dictionary2 1088 986 -9.4% 1.10x

Code size: -O

TEST OLD NEW DELTA RATIO
Regression
DictionaryKeysContains.o 11707 13371 +14.2% 0.88x
WordCount.o 45460 46964 +3.3% 0.97x
Improvement
DeadArray.o 1802 1474 -18.2% 1.22x

Performance: -Osize

TEST OLD NEW DELTA RATIO
Improvement
Dictionary2 1078 958 -11.1% 1.13x (?)
IterateData 1515 1356 -10.5% 1.12x

Code size: -Osize

TEST OLD NEW DELTA RATIO
Regression
DictionaryKeysContains.o 11459 12867 +12.3% 0.89x
WordCount.o 41708 43244 +3.7% 0.96x
SequenceAlgos.o 23148 23516 +1.6% 0.98x
TwoSum.o 5277 5341 +1.2% 0.99x
Improvement
DeadArray.o 1842 1458 -20.8% 1.26x
ByteSwap.o 1634 1614 -1.2% 1.01x

Performance: -Onone

TEST OLD NEW DELTA RATIO
Regression
MapReduceLazyCollection 21239 25803 +21.5% 0.82x
MapReduceLazyCollectionShort 31806 38388 +20.7% 0.83x

Code size: -swiftlibs

TEST OLD NEW DELTA RATIO
Regression
libswiftFoundation.dylib 1593344 1609728 +1.0% 0.99x
Improvement
libswiftSwiftReflectionTest.dylib 49152 45056 -8.3% 1.09x
How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview
  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB
--------------
@brentdax

This comment has been minimized.

Collaborator

brentdax commented Nov 12, 2018

I'm reasonably satisfied with these results. Comments on some surprises:

  • Smaller-than-expected libswiftSwiftReflectionTest.dylib improvement: The baseline changed. On Sunday, master was building a 53k dylib; on Thursday, it built a 49k dylib. In both runs, this PR built a 45k version.
  • DictionaryKeysContains.o size regression: The inliner seems to make different decisions about Dictionary.init<A>(uniqueKeysWithValues:)—in the branch, it doesn't inline _NativeDictionary.merge<A>(_:isUnique:uniquingKeysWith:) but instead keeps it as a separate specialized function. There's a string interpolation in the merge method which generates a fatalError() message, and it looks like it's not inlining calls to DefaultStringInterpolation.init, so perhaps that's changing the inlining cost enough so that it makes a different decision? (In any case, I noticed some slightly questionable code in some of these members, so it might be changing soon...)

Edit: The odd code does make sense after all. Hmm...

@brentdax

This comment has been minimized.

Collaborator

brentdax commented Nov 12, 2018

The WordCount.o, SequenceAlgos.o, and TwoSum.o tests are heavy on calls to CheckResults(_:), a function with a string interpolation in its (never-taken) failure branch. These are all sensitive to changes in how much interpolation code we inline. They're probably not directly related to the DictionaryKeysContains.o regression.

@brentdax

This comment has been minimized.

Collaborator

brentdax commented Nov 13, 2018

@swift-ci please benchmark

@brentdax brentdax changed the title from [WIP] Dead interpolation elimination improvements to Dead interpolation elimination improvements Nov 13, 2018

@brentdax brentdax requested a review from eeckstein Nov 13, 2018

@brentdax

This comment has been minimized.

Collaborator

brentdax commented Nov 13, 2018

@eeckstein DictionaryKeysContains.o is showing a ~13% code size regression. Based on the inliner debug output, it looks like interpolation cost estimates are now lower. Perhaps the inliner is underestimating the cost of certain functions because it doesn't realize that interpolation methods skipped during early inlining will still be inlined later.

Do you have any thoughts about this? And if there's no obvious fix, could you review the PR?

@swift-ci

This comment has been minimized.

Contributor

swift-ci commented Nov 13, 2018

Build comment file:

Performance: -O

TEST OLD NEW DELTA RATIO
Improvement
NopDeinit 52690 45455 -13.7% 1.16x
StringEqualPointerComparison 628 571 -9.1% 1.10x

Code size: -O

TEST OLD NEW DELTA RATIO
Regression
DictionaryKeysContains.o 11707 13371 +14.2% 0.88x
WordCount.o 44276 45780 +3.4% 0.97x
Improvement
DeadArray.o 1802 1474 -18.2% 1.22x

Performance: -Osize

TEST OLD NEW DELTA RATIO
Improvement
StringEqualPointerComparison 628 571 -9.1% 1.10x

Code size: -Osize

TEST OLD NEW DELTA RATIO
Regression
DictionaryKeysContains.o 11459 12867 +12.3% 0.89x
WordCount.o 40596 42132 +3.8% 0.96x
SequenceAlgos.o 23148 23516 +1.6% 0.98x
TwoSum.o 5277 5341 +1.2% 0.99x
Improvement
DeadArray.o 1842 1458 -20.8% 1.26x
ByteSwap.o 1634 1614 -1.2% 1.01x

Performance: -Onone

TEST OLD NEW DELTA RATIO
Regression
MapReduceLazyCollection 21388 27748 +29.7% 0.77x
MapReduceLazyCollectionShort 32376 38872 +20.1% 0.83x
StringHashing_latin1 760 839 +10.4% 0.91x
Improvement
CharIteration_utf16_unicodeScalars 174200 122109 -29.9% 1.43x
CharIteration_japanese_unicodeScalars 230668 173062 -25.0% 1.33x
CharIteration_tweet_unicodeScalars 372034 280993 -24.5% 1.32x
CharIteration_ascii_unicodeScalars 187249 142154 -24.1% 1.32x
CharIteration_korean_unicodeScalars 185056 140973 -23.8% 1.31x
CharIteration_russian_unicodeScalars 158268 120705 -23.7% 1.31x
CharIteration_chinese_unicodeScalars 144884 110514 -23.7% 1.31x
CharIteration_punctuatedJapanese_unicodeScalars 33219 25603 -22.9% 1.30x
CharIteration_punctuated_unicodeScalars 41740 32313 -22.6% 1.29x
CharIteration_japanese_unicodeScalars_Backwards 388800 322928 -16.9% 1.20x
CharIndexing_japanese_unicodeScalars 515669 435723 -15.5% 1.18x
CharIteration_punctuatedJapanese_unicodeScalars_Backwards 50823 43938 -13.5% 1.16x
CharIteration_korean_unicodeScalars_Backwards 280889 247100 -12.0% 1.14x
CharIteration_ascii_unicodeScalars_Backwards 283159 249568 -11.9% 1.13x
DictionaryCompactMapValuesOfCastValue 67745 59713 -11.9% 1.13x
CharIteration_tweet_unicodeScalars_Backwards 565668 499583 -11.7% 1.13x
CharIteration_russian_unicodeScalars_Backwards 238634 210880 -11.6% 1.13x
CharIteration_punctuated_unicodeScalars_Backwards 63136 55854 -11.5% 1.13x
CharIteration_chinese_unicodeScalars_Backwards 216581 192349 -11.2% 1.13x
CharIndexing_chinese_unicodeScalars 311011 276246 -11.2% 1.13x
CharIndexing_russian_unicodeScalars_Backwards 385447 345589 -10.3% 1.12x
CharIteration_utf16_unicodeScalars_Backwards 240414 218925 -8.9% 1.10x
CharIndexing_ascii_unicodeScalars 394405 359898 -8.7% 1.10x
CharIndexing_punctuated_unicodeScalars 87738 80159 -8.6% 1.09x
CharIndexing_punctuatedJapanese_unicodeScalars 68947 63011 -8.6% 1.09x
CharIndexing_ascii_unicodeScalars_Backwards 451114 414136 -8.2% 1.09x
CharIndexing_utf16_unicodeScalars_Backwards 368460 339446 -7.9% 1.09x

Code size: -swiftlibs

TEST OLD NEW DELTA RATIO
Regression
libswiftFoundation.dylib 1601536 1617920 +1.0% 0.99x
Improvement
libswiftSwiftReflectionTest.dylib 49152 45056 -8.3% 1.09x
How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview
  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB
--------------
@brentdax

This comment has been minimized.

Collaborator

brentdax commented Nov 13, 2018

@swift-ci please test

@swift-ci

This comment was marked as resolved.

Contributor

swift-ci commented Nov 13, 2018

Build failed
Swift Test OS X Platform
Git Sha - 32f043d

@brentdax

This comment has been minimized.

Collaborator

brentdax commented Nov 13, 2018

That's the CI's usual "bad sha" failure. Tests are running here for macOS and here for Linux.

@swift-ci

This comment was marked as resolved.

Contributor

swift-ci commented Nov 13, 2018

Build failed
Swift Test Linux Platform
Git Sha - 32f043d

return true;
// We can remove loads of addresses within the allocation.
if (auto *Load = dyn_cast<LoadInst>(Inst))

This comment has been minimized.

@eeckstein

eeckstein Nov 13, 2018

Member

I'm wondering why we can't remove load instructions in general, i.e. remove the "!Inst->mayReadFromMemory()" from the condition above.

This comment has been minimized.

@brentdax

brentdax Nov 14, 2018

Collaborator

Changing mayReadFromMemory() to mayWriteFromMemory() breaks a bunch of tests, particularly tests for key paths. I think the problem is that the pass would eliminate code which mutates objects outside of the allocation, such as:

 %0 = ...something producing $Int...
 %1 = alloc_stack $MyStruct
 // ...initialization omitted...
%42 = struct_element_addr %1 : $*MyStruct , #MyStruct.someStructPtr
%43 = load %42 : $**SomeStruct
%44 = struct_element_addr %44 : $*SomeStruct, #SomeStruct.property
store %0 to %44 : $*Int

Limiting it to loads of addresses within the allocation avoids this problem.

Edit: Reading over that again…if you don't think this test is enough to guard against that problem, I think you're probably right. This works for DefaultStringInterpolation in the code we generate because its string storage is never shared, but I'm not convinced it's correct in the general case.

// memory management. We don't do that here (that's what
// DeadObjectAnalysis below is supposed to do), but we know we don't
// need to for DefaultStringInterpolation, because we know that the
// initialization of its sole property will be eliminated too.

This comment has been minimized.

@eeckstein

eeckstein Nov 13, 2018

Member

I don't understand this comment. Can you give an example?

This comment has been minimized.

@brentdax

brentdax Nov 13, 2018

Collaborator

I may be a little confused here. What I'm trying to say is that in the general case, if we eliminate a load and a release, we should look for corresponding stores and replace them with releases of the values that would have been stored into them. We don't need to do that for DefaultStringInterpolation because we know that if we eliminate its stack allocation, we will also eliminate anything that could have allocated or retained the string storage object that would have been inside it, because that only happens inside eliminatable calls to appendLiteral(_:), appendInterpolation(_:), or init(literalCapacity:interpolationCount:).

This comment has been minimized.

@eeckstein

eeckstein Nov 13, 2018

Member

I see. Can you change the comment to this explanation?

@eeckstein

This comment has been minimized.

Member

eeckstein commented Nov 13, 2018

DictionaryKeysContains.o

I think we can live with the code size regression for now.

brentdax added some commits Nov 15, 2018

wip
Making nontrivial dead object elimination work.
WIP: Switch to nontrivial dead object elim
Rather than trying to hack trivial dead object elimination to handle certain nontrivial objects, I’m now trying to use the nontrivial dead object elimination tools. These ought to be adequate for our use case; the comments indicate that their main weakness is destructor analysis, which I don’t think should be a factor for structs, and certainly isn’t for interpolations (which always use native strings). So far this passes all interpolation tests, but breaks when applied to arbitrary structs, so I don’t fully trust it yet.
@brentdax

This comment has been minimized.

Collaborator

brentdax commented Nov 16, 2018

@swift-ci please benchmark

@swift-ci

This comment has been minimized.

Contributor

swift-ci commented Nov 16, 2018

Build comment file:

Code size: -O

TEST OLD NEW DELTA RATIO
Regression
DictionaryKeysContains.o 11767 13431 +14.1% 0.88x
WordCount.o 44228 45732 +3.4% 0.97x
Improvement
DeadArray.o 1892 1532 -19.0% 1.23x

Performance: -Osize

TEST OLD NEW DELTA RATIO
Regression
CharacterLiteralsLarge 100 111 +11.0% 0.90x

Code size: -Osize

TEST OLD NEW DELTA RATIO
Regression
DictionaryKeysContains.o 11455 12895 +12.6% 0.89x
WordCount.o 40516 41924 +3.5% 0.97x
Improvement
DeadArray.o 1890 1506 -20.3% 1.25x
DictTest4.o 20855 20631 -1.1% 1.01x

Performance: -Onone

TEST OLD NEW DELTA RATIO
Regression
ArrayOfPOD 777 858 +10.4% 0.91x (?)
Improvement
ExclusivityIndependent 82 70 -14.6% 1.17x

Code size: -swiftlibs

TEST OLD NEW DELTA RATIO
Regression
libswiftFoundation.dylib 1519616 1536000 +1.1% 0.99x
Improvement
libswiftSwiftReflectionTest.dylib 40960 32768 -20.0% 1.25x
How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview
  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB
--------------
@brentdax

This comment has been minimized.

Collaborator

brentdax commented Nov 16, 2018

@swift-ci please test compiler performance

@swift-ci

This comment has been minimized.

Contributor

swift-ci commented Nov 16, 2018

Build comment file:


@brentdax

This comment has been minimized.

Collaborator

brentdax commented Nov 16, 2018

What CI meant to say was:

Summary for master full

Unexpected test results, excluded stats for Tagged, NonEmpty, ProcedureKitCloud, GRDB, ModelAssistant, Wordy

Regressions found (see below)

Debug-batch

debug-batch brief

Regressed (1)
name old new delta delta_pct
Frontend.NumInstructionsExecuted 21,299,318,627,006 23,930,268,593,075 2,630,949,966,069 12.35% ⛔️
Improved (0)
name old new delta delta_pct
Unchanged (delta < 1.0% or delta < 100.0ms) (2)
name old new delta delta_pct
LLVM.NumLLVMBytesOutput 911,949,140 911,727,304 -221,836 -0.02%
time.swift-driver.wall 2246.9s 2256.4s 9.5s 0.42%

debug-batch detailed

Regressed (1)
name old new delta delta_pct
Frontend.NumInstructionsExecuted 21,299,318,627,006 23,930,268,593,075 2,630,949,966,069 12.35% ⛔️
Improved (2)
name old new delta delta_pct
Driver.NumDriverPipePolls 304,783 292,005 -12,778 -4.19%
Driver.NumDriverPipeReads 345,376 331,797 -13,579 -3.93%
Unchanged (delta < 1.0% or delta < 100.0ms) (92)
name old new delta delta_pct
AST.NumASTBytesAllocated 45,359,895,288 45,451,505,008 91,609,720 0.2%
AST.NumDecls 65,849 65,849 0 0.0%
AST.NumDependencies 152,683 152,689 6 0.0%
AST.NumImportedExternalDefinitions 960,730 960,730 0 0.0%
AST.NumInfixOperators 24,640 24,640 0 0.0%
AST.NumLinkLibraries 0 0 0 0.0%
AST.NumLoadedModules 180,845 180,845 0 0.0%
AST.NumLocalTypeDecls 113 113 0 0.0%
AST.NumObjCMethods 12,528 12,528 0 0.0%
AST.NumPostfixOperators 13 13 0 0.0%
AST.NumPrecedenceGroups 12,687 12,687 0 0.0%
AST.NumPrefixOperators 71 71 0 0.0%
AST.NumReferencedDynamicNames 101 101 0 0.0%
AST.NumReferencedMemberNames 3,046,567 3,046,567 0 0.0%
AST.NumReferencedTopLevelNames 210,981 210,981 0 0.0%
AST.NumSourceBuffers 288,503 288,503 0 0.0%
AST.NumSourceLines 2,107,547 2,107,547 0 0.0%
AST.NumSourceLinesPerSecond 1,167,653 1,160,805 -6,848 -0.59%
AST.NumTotalClangImportedEntities 3,562,456 3,565,535 3,079 0.09%
AST.NumUsedConformances 185,274 185,274 0 0.0%
Driver.ChildrenMaxRSS 68,565,780,480 68,615,297,024 49,516,544 0.07%
Driver.DriverDepCascadingDynamic 0 0 0 0.0%
Driver.DriverDepCascadingExternal 0 0 0 0.0%
Driver.DriverDepCascadingMember 0 0 0 0.0%
Driver.DriverDepCascadingNominal 0 0 0 0.0%
Driver.DriverDepCascadingTopLevel 0 0 0 0.0%
Driver.DriverDepDynamic 0 0 0 0.0%
Driver.DriverDepExternal 0 0 0 0.0%
Driver.DriverDepMember 0 0 0 0.0%
Driver.DriverDepNominal 0 0 0 0.0%
Driver.DriverDepTopLevel 0 0 0 0.0%
Driver.NumDriverJobsRun 13,529 13,529 0 0.0%
Driver.NumDriverJobsSkipped 0 0 0 0.0%
Driver.NumProcessFailures 0 0 0 0.0%
Frontend.MaxMallocUsage 352,809,515,144 353,114,816,472 305,301,328 0.09%
Frontend.NumProcessFailures 0 0 0 0.0%
IRModule.NumIRAliases 92,680 92,680 0 0.0%
IRModule.NumIRBasicBlocks 3,587,318 3,587,289 -29 -0.0%
IRModule.NumIRComdatSymbols 0 0 0 0.0%
IRModule.NumIRFunctions 1,632,326 1,632,326 0 0.0%
IRModule.NumIRGlobals 1,857,354 1,857,354 0 0.0%
IRModule.NumIRIFuncs 0 0 0 0.0%
IRModule.NumIRInsts 42,089,475 42,077,011 -12,464 -0.03%
IRModule.NumIRNamedMetaData 66,177 66,177 0 0.0%
IRModule.NumIRValueSymbols 3,120,194 3,120,194 0 0.0%
LLVM.NumLLVMBytesOutput 911,949,140 911,727,304 -221,836 -0.02%
Parse.NumFunctionsParsed 2,170,339 2,170,339 0 0.0%
Parse.NumIterableDeclContextParsed 865,730 865,730 0 0.0%
SILModule.NumSILGenDefaultWitnessTables 0 0 0 0.0%
SILModule.NumSILGenFunctions 1,286,682 1,286,682 0 0.0%
SILModule.NumSILGenGlobalVariables 26,821 26,821 0 0.0%
SILModule.NumSILGenVtables 10,171 10,171 0 0.0%
SILModule.NumSILGenWitnessTables 37,360 37,360 0 0.0%
SILModule.NumSILOptDefaultWitnessTables 0 0 0 0.0%
SILModule.NumSILOptFunctions 1,170,065 1,171,362 1,297 0.11%
SILModule.NumSILOptGlobalVariables 27,550 27,550 0 0.0%
SILModule.NumSILOptVtables 16,355 16,355 0 0.0%
SILModule.NumSILOptWitnessTables 72,507 72,507 0 0.0%
Sema.AccessLevelRequest 1,912,680 1,913,194 514 0.03%
Sema.DefaultAndMaxAccessLevelRequest 45,840 45,843 3 0.01%
Sema.EnumRawTypeRequest 13,205 13,205 0 0.0%
Sema.ExtendedNominalRequest 2,675,157 2,675,431 274 0.01%
Sema.InheritedDeclsReferencedRequest 84,613,324 84,668,705 55,381 0.07%
Sema.InheritedTypeRequest 441,224 441,287 63 0.01%
Sema.IsDynamicRequest 1,535,733 1,535,733 0 0.0%
Sema.IsObjCRequest 1,333,835 1,333,978 143 0.01%
Sema.NamedLazyMemberLoadFailureCount 18,161 18,184 23 0.13%
Sema.NamedLazyMemberLoadSuccessCount 13,018,294 13,012,199 -6,095 -0.05%
Sema.NominalTypeLookupDirectCount 24,981,204 24,991,188 9,984 0.04%
Sema.NumConformancesDeserialized 3,879,389 3,882,556 3,167 0.08%
Sema.NumConstraintScopes 13,045,831 13,046,856 1,025 0.01%
Sema.NumConstraintsConsideredForEdgeContraction 29,316,344 29,316,587 243 0.0%
Sema.NumDeclsDeserialized 30,827,826 30,873,244 45,418 0.15%
Sema.NumDeclsValidated 1,645,646 1,645,641 -5 -0.0%
Sema.NumFunctionsTypechecked 920,531 920,531 0 0.0%
Sema.NumGenericSignatureBuilders 872,637 872,284 -353 -0.04%
Sema.NumLazyGenericEnvironments 6,157,722 6,157,904 182 0.0%
Sema.NumLazyGenericEnvironmentsLoaded 165,400 165,424 24 0.01%
Sema.NumLazyIterableDeclContexts 4,932,562 4,933,380 818 0.02%
Sema.NumLeafScopes 8,944,710 8,945,570 860 0.01%
Sema.NumTypesDeserialized 10,882,131 10,888,374 6,243 0.06%
Sema.NumTypesValidated 1,078,137 1,078,132 -5 -0.0%
Sema.NumUnloadedLazyIterableDeclContexts 3,500,966 3,499,675 -1,291 -0.04%
Sema.OverriddenDeclsRequest 4,103,315 4,121,991 18,676 0.46%
Sema.RequirementRequest 54,623 54,619 -4 -0.01%
Sema.SelfBoundsFromWhereClauseRequest 50,408,559 50,467,490 58,931 0.12%
Sema.SetterAccessLevelRequest 108,626 108,626 0 0.0%
Sema.SuperclassDeclRequest 66,573,449 66,597,424 23,975 0.04%
Sema.SuperclassTypeRequest 30,277 30,277 0 0.0%
Sema.TypeDeclsFromWhereClauseRequest 26,254 26,257 3 0.01%
Sema.USRGenerationRequest 6,141,318 6,181,319 40,001 0.65%
Sema.UnderlyingTypeDeclsReferencedRequest 2,296,019 2,296,093 74 0.0%

Release

release brief

Regressed (0)
name old new delta delta_pct
Improved (0)
name old new delta delta_pct
Unchanged (delta < 1.0% or delta < 100.0ms) (3)
name old new delta delta_pct
Frontend.NumInstructionsExecuted 24,042,045,391,202 24,042,730,935,196 685,543,994 0.0%
LLVM.NumLLVMBytesOutput 811,141,394 811,010,570 -130,824 -0.02%
time.swift-driver.wall 4466.2s 4465.0s -1.2s -0.03%

release detailed

Regressed (0)
name old new delta delta_pct
Improved (0)
name old new delta delta_pct
Unchanged (delta < 1.0% or delta < 100.0ms) (23)
name old new delta delta_pct
AST.NumImportedExternalDefinitions 178,485 178,485 0 0.0%
AST.NumLoadedModules 11,404 11,404 0 0.0%
AST.NumTotalClangImportedEntities 602,601 602,601 0 0.0%
AST.NumUsedConformances 186,004 186,004 0 0.0%
IRModule.NumIRBasicBlocks 2,985,101 2,985,023 -78 -0.0%
IRModule.NumIRFunctions 1,338,631 1,338,723 92 0.01%
IRModule.NumIRGlobals 1,465,764 1,465,655 -109 -0.01%
IRModule.NumIRInsts 28,671,707 28,672,463 756 0.0%
IRModule.NumIRValueSymbols 2,620,363 2,620,358 -5 -0.0%
LLVM.NumLLVMBytesOutput 811,141,394 811,010,570 -130,824 -0.02%
SILModule.NumSILGenFunctions 561,090 561,090 0 0.0%
SILModule.NumSILOptFunctions 695,820 696,113 293 0.04%
Sema.NumConformancesDeserialized 1,574,881 1,572,972 -1,909 -0.12%
Sema.NumConstraintScopes 11,500,306 11,500,306 0 0.0%
Sema.NumDeclsDeserialized 3,928,331 3,928,720 389 0.01%
Sema.NumDeclsValidated 855,460 855,460 0 0.0%
Sema.NumFunctionsTypechecked 450,388 450,388 0 0.0%
Sema.NumGenericSignatureBuilders 144,163 144,163 0 0.0%
Sema.NumLazyGenericEnvironments 792,879 792,879 0 0.0%
Sema.NumLazyGenericEnvironmentsLoaded 15,999 15,999 0 0.0%
Sema.NumLazyIterableDeclContexts 531,886 531,906 20 0.0%
Sema.NumTypesDeserialized 2,136,447 2,139,197 2,750 0.13%
Sema.NumTypesValidated 401,820 401,820 0 0.0%
@brentdax

This comment has been minimized.

Collaborator

brentdax commented Nov 16, 2018

Also, ModelAssistant UPASSed. So, that happened.

@brentdax

This comment has been minimized.

Collaborator

brentdax commented Nov 17, 2018

So, this passes all tests and doesn't seem to cause any problems as long as NONTRIVIAL_ALLOC_STACK_FOR_EVERYONE is not defined. But with it defined—which allows the new code to operate on any struct, not just DefaultStringInterpolation/_StringGuts/_StringObject—it seems to introduce incorrect releases in some tests, which makes me worried there's an underlying issue. Reduction for one of these (pop it into test/stdlib and run it with lit; make sure you have a release-mode, no-assertion stdlib):

// RUN: %target-run-simple-swift
// REQUIRES: executable_test
// REQUIRES: objc_interop

//
// Tests for the NSString APIs as exposed by String
//

import Foundation

_ = "абвг" == "абВГ".lowercased(with: Locale.current)

As far as I can tell, the failing tests all involve bridged NSStrings, and interpolation buffers should ~never be NSStrings. But it's still a concern, and this patch is probably getting too large to land in swift-5.0-branch anyway.

One other bit of knowledge I want to preserve: @_semantics("interpolation.append") in this PR exempts a method from early inlining, and that seems to cause regressions in certain benchmarks. It's something to keep an eye on and perhaps figure out how to mitigate eventually.

@brentdax brentdax closed this Nov 17, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment