Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[stdlib][DNM] Remove customization points from Sequence and Collection #19769

Closed

Conversation

airspeedswift
Copy link
Member

@airspeedswift airspeedswift commented Oct 8, 2018

This removes some unnecessary customization points on standard library protocols:

map and forEach on Sequence

It is unrealistic that a collection would be able to implement map (which takes a transformation closure) to create a mapped array faster than the default implementation.

While it is possible that a type's forEach implementation might be able to eek out a small performance benefit (for example, to avoid the reference count bump of putting self into an iterator), it is generally harmful to encourage this kind of "maybe forEach could be faster" micro-optimization (for example, see here where error control flow was used in order to break out of the for-each early).

first on Collection and last on BidirectionalCollection

It's important to remove these in order for collections of move-only types to be able to implement these protocols. Returning a single element of the collection inside an optional would otherwise need to consume the entire collection.

While it is conceivable that faster implementations of first for particular collections could exist, it's unlikely to be material and experience in benchmarking so far suggests that where they did exist, those implementations were actually slower.

@airspeedswift
Copy link
Member Author

@swift-ci please smoke benchmark

@airspeedswift
Copy link
Member Author

@swift-ci please smoke test compiler performance

@airspeedswift
Copy link
Member Author

@swift-ci please test

@swift-ci
Copy link
Collaborator

swift-ci commented Oct 8, 2018

Build comment file:

Performance: -O

TEST OLD NEW DELTA RATIO
Regression
MapReduceAnyCollection 398 6748 +1595.5% 0.06x
MapReduceAnyCollectionShort 2035 10519 +416.9% 0.19x
IterateData 1769 1909 +7.9% 0.93x (?)
CStringLongAscii 3301 3554 +7.7% 0.93x
Improvement
StringWordBuilderReservingCapacity 1214 1090 -10.2% 1.11x (?)
Array2D 7505 6910 -7.9% 1.09x
MapReduce 396 368 -7.1% 1.08x

Code size: -O

TEST OLD NEW DELTA RATIO
Regression
MapReduce.o 25781 27621 +7.1% 0.93x
Improvement
NibbleSort.o 14456 13976 -3.3% 1.03x
COWTree.o 14884 14724 -1.1% 1.01x

Performance: -Osize

TEST OLD NEW DELTA RATIO
Regression
MapReduceAnyCollection 436 11494 +2536.2% 0.04x
MapReduceAnyCollectionShort 2329 15428 +562.4% 0.15x
Improvement
StringWalk 2008 1701 -15.3% 1.18x
StringWordBuilderReservingCapacity 1248 1090 -12.7% 1.14x (?)
Array2D 7208 6611 -8.3% 1.09x

Code size: -Osize

TEST OLD NEW DELTA RATIO
Regression
MapReduce.o 22365 23557 +5.3% 0.95x
Improvement
NibbleSort.o 19504 18944 -2.9% 1.03x
COWTree.o 14062 13902 -1.1% 1.01x
LuhnAlgoLazy.o 15904 15744 -1.0% 1.01x
LuhnAlgoEager.o 15906 15746 -1.0% 1.01x

Performance: -Onone

TEST OLD NEW DELTA RATIO
Regression
MapReduceAnyCollection 25072 43746 +74.5% 0.57x
MapReduceAnyCollectionShort 38051 60490 +59.0% 0.63x
PrefixArrayLazy 31200 35976 +15.3% 0.87x
DropFirstArrayLazy 31389 35895 +14.4% 0.87x
Improvement
ArrayOfPOD 859 781 -9.1% 1.10x (?)
StringWordBuilderReservingCapacity 1431 1321 -7.7% 1.08x

Code size: Swift libraries

TEST OLD NEW DELTA RATIO
Improvement
libswiftCore.dylib 3969024 3907584 -1.5% 1.02x
How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false alarms. Unexpected regressions which are marked with '(?)' are probably noise. If you see regressions which you cannot explain you can try to run the benchmarks again. If regressions still show up, please consult with the performance team (@eeckstein).

Hardware Overview
  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

@swift-ci
Copy link
Collaborator

swift-ci commented Oct 8, 2018

Build comment file:

Summary for master smoketest

Unexpected test results, excluded stats for Kingfisher, ReactiveCocoa

No regressions above thresholds

Debug

debug brief

Regressed (0)
name old new delta delta_pct
Improved (0)
name old new delta delta_pct
Unchanged (delta < 1.0% or delta < 100.0ms) (3)
name old new delta delta_pct
Frontend.NumInstructionsExecuted 183,183,666,227 183,268,040,707 84,374,480 0.05%
LLVM.NumLLVMBytesOutput 9,593,348 9,588,752 -4,596 -0.05%
time.swift-driver.wall 23.5s 23.6s 9.0ms 0.04%

debug detailed

Regressed (0)
name old new delta delta_pct
Improved (0)
name old new delta delta_pct
Unchanged (delta < 1.0% or delta < 100.0ms) (23)
name old new delta delta_pct
AST.NumImportedExternalDefinitions 4,559 4,559 0 0.0%
AST.NumLoadedModules 1,578 1,578 0 0.0%
AST.NumTotalClangImportedEntities 17,921 17,921 0 0.0%
AST.NumUsedConformances 1,352 1,352 0 0.0%
IRModule.NumIRBasicBlocks 32,760 32,744 -16 -0.05%
IRModule.NumIRFunctions 16,675 16,659 -16 -0.1%
IRModule.NumIRGlobals 14,608 14,592 -16 -0.11%
IRModule.NumIRInsts 444,841 444,769 -72 -0.02%
IRModule.NumIRValueSymbols 29,966 29,934 -32 -0.11%
LLVM.NumLLVMBytesOutput 9,593,348 9,588,752 -4,596 -0.05%
SILModule.NumSILGenFunctions 7,843 7,829 -14 -0.18%
SILModule.NumSILOptFunctions 10,695 10,665 -30 -0.28%
Sema.NumConformancesDeserialized 23,230 23,210 -20 -0.09%
Sema.NumConstraintScopes 70,308 70,304 -4 -0.01%
Sema.NumDeclsDeserialized 229,260 228,462 -798 -0.35%
Sema.NumDeclsValidated 16,641 16,641 0 0.0%
Sema.NumFunctionsTypechecked 6,090 6,090 0 0.0%
Sema.NumGenericSignatureBuilders 7,320 7,310 -10 -0.14%
Sema.NumLazyGenericEnvironments 52,371 52,072 -299 -0.57%
Sema.NumLazyGenericEnvironmentsLoaded 2,363 2,363 0 0.0%
Sema.NumLazyIterableDeclContexts 43,925 43,924 -1 -0.0%
Sema.NumTypesDeserialized 100,312 99,697 -615 -0.61%
Sema.NumTypesValidated 14,185 14,185 0 0.0%

Release

release brief

Regressed (0)
name old new delta delta_pct
Improved (0)
name old new delta delta_pct
Unchanged (delta < 1.0% or delta < 100.0ms) (3)
name old new delta delta_pct
Frontend.NumInstructionsExecuted 223,126,461,832 222,848,405,976 -278,055,856 -0.12%
LLVM.NumLLVMBytesOutput 10,708,980 10,704,312 -4,668 -0.04%
time.swift-driver.wall 41.3s 41.4s 127.9ms 0.31%

release detailed

Regressed (0)
name old new delta delta_pct
Improved (3)
name old new delta delta_pct
SILModule.NumSILOptFunctions 9,041 8,947 -94 -1.04% ✅
Sema.NumConformancesDeserialized 15,078 14,883 -195 -1.29% ✅
Sema.NumTypesDeserialized 26,687 26,209 -478 -1.79% ✅
Unchanged (delta < 1.0% or delta < 100.0ms) (20)
name old new delta delta_pct
AST.NumImportedExternalDefinitions 1,111 1,111 0 0.0%
AST.NumLoadedModules 105 105 0 0.0%
AST.NumTotalClangImportedEntities 4,764 4,764 0 0.0%
AST.NumUsedConformances 1,354 1,354 0 0.0%
IRModule.NumIRBasicBlocks 34,692 34,676 -16 -0.05%
IRModule.NumIRFunctions 15,176 15,160 -16 -0.11%
IRModule.NumIRGlobals 13,962 13,946 -16 -0.11%
IRModule.NumIRInsts 334,236 334,186 -50 -0.01%
IRModule.NumIRValueSymbols 28,044 28,012 -32 -0.11%
LLVM.NumLLVMBytesOutput 10,708,980 10,704,312 -4,668 -0.04%
SILModule.NumSILGenFunctions 6,076 6,066 -10 -0.16%
Sema.NumConstraintScopes 69,604 69,600 -4 -0.01%
Sema.NumDeclsDeserialized 45,627 45,493 -134 -0.29%
Sema.NumDeclsValidated 10,362 10,362 0 0.0%
Sema.NumFunctionsTypechecked 4,162 4,162 0 0.0%
Sema.NumGenericSignatureBuilders 2,033 2,029 -4 -0.2%
Sema.NumLazyGenericEnvironments 10,065 10,020 -45 -0.45%
Sema.NumLazyGenericEnvironmentsLoaded 201 201 0 0.0%
Sema.NumLazyIterableDeclContexts 5,861 5,861 0 0.0%
Sema.NumTypesValidated 7,103 7,103 0 0.0%

@airspeedswift airspeedswift force-pushed the too-much-customization branch 2 times, most recently from f0d0c2e to e02c925 Compare October 8, 2018 23:21
@airspeedswift
Copy link
Member Author

@swift-ci please smoke benchmark

@airspeedswift
Copy link
Member Author

@swift-ci please smoke test

@swift-ci
Copy link
Collaborator

swift-ci commented Oct 9, 2018

Build comment file:

Performance: -O

TEST OLD NEW DELTA RATIO
Regression
CStringLongAscii 3294 3556 +8.0% 0.93x
Improvement
StringWordBuilderReservingCapacity 1214 1091 -10.1% 1.11x (?)
Array2D 7505 6908 -8.0% 1.09x
MapReduce 397 369 -7.1% 1.08x
MapReduceAnyCollection 398 370 -7.0% 1.08x

Code size: -O

TEST OLD NEW DELTA RATIO
Improvement
NibbleSort.o 14456 13976 -3.3% 1.03x
COWTree.o 14884 14724 -1.1% 1.01x

Performance: -Osize

TEST OLD NEW DELTA RATIO
Improvement
StringWordBuilderReservingCapacity 1214 1090 -10.2% 1.11x (?)
Array2D 7208 6612 -8.3% 1.09x

Code size: -Osize

TEST OLD NEW DELTA RATIO
Improvement
NibbleSort.o 19504 18944 -2.9% 1.03x
COWTree.o 14062 13902 -1.1% 1.01x
LuhnAlgoLazy.o 15904 15744 -1.0% 1.01x
LuhnAlgoEager.o 15906 15746 -1.0% 1.01x

Performance: -Onone

TEST OLD NEW DELTA RATIO
Improvement
ArrayOfPOD 860 780 -9.3% 1.10x (?)
StringWordBuilderReservingCapacity 1431 1321 -7.7% 1.08x

Code size: Swift libraries

TEST OLD NEW DELTA RATIO
Improvement
libswiftCore.dylib 3969024 3907584 -1.5% 1.02x
How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false alarms. Unexpected regressions which are marked with '(?)' are probably noise. If you see regressions which you cannot explain you can try to run the benchmarks again. If regressions still show up, please consult with the performance team (@eeckstein).

Hardware Overview
  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

@airspeedswift
Copy link
Member Author

Closing in favor of #19995

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants