PerformanceInliner: enable generic inlining of co-routines #27088

eeckstein · 2019-09-09T14:01:43Z

Co-routines are so expensive (e.g. Array.subscript.read) that it makes sense to enable generic inlining of co-routines.
This will speed up array iteration (e.g. for elem in array { }) in a generic context significantly.
Another example is ManagedBuffer.header.read, which gets much faster.
In both cases, the speedup is mainly because there is no malloc happening anymore.

https://bugs.swift.org/browse/SR-11231
rdar://problem/53777612

eeckstein · 2019-09-09T14:01:59Z

@swift-ci test

eeckstein · 2019-09-09T14:02:09Z

@swift-ci benchmark

swift-ci · 2019-09-09T14:26:45Z

Performance: -O

Regression	OLD	NEW	DELTA	RATIO
PrefixWhileAnySeqCntRangeLazy	34	40	+17.6%	0.85x (?)

Improvement	OLD	NEW	DELTA	RATIO
Dictionary4	198	156	-21.2%	1.27x
Dictionary4OfObjects	225	197	-12.4%	1.14x

Code size: -O

Regression	OLD	NEW	DELTA	RATIO
BucketSort.o	11083	11339	+2.3%	0.98x

Performance: -Osize

Regression	OLD	NEW	DELTA	RATIO
MapReduce	218	262	+20.2%	0.83x (?)
RandomShuffleLCG2	368	416	+13.0%	0.88x
RemoveWhereSwapInts	31	35	+12.9%	0.89x
Array2D	3696	4144	+12.1%	0.89x (?)
MapReduceAnyCollection	239	261	+9.2%	0.92x
RemoveWhereFilterInts	23	25	+8.7%	0.92x (?)
SubstringFromLongStringGeneric	12	13	+8.3%	0.92x (?)
ArraySetElement	262	283	+8.0%	0.93x (?)

Improvement	OLD	NEW	DELTA	RATIO
FlattenListLoop	2791	2416	-13.4%	1.16x (?)
ObjectiveCBridgeStubNSDateRefAccess	196	174	-11.2%	1.13x (?)

Code size: -Osize

Regression	OLD	NEW	DELTA	RATIO
StringRemoveDupes.o	4529	4641	+2.5%	0.98x
BucketSort.o	11169	11393	+2.0%	0.98x

Performance: -Onone

Code size: -swiftlibs

How to read the data

The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview

  Model Name: Mac mini
  Model Identifier: Macmini8,1
  Processor Name: Intel Core i7
  Processor Speed: 3.2 GHz
  Number of Processors: 1
  Total Number of Cores: 6
  L2 Cache (per Core): 256 KB
  L3 Cache: 12 MB
  Memory: 64 GB

swift-ci · 2019-09-09T15:06:07Z

Build failed
Swift Test Linux Platform
Git Sha - 8c04047435d7fa5539afc9be0a6ad540c4f826c0

swift-ci · 2019-09-09T15:08:36Z

Build failed
Swift Test OS X Platform
Git Sha - 8c04047435d7fa5539afc9be0a6ad540c4f826c0

Co-routines are so expensive (e.g. Array.subscript.read) that it makes sense to enable generic inlining of co-routines. This will speed up array iteration (e.g. for elem in array { }) in a generic context significantly. Another example is ManagedBuffer.header.read, which gets much faster. In both cases, the speedup is mainly because there is no malloc happening anymore. https://bugs.swift.org/browse/SR-11231 rdar://problem/53777612

eeckstein · 2019-09-09T17:22:41Z

@swift-ci test

eeckstein · 2019-09-09T17:24:34Z

@swift-ci test

swift-ci · 2019-09-09T18:46:20Z

Build failed
Swift Test Linux Platform
Git Sha - d07593b

swift-ci · 2019-09-09T19:26:01Z

Build failed
Swift Test OS X Platform
Git Sha - d07593b

eeckstein · 2019-09-10T06:10:57Z

@swift-ci test

eeckstein force-pushed the inline-generic-coroutines branch from 8c04047 to d07593b Compare September 9, 2019 17:22

eeckstein merged commit c949261 into apple:master Sep 10, 2019

eeckstein deleted the inline-generic-coroutines branch September 10, 2019 08:33

This was referenced Sep 10, 2019

[SR-11231] ManagedBufferPointer's suggested deinit allocates #53632

Closed

[SR-11262] Array.subscript.read / Collection.subscript.read allocate #53663

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PerformanceInliner: enable generic inlining of co-routines #27088

PerformanceInliner: enable generic inlining of co-routines #27088

eeckstein commented Sep 9, 2019

eeckstein commented Sep 9, 2019

eeckstein commented Sep 9, 2019

swift-ci commented Sep 9, 2019

swift-ci commented Sep 9, 2019

swift-ci commented Sep 9, 2019

eeckstein commented Sep 9, 2019

eeckstein commented Sep 9, 2019

swift-ci commented Sep 9, 2019

swift-ci commented Sep 9, 2019

eeckstein commented Sep 10, 2019

PerformanceInliner: enable generic inlining of co-routines #27088

PerformanceInliner: enable generic inlining of co-routines #27088

Conversation

eeckstein commented Sep 9, 2019

eeckstein commented Sep 9, 2019

eeckstein commented Sep 9, 2019

swift-ci commented Sep 9, 2019

Performance: -O

Code size: -O

Performance: -Osize

Code size: -Osize

Performance: -Onone

Code size: -swiftlibs

swift-ci commented Sep 9, 2019

swift-ci commented Sep 9, 2019

eeckstein commented Sep 9, 2019

eeckstein commented Sep 9, 2019

swift-ci commented Sep 9, 2019

swift-ci commented Sep 9, 2019

eeckstein commented Sep 10, 2019