Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stdlib: Code size improvements for Dictionary for -Osize #21306

Merged
merged 2 commits into from Dec 14, 2018

Conversation

eeckstein
Copy link
Member

The first change is to remove some @inline(__always) attributes. Those were added before we had the guaranteed-by-default calling convention. They are not necessary anymore.

The second change is to not specialize some slow-path functions. This results that no specialization code for these functions is generated at the client side. Instead those functions are directly called in the libSwiftCore.
Note that Key-related hash and equality comparisons are still specialized, because otherwise the performance hit for Osize would be too big.

Some Dictionary benchmarks regress a bit with -Osize, but the code size wins are big.

rdar://problem/46534453

@eeckstein
Copy link
Member Author

@swift-ci benchmark

@swift-ci
Copy link
Collaborator

Build comment file:

Performance: -O

TEST OLD NEW DELTA RATIO
Regression
DictionaryGroup 229 259 +13.1% 0.88x
CountAlgoString 1725 1940 +12.5% 0.89x
Improvement
SortAdjacentIntPyramids 1257 990 -21.2% 1.27x
DataSetCountSmall 154 137 -11.0% 1.12x (?)
Histogram 535 476 -11.0% 1.12x
WordCountHistogramASCII 4308 3989 -7.4% 1.08x

Code size: -O

TEST OLD NEW DELTA RATIO
Regression
DictionaryKeysContains.o 8615 10719 +24.4% 0.80x
DictionaryGroup.o 16963 18395 +8.4% 0.92x
WordCount.o 43171 44315 +2.6% 0.97x
HashQuadratic.o 5508 5644 +2.5% 0.98x
DictionaryOfAnyHashableStrings.o 11053 11229 +1.6% 0.98x
DictionaryCopy.o 7773 7885 +1.4% 0.99x
Improvement
DictTest3.o 24459 22475 -8.1% 1.09x
ReduceInto.o 18201 17145 -5.8% 1.06x
DictionarySwap.o 27750 26478 -4.6% 1.05x
DictTest4.o 25055 24034 -4.1% 1.04x
DictTest4Legacy.o 26513 25540 -3.7% 1.04x
DriverUtils.o 149581 144669 -3.3% 1.03x
StringRemoveDupes.o 7728 7520 -2.7% 1.03x
DictionaryCompactMapValues.o 19294 18950 -1.8% 1.02x
ObjectiveCBridging.o 43320 42696 -1.4% 1.01x

Performance: -Osize

TEST OLD NEW DELTA RATIO
Regression
DictionaryLiteral 2856 7032 +146.2% 0.41x
DictionaryRemove 3498 5820 +66.4% 0.60x
FrequenciesUsingReduce 4018 6634 +65.1% 0.61x
DictionarySubscriptDefaultMutation 269 361 +34.2% 0.75x
DictionaryCopy 54175 72500 +33.8% 0.75x
DictionaryRemoveOfObjects 23229 29555 +27.2% 0.79x
FrequenciesUsingReduceInto 891 1104 +23.9% 0.81x
TwoSum 1167 1444 +23.7% 0.81x
DictionarySubscriptDefaultMutationArray 609 727 +19.4% 0.84x
DictionaryGroup 321 362 +12.8% 0.89x
Dictionary2OfObjects 2245 2510 +11.8% 0.89x
DictionarySwap 997 1108 +11.1% 0.90x
Dictionary2 882 970 +10.0% 0.91x
Dictionary 534 587 +9.9% 0.91x
CountAlgoString 1730 1900 +9.8% 0.91x
Dictionary3 229 251 +9.6% 0.91x
StringEnumRawValueInitialization 846 923 +9.1% 0.92x
StringRemoveDupes 339 369 +8.8% 0.92x
Improvement
DataSetCountSmall 157 140 -10.8% 1.12x

Code size: -Osize

TEST OLD NEW DELTA RATIO
Improvement
Histogram.o 4016 1920 -52.2% 2.09x
HashQuadratic.o 5164 2880 -44.2% 1.79x
TwoSum.o 5357 3109 -42.0% 1.72x
DictTest4Legacy.o 23880 14336 -40.0% 1.67x
StringRemoveDupes.o 7561 4761 -37.0% 1.59x
DictTest3.o 21233 13945 -34.3% 1.52x
DictionaryCompactMapValues.o 18126 12358 -31.8% 1.47x
DictTest2.o 14145 9833 -30.5% 1.44x
DictionaryRemove.o 15746 11028 -30.0% 1.43x
DictionarySubscriptDefault.o 27545 19601 -28.8% 1.41x
DictionaryLiteral.o 1509 1075 -28.8% 1.40x
DictTest.o 17233 12521 -27.3% 1.38x
ReduceInto.o 13403 9955 -25.7% 1.35x
DictionaryOfAnyHashableStrings.o 10757 8013 -25.5% 1.34x
DictTest4.o 20854 16894 -19.0% 1.23x
ReversedCollections.o 11842 9746 -17.7% 1.22x
DictionarySwap.o 26886 22214 -17.4% 1.21x
DictionaryCopy.o 7241 6177 -14.7% 1.17x
RGBHistogram.o 27566 23974 -13.0% 1.15x
DictionaryGroup.o 16143 14175 -12.2% 1.14x
DictOfArraysToArrayOfDicts.o 30614 27094 -11.5% 1.13x
DictionaryBridgeToObjC.o 5981 5419 -9.4% 1.10x
Prims.o 39308 37212 -5.3% 1.06x
PrimsSplit.o 39360 37264 -5.3% 1.06x
DictionaryKeysContains.o 8815 8359 -5.2% 1.05x
DriverUtils.o 130357 124053 -4.8% 1.05x
WordCount.o 40660 39532 -2.8% 1.03x
Hash.o 20367 19841 -2.6% 1.03x
ObjectiveCBridging.o 40695 40143 -1.4% 1.01x

Performance: -Onone

TEST OLD NEW DELTA RATIO
Improvement
DataSetCountSmall 217 200 -7.8% 1.08x (?)

Code size: -swiftlibs

TEST OLD NEW DELTA RATIO
Improvement
libswiftAVFoundation.dylib 65536 61440 -6.2% 1.07x
libswiftXCTest.dylib 81920 77824 -5.0% 1.05x
libswiftAppKit.dylib 81920 77824 -5.0% 1.05x
libswiftFoundation.dylib 1654784 1626112 -1.7% 1.02x
libswiftStdlibUnittest.dylib 380928 376832 -1.1% 1.01x
How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview
  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB
--------------

@eeckstein
Copy link
Member Author

@swift-ci test

@swift-ci
Copy link
Collaborator

Build failed
Swift Test Linux Platform
Git Sha - 16fcc3e7946cf74f3a13a4012417191d27c321c8

@swift-ci
Copy link
Collaborator

Build failed
Swift Test OS X Platform
Git Sha - 16fcc3e7946cf74f3a13a4012417191d27c321c8

…ion when compiled with -Osize

It's @_semantics("optimize.sil.specialize.generic.size.never")

It is similar to "optimize.sil.specialize.generic.partial.never", but only prevents specialization if the optimization mode is Size
The first change is to remove some @inline(__always) attributes. Those were added before we had the guaranteed-by-default calling convention. They are not necessary anymore.

The second change is to not specialize some slow-path functions. This results that no specialization code for these functions is generated at the client side. Instead those functions are directly called in the libSwiftCore.
Note that Key-related hash and equality comparisons are still specialized, because otherwise the performance hit for Osize would be too big.

Some Dictionary benchmarks regress a bit with -Osize, but the code size wins are big.

rdar://problem/46534453
@eeckstein
Copy link
Member Author

@swift-ci test

1 similar comment
@eeckstein
Copy link
Member Author

@swift-ci test

@lorentey
Copy link
Member

Hm; I wonder if some of these would better be implemented by moving methods that don't depend on Key/Value to a non-generic _CoreNativeDictionary struct.

@eeckstein
Copy link
Member Author

As I see it, all of the functions I annotated depend on the Value type.
In general, it's an interesting idea, because you probably could make such functions resilient without hurting performance.

@aroben
Copy link

aroben commented Dec 14, 2018

Any idea what's causing the two -O performance regressions? Is it the removal of @inline(__always)?

And the -O code size regressions seem particularly surprising, given that this change should produce less code, not more.

@eeckstein
Copy link
Member Author

The -O regressions could be noise (I didn't see them when running the benchmarks locally), e.g. because of different code alignment.
The -Osize regressions are because we now execute generic code instead of specialized code.
I experimented a lot to keep those regressions in an acceptable range. For example, Key-type related functions (e.g. lookup) are still specialized. Otherwise the regressions would be really bad.

@aroben
Copy link

aroben commented Dec 14, 2018

Any idea what's going on with the "Code size: -O" regressions? For example, DictionaryKeysContains.o got about 2KB bigger.

@eeckstein
Copy link
Member Author

This is a result of different inlining decisions

@eeckstein eeckstein merged commit 18e1905 into apple:master Dec 14, 2018
@eeckstein eeckstein deleted the dictionary-code-size branch December 14, 2018 18:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants