Skip to content

Conversation

lorentey
Copy link
Member

@lorentey lorentey commented Sep 27, 2018

(This part 2 of a resubmission of #19537; part 1 with the boring parts has landed already in #19574.)

Set and Dictionary invalidate all previously vended indices whenever one of these two operations occur:

  • The removal of any element
  • Resizing of the hash table

Currently, we attempt to detect if someone reuses an invalid index or uses an index from an unrelated collection by simply verifying that it points to an occupied bucket. This isn't nearly enough.

We can detect invalid indices with a much higher degree of confidence by including a mutation counter in the storage header of every set/dictionary, and inside every native index.

  • The hash value of the storage object identifier is a nice way to generate an initial value for the counter.
  • The counter in the header needs to be changed whenever the above operations occur.
  • For validation, the counter in the index needs to be compared to the one in the header.

Obviously this isn't a 100% foolproof method -- mutation counters can wrap around, or they can just happen to be initialized to the same value. However, checking the counter is a cheap and reasonably reliable way to catch accidental misuse:

var set = Set(0..<100)
let index = set.index(of: 42)!
set.remove(7)

// Before:
print(set[index]) // May print 42, some other number, or trap

// After:
print(set[index]) // ⚡️ "Attempting to access Dictionary elements using an invalid Index"

rdar://problem/18191576

The mutation count will allow us to recognize and trap on invalid indices more reliably. (However, it won’t be foolproof — some invalid indices may pass through the checks.)

- Change _scale to Int8 to make space for an extra Int32 without enlarging the storage header.
- Add an _age field inside the new gap.
- Initialize the age to a scrambled version of the storage address.
- Generate a new counter for every newly allocated storage instance, except for identical copy-on-write copies.
- Increment the mutation counter after every removal.
(For symmetry with Dictionary, which already has this check.)
Also add static_assert for Set/Dictionary storage header size.
Set and Dictionary will start using this shortly as their native index types.
@lorentey
Copy link
Member Author

@swift-ci please test

@lorentey
Copy link
Member Author

@swift-ci please benchmark

@lorentey
Copy link
Member Author

@swift-ci please test source compatibility

@lorentey lorentey changed the title [stdlib] Set, Dictionary: Radically improved index invalidation [stdlib] Set, Dictionary: Radically improved index validation Sep 27, 2018
@lorentey
Copy link
Member Author

Note: This is really only about native indices. Cocoa index validation and the interaction between Cocoa and native indices could use some improvements, too -- that's the next task on my list.

@swift-ci
Copy link
Contributor

Build comment file:

Performance: -O

TEST OLD NEW DELTA RATIO
Regression
DictionarySwapAt 900 1094 +21.6% 0.82x
Improvement
DictionarySwapAtOfObjects 9704 7511 -22.6% 1.29x

Code size: -O

TEST OLD NEW DELTA RATIO
Regression
WordCount.o 68062 69262 +1.8% 0.98x
Improvement
SetTests.o 61793 60665 -1.8% 1.02x
DictionarySwap.o 30435 30043 -1.3% 1.01x

Performance: -Osize

TEST OLD NEW DELTA RATIO
Regression
SetUnionBox0 851 955 +12.2% 0.89x
SetUnion_OfObjects 8453 9485 +12.2% 0.89x
FloatingPointPrinting_Float_description_small 5249 5662 +7.9% 0.93x
Improvement
DictionarySwapAtOfObjects 10016 7993 -20.2% 1.25x

Code size: -Osize

TEST OLD NEW DELTA RATIO
Regression
WordCount.o 57359 59183 +3.2% 0.97x
Histogram.o 4368 4496 +2.9% 0.97x
CountAlgo.o 14568 14760 +1.3% 0.99x
TestsUtils.o 20258 20498 +1.2% 0.99x
Improvement
DictTest4Legacy.o 23286 22726 -2.4% 1.02x
DictionarySwap.o 26907 26619 -1.1% 1.01x

Performance: -Onone

TEST OLD NEW DELTA RATIO
Improvement
DictionarySwapAt 7105 3286 -53.8% 2.16x
PrefixArrayLazy 33783 30918 -8.5% 1.09x
DropFirstArrayLazy 33793 30966 -8.4% 1.09x
CharIndexing_ascii_unicodeScalars 380330 349322 -8.2% 1.09x
CharIndexing_korean_unicodeScalars 369472 340572 -7.8% 1.08x
How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false alarms. Unexpected regressions which are marked with '(?)' are probably noise. If you see regressions which you cannot explain you can try to run the benchmarks again. If regressions still show up, please consult with the performance team (@eeckstein).

Hardware Overview
  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

@lorentey
Copy link
Member Author

lorentey commented Sep 27, 2018

20:52:22 FAIL: fluent, 4.2, 270b6f, Swift Package

That's weird -- this is the same package that started failing after #19500 landed.

It failed only in the debug build, so this may actually be an unrelated a Hashable violation (i.e., a bug in the package). The detection of those is nondeterministic, so the failure would randomly come and go.

@lorentey
Copy link
Member Author

fluent had some failures in PR testing over the last few days, but it's unclear to me if #19500 was still in play. I'll take a closer look.

@lorentey
Copy link
Member Author

Hm, I can't seem to reproduce the failure with my local toolchain build.

@swift-ci please test source compatibility

…-level indices

Native sets and (especially!) native dictionaries must support indexing with Cocoa indices — indices must be preserved when a Cocoa set/dictionary is converted to native.

This is especially the case for Dictionaries that were converted because of a mutation restricted to values — such as those done through the Values view.
…sages

Capital Index seems too nitpicky, somehow
This makes it a little easier to follow validation logic.
Like before, allow the use of Cocoa indices to access native sets/dictionaries, but approximate the same mutation count-based check as we do for native indices.

- Ensure that native collections that were converted from Cocoa have an age generated from the address of the original Cocoa object.
- To get the age of a Cocoa index, regenerate one from the object embedded in it.
- Compare self.age to index.age and trap if it doesn’t match.

# Conflicts:
#	stdlib/public/core/HashTable.swift
Affected operations:

subscript(index:)
remove(at:)

Note that index(after:) is intentionally not on this list.
…tionaries

Affected operations:

subscript(index:)
keys.subscript(index:)
values.subscript(index:) (setter and _modify)
remove(at:)
swapAt(_:, _:)

Note that index(after:) is intentionally not on this list.
@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - 46475d0

@lorentey
Copy link
Member Author

@swift-ci please test macOS platform

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - 46475d0

@lorentey
Copy link
Member Author

Build failed: Swift Test Linux Platform

DictionaryStorage.swift:387:14: error: use of undeclared type '_CocoaDictionary'

There is a missing compile-time conditional; I'll push a fix once the other tests finish.

@lorentey
Copy link
Member Author

The source compat suite failed again. #19594 produced the same failure, which indicates it's an unrelated regression from earlier. (Quite probably caused by a Set/Dictionary bug, though.)

I'm working to reproduce & fix it. However, I think it might be best to land this so that I don't have to split my brain between two separate codebases.

@lorentey
Copy link
Member Author

@swift-ci please smoke test

@lorentey
Copy link
Member Author

@swift-ci please test linux platform

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - eb51801

@lorentey
Copy link
Member Author

@swift-ci please test linux platform

@lorentey
Copy link
Member Author

@swift-ci please smoke test

@lorentey
Copy link
Member Author

@swift-ci please smoke test compiler performance

@swift-ci
Copy link
Contributor

Build comment file:

Summary for master smoketest

Unexpected test results, excluded stats for Kingfisher, ReactiveCocoa

Regressions found (see below)

Debug

debug brief

Regressed (1)
name old new delta delta_pct
time.swift-driver.wall 26.0s 26.4s 369.6ms 1.42% ⛔
Improved (0)
name old new delta delta_pct
Unchanged (delta < 1.0% or delta < 100.0ms) (2)
name old new delta delta_pct
Frontend.NumInstructionsExecuted 239,506,293,312 239,561,225,385 54,932,073 0.02%
LLVM.NumLLVMBytesOutput 9,520,248 9,520,232 -16 -0.0%

debug detailed

Regressed (0)
name old new delta delta_pct
Improved (0)
name old new delta delta_pct
Unchanged (delta < 1.0% or delta < 100.0ms) (23)
name old new delta delta_pct
AST.NumImportedExternalDefinitions 5,914 5,914 0 0.0%
AST.NumLoadedModules 1,510 1,510 0 0.0%
AST.NumTotalClangImportedEntities 18,273 18,273 0 0.0%
AST.NumUsedConformances 1,348 1,348 0 0.0%
IRModule.NumIRBasicBlocks 32,510 32,510 0 0.0%
IRModule.NumIRFunctions 16,738 16,738 0 0.0%
IRModule.NumIRGlobals 14,495 14,495 0 0.0%
IRModule.NumIRInsts 445,439 445,437 -2 -0.0%
IRModule.NumIRValueSymbols 29,953 29,953 0 0.0%
LLVM.NumLLVMBytesOutput 9,520,248 9,520,232 -16 -0.0%
SILModule.NumSILGenFunctions 7,845 7,845 0 0.0%
SILModule.NumSILOptFunctions 10,598 10,598 0 0.0%
Sema.NumConformancesDeserialized 33,653 33,647 -6 -0.02%
Sema.NumConstraintScopes 70,890 70,890 0 0.0%
Sema.NumDeclsDeserialized 270,048 270,354 306 0.11%
Sema.NumDeclsValidated 16,641 16,641 0 0.0%
Sema.NumFunctionsTypechecked 7,044 7,044 0 0.0%
Sema.NumGenericSignatureBuilders 11,099 11,099 0 0.0%
Sema.NumLazyGenericEnvironments 61,680 61,670 -10 -0.02%
Sema.NumLazyGenericEnvironmentsLoaded 7,183 7,183 0 0.0%
Sema.NumLazyIterableDeclContexts 45,494 45,634 140 0.31%
Sema.NumTypesDeserialized 115,709 115,729 20 0.02%
Sema.NumTypesValidated 14,185 14,185 0 0.0%

Release

release brief

Regressed (0)
name old new delta delta_pct
Improved (0)
name old new delta delta_pct
Unchanged (delta < 1.0% or delta < 100.0ms) (3)
name old new delta delta_pct
Frontend.NumInstructionsExecuted 233,799,795,007 233,504,064,039 -295,730,968 -0.13%
LLVM.NumLLVMBytesOutput 10,577,480 10,572,888 -4,592 -0.04%
time.swift-driver.wall 43.3s 43.3s -1.9ms -0.0%

release detailed

Regressed (1)
name old new delta delta_pct
Sema.NumDeclsDeserialized 51,417 52,169 752 1.46% ⛔
Improved (0)
name old new delta delta_pct
Unchanged (delta < 1.0% or delta < 100.0ms) (22)
name old new delta delta_pct
AST.NumImportedExternalDefinitions 1,383 1,383 0 0.0%
AST.NumLoadedModules 100 100 0 0.0%
AST.NumTotalClangImportedEntities 4,743 4,761 18 0.38%
AST.NumUsedConformances 1,350 1,350 0 0.0%
IRModule.NumIRBasicBlocks 35,307 35,197 -110 -0.31%
IRModule.NumIRFunctions 15,055 15,055 0 0.0%
IRModule.NumIRGlobals 13,861 13,861 0 0.0%
IRModule.NumIRInsts 337,201 336,702 -499 -0.15%
IRModule.NumIRValueSymbols 27,852 27,852 0 0.0%
LLVM.NumLLVMBytesOutput 10,577,480 10,572,888 -4,592 -0.04%
SILModule.NumSILGenFunctions 6,074 6,074 0 0.0%
SILModule.NumSILOptFunctions 8,997 9,004 7 0.08%
Sema.NumConformancesDeserialized 16,908 16,906 -2 -0.01%
Sema.NumConstraintScopes 69,488 69,488 0 0.0%
Sema.NumDeclsValidated 10,362 10,362 0 0.0%
Sema.NumFunctionsTypechecked 4,360 4,360 0 0.0%
Sema.NumGenericSignatureBuilders 2,525 2,525 0 0.0%
Sema.NumLazyGenericEnvironments 11,143 11,146 3 0.03%
Sema.NumLazyGenericEnvironmentsLoaded 1,430 1,430 0 0.0%
Sema.NumLazyIterableDeclContexts 5,939 5,959 20 0.34%
Sema.NumTypesDeserialized 29,516 29,556 40 0.14%
Sema.NumTypesValidated 7,103 7,103 0 0.0%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants