Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DO NOT MERGE] Atomic experiments #27229

Closed
wants to merge 48 commits into from
Closed

Conversation

lorentey
Copy link
Member

@lorentey lorentey commented Sep 18, 2019

This PR explores some possible API design variations for exposing a limited set of atomic operations in the stdlib.

We need a stable address to perform atomic operations, so we won't be able to add the obvious atomic types (e.g. a safe AtomicInt) until we introduce support for move-only types in the language. However, we should still allow people to experiment with atomics by providing a limited set of atomic operations through unsafe pointer types. To make this more practical, we should investigate ways to reliably get at the address of a stored property of an object instance.

  1. Add FieldAccessor, a key-path based library construct to reliably get pointers to stored properties of a reference type.

    final public class AtomicCounter {
      var _countStorage: Int = 0
    
      static let _countField = FieldAccessor(for: \Foo._countStorage)
    
      public func increment() -> Int {
        return Foo._countField.withUnsafeMutablePointer(in: foo) { ptr in 
          _ = UnsafeAtomicInt(ptr).fetchAndAdd(1, ordering: .relaxed)
        }
    }
    
  2. Expose some (underscored) builtin atomic operations on UnsafeRawMutablePointer. (These all assume the pointer's alignment is suitable for such operations, and they all work with Int-sized values.) The memory ordering is baked into the method names, which makes these back-deployable to any version to the stdlib. Each method wraps a single Builtin atomic primitive operation.

    let raw: UnsafeMutableRawPointer = ...
    raw._atomicRelaxedLoadWord()
    raw._atomicLoadWord() // acquiring
    
    raw._atomicRelaxedStoreWord(42)
    raw._atomicStoreWord(42) // releasing
    
    _ = raw._atomicRelaxedExchangeWord(42)
    _ = raw._atomicAcquiringExchangeWord(42)
    _ = raw._atomicReleasingExchangeWord(42)
    _ = raw._atomicExchangeWord(42) // acquiring and releasing
    
    var expected = 0
    if raw._atomicRelaxedCompareExchangeWord(expected: &expected, desired: 42) { ... }
    if raw._atomicAcquiringCompareExchangeWord(expected: &expected, desired: 42) { ... }
    if raw._atomicReleasingCompareExchangeWord(expected: &expected, desired: 42) { ... }
    if raw._atomicCompareExchangeWord(expected: &expected, desired: 42) { ... } // acquiring and releasing
    
    _ = raw._atomicRelaxedFetchThenWrappingAndWord(1)
    _ = raw._atomicAcquiringFetchThenWrappingAndWord(1)
    _ = raw._atomicReleasingFetchThenWrappingAndWord(1)
    _ = raw._atomicFetchThenWrappingAndWord(1) // acquiring and releasing
    
    _ = raw._atomicRelaxedFetchThenWrappingSubtractWord(1)
    _ = raw._atomicAcquiringFetchThenWrappingSubtractWord(1)
    _ = raw._atomicReleasingFetchThenWrappingSubtractWord(1)
    _ = raw._atomicFetchThenWrappingSubtractWord(1) // acquiring and releasing
    
    _ = raw._atomicRelaxedFetchThenBitwiseAndWord(1)
    _ = raw._atomicAcquiringFetchThenBitwiseAndWord(1)
    _ = raw._atomicReleasingFetchThenBitwiseAndWord(1)
    _ = raw._atomicFetchThenBitwiseAndWord(1) // acquiring and releasing
    
    _ = raw._atomicRelaxedFetchThenBitwiseOrWord(1)
    _ = raw._atomicAcquiringFetchThenBitwiseOrWord(1)
    _ = raw._atomicReleasingFetchThenBitwiseOrWord(1)
    _ = raw._atomicFetchThenBitwiseOrWord(1) // acquiring and releasing
    
    _ = raw._atomicRelaxedFetchThenBitwiseXorWord(1)
    _ = raw._atomicAcquiringFetchThenBitwiseXorWord(1)
    _ = raw._atomicReleasingFetchThenBitwiseXorWord(1)
    _ = raw._atomicFetchThenBitwiseXorWord(1) // acquiring and releasing
  3. Introduce a new MemoryOrdering struct, representing LLVM's memory ordering levels, and add some convenience methods that take them. The struct works like a non-frozen enum with the additional restriction that representation of the cases won't ever change. (Which allows better debug performance.)

    Unfortunately MemoryOrdering is a new type, so it isn't directly back-deployable; however, code reads a lot better with the levels as arguments, so it makes sense to use it in higher level APIs, which will come with availability declarations anyway.

    @frozen
    @available(macOS 9999, iOS 9999, watchOS 9999, tvOS 9999, *)
    public struct MemoryOrdering {
      let _rawValue: Int
    
      /// Guarantees the atomicity of the specific operation on which it is applied,
      /// but imposes no ordering constraints on any other reads or writes.
      static var relaxed: MemoryOrdering { get }
    
      /// An acquiring load prevents subsequent load and store operations on the
      /// current thread from being reordered before it.
      static var acquiring: MemoryOrdering { get }
    
      /// A releasing store prevents previous load and store operations on the
      /// current thread from being reordered after it.
      static var releasing: MemoryOrdering { get }
    
      /// An acquiring-and-releasing operation prevents neighboring load and store
      /// operations on the current thread from being reordered across it.
      static var acquiringAndReleasing: MemoryOrdering { get }
    }
    
    _ = raw._atomicLoadWord(ordering: .acquiring)
    raw._atomicStoreWord(42, ordering: .releasing)
    _ = raw._atomicExchangeWord(23, ordering: .acquringAndReleasing)
    _ = raw._atomicCompareExchangeWord(&expected, desired, ordering: .acquiring)
    _ = raw._atomicFetchThenWrappingAddWord(1, ordering: .relaxed)
    _ = raw._atomicFetchThenWrappingSubtractWord(1, ordering: .relaxed)
    _ = raw._atomicFetchThenBitwiseXorWord(42, ordering: .acquiringAndReleasing)
    _ = raw._atomicBitwiseAndThenFetchWord(93, ordering: .sequentiallyConsistent)
    _ = raw._atomicBitwiseXorThenFetchWord(42, ordering: .acquiringAndReleasing)
  4. Introduce a level of type safety and a tiny bit of abstraction by adding a handful of memory-unsafe atomic types. These have the same memory management concerns as an UnsafeMutablePointer; in fact they are simply wrappers around a raw pointer type.

    UnsafeAtomicInt      // load, store, exchange, CAS, &+, &-, &, |, ^
    UnsafeAtomicUInt     // load, store, exchange, CAS, &+, &-, &, |, ^
    UnsafeAtomicUnmanaged<T>      // load, store, exchange, CAS
    UnsafeAtomicUnsafeMutablePointer<T>      // load, store, exchange, CAS
    UnsafeAtomicInitializableReference<T>      // atomic init, load

Other variations are possible, too.

@lorentey
Copy link
Member Author

@swift-ci test

1 similar comment
@lorentey
Copy link
Member Author

@swift-ci test

@swiftlang swiftlang deleted a comment from swift-ci Sep 20, 2019
@swiftlang swiftlang deleted a comment from swift-ci Sep 20, 2019
@lorentey
Copy link
Member Author

lorentey commented Oct 8, 2019

The latest commit replaces MemoryOrdering with ordering views on the UnsafeAtomicUInt type:

let counter: UnsafeAtomicUInt = 

// TIRED
counter.increment(by: 23, ordering: .releasing)

// WIRED
counter.releasing.increment(by: 23)

This gets rid of switch statements, and reads pretty well, but it considerably increases API surface area.

@lorentey
Copy link
Member Author

While ordering views look promising, there are some major problems with them:

  1. There is at least one useful atomic operation that needs two distinct orderings: compare and exchange with a separate ordering for the failure case. While specifying one ordering looks okay (nice, even), listing two of them looks rather weird to me:

    if counter.acquiringAndReleasing.releasingOnFailure.compareExchange(
        expected: &expected, 
        desired: 42
    ) { ... }
  2. The API surface area for ordering views is unjustifiably large. The gyb-based implementation above generates 27 public structs with 175 public funcs. This compares unfavorably with the earlier MemoryOrdering struct (6 structs, 53 funcs). As we add more operations, the difference will grow even further, since the views basically act as an API multiplier.

    This can be partially mitigated by reintroducing MemoryOrdering as a protocol, and expressing the ordering views as generics over it. However, that approach has its own problems, especially around API design complexity. (One example: it is desirable that ordering views only provide operations that make sense for them (e.g., counter.releasing.load() shouldn't compile), but to express this in the type system, we'd need to set up a whole new protocol hierarchy.)

    Additionally, making ordering views generic effectively replaces the earlier (hopefully compiler-evaluated) switch statements with (hopefully compiler-evaluated) generic dispatch -- and I'm not sure this would count as an overall improvement.

  3. Extensibility is an issue. People should be able to define their own atomic constructs outside of the Standard Library, with APIs that match the stdlib's. Implementing ordering views is complicated and boilerplate-heavy, even when they're built around generics. Meanwhile, passing around an ordering value is easy to understand and to implement.

@lorentey
Copy link
Member Author

lorentey commented Oct 18, 2019

Things are back in motion. The last commit makes four major changes, with an eye towards getting this ready to be pitched.


First, I returned to representing memory orderings with regular arguments. However, we now have three separate ordering types based on the nature of the operation:

struct AtomicLoadOrdering {
  static var relaxed: AtomicLoadOrdering { get }
  static var acquiring: AtomicLoadOrdering { get }
}
struct AtomicStoreOrdering {
  static var relaxed: AtomicLoadOrdering { get }
  static var releasing: AtomicLoadOrdering { get }
}
struct AtomicUpdateOrdering {
  static var relaxed: AtomicLoadOrdering { get }
  static var acquiring: AtomicLoadOrdering { get }
  static var releasing: AtomicLoadOrdering { get }
  static var acquiringAndReleasing: AtomicLoadOrdering { get }
}

This lets us enforce that each operation can only be called with orderings that it can support, which was one of the two major advantages to ordering views. (The other advantage was not relying on the optimizer's constant folding to get rid of the switch statements in the implementation. Unfortunately that's back now.)


Second, the four high-level atomic structures now include not only an unsafe pointer to the memory location that stores their value, but also an anchor reference to an object that keeps it alive. This makes them entirely safe, so they lose the Unsafe prefix and we are left with AtomicInt, AtomicUInt, AtomicUnsafeMutablePointer and AtomicUnmanaged.

Note that using these names now may turn out to be unfortunate when and if we get non-copyable types, which would potentially provide a better implementation.


Third, we introduced an Anchored protocol for things that represent values through their address, as well as an associated Anchoring property wrapper:

protocol Anchored {
  associatedtype Value

  static var defaultInitialValue: Value { get }
  init(at address: UnsafeMutablePointer<Value>, in anchor: AnyObject)
}

@propertyWrapper
struct Anchoring<Thing: Anchored> {
  var _storage: Thing.Value

  init() {
    self._storage = Thing.defaultInitialValue
  }

  init(_ value: Thing.Value) {
    self._storage = value
  }

  static subscript<Anchor: AnyObject>(
    _enclosingInstance anchor: Anchor,
    wrapped wrappedKeyPath: ReferenceWritableKeyPath<Anchor, Thing>,
    storage storageKeyPath: ReferenceWritableKeyPath<Anchor, Self>
  ) -> Thing {
    _read {
      let keyPath = storageKeyPath.appending(path: \._storage)
      let p = keyPath._directAddress(in: anchor)!
      yield Thing(at: p, in: anchor)
    }
  }
}

struct AtomicInt: Anchored {...}
struct AtomicUInt: Anchored {...}
struct AtomicUnsafeMutablePointer<Pointee>: Anchored {...}
struct AtomicUnmanaged<Instance: AnyObject>: Anchored {...}

Anchored isn't specific to atomics -- it can be implemented by anything that needs to have a stable address:

struct UnfairLock: Anchored {
  struct Value { var lock: os_unfair_lock_s }

  let _anchor: AnyObject
  let _ptr: UnsafeMutablePointer<Value>
  
  var defaultInitialValue: Value { Value(lock: .init()) }
  init(at address: UnsafeMutablePointer<Value>, in anchor: AnyObject) {
    _anchor = anchor
    _ptr = address
  }

  func lock() { os_unfair_lock_lock(_ptr) }
  func unlock() { os_unfair_lock_unlock(_ptr) }
}

The point of all this is that use sites become a lot more pleasant to read and write:

class Foo {
  @Anchoring(42) var counter: AtomicInt
  @Anchoring var lock: UnfairLock
}

func doSomething(foo: Foo) {
  foo.counter.wrappingIncrement()
  foo.lock.lock()
  defer { foo.lock.unlock() }
  print("I'm holding the lock right now")
}

Unfortunately, currently this has terrible performance, and it needs some compiler work to make it practical.

  • Constructing the key path to the underlying storage variable and extracting its address needs to be a compile-time operation. (We could previously hide the cost of doing this at runtime in FieldAccessor's initializer; Anchoring currently does it on every single access.)
  • New with Anchored: storing a new reference to the anchor class in a temporary anchored value must not result in any additional retain/release traffic, unless the temporary is copied out to its own thing. The _read accessor above is intended to make it easier to do this.)
  • The Anchoring wrapper has some serious limitations:
    • It uses the non-public init(_enclosingInstance:,wrapped:,storage:) feature.
    • The enclosing type must be a class, but this is only enforced with a runtime trap. Trying to use it outside of a class needs to be diagnosed as a compile-time error.
    • We should be statically requiring the storage key path to point to a stored (as opposed to computed) property. (Although currently I don't think this requirement can be violated when Anchored is used as a wrapper.)

Finally, I added some convenience operations to increment/decrement atomic integers.

extension Atomic[U]Int {
  func wrappingIncrement(
    by delta: [U]Int = 1,
    ordering: AtomicUpdateOrdering = .acquiringAndReleasing
  ) {...}

  func wrappingDecrement(
    by delta: [U]Int = 1,
    ordering: AtomicUpdateOrdering = .acquiringAndReleasing
  ) {...}
}

The intent here is that counter.wrappingIncrement() will be somewhat easier to understand than counter.fetchThenWrappingAdd(1).

Add FieldAccessor, a construct that enables pointer operations on stored properties inside class instances.

The idea is to initialize a field accessor struct once for each atomic stored property. Once initialized, the field updater provides efficient atomic operations. (It simply caches the offset of the stored property within the class, then uses it to set a direct pointer to the stored property.)

This works okay, but it’s quite boilerplatey. Also, leaving the raw non-atomic stored properties visible is a bad idea.

final class Foo {
  private var _v: Int = 0
  private var _w: Int = 0

  private static let _vField = FieldAccessor<Foo>(for: \Foo._v)
  private static let _wField = FieldAccessor<Foo>(for: \Foo._w)
}

func test(_ foo: Foo) {
  Foo._vField.withUnsafeMutablePointer(in: foo) { ptr in
    ...
  }
}
UnsafeAtomicInt
UnsafeAtomicUInt
UnsafeAtomicBool
UnsafeAtomicUnmanaged
UnsafeAtomicUnsafeMutablePointer
UnsafeAtomicInitializableReference
- Use UInt for Word-based atomics
- Clarify atomic add operation by calling it “wrapping add”
- Remove “atomic” prefix from members of UnsafeAtomic* types
- Re-word AtomicMemoryOrdering docs
Convert the AtomicMemoryOrdering enum to a frozen struct with transparent static properties for the old cases. This enables DCE to kick in during debug builds, allowing these switch statements to compile down to the specific case.
It is attractive, but it is too expensive for practical use, and may not be supported on all architectures.

This leaves us with the following levels:

- relaxed
- acquiring (default for loads)
- releasing (default for stores)
- acquiring and releasing (default for read-modify-write operations)
It’s not layout-compatible with an actual Bool type, so this formulation is obviously broken. (D’oh.)

We could switch to single byte atomics, but it doesn’t seem worth the complexity. (Also, it’s unclear to me if that would have unusual alignment expectations or if it would add interference issues.)
…s rather than using an enum

Instead of

    let counter: UnsafeAtomicUInt = …
    counter.increment(by: 23, ordering: .releasing)

we now have this:

    counter.releasing.increment(by: 23)

This gets rid of switch statements, and reads pretty well, but it considerably increases API surface.
Let’s try keeping things explicit.

Requiring an explicit ordering may either be an overall readability improvement, or it may be too much noise. We’ll see with practice.
WrappingAdd(_:) → WrappingIncrement(by:)
WrappingSubtract(_:) → WrappingDecrement(by:)

Always name the ordering on UnsafeMutableRawPointer APIs
@lorentey
Copy link
Member Author

@MadCoder This is terrific feedback, thanks. Reverting the removal of .sequentiallyConsistent is easy enough; it's nice to see confirmation that there are real life use cases for it.

I'll try replacing compareExchange with the following two methods:

func compareAndStore(expected: Int, desired: Int, ordering: AtomicUpdateOrdering) -> Bool
func compareAndExchange(expected: Int, desired: Int, ordering: AtomicUpdateOrdering) -> (swapped: Bool, original: Int)

(The names aren't great; I expect we'll have ample opportunity to come up with a better naming scheme on the forums.)

Interestingly, I originally went with the inout version specifically because it felt less annoying in toy examples, despite the obvious inconsistency with func exchange. But now I'd argue the inout variant actually reads much worse:

extension UnsafeAtomicInt {
  // BEFORE
  func myIncrement(ordering: AtomicUpdateOrdering) {
    var expected = load(ordering: .relaxed)
    while !compareExchange(
      expected: &expected, 
      desired: expected + 1, 
      ordering: ordering) { }
  }

  // AFTER
  func myIncrement(ordering: AtomicUpdateOrdering) {
    var done = false
    var value = load(ordering: .relaxed)
    while !done {
      (done, value) = compareAndExchange(
        expected: value, 
        desired: value + 1,
        ordering: ordering)
    }
  }
}

@glessard
Copy link
Contributor

I'd nitpick the label for the returned Bool: why name it "swapped" when the name of the operation is "exchange"?

@glessard
Copy link
Contributor

I'm unsure of the appropriateness of having the 'load' operation on the Unmanaged atomic wrapper. This would lead to people reproducing the pre-Swift-3 weak reference bug where one thread lowers the reference count to zero while another thread is trying to increment it. Anything with a refcount of 1 shouldn't be interacted with from more than one thread at a time, and the way to encourage that with an atomic unmanaged is to restrict the operations to exchanges.

@lorentey
Copy link
Member Author

@glessard The reason I don't think this is as big as a problem as it first appears is that extracting a strong reference from an UnsafeAtomicUnmanaged requires two very explicit steps:

let _ref: UnsafeAtomicUnmanaged<Foo> = ...
let ref = _ref.load(ordering: .acquiring).takeUnretainedValue()

To me it seems obvious that the two steps aren't going to happen in a single atomic transaction.

Now of course, the UnsafeAtomicUnmanaged construct does not help at all with the problem of memory reclamation, and that is a major, major limitation. (UnsafeAtomicMutablePointer has the same issue.) These two types have essentially no room for any additional bits to implement inline refcounts or version stamps. (So they provide no ABA mitigation, either.)

My view is that the initial wave of atomics will be mostly about establishing the basics -- memory orderings and a naming scheme. A subsequent second wave will flesh out the feature by introducing actually useful atomic types (stamped pointers, atomic strong references, etc.) built around double-wide atomics.

The reason I think we need a delay here is that I expect the double-wide atomic types won't maintain a direct match between the logical value held by the atomic struct (say, a strong reference) and the in-memory representation (say, some encoding of a (reference, inline refcount, stamp) tuple), so trying to model them with unsafe pointer APIs would be unnecessarily complicated. I'd much rather wait for move-only types -- I expect those will provide a more appropriate model.

@lorentey
Copy link
Member Author

lorentey commented Feb 19, 2020

(Not to mention that it isn't clear to me exactly what set of double-wide atomic types we would need to add. I also don't know yet if a universal atomic strong reference would scale well enough to be more practical than a regular strong reference + an unfair lock.)

Introduce a new UnsafeAtomic<Value> generic struct; use it to define atomic operations on Int8, Int16, Int32, Int64, UInt8, UInt16, UInt32, UInt64 in additiont to the existing Int/UInt.

Replace UnsafeAtomic[U]Int with UnsafeAtomic<[U]Int>.
@lorentey lorentey marked this pull request as draft April 9, 2020 22:44
@shahmishal
Copy link
Member

Please update the base branch to main by Oct 5th otherwise the pull request will be closed automatically.

  • How to change the base branch: (Link)
  • More detail about the branch update: (Link)

@lorentey
Copy link
Member Author

lorentey commented Oct 2, 2020

Closing -- these experiments have culminated in the release of the Atomics package, at least for now.

@lorentey lorentey closed this Oct 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants