Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom Hashable implementation in protocol extension causes infinite loop when used by RawRepresentable types #59780

Open
retendo opened this issue Jun 29, 2022 · 10 comments
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. compiler The Swift compiler itself derived conformances Feature → protocol → conformances: derived conformances aka synthesized conformances type checker Area → compiler: Semantic analysis

Comments

@retendo
Copy link

retendo commented Jun 29, 2022

(From https://forums.swift.org/t/possible-bug-in-swift-5-6-stdlib-custom-hashable-implementation-in-protocol-extension-causes-infinite-loop-when-used-by-rawrepresentable-types/58536)

Describe the bug
A custom Hashable implementation in a protocol extension causes an infinite loop when it is used by types that also conform to RawRepresentable.

To Reproduce
Steps to reproduce the behavior:

  1. Create a new interface builder based iOS app
  2. Navigate to ViewController.swift
  3. Paste the following code and run the app on an iOS 15.5 device/simulator:
import UIKit

protocol P: Hashable {
    var name: String { get }
}
extension P {
    func hash(into hasher: inout Hasher) {
        hasher.combine(name)
    }
}

enum E: String, P {
    case a
    var name: String {
        rawValue
    }
}

struct S: RawRepresentable, P {
    let rawValue: String
    init?(rawValue: String) {
        self.rawValue = rawValue
    }
    var name: String {
        rawValue
    }
}

struct SNonRaw: P {
    let rawValue: String
    init?(rawValue: String) {
        self.rawValue = rawValue
    }
    var name: String {
        rawValue
    }
}

class ViewController: UIViewController {

    override func viewDidLoad() {
        super.viewDidLoad()
        // Do any additional setup after loading the view.
        
        print("BEFORE NON RAW STRUCT")
        
        let _ = Set<SNonRaw>([SNonRaw(rawValue: "a")!, SNonRaw(rawValue: "a")!, SNonRaw(rawValue: "a")!, SNonRaw(rawValue: "a")!, SNonRaw(rawValue: "a")!])
        
        print("AFTER NON RAW STRUCT")
        
        print("BEFORE STRUCT")
        
        let _ = Set<S>([S(rawValue: "a")!, S(rawValue: "a")!, S(rawValue: "a")!, S(rawValue: "a")!, S(rawValue: "a")!])
        
        print("AFTER STRUCT")
        
        print("BEFORE ENUM")

        let _ = Set<E>([.a, .a, .a, .a, .a, .a, .a, .a, .a, .a, .a])

        print("AFTER ENUM")
    }
}
  1. Observation: The app hangs and the output will be the following
BEFORE NON RAW STRUCT
AFTER NON RAW STRUCT
BEFORE STRUCT
  1. If you stop the app you will get a Problem Report window that gives you the following stacktrace:
Thread 0 Crashed::  Dispatch queue: com.apple.main-thread
0   libswiftCore.dylib            	    0x7fff309e094a RawRepresentable<>._rawHashValue(seed:) + 74
1   HashableRawRepresentable      	       0x109753bd4 protocol witness for Hashable._rawHashValue(seed:) in conformance S + 52
2   HashableRawRepresentable      	       0x1097539fd S.hash(into:) + 109
3   HashableRawRepresentable      	       0x109753b91 protocol witness for Hashable.hash(into:) in conformance S + 17
4   libswiftCore.dylib            	    0x7fff309e0974 RawRepresentable<>._rawHashValue(seed:) + 116
5   HashableRawRepresentable      	       0x109753bd4 protocol witness for Hashable._rawHashValue(seed:) in conformance S + 52
6   HashableRawRepresentable      	       0x1097539fd S.hash(into:) + 109
7   HashableRawRepresentable      	       0x109753b91 protocol witness for Hashable.hash(into:) in conformance S + 17
8   libswiftCore.dylib            	    0x7fff309e0974 RawRepresentable<>._rawHashValue(seed:) + 116
9   HashableRawRepresentable      	       0x109753bd4 protocol witness for Hashable._rawHashValue(seed:) in conformance S + 52
10  HashableRawRepresentable      	       0x1097539fd S.hash(into:) + 109
11  HashableRawRepresentable      	       0x109753b91 protocol witness for Hashable.hash(into:) in conformance S + 17
12  libswiftCore.dylib            	    0x7fff309e0974 RawRepresentable<>._rawHashValue(seed:) + 116
13  HashableRawRepresentable      	       0x109753bd4 protocol witness for Hashable._rawHashValue(seed:) in conformance S + 52
14  HashableRawRepresentable      	       0x1097539fd S.hash(into:) + 109
15  HashableRawRepresentable      	       0x109753b91 protocol witness for Hashable.hash(into:) in conformance S + 17
16  libswiftCore.dylib            	    0x7fff309e0974 RawRepresentable<>._rawHashValue(seed:) + 116
17  HashableRawRepresentable      	       0x109753bd4 protocol witness for Hashable._rawHashValue(seed:) in conformance S + 52
... (continues forever...)

Expected behavior
I would expect the app to continue running and the console output to be:

BEFORE NON RAW STRUCT
AFTER NON RAW STRUCT
BEFORE STRUCT
AFTER STRUCT
BEFORE ENUM
AFTER ENUM

Environment (please complete the following information):

  • OS: iOS 15.5
  • Xcode Version/Tag/Branch: 13.4.1

Additional context

  • If you run this on a iOS 15.2 device/simulator everything works as expected.
  • Another workaround to make this run as expected is to basically revert f8d72f8 by adding this somewhere in your project:
extension RawRepresentable where RawValue: Hashable, Self: Hashable {
    var hashValue: Int {
        return rawValue.hashValue
    }
}
@retendo retendo added the bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. label Jun 29, 2022
@lorentey lorentey added type checker Area → compiler: Semantic analysis standard library Area: Standard library umbrella labels Jun 30, 2022
@lorentey
Copy link
Member

Fascinating!

IIUC, the problem appears to be that struct S gets two separate default implementations of hash(into:): one from protocol P in the code above, the other from RawRepresentable within the stdlib. For some reason, the type checker selects neither one of these two -- instead, the hash(into:) requirement gets synthesized from the hashValue implementation in RawRepresentable, which ultimately ends up calling hash(into:), leading to an infinite recursion.

The compiler does emit a warning that indicates that this happened, but it's a bit subtle:

Swift.RawRepresentable:2:27: warning: 'Hashable.hashValue' is deprecated as a protocol requirement; conform type 'E' to 'Hashable' by implementing 'hash(into:)' instead
    @inlinable public var hashValue: Int { get }
                          ^

Cc @hborla & @slavapestov for help -- I can't explain why the compiler prefers to synthesize hash(into:) instead of selecting one of its two default implementations. 🤔

(enum E has the same behavior.)

Meanwhile, for protocols that refine Hashable and provide default implementations for its requirements, I strongly recommend always implementing all three of static func ==, func hash(into:) and var hashValue -- omitting any of these three will generally cause problems.

extension P {
  static func ==(left: Self, right: Self) -> Bool {
    left.name == right.name
  }

  func hash(into hasher: inout Hasher) {
    hasher.combine(name)
  }

  var hashValue: Int {
    name.hashValue
  }
}

Weirdly, doing this seems to resolve the issue, at least superficially. (Which indicates that there is no underlying ambiguity, which makes it even more mysterious to me why the compiler synthesizes hash(into:).)

Evidently with these changes, the compiler uses P.== for the Equatable conformance, but RawRepresentable.hash(into:) and RawRepresentable.hashValue for Hashable, which definitely isn't right -- all three implementations should come from a consistent source.

For reference, as of Swift 5.5, the stdlib has:

@inlinable // trivial-implementation
public func == <T: RawRepresentable>(lhs: T, rhs: T) -> Bool
  where T.RawValue: Equatable {
  return lhs.rawValue == rhs.rawValue
}

extension RawRepresentable where RawValue: Hashable, Self: Hashable {
  @inlinable // trivial
  public var hashValue: Int {
    // Note: in Swift 5.5 and below, this used to return `rawValue.hashValue`.
    // The current definition matches the default `hashValue` implementation,
    // so that RawRepresentable types don't need to implement both `hashValue`
    // and `hash(into:)` to customize their hashing.
    _hashValue(for: self)
  }

  @inlinable // trivial
  public func hash(into hasher: inout Hasher) {
    hasher.combine(rawValue)
  }

  @inlinable // trivial
  public func _rawHashValue(seed: Int) -> Int {
    // In 5.0, this used to return rawValue._rawHashValue(seed: seed).  This was
    // slightly faster, but it interfered with conforming types' ability to
    // customize their hashing. The current definition is equivalent to the
    // default implementation; however, we need to keep the definition to remain
    // ABI compatible with code compiled on 5.0.
    //
    // Note that unless a type provides a custom hash(into:) implementation,
    // this new version returns the same values as the original 5.0 definition,
    // so code that used to work in 5.0 remains working whether or not the
    // original definition was inlined.
    //
    // See https://bugs.swift.org/browse/SR-10734
    var hasher = Hasher(_seed: seed)
    self.hash(into: &hasher)
    return hasher._finalize()
  }
}

The inconsistency with == likely stems from the fact that func == above is a top-level function. (RawRepresentable isn't explicitly refining Equatable nor Hashable, it merely provides default implementations for their requirements, in a misguided effort to be "helpful".)

At this point, I think I'm going to stop tweaking RawRepresentable's Hashable implementations in the Standard Library -- every couple of releases we add something new to it to patch up a bug, and the fix later turns out to open two brand new corner cases. 😝

@lorentey
Copy link
Member

lorentey commented Jun 30, 2022

(Which indicates that there is no underlying ambiguity, which makes it even more mysterious to me why the compiler synthesizes hash(into:).)

Ah, that isn't right -- the two competing default implementations of hash(into:) have equal priority, so there is an ambiguity. The same isn't true for hashValue, for which there is a single implementation.

It appears that the ambiguity for hash(into:) leads the compiler to choose to synthesize it instead, which feels wrong to me -- we should instead get a diagnostic about the multiple matching functions found for that requirement. The synthesis logic is probably in need of a fix. 😞

If we moved compiler synthesis out of the picture, we do get the expected error:

protocol H {
  func foo() -> String
}

protocol P: H {
  var name: String { get }
}

extension P {
  func foo() -> String { name }
}

protocol R {
  var rawValue: String { get }
}

extension R {
  func foo() -> String { rawValue }
}

struct S: P, R {
  var rawValue: String
  var name: String { rawValue }
}
i.swift:21:8: error: type 'S' does not conform to protocol 'H'
struct S: P, R {
       ^
i.swift:2:8: note: multiple matching functions named 'foo()' with type '() -> String'
  func foo() -> String
       ^
i.swift:18:8: note: candidate exactly matches
  func foo() -> String { rawValue }
       ^
i.swift:10:8: note: candidate exactly matches
  func foo() -> String { name }
       ^

@lorentey
Copy link
Member

Reduced reproducer for the original issue:

protocol P: Equatable {}
protocol Q: Equatable {}

extension P {
  static func ==(left: Self, right: Self) -> Bool {
    print("P.==")
    return true
  }
}


extension Q {
  static func ==(left: Self, right: Self) -> Bool {
    print("Q.==")
    return true
  }
}

struct S: P, Q {}

print(S() == S())

My expectation is that this would fail to compile due to the ambiguous definition for ==; however, struct S gets a synthesized implementation instead.

@xwu
Copy link
Collaborator

xwu commented Jun 30, 2022

Mmm, this is arguably defensible behavior (I will explain below) and I bet there are at least a number of people whose code depends on it. Rather than making it fail to compile, I think a warning with different options to silence (I will explain below) would be a way to square the circle, so to speak:

In your reduced example with struct S: P, Q, if Swift had no compiler synthesis for Equatable, then the way to resolve an error arising from two equally good and therefore ambiguous default implementations of == inherited from P and Q would be to write your own concrete implementation of S.==.

Compiler synthesis takes place for that concrete implementation when (a) an implementation is required to make code compile and (b) it is not written out explicitly. The scenario above meets these criteria.

If we were implementing things from scratch, an argument could be made that the compiler never synthesizes an implementation for the concrete conforming type when there is more than one equally good default implementation. However:

  1. I can guarantee per Hyrum's Law that there are people who rely on this for their code to compile. The number of people for whom the compiler synthesis feature as currently implemented causes an infinite loop is going to be (by necessity and construction) a very small minority of those for whom the feature causes their code to compile and behave "normally," who wouldn't like it very much if their code stopped compiling. Since an infinite loop halts execution and is detectable, I think pulling out the rug entirely would be too extreme.

  2. It may not even be a good choice to cause a compilation failure in such cases of ambiguity, even if we were starting from scratch. This is because the standard library (in this case) and likely many other libraries consider adding default implementations in protocol extensions to be generally benign. Now consider if P and Q were third-party protocols in libraries from different vendors, each refining Equatable (which are safe to vend), but P initially has a default implementation for == and Q does not. It should be safe as an app developer using both libraries to conform your own type S to both P and Q. Therefore, it would be awkward if the vendor of Q adds a default implementation for == only in a later version of the library and it then causes their users' source code to stop compiling—but only if they also conform their type to P (or another protocol that refines Equatable and provides default implementations). Neither library vendors nor the consumer of the library have individually done anything "wrong" or unsupported, but the compiler would stop compiling.

Instead, I wonder if we could consider the following:

  • When the compiler synthesizes a concrete implementation to break the tie between two or more ambiguous inherited default implementations, issue a warning stating that fact explicitly.
  • Give two fix-its to silence the warning: (a) insert an extension explicitly restating the concrete type's conformance to Equatable (or whichever other protocol conformance is being synthesized) with no implementation, in order to have the compiler continue synthesizing a conformance while silencing the compiler warning; or (b) insert an extension with each tie-breaking implementation skeletonized with placeholders so the user can implement these manually for themselves.

@AnthonyLatsis
Copy link
Collaborator

AnthonyLatsis commented Jun 30, 2022

It should be safe as an app developer using both libraries to conform your own type S to both P and Q.

I concur with Xiaodi in the intermodular case, except I would rather make it an error (same fix-it) if at least one of the ambiguous candidates is declared in the current module.

@AnthonyLatsis AnthonyLatsis added conformances Feature → protocol: protocol conformances derived conformances Feature → protocol → conformances: derived conformances aka synthesized conformances and removed conformances Feature → protocol: protocol conformances standard library Area: Standard library umbrella labels Jun 30, 2022
@lorentey
Copy link
Member

lorentey commented Jun 30, 2022

This seems a pretty clear cut case of a missing diagnostic to me. Member requirements are only ever intended to be synthesized when there is no implementation visible -- most certainly not when there are multiple competing choices.

(Source: I implemented one of the iterations of hash(into:) synthesis. This behavior is a bug.)

The right way to fix this issue is the same as with any other bug of this nature: keep the existing behavior but emit a compiler warning in Swift 5.x, then later upgrade the warning to an error in Swift 6 language mode, once we have such a thing.

Please let's not overthink this.

@xwu
Copy link
Collaborator

xwu commented Jul 1, 2022

My concern is that the end user writing the concrete type should be able to get a synthesized protocol conformance somehow in Swift 6. I don’t think having two or more viable defaults should mean that this useful functionality should be suppressed with no way to get it back.

I understand that shipping such functionality was not intended in the first place, but now that it’s been shipped, those relying on it for their code to compile shouldn’t be punished for doing so where the fault isn’t theirs and where the synthesized code doesn’t cause problems.

Sympathetic with keeping the fix as simple as possible, though, for sure.

@lorentey
Copy link
Member

lorentey commented Jul 1, 2022

Yep, we should have syntax to explicitly request a synthesized implementation for a type member, no matter what default implementations are available. (And/or, to have the same implementation generated as explicit code by a refactoring tool in the development environment.)

But the fix for this need not wait until this gets implemented -- it would indeed be nice to have a fix-it to get the current behavior for cases when that is desirable, but it's by no means essential.

(Personally, I'd prefer if we just forced people to spell out the conforming implementation rather than relying on heuristics such as the protocol being explicitly included on the type's conformance list. The stakes are pretty high here, and the current behavior is broken and very surprising.)

@xwu
Copy link
Collaborator

xwu commented Jul 5, 2022

For sure, I'm in no way suggesting a need to implement the general case of explicitly requesting a synthesized implementation in order to fix a bug.

@AnthonyLatsis AnthonyLatsis added the compiler The Swift compiler itself label Dec 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. compiler The Swift compiler itself derived conformances Feature → protocol → conformances: derived conformances aka synthesized conformances type checker Area → compiler: Semantic analysis
Projects
None yet
Development

No branches or pull requests

4 participants