Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generated inner constructor should have parameters constrained. #49

Open
gafter opened this issue Sep 8, 2023 · 10 comments
Open

Generated inner constructor should have parameters constrained. #49

gafter opened this issue Sep 8, 2023 · 10 comments

Comments

@gafter
Copy link
Member

gafter commented Sep 8, 2023

According to the Julia documentation at https://docs.julialang.org/en/v1/manual/constructors/#man-inner-constructor-methods

If any inner constructor method is defined, no default constructor method is provided: it is presumed that you have supplied yourself with all the inner constructors you need. The default constructor is equivalent to writing your own inner constructor method that takes all of the object's fields as parameters (constrained to be of the correct type, if the corresponding field has a type), and passes them to new, returning the resulting object:

But @auto_hash_equals generates the constructor with unconstrained parameter types. Those parameters should be constrained.

@gafter
Copy link
Member Author

gafter commented Sep 8, 2023

This is a slightly breaking change. The following no longer runs:

@auto_hash_equals_cached struct S268
    x::Int
end
@assert S268(2.0).x === 2 # MethodError: no method matching Main.A.S268(::Float64)

@auto_hash_equals_cached struct S269{T <: Any}
    x::T
end
@assert S269{Int}(2.0).x === 2 # MethodError: no method matching Main.A.S269{Int64}(::Float64)
@assert S269(2.0).x === 2.0

So we would have to bump the major version number.

@mcmcgrath13
Copy link
Collaborator

mcmcgrath13 commented Sep 8, 2023

I tried playing with this as it surprised me a bit and I don't see Julia enforcing that behavior with the default constructor. Are we sure we want to do this?

julia> struct Foo{T}
       a::T
       end

julia> Foo(2.0)
Foo{Float64}(2.0)

julia> Foo{Int}(2.0)
Foo{Int64}(2)
julia> struct Bar
       a::Int
       end

julia> Bar(2.0)
Bar(2)

@gafter
Copy link
Member Author

gafter commented Sep 8, 2023

@mcmcgrath13 The repl has different behavior than the language elsewhere. I think this is documented somewhere. To see the "proper" Julia behavior, you need to put the code in a source file and include it.

(which I "corrected" to)

Actually, the Julia documentation cited in the OP is not correct. Julia does NOT put type constraints on the parameters of the implicit constructor:

struct S1
    x::Int
end
S1(2.0) # OK

struct S2
    x::Int
    S2(x::Int) = new(x)
end
S2(2.0) # error!

This is the same in or out of the REPL.

After further research, I wrote JuliaLang/julia#51253

@gafter
Copy link
Member Author

gafter commented Sep 9, 2023

I have offered to fix Julia's documentation at JuliaLang/julia#51253.

Whatever the correct specification of the correct behavior of Julia is, this package should

  1. Imitate that behavior when no inner constructor is provided, and
  2. Imitate the default behavior of the language when an inner constructor is provided.

I suspect that fixing this issue may permit this package to interoperate with other macros better.

@timholy
Copy link

timholy commented Sep 13, 2023

I don't see anything particularly wrong with the constructor:

julia> macroexpand(Main, quote
       @auto_hash_equals cache=true struct S269{T<:Integer}
           x::T
       end
       end)
quote
    #= REPL[9]:2 =#
    begin
        #= REPL[9]:2 =#
        struct S269{T <: Integer}
            #= REPL[9]:3 =#
            x::T
            _cached_hash::UInt
            function S269{T}(x) where T <: Integer
                #= /home/tim/.julia/packages/AutoHashEquals/8dq3O/src/impl.jl:256 =#
                #= /home/tim/.julia/packages/AutoHashEquals/8dq3O/src/impl.jl:257 =#
                new(x, (Base).hash(x, 0xa63963209f14699e))
            end
        end
...

Maybe it could be improved like this:

function S269{T}(x) where T <: Integer
    y = convert(T, x)
    new(y, (Base).hash(y, 0xa63963209f14699e))
end

That would eliminate a corner case: T(x1) === T(x2) but typeof(x1) !== typeof(x2) and the two hash differently (which would indeed be bad).

@timholy
Copy link

timholy commented Sep 14, 2023

In fact, it doesn't even need to be two different types:

julia> @auto_hash_equals cache=true struct S{T}
           x::T
       end
S

julia> s1, s2 = S{Float32}(1.0), S{Float32}(1.0+eps())
(S{Float32}(1.0f0), S{Float32}(1.0f0))

julia> s1 == s2
false

julia> s1.x == s2.x
true

This happens because 1.0 and 1.0+eps() both round to 1.0f0, but the Float64-based hash value is different. If you performed the conversion first, and then hashed, things would work correctly. Likewise, omitting the cache=true causes things to work properly.

This is why a good mental model for default constructors is the following (where all constructors are explicit):

struct S{T}
    x::T

    S{T}(x::T) where T = new{T}(x)        # constructor 1
end
S{T}(x) where T = S{T}(convert(T, x)::T)  # constructor 2
S(x) = S{typeof(x)}(x)                    # constructor 3

It's not exactly what we do (we effectively inline constructor 1 into constructor 2) but it's the right way to think about it.

Moreover, this design would fix the bug here: compute the cached hash only in the inner constructor S{T}(x::T), and everything else is an outer constructor which calls this one.

Presumably this is essentially what you mean by "should have parameters constrained," but you also need the convert constructors to keep everything working. Or you can keep the current design but convert before hashing.

@gafter
Copy link
Member Author

gafter commented Sep 14, 2023

@timholy What is wrong with the current behavior is that it defines one constructor method rather than two. The default constructor has two methods. convert is already performed by the new invocation and doesn't need to appear in the expansion. Adding a convert invocation makes no difference whatsoever to the generated code.

If there is no source equivalent to the set of constructor methods automatically provided, the documentation should say that.

As far as I can tell, the default constructors are equivalent to this for a parametric type:

struct S{T}
    x::T

    S{T}(x::T) where T = new{T}(x)
end
S{T}(x) where T = S{T}(convert(T, x)::T)
S(x::T) where T = S{T}(x)

The hand-inlined type inference in your definition of S(x) would only be correct for the simplest cases of precisely one field being precisely of the type of a type parameter. The last method, above, seems to handle all cases correctly.

However, this only applies to parametric types. Non-parametric types have a much simpler expansion, documented in JuliaLang/julia#51253 . That PR doesn't provide the expansion for parametric types, which is already (incorrectly) documented in a later section of the document modified by that PR.

I'll modify that PR to better handle the case of parametric types, including the required conversion on the second constructor method, above.

@timholy
Copy link

timholy commented Sep 14, 2023

The bug illustrated in the first code-block of #49 (comment) is fixed by the suggestion in the second code-block of #49 (comment),

function S269{T}(x) where T <: Integer
    y = convert(T, x)
    new(y, (Base).hash(y, 0xa63963209f14699e))
end

Try it; you'll see that adding a convert indeed makes a difference. What's crucial is that the conversion happens before the hash computation, not after; it doesn't much matter whether you achieve this by arranging the order in which constructors get called (if you have multiple constructors), or just get the logic right in a single constructor. That is to say, either you could have one constructor (the one in the code block above this paragraph) or several (the one in the final code-block of #49 (comment)).

Focusing on counting the number of methods doesn't seem so important:

f(x) = 1
f(x::Int) = 1
f(x::Float64) = 1

This is three methods, but one would be forgiven for suggesting that one could delete the last two (and it's probably better if one does).

@timholy
Copy link

timholy commented Sep 14, 2023

For example, let's try the implementation I think you're advocating:

julia> struct CachedHash{T}
           x::T
           cachedhash::UInt

           CachedHash{T}(x::T) where T = new{T}(x, hash(x, 0xa63963209f14699e))
           CachedHash{T}(x) where T = new{T}(x, hash(x, 0xa63963209f14699e))
       end

julia> s1, s2 = CachedHash{Float32}(1.0), CachedHash{Float32}(1.0 + eps());

julia> dump(s1)
CachedHash{Float32}
  x: Float32 1.0f0
  cachedhash: UInt64 0x691e5307da57bbf4

julia> dump(s2)
CachedHash{Float32}
  x: Float32 1.0f0
  cachedhash: UInt64 0xc58b93505afe7c50

Oops. You have s1.x == s2.x but the cached hashes are different. Just having two methods and calling new from both of them doesn't magically fix a logic bug.

@gafter
Copy link
Member Author

gafter commented Sep 14, 2023

@timholy The current behavior of AutoHashEquals is incorrect because it doesn't provide a set of constructors that are semantically equivalent to the default ones. This issue asks that to be corrected. I do not expect to make the error you suggest above in computing the hash code in the implementation of AutoHashEquals.jl, though I appreciate having it called to my attention so I can be sure to add tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants