Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Presumed dispatch error on tuple types #9536

Closed
timholy opened this issue Jan 1, 2015 · 16 comments
Closed

Presumed dispatch error on tuple types #9536

timholy opened this issue Jan 1, 2015 · 16 comments
Labels
bug Indicates an unexpected problem or unintended behavior

Comments

@timholy
Copy link
Member

timholy commented Jan 1, 2015

I have no idea how reproducible this is, but I'm filing it while I have all the output.

I was testing #9493 and starting julia as JULIA_CPU_CORES=1 julia --code-coverage=all --inline=no, after deleting sys.so, and then include("runtests.jl") in the test/ folder. Tests completed successfully up to resolve.jl, which gave the following error:

...
     * version
     * resolve
exception on 1: ERROR: test error in expression: sanity_tst(deps_data)
BoundsError()
 in getindex at tuple.jl:6
 in hash at tuple.jl:90
 in hash at tuple.jl:88
 in hash at version.jl:159
 in ht_keyindex at dict.jl:501
 in haskey at dict.jl:657
 in deps_from_data at resolve.jl:65
 in sanity_tst at resolve.jl:88
 in sanity_tst at resolve.jl:98
 in anonymous at test.jl:85
 in do_test at test.jl:47
 in include at boot.jl:242
 in runtests at /home/tim/src/julia/test/testdefs.jl:5
 in anonymous at multi.jl:642
 in run_work_thunk at multi.jl:603
 in remotecall_fetch at multi.jl:676
 in remotecall_fetch at multi.jl:691
 in anonymous at task.jl:1614
while loading resolve.jl, in expression starting on line 436
ERROR: test error in expression: sanity_tst(deps_data)
BoundsError()
 in getindex at tuple.jl:6
 in hash at tuple.jl:90
 in hash at tuple.jl:88
 in hash at version.jl:159
 in ht_keyindex at dict.jl:501
 in haskey at dict.jl:657
 in deps_from_data at resolve.jl:65
 in sanity_tst at resolve.jl:88
 in sanity_tst at resolve.jl:98
 in anonymous at test.jl:85
 in do_test at test.jl:47
 in include at boot.jl:242
 in runtests at /home/tim/src/julia/test/testdefs.jl:5
 in anonymous at multi.jl:642
 in run_work_thunk at multi.jl:603
 in remotecall_fetch at multi.jl:676
 in remotecall_fetch at multi.jl:691
 in anonymous at task.jl:1614
while loading resolve.jl, in expression starting on line 436
while loading /home/tim/src/julia/test/runtests.jl, in expression starting on line 42

Here are the key lines. I infer that it must have called line 90 when the tuple did not have sufficient length. (Looking forward to #9534!)

@timholy
Copy link
Member Author

timholy commented Jan 1, 2015

It also appears to me that x[2] gets hashed twice, because tail only discards the first item. Is that really what we want?

@timholy
Copy link
Member Author

timholy commented Jan 1, 2015

I got it to fail again the same way, as long as I ran the whole sequence; just doing using Base.Test; include("resolve.jl") does not trigger the error. I also inserted a @show vn in the test, and it turns out the version number that's triggering the problem is v"2.0.0-rc.1+bld". To me that suggests tuple.jl: 88 should be calling the hash method for the empty tuple (line 87) rather than the one for tuples longer than 2 elements (line 90).

I wonder if this could be #8631?

@vtjnash
Copy link
Member

vtjnash commented Jan 1, 2015

does my proposed patch help?

diff --git a/src/jltypes.c b/src/jltypes.c
index 7d4d53c..50a838e 100644
--- a/src/jltypes.c
+++ b/src/jltypes.c
@@ -751,10 +751,12 @@ static int tuple_to_Type(jl_tuple_t *a, jl_tuple_t **ptemp)
     for(i=0; i < alen; i++) {
         jl_value_t *el = jl_tupleref(a, i);
         if (jl_is_type_type(el)) {
-            jl_tupleset(*ptemp, i, jl_tparam0(el));
+            jl_value_t *T = jl_type_intersection(jl_tparam0(el), (jl_value_t*)jl_any_type); // removes Undef
+            jl_tupleset(*ptemp, i, T);
         }
         else if (i==alen-1 && jl_is_vararg_type(el) && jl_is_type_type(jl_tparam0(el))) {
-            jl_tupleset(*ptemp, i, jl_wrap_vararg(jl_tparam0(jl_tparam0(el))));
+            jl_value_t *T = jl_type_intersection(jl_tparam0(jl_tparam0(el)), (jl_value_t*)jl_any_type); // removes Undef
+            jl_tupleset(*ptemp, i, jl_wrap_vararg(T));
         }
         else {
             *ptemp = NULL;

@timholy
Copy link
Member Author

timholy commented Jan 1, 2015

I wish it did, but unfortunately it didn't seem to.

@timholy
Copy link
Member Author

timholy commented Jan 3, 2015

@simonster debugged this further (#9565), and a little further work comes up with this quite-minimal test case:

$ julia --inline=no

julia> hash((BigFloat, 256, 7))
0x24d1cc479e693f05

julia> hash(v"2.0.0-rc.1")
0x2dc29d0d282a3c23

julia> hash(v"2.0.0-rc.1+bld")
ERROR: BoundsError
 in hash at tuple.jl:90
 in hash at tuple.jl:88
 in hash at ./version.jl:159
 in hash at hashing.jl:3
 in eval at no file

@simonster simonster added the bug Indicates an unexpected problem or unintended behavior label Jan 3, 2015
@timholy
Copy link
Member Author

timholy commented Jan 7, 2015

Interestingly, I am unable to replicate the bug with

myhash(x::Any) = myhash(x, zero(UInt))
# myhash(w::WeakRef, h::UInt) = myhash(w.value, h)

myhash(x::ANY, h::UInt) = myhash(object_id(x), h)
myhash(x::UInt64,  h::UInt) = Base.hx(x, float64(x), h)
myhash(x::Int64,   h::UInt) = Base.hx(reinterpret(UInt64,abs(x)), float64(x), h)

const tuplehash_seed = UInt === UInt64 ? 0x77cfa1eef01bca90 : 0xf01bca90
myhash(::(), h::UInt) = h + tuplehash_seed
myhash(x::(Any,), h::UInt)    = myhash(x[1], myhash((), h))
myhash(x::(Any,Any), h::UInt) = myhash(x[1], myhash(x[2], myhash((), h)))
myhash(x::Tuple, h::UInt)     = myhash(x[1], myhash(x[2], myhash(Base.tail(x), h)))

function myhash(v::VersionNumber, h::UInt)
    h += 0x8ff4ffdb75f9fede % UInt
    h = myhash(v.major, h)
    h = myhash(v.minor, h)
    h = myhash(v.patch, h)
    h = myhash(v.prerelease, ~h)
    h = myhash(v.build, ~h)
end

and replacing those hash calls with myhash. I'll see what hash functions are already compiled.

@timholy
Copy link
Member Author

timholy commented Jan 7, 2015

OK, with this diff:

diff --git a/src/codegen.cpp b/src/codegen.cpp
index a9f2e38..e518df6 100644
--- a/src/codegen.cpp
+++ b/src/codegen.cpp
@@ -3511,9 +3511,18 @@ static Function *gen_jlcall_wrapper(jl_lambda_info_t *lam, jl_expr_t *ast, Funct
     return w;
 }

+extern "C" void jl_(void *jl_value);
+
 // cstyle = compile with c-callable signature, not jlcall
 static Function *emit_function(jl_lambda_info_t *lam, bool cstyle)
 {
+    if (strncmp(lam->name->name, "hash", 4) == 0 ||
+        strncmp(lam->name->name, "__hash", 6) == 0 ||
+        strncmp(lam->name->name, "myhash", 4) == 0 ||
+        strncmp(lam->name->name, "__myhash", 6) == 0) {
+        jl_(lam->name);
+        jl_(lam->specTypes);
+    }
     // step 1. unpack AST and allocate codegen context for this function
     jl_expr_t *ast = (jl_expr_t*)lam->ast;
     jl_tuple_t *sparams = NULL;

and starting like this:

tim@diva:~/src/julia$ rm usr/lib/julia/sys.so 
tim@diva:~/src/julia$ julia --inline=no

I get the following output:

:hash_64_64
(UInt64,)
:hashindex
(Int64, Int64)
               _
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: http://docs.julialang.org
   _ _   _| |_  __ _   |  Type "help()" for help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.4.0-dev+2539 (2015-01-07 06:00 UTC)
 _/ |\__'_|_|_|\__'_|  |  teh/pkg_test/a527adf* (fork: 1 commits, 0 days)
|__/                   |  x86_64-linux-gnu

:hashindex
(Symbol, Int64)
:hash
(Base.RemoteRef, UInt64)
:hash
((Any, Any), UInt64)
:hash
(Int64, UInt64)
:hash
(ASCIIString, UInt64)
:hash
(Char,)
:hashindex
(Base.LineEdit.Prompt, Int64)
:hashindex
(Base.LineEdit.HistoryPrompt{Base.REPL.REPLHistoryProvider}, Int64)
:hashindex
(Base.LineEdit.PrefixHistoryPrompt{Base.REPL.REPLHistoryProvider}, Int64)
:hash
(Base.LineEdit.PrefixHistoryPrompt{Base.REPL.REPLHistoryProvider},)
:hash
(Base.LineEdit.PrefixHistoryPrompt{Base.REPL.REPLHistoryProvider}, UInt64)
:hash
(UInt64, UInt64)
julia> hash((BigFloat, 256, 7))
:hash
((Any...,),)
:hash
((Any...,), UInt64)
:hash
((Int64, Int64), UInt64)
:hash
(Type{Base.MPFR.BigFloat}, UInt64)
0xb77958fdc56759c2

julia> hash(v"2.0.0-rc.1")
:hash
(Base.VersionNumber,)
:hash
(Base.VersionNumber, UInt64)
:hash
(Type{()}, UInt64)
0x2dc29d0d282a3c23

julia> hash(v"2.0.0-rc.1+bld")
:hash
((ASCIIString,), UInt64)
ERROR: BoundsError: attempt to access ()
  at index [1]
 in getindex at tuple.jl:6
 in hash at tuple.jl:90
 in hash at tuple.jl:88
 in hash at version.jl:159
 in hash at hashing.jl:3
 in eval at no file

julia> 

So then I created this test file:

include("hashbug2.jl")   #this is the script in the previous comment

z = zero(UInt)
r = RemoteRef()
println("RemoteRef")
myhash(r, z)
println("2-tuple")
myhash(("hello",5), z)
println("char")
myhash('c')
# Missing the REPLHistoryProvider stuff
println("UInt64")
myhash(z, z)

println("\n\nHere's the test that should give an error:\n")
myhash((BigFloat, 256, 7))
myhash(v"2.0.0-rc.1")
myhash(v"2.0.0-rc.1+bld")

and got this:

tim@diva:/tmp$ julia --inline=no
:hash_64_64
(UInt64,)
:hashindex
(Int64, Int64)
               _
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: http://docs.julialang.org
   _ _   _| |_  __ _   |  Type "help()" for help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.4.0-dev+2539 (2015-01-07 06:00 UTC)
 _/ |\__'_|_|_|\__'_|  |  teh/pkg_test/a527adf* (fork: 1 commits, 0 days)
|__/                   |  x86_64-linux-gnu

:hashindex
(Symbol, Int64)
:hash
(Base.RemoteRef, UInt64)
:hash
((Any, Any), UInt64)                                      # label
:hash
(Int64, UInt64)
:hash
(ASCIIString, UInt64)
:hash
(Char,)
:hashindex
(Base.LineEdit.Prompt, Int64)
:hashindex
(Base.LineEdit.HistoryPrompt{Base.REPL.REPLHistoryProvider}, Int64)
:hashindex
(Base.LineEdit.PrefixHistoryPrompt{Base.REPL.REPLHistoryProvider}, Int64)
:hash
(Base.LineEdit.PrefixHistoryPrompt{Base.REPL.REPLHistoryProvider},)
:hash
(Base.LineEdit.PrefixHistoryPrompt{Base.REPL.REPLHistoryProvider}, UInt64)
:hash
(UInt64, UInt64)
julia> include("run_hashbug.jl")
RemoteRef
:myhash
(Base.RemoteRef, UInt64)
:myhash
(UInt64, UInt64)
2-tuple
:myhash
((Any, Any), UInt64)             # compare against "label" above; it doesn't have ((), UInt64) as its next call
:myhash
((), UInt64)
:myhash
(Int64, UInt64)
:myhash
(ASCIIString, UInt64)
char
:myhash
(Char,)
:myhash
(Char, UInt64)
UInt64


Here's the test that should give an error:

:myhash
((Any...,),)
:myhash
((Any...,), UInt64)
:myhash
((Int64, Int64), UInt64)
:myhash
(Type{Base.MPFR.BigFloat}, UInt64)
:myhash
(Base.VersionNumber,)
:myhash
(Base.VersionNumber, UInt64)
:myhash
((ASCIIString,), UInt64)
0x52f417c4f2be1289

(All this was using deleted sys.so.)

To me, the most conspicuous difference is what I marked with "label" above; for some reason myhash gets an extra compilation of myhash((), zero(UInt)) that's missing for hash.

@timholy
Copy link
Member Author

timholy commented Jan 7, 2015

Here's something else I'm not sure I understand: from either hash(()) or hash(tuple()) I get this:

:hash
(Type{()},)
:hash
(Type{()}, UInt64)

which is different. How do I force hash to compile for ()?

@MichaelHatherly
Copy link
Member

As suggested by @timholy in #9722, the following appears to be a more direct way to cause this bug:

$ julia-0.4 --inline=no
               _
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: http://docs.julialang.org
   _ _   _| |_  __ _   |  Type "help()" for help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.4.0-dev+2607 (2015-01-11 06:43 UTC)
 _/ |\__'_|_|_|\__'_|  |  Commit 02c0501 (1 day old master)
|__/                   |  x86_64-unknown-linux-gnu

julia> f(args...) = map(identity, args)
f (generic function with 1 method)

julia> f()
ERROR: BoundsError: attempt to access ()
  at index [1]
 in f at none:1
 in eval at no file

@ihnorton
Copy link
Member

Inlining looks like a red-herring here. It is triggering a deeper issue. When inlining is on, we get constant propagation and end up with:

julia> code_typed(f,())
1-element Array{Any,1}:
 :($(Expr(:lambda, Any[:(args::(top(apply_type))(Vararg,Any)::Any::Any)], Any[Any[],Any[Any[:args,(),0]],Any[]], :(begin  # none, line 1:
        return ()
    end::()))))

When inlining is off we get:

julia> code_typed(f,())
1-element Array{Any,1}:
Warning: could not attach metadata for @simd loop.
 :($(Expr(:lambda, Any[:(args::(top(apply_type))(Vararg,Any)::Any::Any)], Any[Any[],Any[Any[:args,(),0]],Any[]], :(begin  # none, line 1:
        return $(Expr(:call1, :map, :identity, :(args::())))
    end::()))))

Obviously this is a valid call and should still work. When I look at a backtrace from jl_throw (with LLVM3.5) it is clear that the wrong version of map is being called:

(gdb) bt 3
#0  jl_throw (e=0x4770068) at task.c:745
#1  0x00007ffff799344d in jl_bounds_error_int (v=0x68ce30, i=<optimized out>) at builtins.c:152
#2  0x00007ffff3e5fb0c in julia_map_71218 () at tuple.jl:46
(More stack frames follow...)
(gdb) f
#2  0x00007ffff3e5fb0c in julia_map_71218 () at tuple.jl:46
46  map(f::Callable, t::Tuple)                = tuple(f(t[1]), map(f,tail(t))...)

We should be calling the more specific version on line 42:

map(f::Callable, t::())                   = ()

The issue here appears to boil down to a mismatch in the signature the compiler chooses in jl_get_specialization. The following is useful for debugging purposes (detailed notes for the benefit of future-me as much as anyone else):

julia> macro ident(arg)
           return esc(eval(arg))
       end
julia> m() = @ident Expr(:call1, :map, :identity, :(()::()))

Running m() the first time and stepping through the specializations, we see the following:

Breakpoint 1, jl_get_specialization (f=0x21fd2b0, types=0x3d035d0) at gf.c:1407
1407    {
(gdb) break cache_match_by_type 
Breakpoint 2 at 0x7ffff78fe57a: file gf.c, line 40.
(gdb) c
Continuing.

Breakpoint 2, cache_match_by_type (types=0x3d035e0, n=2, sig=0x21fd6b0, va=0) at gf.c:40
40  {
(gdb) p jl_(sig)
(Function, (Any,))
$1 = void
(gdb) c
Continuing.

Breakpoint 2, cache_match_by_type (types=0x3d035e0, n=2, sig=0x21fd490, va=0) at gf.c:40
40  {
(gdb) p jl_(sig)
(Function, Type{()})
$2 = void
(gdb) c
Continuing.

Breakpoint 2, cache_match_by_type (types=0x3d035e0, n=2, sig=0x21fd730, va=0) at gf.c:40
40  {
(gdb) p jl_(sig)
(Function, (Any, Any))
$3 = void
(gdb) c
Continuing.

Breakpoint 2, cache_match_by_type (types=0x3d035e0, n=2, sig=0x21fd7d0, va=0) at gf.c:40
40  {
(gdb) p jl_(sig)
(Function, (Any, Any, Any))
$4 = void
(gdb) c
Continuing.

Breakpoint 2, cache_match_by_type (types=0x3d035e0, n=2, sig=0x21fd950, va=0) at gf.c:40
40  {
(gdb) p jl_(sig)
(Function, (Any...,))
$5 = void
(gdb) c
Continuing.

The second signature (Function, Type{()}) should, presumably, be matched. Instead, we match (Function, (Any...,)), which accounts for using map(f::Callable, t::Tuple) instead of map(f::Callable, t::()).

Here is a speculative patch which fixes this particular issue but breaks several tests:

diff --git a/src/jltypes.c b/src/jltypes.c
index dc5fe56..3f5457a 100644
--- a/src/jltypes.c
+++ b/src/jltypes.c
@@ -1548,6 +1548,8 @@ static int type_eqv_(jl_value_t *a, jl_value_t *b)
                     return 0;
             }
             return 1;
+        } else if (jl_is_type_type(b)) {
+            return a == jl_t0(((jl_datatype_t*)b)->parameters);
         }
         return 0;
     }

(this seems very similar to #8631)

@timholy
Copy link
Member Author

timholy commented Jan 19, 2015

This looks like progress! Does your patch also fix #8631, even though it breaks other things?

CCing @vtjnash (compare to your fix?) and @JeffBezanson. These bugs are pretty scary, and it would be lovely to squash them.

@jakebolewski
Copy link
Member

It would be good to fix the regression, but I feel that the fundamental fix for this will come with #8974 and #8470.

@timholy
Copy link
Member Author

timholy commented Jan 19, 2015

It's likely not a regression, just hard to trigger on 0.3. (#8631 is "convertalypse," which annoyed many and was also the most infamous of Dan Luu's gripes about bugs in julia; Color.jl was triggering that bug for many users on julia 0.3, and only stopped being visible when a few bandaids got applied to Color.jl.)

If there's any fix prior to #8974 (which I imagine is not imminent), it would be great to have.

@ihnorton
Copy link
Member

I tried some of Jake's examples in #8631 and the patch does not seem to help -- but it is really just a (very specific) band-aid.

Questions I don't know the answer to:

julia> l = lminfo(map, (Base.Callable, ()))
(AST(:($(Expr(:lambda, Any[:f,:t], Any[Any[],Any[Any[:f,:Any,0],Any[:t,:Any,0]],Any[]], :(begin  # tuple.jl, line 42:
        return (top(tuple))()::Any
    end::Any))))),(Function,()),())

julia> (tree,ty) = Base.typeinf(l...)
(UInt8[0x14,0x07,0x03,0x1e,0x18,0x0f,0x88,0xd6,0x85,0x2b  …  0x6c,0x65,0x2e,0x6a,0x6c,0x07,0x01,0x20,0x18,0x14],())

julia> showast(l[1],tree)
:($(Expr(:lambda, Any[:f,:t], Any[Any[],Any[Any[:f,Function,0],Any[:t,Type{()},0]],Any[]], :(begin  # tuple.jl, line 42:
        return ()
    end::()))))

Also, I just re-read this thread and found your comment here to be interesting: I had actually tried to do the same thing with a minimal set of map-like (renamed) methods for tuples, and could not trigger the error either.

@ihnorton
Copy link
Member

I believe all of the test cases here are fixed (presumably by 4e9bf7a).

@ihnorton
Copy link
Member

@mbauman's example in #10134 is also fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

6 participants