Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GC error on 1.10 #52363

Closed
Liozou opened this issue Dec 1, 2023 · 26 comments
Closed

GC error on 1.10 #52363

Liozou opened this issue Dec 1, 2023 · 26 comments
Assignees
Labels
compiler:codegen Generation of LLVM IR and native code GC Garbage collector kind:bug Indicates an unexpected problem or unintended behavior kind:rr trace included
Milestone

Comments

@Liozou
Copy link
Member

Liozou commented Dec 1, 2023

Here is a reproducer for a GC crash that occurs on the latest commit (b497f44) of the backports-release-1.10 branch.

This is the minimized version of the issue that prompted #52256 (as well as #52184 and #51774, it took me two months to minimize) but one crucial difference is that the crash occurs on a normal (debug) build of julia with no particular flag set in the build process. It also crashes on julia master but not in v1.9.

The other crucial difference is that it crashes on a single thread (i.e. launched with -t1). All along the minimization process it only crashed when using multiple threads so this is a surprise.

reproducer.jl
using StaticArrays: SVector, SMatrix
using Printf

mutable struct System
    somelist::Vector{SVector{3,Float64}}
    mat::SMatrix{3,3,Float64,9}
end

function System(n)
    somelist = [zero(SVector{3,Float64}) for _ in 1:n]
    System(somelist, SMatrix{3,3,Float64,9}(25, 0, 0, 0, 25, 0, 0, 0, 25))
end

function output(o::System)
    mktemp() do _, io
        for s in ("", "")
            for (i, x) in enumerate(('a', 'b', 'c'))
                @printf io "%19.12f%19.12f%19.12f" o.mat[1,i] o.mat[2,i] o.mat[3,i]
            end
        end
        poss = zero(SVector{3,Float64})
        for opos in poss
            pos = o.mat*opos
            @printf io "%19.12f%19.12f%19.12f" pos[1] pos[2] pos[3]
        end
    end
    nothing
end

const CHANNEL = Channel{System}(Inf)
for _ in 1:5
    Base.Threads.@spawn while true
        x = take!($CHANNEL)
        output(x)
    end
end

function run_main(syst::System)
    for _ in 1:10
        put!(CHANNEL, deepcopy(syst))
        yield()
    end
    nothing
end


for irange in 1:10000
    step = System(80)
    run_main(step)
end

I then simply launch it with

/LionelSSDext4/liozou/julia-test-1.10/usr/bin/julia-debug -t1 --startup-file=no --bug-report=rr,chaos /tmp/reproducer.jl

Here is the rr trace: https://julialang-dumps.s3.amazonaws.com/reports/2023-12-01T10-48-13-Liozou.tar.zst
And the full output (including the dump):

output
GC error (probable corruption)
Allocations: 1520729 (Pool: 1519177; Big: 1552); GC: 1
Array{StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}, (80,)}[StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0)), StaticArraysCore.SArray{Tuple{3}, Float64, 1, 3}(data=(0, 0, 0))]

thread 0 ptr queue:
~~~~~~~~~~ ptr queue top ~~~~~~~~~~
Task(next=nothing, queue=Base.IntrusiveLinkedList{Task}(head=Task(next=Task(next=Task(next=Task(next=<circular reference @-6>, queue=<circular reference @-5>, storage=nothing, donenotify=Base.GenericCondition{Base.Threads.SpinLock}(waitq=Base.IntrusiveLinkedList{Task}(head=nothing, tail=nothing), lock=Base.Threads.SpinLock(owned=0)), result=nothing, logstate=nothing, code=Main.var"#5#6"{Base.Channel{Main.System}}(##225=Base.Channel{Main.System}(cond_take=Base.GenericCondition{Base.ReentrantLock}(waitq=Base.IntrusiveLinkedList{Task}(head=nothing, tail=nothing), lock=Base.ReentrantLock(locked_by=nothing, reentrancy_cnt=0x00000000, havelock=0x00, cond_wait=Base.GenericCondition{Base.Threads.SpinLock}(waitq=Base.IntrusiveLinkedList{Task}(head=nothing, tail=nothing), lock=Base.Threads.SpinLock(owned=0)), _=(0, 0, 0))), cond_wait=Base.GenericCondition{Base.ReentrantLock}(waitq=Base.IntrusiveLinkedList{Task}(head=nothing, tail=nothing), lock=Base.ReentrantLock(locked_by=nothing, reentrancy_cnt=0x00000000, havelock=0x00, cond_wait=Base.GenericCondition{Base.Threads.SpinLock}(waitq=Base.IntrusiveLinkedList{Task}(head=nothing, tail=nothing), lock=Base.Threads.SpinLock(owned=0)), _=(0, 0, 0))), cond_put=Base.GenericCondition{Base.ReentrantLock}(waitq=Base.IntrusiveLinkedList{Task}(head=nothing, tail=nothing), lock=Base.ReentrantLock(locked_by=nothing, reentrancy_cnt=0x00000000, havelock=0x00, cond_wait=Base.GenericCondition{Base.Threads.SpinLock}(waitq=Base.IntrusiveLinkedList{Task}(head=nothing, tail=nothing), lock=Base.Threads.SpinLock(owned=0)), _=(0, 0, 0))), state=:open, excp=nothing, data=Array{Main.System, (0,)}[], n_avail_items=0, sz_max=9223372036854775807)), rngState0=0xe3398f4220ac9d3c, rngState1=0x899f35f795947f90, rngState2=0x5698c3cbf37a0156, rngState3=0xf106cddea4b9b659, rngState4=0xe358e4a012eb8710, _state=0x00, sticky=false, _isexception=false, priority=0x0000), queue=<circular reference @-4>, storage=nothing, donenotify=Base.GenericCondition{Base.Threads.SpinLock}(waitq=Base.IntrusiveLinkedList{Task}(head=nothing, tail=nothing), lock=Base.Threads.SpinLock(owned=0)), result=nothing, logstate=nothing, code=Main.var"#5#6"{Base.Channel{Main.System}}(##225=Base.Channel{Main.System}(cond_take=Base.GenericCondition{Base.ReentrantLock}(waitq=Base.IntrusiveLinkedList{Task}(head=nothing, tail=nothing), lock=Base.ReentrantLock(locked_by=nothing, reentrancy_cnt=0x00000000, havelock=0x00, cond_wait=Base.GenericCondition{Base.Threads.SpinLock}(waitq=Base.IntrusiveLinkedList{Task}(head=nothing, tail=nothing), lock=Base.Threads.SpinLock(owned=0)), _=(0, 0, 0))), cond_wait=Base.GenericCondition{Base.ReentrantLock}(waitq=Base.IntrusiveLinkedList{Task}(head=nothing, tail=nothing), lock=Base.ReentrantLock(locked_by=nothing, reentrancy_cnt=0x00000000, havelock=0x00, cond_wait=Base.GenericCondition{Base.Threads.SpinLock}(waitq=Base.IntrusiveLinkedList{Task}(head=nothing, tail=nothing), lock=Base.Threads.SpinLock(owned=0)), _=(0, 0, 0))), cond_put=Base.GenericCondition{Base.ReentrantLock}(waitq=Base.IntrusiveLinkedList{Task}(head=nothing, tail=nothing), lock=Base.ReentrantLock(locked_by=nothing, reentrancy_cnt=0x00000000, havelock=0x00, cond_wait=Base.GenericCondition{Base.Threads.SpinLock}(waitq=Base.IntrusiveLinkedList{Task}(head=nothing, tail=nothing), lock=Base.Threads.SpinLock(owned=0)), _=(0, 0, 0))), state=:open, excp=nothing, data=Array{Main.System, (0,)}[], n_avail_items=0, sz_max=9223372036854775807)), rngState0=0x1aac3ff1d8f43034, rngState1=0x14802f94b541ef20, rngState2=0x7942d7c173e44ced, rngState3=0x8152d63da3f75d6a, rngState4=0x4f888f2347910313, _state=0x00, sticky=false, _isexception=false, priority=0x0000), queue=<circular reference @-3>, storage=nothing, donenotify=Base.GenericCondition{Base.Threads.SpinLock}(waitq=Base.IntrusiveLinkedList{Task}(head=nothing, tail=nothing), lock=Base.Threads.SpinLock(owned=0)), result=nothing, logstate=nothing, code=Main.var"#5#6"{Base.Channel{Main.System}}(##225=Base.Channel{Main.System}(cond_take=Base.GenericCondition{Base.ReentrantLock}(waitq=Base.IntrusiveLinkedList{Task}(head=nothing, tail=nothing), lock=Base.ReentrantLock(locked_by=nothing, reentrancy_cnt=0x00000000, havelock=0x00, cond_wait=Base.GenericCondition{Base.Threads.SpinLock}(waitq=Base.IntrusiveLinkedList{Task}(head=nothing, tail=nothing), lock=Base.Threads.SpinLock(owned=0)), _=(0, 0, 0))), cond_wait=Base.GenericCondition{Base.ReentrantLock}(waitq=Base.IntrusiveLinkedList{Task}(head=nothing, tail=nothing), lock=Base.ReentrantLock(locked_by=nothing, reentrancy_cnt=0x00000000, havelock=0x00, cond_wait=Base.GenericCondition{Base.Threads.SpinLock}(waitq=Base.IntrusiveLinkedList{Task}(head=nothing, tail=nothing), lock=Base.Threads.SpinLock(owned=0)), _=(0, 0, 0))), cond_put=Base.GenericCondition{Base.ReentrantLock}(waitq=Base.IntrusiveLinkedList{Task}(head=nothing, tail=nothing), lock=Base.ReentrantLock(locked_by=nothing, reentrancy_cnt=0x00000000, havelock=0x00, cond_wait=Base.GenericCondition{Base.Threads.SpinLock}(waitq=Base.IntrusiveLinkedList{Task}(head=nothing, tail=nothing), lock=Base.Threads.SpinLock(owned=0)), _=(0, 0, 0))), state=:open, excp=nothing, data=Array{Main.System, (0,)}[], n_avail_items=0, sz_max=9223372036854775807)), rngState0=0xa11288352b970af6, rngState1=0xe67f45959c2ec6aa, rngState2=0x5193977f5f8481df, rngState3=0xbae5a560fccd6eaa, rngState4=0x0b6cedcb01b03a4a, _state=0x00, sticky=false, _isexception=false, priority=0x0000), queue=<circular reference @-2>, storage=nothing, donenotify=Base.GenericCondition{Base.Threads.SpinLock}(waitq=Base.IntrusiveLinkedList{Task}(head=nothing, tail=nothing), lock=Base.Threads.SpinLock(owned=0)), result=nothing, logstate=nothing, code=Main.var"#5#6"{Base.Channel{Main.System}}(##225=Base.Channel{Main.System}(cond_take=Base.GenericCondition{Base.ReentrantLock}(waitq=Base.IntrusiveLinkedList{Task}(head=nothing, tail=nothing), lock=Base.ReentrantLock(locked_by=nothing, reentrancy_cnt=0x00000000, havelock=0x00, cond_wait=Base.GenericCondition{Base.Threads.SpinLock}(waitq=Base.IntrusiveLinkedList{Task}(head=nothing, tail=nothing), lock=Base.Threads.SpinLock(owned=0)), _=(0, 0, 0))), cond_wait=Base.GenericCondition{Base.ReentrantLock}(waitq=Base.IntrusiveLinkedList{Task}(head=nothing, tail=nothing), lock=Base.ReentrantLock(locked_by=nothing, reentrancy_cnt=0x00000000, havelock=0x00, cond_wait=Base.GenericCondition{Base.Threads.SpinLock}(waitq=Base.IntrusiveLinkedList{Task}(head=nothing, tail=nothing), lock=Base.Threads.SpinLock(owned=0)), _=(0, 0, 0))), cond_put=Base.GenericCondition{Base.ReentrantLock}(waitq=Base.IntrusiveLinkedList{Task}(head=nothing, tail=nothing), lock=Base.ReentrantLock(locked_by=nothing, reentrancy_cnt=0x00000000, havelock=0x00, cond_wait=Base.GenericCondition{Base.Threads.SpinLock}(waitq=Base.IntrusiveLinkedList{Task}(head=nothing, tail=nothing), lock=Base.Threads.SpinLock(owned=0)), _=(0, 0, 0))), state=:open, excp=nothing, data=Array{Main.System, (0,)}[], n_avail_items=0, sz_max=9223372036854775807)), rngState0=0x61b03f67d6f89086, rngState1=0x1ee87ddebe294103, rngState2=0xb93d13df070e25f6, rngState3=0x3eacce10f111b1c0, rngState4=0xa05f80d86cbdc2e5, _state=0x00, sticky=false, _isexception=false, priority=0x0000), tail=<circular reference @-2>), storage=Base.IdDict{Any, Any}(ht=Array{Any, (32,)}[
  #<null>,
  #<null>,
  #<null>,
  #<null>,
  #<null>,
  #<null>,
  #<null>,
  #<null>,
  #<null>,
  #<null>,
  #<null>,
  #<null>,
  #<null>,
  #<null>,
  #<null>,
  #<null>,
  #<null>,
  #<null>,
  :SOURCE_PATH,
  "/tmp/reproducer.jl",
  #<null>,
  #<null>,
  #<null>,
  #<null>,
  #<null>,
  #<null>,
  #<null>,
  #<null>,
  #<null>,
  #<null>,
  #<null>,
  #<null>], count=1, ndel=6), donenotify=nothing, result=nothing, logstate=nothing, code=#<null>, rngState0=0x3bdd6b7492f99322, rngState1=0xc820e13302084685, rngState2=0x3393658f2559db6d, rngState3=0xc524859e7b717044, rngState4=0xe358e4a012eb8710, _state=0x00, sticky=true, _isexception=false, priority=0x0000)
==========
"�= z�0H�9MLP�PP"
==========
(0, 0, 0)
==========
0
==========
0
==========
0
==========
~~~~~~~~~~ ptr queue bottom ~~~~~~~~~~

[33573] signal (6.-6): Aborted
in expression starting at /tmp/reproducer.jl:47
pthread_kill at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
raise at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
gc_dump_queue_and_abort at /LionelSSDext4/liozou/julia-test-1.10/src/gc.c:1813
gc_mark_outrefs at /LionelSSDext4/liozou/julia-test-1.10/src/gc.c:2508 [inlined]
gc_mark_loop_serial_ at /LionelSSDext4/liozou/julia-test-1.10/src/gc.c:2677
gc_mark_loop_serial at /LionelSSDext4/liozou/julia-test-1.10/src/gc.c:2700
gc_mark_loop at /LionelSSDext4/liozou/julia-test-1.10/src/gc.c:2813
_jl_gc_collect at /LionelSSDext4/liozou/julia-test-1.10/src/gc.c:3137
ijl_gc_collect at /LionelSSDext4/liozou/julia-test-1.10/src/gc.c:3434
maybe_collect at /LionelSSDext4/liozou/julia-test-1.10/src/gc.c:935
jl_gc_pool_alloc_inner at /LionelSSDext4/liozou/julia-test-1.10/src/gc.c:1291
jl_gc_pool_alloc_noinline at /LionelSSDext4/liozou/julia-test-1.10/src/gc.c:1348
jl_gc_alloc_ at /LionelSSDext4/liozou/julia-test-1.10/src/julia_internal.h:477
jl_gc_alloc at /LionelSSDext4/liozou/julia-test-1.10/src/gc.c:3486
ijl_string_to_array at /LionelSSDext4/liozou/julia-test-1.10/src/array.c:289
unsafe_wrap at ./strings/string.jl:100 [inlined]
StringVector at ./iobuffer.jl:32 [inlined]
format at /LionelSSDext4/liozou/julia-test-1.10/usr/share/julia/stdlib/v1.10/Printf/src/Printf.jl:932
#3 at /tmp/reproducer.jl:24
mktemp at ./file.jl:738
mktemp at ./file.jl:736 [inlined]
output at /tmp/reproducer.jl:15 [inlined]
#5 at /tmp/reproducer.jl:34
unknown function (ip: 0x1f5f4ef5fa52)
_jl_invoke at /LionelSSDext4/liozou/julia-test-1.10/src/gf.c:2894
ijl_apply_generic at /LionelSSDext4/liozou/julia-test-1.10/src/gf.c:3076
jl_apply at /LionelSSDext4/liozou/julia-test-1.10/src/julia.h:1982
start_task at /LionelSSDext4/liozou/julia-test-1.10/src/task.c:1238
Allocations: 1520729 (Pool: 1519177; Big: 1552); GC: 1

Across the minimization I have seen other kinds of crashes such as segmentation faults with backtraces pointing to gc_mark_outrefs, and segfaults without backtraces.

Might or might not be related to other reported crashes occuring on 1.10 i.e. #50705, #52032, #51792, #51800, #52200.

@Liozou Liozou added kind:bug Indicates an unexpected problem or unintended behavior GC Garbage collector kind:rr trace included labels Dec 1, 2023
@ufechner7
Copy link

ufechner7 commented Dec 1, 2023

I could reproduce your bug with Julia 1.10.0-rc1 launched without any parameters on Ubuntu 22.04. Furthermore the code executes without any error on 1.9.4.

@KristofferC KristofferC added this to the 1.10 milestone Dec 1, 2023
@d-netto
Copy link
Member

d-netto commented Dec 2, 2023

dbd82a4dbab0582a345679eb83b2d99d40c0356a is the first bad commit
commit dbd82a4dbab0582a345679eb83b2d99d40c0356a
Author: pchintalapudi <34727397+pchintalapudi@users.noreply.github.com>
Date:   Wed Jun 7 04:40:07 2023 +0000

    Update newpm pass pipeline (#49747)
    
    Co-authored-by: Gabriel Baraldi <baraldigabriel@gmail.com>

 src/pipeline.cpp | 44 +++++++++++++++++++++++++++++++++++---------
 1 file changed, 35 insertions(+), 9 deletions(-)

CC: @pchintalapudi, @gbaraldi.

@maleadt
Copy link
Member

maleadt commented Dec 2, 2023

Further reduced:

abstract type StaticArray{Tuple, T, N} <: AbstractArray{T, N} end
tuple_svec(::Type{T}) where T = T.parameters
tuple_tuple(::Type{T}) where T = (tuple_svec(T)...,)
struct SArray{S, T, N, L} <: StaticArray{S, T, N}
    data
end
SMatrix{S1, S2} = SArray{Tuple{S1}}
mutable struct MArray{S , T, N, L} <: StaticArray{S, T, N}
    function MArray{S,T,N,L}(UndefInitializer) where {S,T,N,L}
        new()
    end
end
struct Size{S} end
Size() = ()
Size(::Type{T}) where T = Size{tuple_tuple(T)}()
Size(::T) where T<:StaticArray{S} where S = @isdefined(S) ? Size(S) : Union
length_val(Size) = Val{length}
Base.getindex(v::SArray, i) = getfield(v,:data)
function Base.setindex!(v::MArray, val, Int)
    T = eltype(v)
    if isbitstype(T)
        unsafe_store!(Base.unsafe_convert(Ptr{T}, pointer_from_objref(v)), ( val))
    end
end
struct Args end
construct_type(::Type{SA}, x) where SA= adapt_eltype(adapt_size(SA, x), x)
adapt_size(::Type{SA}, x) where SA = typeintersect(SA, StaticArray)
adapt_eltype(::Type{SA}, x) where SA= SA
(::Type{SA})(x...) where SA = construct_type(SA, Args)(x)
Base.length(::StaticArray) = prod(())Int
Base.axes(s::StaticArray) = _axes(Size(s))
_axes(::Size{sizes}) where sizes = map(x->(), sizes)
Base.IndexStyle(::T) where T<:StaticArray = IndexLinear()
mutable_similar_type(::Type{T}, Size, ::Type{D}) where {T,D} = MArray{Tuple{},T,D,prod}
Base.similar(Type, ::Type{T}, s) where T =
    isbitstype(T) ?
    mutable_similar_type(T, s, length_val(s))(undef) : nothing
zeros(::Type{SA}) where SA = ((), )
Base.:(*)(a, Number) = map(c->c, a)
zero(::SA) where SA = zeros(SA)
function output(mat)
    mktemp() do _, io
        poss = zero(SArray)
        for opos in poss
            mat*opos
        end
    end
end
CHANNEL = Channel()
for _ in 5
    @async while true
        x = take!(CHANNEL)
        output(x)
    end
end
function run_main(mat)
    for _ in 10
        put!(CHANNEL, mat)
    end
end
for irange in 1:10000
    step = SMatrix{3,3,Float64,9}(1)
    run_main(step)
end

@Liozou
Copy link
Member Author

Liozou commented Dec 2, 2023

EDIT: sorry nope, I thought I found a better minimizer but my build of julia was rotten

@Liozou
Copy link
Member Author

Liozou commented Dec 2, 2023

@maleadt I'm not sure your minimization is correct: your definition of an MArray is

mutable struct MArray{S <: Tuple, T, N, L} <: StaticArray{S, T, N}
    function MArray() where {}
    end
    function MArray{S,T,N,L}(::UndefInitializer) where {S<:Tuple,T,N,L}
        new()
    end
end

so it does not store any data. But then it's not really surprising that a call to

function Base.setindex!(v::MArray, val, i::Int)
    T = eltype(v)
    if isbitstype(T)
        GC.@preserve v unsafe_store!(Base.unsafe_convert(Ptr{T}, pointer_from_objref(v)), convert(T, val), i)
    end
end

should crash, because the unsafe_store! is definitely invalid here.

@maleadt
Copy link
Member

maleadt commented Dec 2, 2023

Yeah, that's possible. I just reduced (mechanically) preserving the GC error I initially observed.

@vchuravy
Copy link
Sponsor Member

vchuravy commented Dec 2, 2023

Reproduced locally and reduced to:

using StaticArrays: SVector
using Printf

mutable struct System
    somelist::Vector{Float64}
    mat::SVector{3, Float64}
end

function System()
    somelist = [0.0]
    System(somelist, SVector{3, Float64}(25, 0, 0))
end

function output(o::System)
    mktemp() do _, io
        for _ in (1, 1)
            for (i, _) in enumerate((1,1,1))
                @printf io "%19.12f" o.mat[i]
            end
        end
        for _ in 1:3
            @printf io "%19.12f" o.mat[1]
        end
    end
    nothing
end

const CHANNEL = Channel{System}(Inf)
function run_main(syst::System)
    put!(CHANNEL, deepcopy(syst))
    x = take!(CHANNEL)
    output(x)
    put!(CHANNEL, deepcopy(syst))
    x = take!(CHANNEL)
    output(x)
    yield()
    nothing
end


for irange in 1:10000
    step = System()
    run_main(step)
end

My hunch based on the bisect and the curious fact that the pointer we are marking seems to be the arr->data pointer, is that we may accidentally root a derived pointer, but nailing this down is going to be challenging.

@Liozou
Copy link
Member Author

Liozou commented Dec 2, 2023

Ah that's a great reduction! Here is a way to get rid of the StaticArrays dependency reduce the somelist::Vector{Float64} field to a simple empty mutable struct:

using Printf

struct TupleWrapper
    data::NTuple{3,Float64}
end
Base.getindex(v::TupleWrapper, i::Int) = getfield(v,:data)[i]

mutable struct EmptyMutable end
mutable struct System
    _::EmptyMutable
    mat::TupleWrapper
end

function System()
    System(EmptyMutable(), TupleWrapper((25.0, 0.0, 0.0)))
end

function output(o::System)
    mktemp() do _, io
        for _ in (1, 1)
            for (i, _) in enumerate((1,1,1))
                @printf io "%19.12f" o.mat[i]
            end
        end
        for _ in 1:3
            @printf io "%19.12f" o.mat[1]
        end
    end
    nothing
end

const CHANNEL = Channel{System}(Inf)
function run_main(syst::System)
    put!(CHANNEL, deepcopy(syst))
    x = take!(CHANNEL)
    output(x)
    put!(CHANNEL, deepcopy(syst))
    x = take!(CHANNEL)
    output(x)
    yield()
    nothing
end


for irange in 1:100000
    step = System()
    run_main(step)
end

@vchuravy
Copy link
Sponsor Member

vchuravy commented Dec 2, 2023

# in gc_mark_stack
(pernosco) p jl_(new_obj)
# crash
(pernosco) p jl_(jl_typeof(new_obj)) 
Array{Float64, (1,)}[0]

So the interior pointer did indeed end up on the GC stack-frame.

The stack slot ends up being written in:

Old value = 139833946344472
New value = 139833946344464
0x00007f2dab905a62 in julia_#1_67 (io=...) at /home/vchuravy/src/julia/repr.jl:22
22                  @printf io "%19.12f" o.mat[1]

(rr) bt

#0  0x00007f2dab905a62 in julia_#1_67 (io=...) at /home/vchuravy/src/julia/repr.jl:22
#1  0x00007f2dab905cda in julia_mktemp_64 (fn=..., parent=<error reading variable: Cannot access memory at address 0x0>) at file.jl:738
#2  0x00007f2dab9071f5 in mktemp () at file.jl:736
#3  output () at /home/vchuravy/src/julia/repr.jl:15
#4  julia_run_main_53 (syst=...) at /home/vchuravy/src/julia/repr.jl:32

I think this is the right IR the IR before optimizations:

; ModuleID = '#1'
source_filename = "#1"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128-ni:10:11:12:13"
target triple = "x86_64-unknown-linux-gnu"

@_j_const7 = private unnamed_addr constant [3 x i64] [i64 1, i64 1, i64 1], align 8
@_j_const8 = private unnamed_addr constant [2 x i64] [i64 1, i64 1], align 8

; Function Attrs: sspstrong
define swiftcc void @"julia_#1_67"({}*** nonnull swiftself %0, [1 x {} addrspace(10)*] addrspace(11)* nocapture noundef nonnull readonly align 8 dereferenceable(8) %1, {} addrspace(10)* %2, {} addrspace(10)* noundef nonnull align 8 dereferenceable(48) %3) #0 !dbg !6 {
top:
  %io = alloca {} addrspace(10)*, align 8
  %aggregate_load_box = alloca [1 x [3 x double]], align 8
  %aggregate_load_box25 = alloca [1 x [3 x double]], align 8
  %4 = call {}*** @julia.get_pgcstack()
  store {} addrspace(10)* null, {} addrspace(10)** %io, align 8
  %5 = bitcast {}*** %4 to {}**
  %current_task = getelementptr inbounds {}*, {}** %5, i64 -14
  %6 = bitcast {}** %current_task to i64*
  %world_age = getelementptr inbounds i64, i64* %6, i64 15
  call void @llvm.dbg.declare(metadata {} addrspace(10)** %io, metadata !28, metadata !DIExpression(DW_OP_deref)), !dbg !29
  call void @llvm.dbg.declare(metadata [1 x {} addrspace(10)*] addrspace(11)* %1, metadata !27, metadata !DIExpression(DW_OP_deref)), !dbg !29
  store {} addrspace(10)* %3, {} addrspace(10)** %io, align 8
  %7 = bitcast {}*** %4 to {}**
  %current_task1 = getelementptr inbounds {}*, {}** %7, i64 -14
  %ptls_field = getelementptr inbounds {}*, {}** %current_task1, i64 16
  %ptls_load = load {}*, {}** %ptls_field, align 8, !tbaa !30
  %ptls = bitcast {}* %ptls_load to {}**
  %8 = bitcast {}** %ptls to i64**
  %9 = getelementptr inbounds i64*, i64** %8, i64 2
  %safepoint = load i64*, i64** %9, align 8, !tbaa !34, !invariant.load !11
  fence syncscope("singlethread") seq_cst
  call void @julia.safepoint(i64* %safepoint), !dbg !29
  fence syncscope("singlethread") seq_cst
  br i1 false, label %L53, label %top.L2_crit_edge, !dbg !29

top.L2_crit_edge:                                 ; preds = %top
  br label %L2, !dbg !29

L2:                                               ; preds = %top.L2_crit_edge, %L52
  %value_phi = phi i64 [ 2, %top.L2_crit_edge ], [ %value_phi20, %L52 ]
  br i1 false, label %L37, label %L2.L4_crit_edge, !dbg !36

L2.L4_crit_edge:                                  ; preds = %L2
  br label %L4, !dbg !36

L4:                                               ; preds = %L2.L4_crit_edge, %L36
  %value_phi2 = phi i64 [ 1, %L2.L4_crit_edge ], [ %value_phi14, %L36 ]
  %value_phi3 = phi i64 [ 2, %L2.L4_crit_edge ], [ %value_phi13, %L36 ]
  %value_phi4 = phi i64 [ 2, %L2.L4_crit_edge ], [ %value_phi12, %L36 ]
  %getfield_addr = getelementptr inbounds [1 x {} addrspace(10)*], [1 x {} addrspace(10)*] addrspace(11)* %1, i32 0, i32 0, !dbg !37
  %getfield = load atomic {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %getfield_addr unordered, align 8, !dbg !37, !tbaa !34, !invariant.load !11, !alias.scope !38, !noalias !41, !nonnull !11, !dereferenceable !46, !align !47
  %10 = addrspacecast {} addrspace(10)* %getfield to {} addrspace(11)*, !dbg !48
  %11 = bitcast {} addrspace(11)* %10 to i8 addrspace(11)*, !dbg !48
  %12 = getelementptr inbounds i8, i8 addrspace(11)* %11, i64 8, !dbg !48
  %13 = bitcast i8 addrspace(11)* %12 to [1 x [3 x double]] addrspace(11)*, !dbg !48
  %14 = getelementptr inbounds [1 x [3 x double]], [1 x [3 x double]] addrspace(11)* %13, i32 0, i32 0, !dbg !48
  %15 = getelementptr inbounds [1 x [3 x double]], [1 x [3 x double]]* %aggregate_load_box, i32 0, i32 0, !dbg !48
  %16 = bitcast [3 x double]* %15 to i8*, !dbg !48
  %17 = bitcast [3 x double] addrspace(11)* %14 to i8 addrspace(11)*, !dbg !48
  call void @llvm.memcpy.p0i8.p11i8.i64(i8* align 8 %16, i8 addrspace(11)* %17, i64 24, i1 false), !dbg !48, !tbaa !52, !alias.scope !53, !noalias !54
  %18 = getelementptr inbounds [1 x [3 x double]], [1 x [3 x double]]* %aggregate_load_box, i32 0, i32 0, !dbg !55
  %19 = sub i64 %value_phi2, 1, !dbg !58
  %boundscheck = icmp ult i64 %19, 3, !dbg !58
  br i1 %boundscheck, label %pass, label %fail, !dbg !58

L14:                                              ; preds = %pass
  %20 = icmp sle i64 %value_phi4, 3, !dbg !61
  %21 = zext i1 %20 to i8
  br label %L17, !dbg !64

L16:                                              ; preds = %pass
  br label %L17, !dbg !64

L17:                                              ; preds = %L16, %L14
  %value_phi5 = phi i8 [ %21, %L14 ], [ 0, %L16 ]
  %22 = trunc i8 %value_phi5 to i1, !dbg !64
  %23 = xor i1 %22, true, !dbg !64
  br i1 %23, label %L22, label %L19, !dbg !64

L19:                                              ; preds = %L17
  %24 = sub i64 %value_phi4, 1, !dbg !70
  %boundscheck6 = icmp ult i64 %24, 3, !dbg !70
  br i1 %boundscheck6, label %pass8, label %fail7, !dbg !70

L22:                                              ; preds = %L17
  br label %L23, !dbg !64

L23:                                              ; preds = %L22, %pass8
  %value_phi9 = phi i8 [ 0, %pass8 ], [ 1, %L22 ]
  %value_phi10 = phi i64 [ %66, %pass8 ], [ undef, %L22 ]
  %value_phi11 = phi i8 [ 1, %L22 ], [ undef, %pass8 ]
  %25 = trunc i8 %value_phi9 to i1, !dbg !71
  %26 = xor i1 %25, true, !dbg !71
  br i1 %26, label %L28, label %L27, !dbg !71

L27:                                              ; preds = %L23
  br label %L30, !dbg !71

L28:                                              ; preds = %L23
  %27 = add i64 %value_phi3, 1, !dbg !72
  br label %L30, !dbg !71

L30:                                              ; preds = %L28, %L27
  %value_phi12 = phi i64 [ %value_phi10, %L28 ], [ undef, %L27 ]
  %value_phi13 = phi i64 [ %27, %L28 ], [ undef, %L27 ]
  %value_phi14 = phi i64 [ %value_phi3, %L28 ], [ undef, %L27 ]
  %value_phi15 = phi i8 [ %value_phi11, %L27 ], [ 0, %L28 ]
  %28 = trunc i8 %value_phi15 to i1, !dbg !69
  %29 = xor i1 %28, true, !dbg !69
  %30 = xor i1 %29, true, !dbg !69
  br i1 %30, label %L37, label %L36, !dbg !69

L36:                                              ; preds = %L30
  br label %L4, !dbg !69

L37:                                              ; preds = %L30, %L2
  %31 = icmp sle i64 1, %value_phi, !dbg !75
  %32 = xor i1 %31, true, !dbg !76
  br i1 %32, label %L41, label %L39, !dbg !76

L39:                                              ; preds = %L37
  %33 = icmp sle i64 %value_phi, 2, !dbg !75
  %34 = zext i1 %33 to i8, !dbg !71
  br label %L42, !dbg !71

L41:                                              ; preds = %L37
  br label %L42, !dbg !71

L42:                                              ; preds = %L41, %L39
  %value_phi16 = phi i8 [ %34, %L39 ], [ 0, %L41 ]
  %35 = trunc i8 %value_phi16 to i1, !dbg !76
  %36 = xor i1 %35, true, !dbg !76
  br i1 %36, label %L47, label %L44, !dbg !76

L44:                                              ; preds = %L42
  %37 = sub i64 %value_phi, 1, !dbg !78
  %boundscheck17 = icmp ult i64 %37, 2, !dbg !78
  br i1 %boundscheck17, label %pass19, label %fail18, !dbg !78

L47:                                              ; preds = %L42
  br label %L48, !dbg !76

L48:                                              ; preds = %L47, %pass19
  %value_phi20 = phi i64 [ %68, %pass19 ], [ undef, %L47 ]
  %value_phi21 = phi i8 [ 0, %pass19 ], [ 1, %L47 ]
  %38 = trunc i8 %value_phi21 to i1, !dbg !77
  %39 = xor i1 %38, true, !dbg !77
  %40 = xor i1 %39, true, !dbg !77
  br i1 %40, label %L53, label %L52, !dbg !77

L52:                                              ; preds = %L48
  br label %L2, !dbg !77

L53:                                              ; preds = %L48, %top
  br i1 false, label %L70, label %L53.L54_crit_edge, !dbg !79

L53.L54_crit_edge:                                ; preds = %L53
  br label %L54, !dbg !76

L54:                                              ; preds = %L53.L54_crit_edge, %L69
  %value_phi22 = phi i64 [ 1, %L53.L54_crit_edge ], [ %value_phi27, %L69 ]
  %getfield_addr23 = getelementptr inbounds [1 x {} addrspace(10)*], [1 x {} addrspace(10)*] addrspace(11)* %1, i32 0, i32 0, !dbg !80
  %getfield24 = load atomic {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %getfield_addr23 unordered, align 8, !dbg !80, !tbaa !34, !invariant.load !11, !alias.scope !38, !noalias !41, !nonnull !11, !dereferenceable !46, !align !47
  %41 = addrspacecast {} addrspace(10)* %getfield24 to {} addrspace(11)*, !dbg !81
  %42 = bitcast {} addrspace(11)* %41 to i8 addrspace(11)*, !dbg !81
  %43 = getelementptr inbounds i8, i8 addrspace(11)* %42, i64 8, !dbg !81
  %44 = bitcast i8 addrspace(11)* %43 to [1 x [3 x double]] addrspace(11)*, !dbg !81
  %45 = getelementptr inbounds [1 x [3 x double]], [1 x [3 x double]] addrspace(11)* %44, i32 0, i32 0, !dbg !81
  %46 = getelementptr inbounds [1 x [3 x double]], [1 x [3 x double]]* %aggregate_load_box25, i32 0, i32 0, !dbg !81
  %47 = bitcast [3 x double]* %46 to i8*, !dbg !81
  %48 = bitcast [3 x double] addrspace(11)* %45 to i8 addrspace(11)*, !dbg !81
  call void @llvm.memcpy.p0i8.p11i8.i64(i8* align 8 %47, i8 addrspace(11)* %48, i64 24, i1 false), !dbg !81, !tbaa !52, !alias.scope !53, !noalias !54
  %49 = getelementptr inbounds [1 x [3 x double]], [1 x [3 x double]]* %aggregate_load_box25, i32 0, i32 0, !dbg !82
  %50 = getelementptr inbounds [3 x double], [3 x double]* %49, i32 0, i32 0, !dbg !83
  %51 = load {} addrspace(10)*, {} addrspace(10)** %io, align 8, !dbg !80, !nonnull !11, !dereferenceable !84, !align !47
  %unbox26 = load double, double* %50, align 8, !dbg !80, !tbaa !85, !alias.scope !87, !noalias !88
  call swiftcc void @julia_format_70({}*** nonnull swiftself %4, {} addrspace(10)* %51, { [1 x {} addrspace(10)*], {} addrspace(10)*, [1 x { i8, i8, i8, i8, i8, i64, i64, i8, i8 }], i64 } addrspace(11)* nocapture readonly addrspacecast ({ [1 x {} addrspace(10)*], {} addrspace(10)*, [1 x { i8, i8, i8, i8, i8, i64, i64, i8, i8 }], i64 }* inttoptr (i64 139833966590608 to { [1 x {} addrspace(10)*], {} addrspace(10)*, [1 x { i8, i8, i8, i8, i8, i64, i64, i8, i8 }], i64 }*) to { [1 x {} addrspace(10)*], {} addrspace(10)*, [1 x { i8, i8, i8, i8, i8, i64, i64, i8, i8 }], i64 } addrspace(11)*), double %unbox26), !dbg !80
  %52 = icmp eq i64 %value_phi22, 3, !dbg !89
  %53 = xor i1 %52, true, !dbg !92
  br i1 %53, label %L63, label %L62, !dbg !92

L62:                                              ; preds = %L54
  br label %L65, !dbg !92

L63:                                              ; preds = %L54
  %54 = add i64 %value_phi22, 1, !dbg !96
  br label %L65, !dbg !92

L65:                                              ; preds = %L63, %L62
  %value_phi27 = phi i64 [ %54, %L63 ], [ undef, %L62 ]
  %value_phi28 = phi i8 [ 1, %L62 ], [ 0, %L63 ]
  %55 = trunc i8 %value_phi28 to i1, !dbg !95
  %56 = xor i1 %55, true, !dbg !95
  %57 = xor i1 %56, true, !dbg !95
  br i1 %57, label %L70, label %L69, !dbg !95

L69:                                              ; preds = %L65
  br label %L54, !dbg !76

L70:                                              ; preds = %L65, %L53
  ret void, !dbg !95

fail:                                             ; preds = %L4
  %58 = addrspacecast [3 x double]* %18 to [3 x double] addrspace(11)*, !dbg !58
  %59 = bitcast [3 x double] addrspace(11)* %58 to i8 addrspace(11)*, !dbg !58
  call void @ijl_bounds_error_unboxed_int(i8 addrspace(11)* %59, {}* inttoptr (i64 139833764433472 to {}*), i64 %value_phi2), !dbg !58
  unreachable, !dbg !58

pass:                                             ; preds = %L4
  %60 = bitcast [3 x double]* %18 to double*, !dbg !58
  %61 = getelementptr inbounds double, double* %60, i64 %19, !dbg !58
  %62 = load {} addrspace(10)*, {} addrspace(10)** %io, align 8, !dbg !37, !nonnull !11, !dereferenceable !84, !align !47
  %unbox = load double, double* %61, align 8, !dbg !37, !tbaa !85, !alias.scope !87, !noalias !88
  call swiftcc void @julia_format_70({}*** nonnull swiftself %4, {} addrspace(10)* %62, { [1 x {} addrspace(10)*], {} addrspace(10)*, [1 x { i8, i8, i8, i8, i8, i64, i64, i8, i8 }], i64 } addrspace(11)* nocapture readonly addrspacecast ({ [1 x {} addrspace(10)*], {} addrspace(10)*, [1 x { i8, i8, i8, i8, i8, i64, i64, i8, i8 }], i64 }* inttoptr (i64 139833966590032 to { [1 x {} addrspace(10)*], {} addrspace(10)*, [1 x { i8, i8, i8, i8, i8, i64, i64, i8, i8 }], i64 }*) to { [1 x {} addrspace(10)*], {} addrspace(10)*, [1 x { i8, i8, i8, i8, i8, i64, i64, i8, i8 }], i64 } addrspace(11)*), double %unbox), !dbg !37
  %63 = icmp sle i64 1, %value_phi4, !dbg !61
  %64 = xor i1 %63, true, !dbg !64
  br i1 %64, label %L16, label %L14, !dbg !64

fail7:                                            ; preds = %L19
  call void @ijl_bounds_error_int({} addrspace(12)* addrspacecast ({}* inttoptr (i64 139833948378416 to {}*) to {} addrspace(12)*), i64 %value_phi4), !dbg !70
  unreachable, !dbg !70

pass8:                                            ; preds = %L19
  %65 = getelementptr inbounds i64, i64* getelementptr inbounds ([3 x i64], [3 x i64]* @_j_const7, i32 0, i32 0), i64 %24, !dbg !70
  %66 = add i64 %value_phi4, 1, !dbg !98
  br label %L23, !dbg !64

fail18:                                           ; preds = %L44
  call void @ijl_bounds_error_int({} addrspace(12)* addrspacecast ({}* inttoptr (i64 139833818732992 to {}*) to {} addrspace(12)*), i64 %value_phi), !dbg !78
  unreachable, !dbg !78

pass19:                                           ; preds = %L44
  %67 = getelementptr inbounds i64, i64* getelementptr inbounds ([2 x i64], [2 x i64]* @_j_const8, i32 0, i32 0), i64 %37, !dbg !78
  %68 = add i64 %value_phi, 1, !dbg !99
  br label %L48, !dbg !76
}

; Function Attrs: noinline optnone
define nonnull {} addrspace(10)* @"jfptr_#1_68"({} addrspace(10)* %function, {} addrspace(10)** noalias nocapture noundef readonly %args, i32 %nargs) #1 {
top:
  %0 = call {}*** @julia.get_pgcstack()
  %1 = bitcast {} addrspace(10)* %function to [1 x {} addrspace(10)*] addrspace(10)*
  %2 = addrspacecast [1 x {} addrspace(10)*] addrspace(10)* %1 to [1 x {} addrspace(10)*] addrspace(11)*
  %3 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)** %args, i32 0
  %4 = load {} addrspace(10)*, {} addrspace(10)** %3, align 8, !tbaa !34, !invariant.load !11, !alias.scope !38, !noalias !41, !nonnull !11
  %5 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)** %args, i32 1
  %6 = load {} addrspace(10)*, {} addrspace(10)** %5, align 8, !tbaa !34, !invariant.load !11, !alias.scope !38, !noalias !41, !nonnull !11, !dereferenceable !84, !align !47
  call swiftcc void @"julia_#1_67"({}*** nonnull swiftself %0, [1 x {} addrspace(10)*] addrspace(11)* nocapture readonly %2, {} addrspace(10)* %4, {} addrspace(10)* %6)
  ret {} addrspace(10)* addrspacecast ({}* inttoptr (i64 139833989128200 to {}*) to {} addrspace(10)*)
}

declare {}*** @julia.get_pgcstack()

; Function Attrs: nocallback nofree nosync nounwind readnone speculatable willreturn
declare void @llvm.dbg.declare(metadata %0, metadata %1, metadata %2) #2

; Function Attrs: inaccessiblemem_or_argmemonly
declare void @julia.safepoint(i64* %0) #3

; Function Attrs: argmemonly nocallback nofree nounwind willreturn
declare void @llvm.memcpy.p0i8.p11i8.i64(i8* noalias nocapture writeonly %0, i8 addrspace(11)* noalias nocapture readonly %1, i64 %2, i1 immarg %3) #4

; Function Attrs: noreturn
declare void @ijl_bounds_error_unboxed_int(i8 addrspace(11)* %0, {}* %1, i64 %2) #5

declare swiftcc void @julia_format_70({}*** nonnull swiftself %0, {} addrspace(10)* %1, { [1 x {} addrspace(10)*], {} addrspace(10)*, [1 x { i8, i8, i8, i8, i8, i64, i64, i8, i8 }], i64 } addrspace(11)* nocapture readonly %2, double %3) #6

; Function Attrs: noreturn
declare void @ijl_bounds_error_int({} addrspace(12)* %0, i64 %1) #5

attributes #0 = { sspstrong "frame-pointer"="all" "probe-stack"="inline-asm" }
attributes #1 = { noinline optnone "frame-pointer"="all" "probe-stack"="inline-asm" }
attributes #2 = { nocallback nofree nosync nounwind readnone speculatable willreturn }
attributes #3 = { inaccessiblemem_or_argmemonly }
attributes #4 = { argmemonly nocallback nofree nounwind willreturn }
attributes #5 = { noreturn }
attributes #6 = { "frame-pointer"="all" "probe-stack"="inline-asm" }

!llvm.module.flags = !{!0, !1, !2, !3}
!llvm.dbg.cu = !{!4}

!0 = !{i32 2, !"Dwarf Version", i32 4}
!1 = !{i32 2, !"Debug Info Version", i32 3}
!2 = !{i32 1, !"stack-protector-guard", !"global"}
!3 = !{i32 2, !"julia.debug_level", i32 2}
!4 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !5, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, nameTableKind: GNU)
!5 = !DIFile(filename: "julia", directory: ".")
!6 = distinct !DISubprogram(name: "#1", linkageName: "julia_#1_67", scope: null, file: !7, line: 16, type: !8, scopeLine: 16, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !4, retainedNodes: !26)
!7 = !DIFile(filename: "/home/vchuravy/src/julia/repr.jl", directory: ".")
!8 = !DISubroutineType(types: !9)
!9 = !{!10, !12, !17, !18}
!10 = !DICompositeType(tag: DW_TAG_structure_type, name: "Nothing", align: 8, elements: !11, runtimeLang: DW_LANG_Julia, identifier: "139833833884128")
!11 = !{}
!12 = !DICompositeType(tag: DW_TAG_structure_type, name: "#1#2", size: 64, align: 64, elements: !13, runtimeLang: DW_LANG_Julia, identifier: "139833954709520")
!13 = !{!14}
!14 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !15, size: 64, align: 64)
!15 = !DICompositeType(tag: DW_TAG_structure_type, name: "jl_value_t", file: !16, line: 71, align: 64, elements: !13)
!16 = !DIFile(filename: "julia.h", directory: "")
!17 = !DIDerivedType(tag: DW_TAG_typedef, name: "String", baseType: !14)
!18 = !DICompositeType(tag: DW_TAG_structure_type, name: "IOStream", size: 384, align: 64, elements: !19, runtimeLang: DW_LANG_Julia, identifier: "139833761778880")
!19 = !{!20, !21, !21, !24, !21, !25}
!20 = !DIBasicType(name: "Ptr", size: 64, encoding: DW_ATE_unsigned)
!21 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !22, size: 64, align: 64)
!22 = !DICompositeType(tag: DW_TAG_structure_type, name: "jl_value_t", file: !16, line: 71, align: 64, elements: !23)
!23 = !{!21}
!24 = !DIBasicType(name: "Int64", size: 64, encoding: DW_ATE_unsigned)
!25 = !DIBasicType(name: "Bool", size: 8, encoding: DW_ATE_unsigned)
!26 = !{!27, !28}
!27 = !DILocalVariable(name: "#self#", arg: 1, scope: !6, file: !7, line: 16, type: !12)
!28 = !DILocalVariable(name: "io", arg: 3, scope: !6, file: !7, line: 16, type: !18)
!29 = !DILocation(line: 16, scope: !6)
!30 = !{!31, !31, i64 0}
!31 = !{!"jtbaa_gcframe", !32, i64 0}
!32 = !{!"jtbaa", !33, i64 0}
!33 = !{!"jtbaa"}
!34 = !{!35, !35, i64 0, i64 1}
!35 = !{!"jtbaa_const", !32, i64 0}
!36 = !DILocation(line: 17, scope: !6)
!37 = !DILocation(line: 18, scope: !6)
!38 = !{!39}
!39 = !{!"jnoalias_const", !40}
!40 = !{!"jnoalias"}
!41 = !{!42, !43, !44, !45}
!42 = !{!"jnoalias_gcframe", !40}
!43 = !{!"jnoalias_stack", !40}
!44 = !{!"jnoalias_data", !40}
!45 = !{!"jnoalias_typemd", !40}
!46 = !{i64 32}
!47 = !{i64 8}
!48 = !DILocation(line: 37, scope: !49, inlinedAt: !37)
!49 = distinct !DISubprogram(name: "getproperty;", linkageName: "getproperty", scope: !50, file: !50, type: !51, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !4, retainedNodes: !11)
!50 = !DIFile(filename: "Base.jl", directory: ".")
!51 = !DISubroutineType(types: !11)
!52 = !{!32, !32, i64 0}
!53 = !{!44, !43}
!54 = !{!42, !45, !39}
!55 = !DILocation(line: 62, scope: !56, inlinedAt: !37)
!56 = distinct !DISubprogram(name: "getindex;", linkageName: "getindex", scope: !57, file: !57, type: !51, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !4, retainedNodes: !11)
!57 = !DIFile(filename: "/home/vchuravy/.julia/packages/StaticArrays/cZ1ET/src/SArray.jl", directory: ".")
!58 = !DILocation(line: 31, scope: !59, inlinedAt: !55)
!59 = distinct !DISubprogram(name: "getindex;", linkageName: "getindex", scope: !60, file: !60, type: !51, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !4, retainedNodes: !11)
!60 = !DIFile(filename: "tuple.jl", directory: ".")
!61 = !DILocation(line: 514, scope: !62, inlinedAt: !64)
!62 = distinct !DISubprogram(name: "<=;", linkageName: "<=", scope: !63, file: !63, type: !51, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !4, retainedNodes: !11)
!63 = !DIFile(filename: "int.jl", directory: ".")
!64 = !DILocation(line: 72, scope: !65, inlinedAt: !66)
!65 = distinct !DISubprogram(name: "iterate;", linkageName: "iterate", scope: !60, file: !60, type: !51, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !4, retainedNodes: !11)
!66 = !DILocation(line: 206, scope: !67, inlinedAt: !69)
!67 = distinct !DISubprogram(name: "iterate;", linkageName: "iterate", scope: !68, file: !68, type: !51, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !4, retainedNodes: !11)
!68 = !DIFile(filename: "iterators.jl", directory: ".")
!69 = !DILocation(line: 19, scope: !6)
!70 = !DILocation(line: 31, scope: !59, inlinedAt: !64)
!71 = !DILocation(line: 207, scope: !67, inlinedAt: !69)
!72 = !DILocation(line: 87, scope: !73, inlinedAt: !74)
!73 = distinct !DISubprogram(name: "+;", linkageName: "+", scope: !63, file: !63, type: !51, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !4, retainedNodes: !11)
!74 = !DILocation(line: 208, scope: !67, inlinedAt: !69)
!75 = !DILocation(line: 514, scope: !62, inlinedAt: !76)
!76 = !DILocation(line: 72, scope: !65, inlinedAt: !77)
!77 = !DILocation(line: 20, scope: !6)
!78 = !DILocation(line: 31, scope: !59, inlinedAt: !76)
!79 = !DILocation(line: 21, scope: !6)
!80 = !DILocation(line: 22, scope: !6)
!81 = !DILocation(line: 37, scope: !49, inlinedAt: !80)
!82 = !DILocation(line: 62, scope: !56, inlinedAt: !80)
!83 = !DILocation(line: 31, scope: !59, inlinedAt: !82)
!84 = !{i64 48}
!85 = !{!86, !86, i64 0}
!86 = !{!"jtbaa_stack", !32, i64 0}
!87 = !{!43}
!88 = !{!42, !44, !45, !39}
!89 = !DILocation(line: 521, scope: !90, inlinedAt: !92)
!90 = distinct !DISubprogram(name: "==;", linkageName: "==", scope: !91, file: !91, type: !51, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !4, retainedNodes: !11)
!91 = !DIFile(filename: "promotion.jl", directory: ".")
!92 = !DILocation(line: 901, scope: !93, inlinedAt: !95)
!93 = distinct !DISubprogram(name: "iterate;", linkageName: "iterate", scope: !94, file: !94, type: !51, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !4, retainedNodes: !11)
!94 = !DIFile(filename: "range.jl", directory: ".")
!95 = !DILocation(line: 23, scope: !6)
!96 = !DILocation(line: 87, scope: !73, inlinedAt: !97)
!97 = !DILocation(line: 902, scope: !93, inlinedAt: !95)
!98 = !DILocation(line: 87, scope: !73, inlinedAt: !64)
!99 = !DILocation(line: 87, scope: !73, inlinedAt: !76)

@vchuravy
Copy link
Sponsor Member

vchuravy commented Dec 2, 2023

Outlining the closure also still works.

using Printf

struct TupleWrapper
    data::NTuple{3,Float64}
end
Base.getindex(v::TupleWrapper, i::Int) = getfield(v,:data)[i]

mutable struct EmptyMutable end
mutable struct System
    _::EmptyMutable
    mat::TupleWrapper
end

function System()
    System(EmptyMutable(), TupleWrapper((25.0, 0.0, 0.0)))
end

struct Lambda <: Base.Function
    o::System
end
@noinline function (l::Lambda)(_, io)
    for _ in (1, 1)
        for (i, _) in enumerate((1,1,1))
            @printf io "%19.12f" l.o.mat[i]
        end
    end
    for _ in 1:3
        @printf io "%19.12f" l.o.mat[1]
    end
end

function output(o::System)
    l = Lambda(o)
    mktemp(l)
    nothing
end

const CHANNEL = Channel{System}(Inf)
function run_main(syst::System)
    put!(CHANNEL, deepcopy(syst))
    x = take!(CHANNEL)
    output(x)
    put!(CHANNEL, deepcopy(syst))
    x = take!(CHANNEL)
    output(x)
    yield()
    nothing
end


for irange in 1:10000
    step = System()
    run_main(step)
end

@vchuravy
Copy link
Sponsor Member

vchuravy commented Dec 3, 2023

And getting rid of the Channel + mktemp

using Printf

struct TupleWrapper
    data::NTuple{3,Float64}
end
Base.getindex(v::TupleWrapper, i::Int) = getfield(v,:data)[i]

mutable struct EmptyMutable end
mutable struct System
    _::EmptyMutable
    mat::TupleWrapper
end

function System()
    System(EmptyMutable(), TupleWrapper((25.0, 0.0, 0.0)))
end

struct Lambda
    o::System
end
const stdout = Base.stdout
@noinline function (l::Lambda)()
    for _ in (1,1)
        for (i, _) in enumerate((1,1))
            @printf stdout "%19.12f" l.o.mat[i]
        end
    end
    for _ in 1:2
        @printf stdout "%19.12f" l.o.mat[1]
        GC.gc()
    end
end

for irange in 1:10000
    step = System()
    Lambda(System())()
end

@vchuravy
Copy link
Sponsor Member

vchuravy commented Dec 3, 2023

This looks rather fishy. Attached is the JULIA_LLVM_ARGS="-print-after-all -print-module-scope" for julia_Lambda.

Lambda.log

In `*** IR Dump After LateLowerGCPass on julia_Lambda_13 *** we have:

define swiftcc void @julia_Lambda_13({}*** nonnull swiftself %0, [1 x {} addrspace(10)*] addrspace(11)* nocapture noundef nonnull readonly align 8 dereferenceable(8) %1) #0 !dbg !6 {
top:
...
  %5 = bitcast [1 x {} addrspace(10)*] addrspace(11)* %1 to i8 addrspace(10)* addrspace(11)*
...
L48:                                              ; preds = %L37
...
  %getfield34.1 = load atomic i8 addrspace(10)*, i8 addrspace(10)* addrspace(11)* %5 unordered, align 8
  %15 = getelementptr i8, i8 addrspace(10)* %getfield34.1, i64 8
...
L54.preheader:                                    ; preds = %L48
  %.lcssa = phi i8 addrspace(10)* [ %15, %L48 ]
...
  %22 = call {} addrspace(10)** @julia.get_gc_frame_slot({} addrspace(10)** %gcframe, i32 0)
  %23 = bitcast i8 addrspace(10)* %.lcssa to {} addrspace(10)*
  store {} addrspace(10)* %23, {} addrspace(10)** %22, align 8

So %1 is ::Lambda so getfield34.1 should be System. We then derive a pointer from %getfield34.1 to the first element of mat. But that is inline allocated...

Now the problem is that %15 is Tracked and not Derived and so we determine that it needs to be stored in the GCFrame, where we later find it and crash due to vt pointing to the first field of EmptyMutable.

So we have a provenance issue... IIUC %15 should have been addrspace(11) and not addrspace(10).

@vchuravy
Copy link
Sponsor Member

vchuravy commented Dec 3, 2023

After SROAPass:

L54:                                              ; preds = %L54, %L48
  %value_phi22 = phi i64 [ 1, %L48 ], [ %value_phi27, %L54 ]
  %getfield_addr23 = getelementptr inbounds [1 x {} addrspace(10)*], [1 x {} addrspace(10)*] addrspace(11)* %1, i32 0, i32 0, !dbg !68
  %getfield24 = load atomic {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %getfield_addr23 unordered, align 8, !dbg !68, !tbaa !24, !invariant.load !11, !alias.scope !33, !noalias !36, !nonnull !11, !dereferenceable !41, !align !42
  %33 = addrspacecast {} addrspace(10)* %getfield24 to {} addrspace(11)*, !dbg !68
  %34 = bitcast {} addrspace(11)* %33 to i8 addrspace(11)*, !dbg !68
  %35 = getelementptr inbounds i8, i8 addrspace(11)* %34, i64 8, !dbg !68
  %36 = bitcast i8 addrspace(11)* %35 to [1 x [3 x double]] addrspace(11)*, !dbg !68

After instcombine

  %getfield34 = load atomic i8 addrspace(10)*, i8 addrspace(10)* addrspace(11)* %5 unordered, align 8, !dbg !28, !tbaa !24, !invariant.load !11, !alias.scope !33, !noalias !36, !nonnull !11, !dereferenceable !41, !align !42
  %6 = getelementptr inbounds i8, i8 addrspace(10)* %getfield34, i64 8, !dbg !28

So it looks like instcombine decides it can remove:

  %33 = addrspacecast {} addrspace(10)* %getfield24 to {} addrspace(11)*, !dbg !68

That seems pretty disastrous, and indeed https://godbolt.org/z/58zq6aE5d

@vchuravy
Copy link
Sponsor Member

vchuravy commented Dec 3, 2023

@Keno or @vtjnash could you quickly confirm that https://godbolt.org/z/ovfr46d79 is indeed an incorrect transformation?

At least my understanding of https://docs.julialang.org/en/v1/devdocs/llvm/#GC-root-placement is that we can't fold an addrspace 10 -> 11 cast, but LLVM seems to have done so as far back as LLVM 5? (I might also be barking up the wrong tree.)

@vchuravy vchuravy assigned gbaraldi and unassigned d-netto Dec 3, 2023
@vchuravy vchuravy added the compiler:codegen Generation of LLVM IR and native code label Dec 3, 2023
@vchuravy
Copy link
Sponsor Member

vchuravy commented Dec 4, 2023

Fixed by JuliaLang/llvm-project#23

@Keno
Copy link
Member

Keno commented Dec 4, 2023

Yes, the getelementptr is not legal in addrspace 10. There was some prior discussion on this somewhere, but I think the conclusion was that this isn't event legal with vanilla integral address spaces.

@ufechner7
Copy link

ufechner7 commented Dec 4, 2023

What are the next steps? A pull request for the master branch of Julia?

@vchuravy
Copy link
Sponsor Member

vchuravy commented Dec 5, 2023

@Liozou could you give #52405 a try and see if it squashes your issue in all it's variations?

@Liozou
Copy link
Member Author

Liozou commented Dec 5, 2023

I will but I don't have access to my computer until tomorrow evening, so you'll have to wait a day

@Liozou

This comment was marked as off-topic.

@vtjnash

This comment was marked as off-topic.

@Liozou

This comment was marked as off-topic.

@vchuravy

This comment was marked as off-topic.

@vchuravy

This comment was marked as off-topic.

@vchuravy

This comment was marked as off-topic.

@Liozou
Copy link
Member Author

Liozou commented Dec 7, 2023

Thanks for splitting that into a separate issue.

Regarding the matter at hand, I do believe #52405 is the fix, thanks a lot! I checked on the initial buggy version of my code and the current version, and none seem to crash.

@vchuravy vchuravy closed this as completed Dec 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:codegen Generation of LLVM IR and native code GC Garbage collector kind:bug Indicates an unexpected problem or unintended behavior kind:rr trace included
Projects
None yet
Development

No branches or pull requests

9 participants