Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finalizers cause exception_access_violation while using HDF5 from a thread. #990

Closed
mkitti opened this issue Jul 14, 2022 · 9 comments
Closed

Comments

@mkitti
Copy link
Member

mkitti commented Jul 14, 2022

When running long jobs, I'm occasionally running into this traces like this. What I think is happening is that the finalizer is running to close out a property while another thread is trying to access the HDF5 library. This can be be mitigated by running GC.gc() when between jobs where threads are used.

Note I'm doing this while using HDF5.jl on a single spawned thread.

Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION at 0x7ffb692e3cac -- H5FL_reg_calloc at C:\Users\simview\.julia\artifacts\a285947f59f2fa90c9ac36239548c5af147d0da5\bin\libhdf5-0.dll (unknown line)
in expression starting at REPL[1]:1
H5FL_reg_calloc at C:\Users\simview\.julia\artifacts\a285947f59f2fa90c9ac36239548c5af147d0da5\bin\libhdf5-0.dll (unknown line)
H5CX_push at C:\Users\simview\.julia\artifacts\a285947f59f2fa90c9ac36239548c5af147d0da5\bin\libhdf5-0.dll (unknown line)
H5Iis_valid at C:\Users\simview\.julia\artifacts\a285947f59f2fa90c9ac36239548c5af147d0da5\bin\libhdf5-0.dll (unknown line)
h5i_is_valid at C:\Users\simview\.julia\packages\HDF5\7zvRl\src\api\functions.jl:1354
isvalid at C:\Users\simview\.julia\packages\HDF5\7zvRl\src\properties.jl:19 [inlined]
close at C:\Users\simview\.julia\packages\HDF5\7zvRl\src\properties.jl:11
jl_apply at /cygdrive/c/buildbot/worker/package_win64/build/src\julia.h:1788 [inlined]
run_finalizer at /cygdrive/c/buildbot/worker/package_win64/build/src\gc.c:278
jl_gc_run_finalizers_in_list at /cygdrive/c/buildbot/worker/package_win64/build/src\gc.c:365
run_finalizers at /cygdrive/c/buildbot/worker/package_win64/build/src\gc.c:394 [inlined]
run_finalizers at /cygdrive/c/buildbot/worker/package_win64/build/src\gc.c:372
jl_gc_collect at /cygdrive/c/buildbot/worker/package_win64/build/src\gc.c:3267
maybe_collect at /cygdrive/c/buildbot/worker/package_win64/build/src\gc.c:882 [inlined]
jl_gc_pool_alloc at /cygdrive/c/buildbot/worker/package_win64/build/src\gc.c:1209
jl_gc_alloc_ at /cygdrive/c/buildbot/worker/package_win64/build/src\julia_internal.h:339 [inlined]
_new_array_ at /cygdrive/c/buildbot/worker/package_win64/build/src\array.c:137 [inlined]
_new_array at /cygdrive/c/buildbot/worker/package_win64/build/src\array.c:193 [inlined]
jl_alloc_array_3d at /cygdrive/c/buildbot/worker/package_win64/build/src\array.c:455
Array at .\boot.jl:461 [inlined]
Array at .\boot.jl:468 [inlined]
Array at .\boot.jl:474
unknown function (ip: 0000000063a9c088)
macro expansion at E:\Mirror\FND red stitching test\chunked\downsampling.jl:178 [inlined]
macro expansion at .\timing.jl:220 [inlined]
macro expansion at E:\Mirror\FND red stitching test\chunked\downsampling.jl:177 [inlined]
macro expansion at .\task.jl:399 [inlined]
downsample_to_memory_multithreaded_channel at E:\Mirror\FND red stitching test\chunked\downsampling.jl:136
@simonbyrne
Copy link
Collaborator

I guess the solution here is to close all properties objects when they are done?

simonbyrne added a commit to simonbyrne/HDF5.jl that referenced this issue Aug 21, 2022
- Clarifies and adds docs
- Closes properties after completion (helps JuliaIO#990)
- Removes custom exception for create_group (fixes JuliaIO#829)
simonbyrne added a commit to simonbyrne/HDF5.jl that referenced this issue Aug 21, 2022
- Clarifies and adds docs
- Closes properties after completion (helps JuliaIO#990)
- Removes custom exception for create_group (fixes JuliaIO#829)
simonbyrne added a commit to simonbyrne/HDF5.jl that referenced this issue Aug 21, 2022
- Clarifies and adds docs
- Closes properties after completion (helps JuliaIO#990)
- Removes custom exception for create_group (fixes JuliaIO#829)
simonbyrne added a commit to simonbyrne/HDF5.jl that referenced this issue Aug 21, 2022
- Clarifies and adds docs
- Closes properties after completion (helps JuliaIO#990)
- Removes custom exception for create_group (fixes JuliaIO#829)
mkitti added a commit that referenced this issue Aug 22, 2022
* Cleanup group code

- Clarifies and adds docs
- Closes properties after completion (helps #990)
- Removes custom exception for create_group (fixes #829)

* Update src/groups.jl

Co-authored-by: Mark Kittisopikul <mkitti@users.noreply.github.com>

* add deprecation

Co-authored-by: Mark Kittisopikul <mkitti@users.noreply.github.com>
@simonbyrne
Copy link
Collaborator

simonbyrne commented Oct 25, 2022

Following up on #1013 (comment), how about we define a function

function with(fn, args...)
    try
        fn(args...)
    finally
        for arg in args
            finalize(arg)
        end
    end
end

so e.g.

HDF5.jl/src/groups.jl

Lines 24 to 35 in 6022be0

function create_group(parent::Union{File,Group}, path::AbstractString; pv...)
lcpl = _link_properties(path)
gcpl = GroupCreateProperties()
try
pv = setproperties!(lcpl, gcpl; pv...)
isempty(pv) || error("invalid keyword options $pv")
return create_group(parent, path, lcpl, gcpl)
finally
close(lcpl)
close(gcpl)
end
end

could be written as

function create_group(parent::Union{File,Group}, path::AbstractString; pv...)
    with(_link_properties(path), GroupCreateProperties()) do lcpl, gcpl
        pv = setproperties!(lcpl, gcpl; pv...)
        isempty(pv) || error("invalid keyword options $pv")
        return create_group(parent, path, lcpl, gcpl)
    end
end

or perhaps a macro to do the same, e.g.:

function create_group(parent::Union{File,Group}, path::AbstractString; pv...)
    @with lcpl=_link_properties(path) gcpl=GroupCreateProperties() begin
        pv = setproperties!(lcpl, gcpl; pv...)
        isempty(pv) || error("invalid keyword options $pv")
        return create_group(parent, path, lcpl, gcpl)
    end
end

@mkitti
Copy link
Member Author

mkitti commented Oct 25, 2022

I was thinking something similar, but calling close rather than finalize.

@musm
Copy link
Member

musm commented Oct 25, 2022

@simonbyrne that sounds a lot better than the current setup and remind me of something similar in the D language for finalizers. The function form seems simpler than going with a macro, unless there is an intrinsic benefit of the macro.

@mkitti
Copy link
Member Author

mkitti commented Oct 25, 2022

Does calling finalize run the finalizers on the current task/thread or does it schedule it on another task/thread?

@simonbyrne
Copy link
Collaborator

I'm pretty sure it runs on the current task (i.e. it calls it immediately). finalize also has the advantages of working on any object (even immutable ones), and being cleared once called (i.e. the finalizers from the object aren't later called again by the GC), e.g.:

julia> mutable struct Foo
       end

julia> close(f::Foo) = println("done")
close (generic function with 1 method)

julia> f = Foo()
Foo()

julia> finalizer(close, f)
Foo()

julia> close(f)
done

julia> f = nothing; GC.gc() # close called again
done

julia> f = Foo()
Foo()

julia> finalizer(close, f)
Foo()

julia> finalize(f)
done

julia> f = nothing; GC.gc() # close not called again

@simonbyrne
Copy link
Collaborator

I'm pretty sure it runs on the current task (i.e. it calls it immediately)

https://github.com/JuliaLang/julia/blob/8512dd2a570c1f181df96529d73512e8c766e33c/base/gcutils.jl#L72-L73

@mkitti
Copy link
Member Author

mkitti commented Oct 26, 2022

This seems more general than HDF5.jl. While I'm happy to experiment with it here, perhaps we should consider putting it in a small package at some point? Maybe we should incubate it as a subdirectory package or at least put it within it's own module.

The @with macro looks quite like a let statement. Perhaps the macro should act on a let statement rather than a begin block?

@simonbyrne
Copy link
Collaborator

Closed by #1021

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants