Skip to content

Distributed.jl worker processes crash on Windows when importing ReverseDiff.jl #153

@MilesCranmer

Description

@MilesCranmer

Distributed.jl worker processes crash when importing ReverseDiff.jl, and only on Windows. I haven't been able to find any explanation in ReverseDiff.jl itself, so I was hoping I could receive some assistance or clues here.

This bug has been observed on Julia 1.5 through 1.8. It only occurs on Windows (including windows-2019, windows-2022, and windows-latest. Ubuntu and macOS are unaffected, although see $^{[1]}$.)

Other posts on this issue:

This bug can reproduced with the following code, which dynamically allocates some worker processes, activates the current environment on each, and then imports a given package on each worker:

using Pkg, Distributed

"""Try to dynamically create workers, and import the package."""
function test(package_name)
    procs = addprocs(4)
    project_path = splitdir(Pkg.project().path)[1]
    # Import package on head worker:
    Base.MainInclude.eval(
        quote
            import $(Symbol(package_name))
        end
    )
    # Import package on worker:
    @everywhere procs begin
        Base.MainInclude.eval(
            quote
                using Pkg
                Pkg.activate($$project_path)
                import $(Symbol($package_name))
            end,
        )
    end
    rmprocs(procs)
end

packages_to_test = [
    "Distributed",  "JSON3", "LineSearches", "LinearAlgebra",
    "LossFunctions", "Optim", "Printf", "Random",
    "Reexport", "SpecialFunctions", "Zygote", "ReverseDiff",
]
for package_name in packages_to_test
    println("Testing $(package_name)...")
    test(package_name)
    println("Success!")
end

The only reliable $^{[1]}$ failure case here is Windows + ReverseDiff.jl. All other combinations of packages and operating systems work fine. I also note that the first import must occur. If the package is only imported on the worker processes, but not on the head worker, the error does not occur.

You can see an example of this error here: https://github.com/MilesCranmer/SymbolicRegression.jl/runs/7957291344?check_suite_focus=true#step:6:296. All packages are successfully imported, except when it comes to ReverseDiff.jl, and only on Windows.

cc @rikhuijzer @ChrisRackauckas @mohamed82008

$^{[1]}$ for the first time, I also saw this occur on an Ubuntu test – also for ReverseDiff.jl. That one is not consistent though.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions