Skip to content

Hang with multithreaded reading #817

@ericphanson

Description

@ericphanson

With the following script

using Printf, CSV, Dates

function write_data(n)
    f = open("test.csv", "w")
    for i=1:n
        @printf(f, ",,,%g,%g\n", randn(), randn())
    end
    close(f)
end

write_data(100)

@info "Info" VERSION Threads.nthreads() now()
@time file = CSV.File("test.csv"; header=false, threaded=false)
@time file = CSV.File("test.csv"; header=false, threaded=false)
@time file = CSV.File("test.csv"; header=false, threaded=true)

I get an indefinite hang when I run it with 8 threads

julia> include("mwe.jl");
┌ Info: Info
│   VERSION = v"1.6.0-rc2"
│   Threads.nthreads() = 8now() = 2021-03-18T23:32:33.745
  5.278531 seconds (19.67 M allocations: 961.109 MiB, 4.88% gc time, 95.44% compilation time)
  0.000274 seconds (183 allocations: 17.312 KiB)
...hanging for > 15 minutes

(It's been 15 minutes now, but a few days ago it ran for like half an hour at least I think, when it first came up in https://discourse.julialang.org/t/making-string-to-float-conversion-faster/57146/12). Looks like 1 core is maxed out a 100% CPU load and the rest are idle.

This is on an M1 macbook pro with 8 cores:

julia> versioninfo()
Julia Version 1.6.0-rc2
Commit 4b6b9fe4d7 (2021-03-11 07:05 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin19.6.0)
  CPU: Apple M1
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-11.0.1 (ORCJIT, westmere)

However, if I run with any fewer than 8 threads, then it completes in a few seconds with no issue.

Runs with 1-7 threads
julia> include("mwe.jl");
┌ Info: Info
│   VERSION = v"1.6.0-rc2"
│   Threads.nthreads() = 1now() = 2021-03-18T23:29:42
  0.000364 seconds (183 allocations: 17.312 KiB)
  0.000222 seconds (183 allocations: 17.312 KiB)
┌ Warning: `threaded=true` but `tasks=1`; to support threaded parsing, pass `tasks=N` where `N > 1`; `tasks` defaults to `Threads.nthreads()`, so you may consider starting Julia with multiple threads
└ @ CSV ~/.julia/packages/CSV/CJfFO/src/file.jl:253
  0.000382 seconds (336 allocations: 28.078 KiB)

julia> include("mwe.jl");
┌ Info: Info
│   VERSION = v"1.6.0-rc2"
│   Threads.nthreads() = 2now() = 2021-03-18T23:30:00.737
  5.235543 seconds (19.67 M allocations: 961.109 MiB, 4.64% gc time, 95.37% compilation time)
  0.000273 seconds (183 allocations: 17.312 KiB)
  1.636271 seconds (4.66 M allocations: 267.129 MiB, 2.96% gc time, 97.81% compilation time)

julia> include("mwe.jl");
┌ Info: Info
│   VERSION = v"1.6.0-rc2"
│   Threads.nthreads() = 3now() = 2021-03-18T23:30:26.200
  5.332556 seconds (19.67 M allocations: 961.109 MiB, 4.60% gc time, 95.40% compilation time)
  0.000245 seconds (183 allocations: 17.531 KiB)
  1.696665 seconds (4.77 M allocations: 273.649 MiB, 3.06% gc time, 94.83% compilation time)

julia> include("mwe.jl");
┌ Info: Info
│   VERSION = v"1.6.0-rc2"
│   Threads.nthreads() = 4now() = 2021-03-18T23:31:03.480
  5.275898 seconds (19.67 M allocations: 961.217 MiB, 4.78% gc time, 95.42% compilation time)
  0.000305 seconds (185 allocations: 20.500 KiB)
  1.730575 seconds (4.89 M allocations: 280.301 MiB, 2.77% gc time, 94.77% compilation time)

julia> include("mwe.jl");
┌ Info: Info
│   VERSION = v"1.6.0-rc2"
│   Threads.nthreads() = 5now() = 2021-03-18T23:31:33.826
  5.243164 seconds (19.67 M allocations: 961.109 MiB, 4.71% gc time, 95.49% compilation time)
  0.000289 seconds (183 allocations: 17.531 KiB)
  1.781002 seconds (5.01 M allocations: 287.013 MiB, 2.72% gc time, 93.84% compilation time)

julia> include("mwe.jl");
┌ Info: Info
│   VERSION = v"1.6.0-rc2"
│   Threads.nthreads() = 6now() = 2021-03-18T23:31:53.600
  5.260127 seconds (19.67 M allocations: 961.109 MiB, 4.74% gc time, 95.45% compilation time)
  0.000255 seconds (183 allocations: 17.312 KiB)
  1.872338 seconds (5.04 M allocations: 289.046 MiB, 2.61% gc time, 95.01% compilation time)

julia> include("mwe.jl");
┌ Info: Info
│   VERSION = v"1.6.0-rc2"
│   Threads.nthreads() = 7now() = 2021-03-18T23:32:13.523
  5.279015 seconds (19.67 M allocations: 961.217 MiB, 4.60% gc time, 95.42% compilation time)
  0.000283 seconds (185 allocations: 20.500 KiB)
  1.846532 seconds (5.08 M allocations: 291.157 MiB, 2.73% gc time, 93.09% compilation time)

(accidentally filed as apache/arrow-julia#155)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions