Skip to content

Cannot round-trip a file (read, write, read) in some circumstances #1140

@TimG1964

Description

@TimG1964

Refer to this discussion on the Julialang Discourse:

Can you file an issue against CSV.jl on GitHub? There’s probably a bug when the cut point to attribute parts of the file to tasks is in a particular position.

The error described there is

┌ Warning: thread = 1 warning: only found 15 / 16 columns around data row: 210003. Filling remaining columns with `missing`
└ @ CSV C:\Users\TGebbels\.julia\packages\CSV\cwX2w\src\file.jl:586
┌ Warning: thread = 1 warning: only found 15 / 16 columns around data row: 210003. Filling remaining columns with `missing`
└ @ CSV C:\Users\TGebbels\.julia\packages\CSV\cwX2w\src\file.jl:586
┌ Warning: thread = 1 warning: only found 15 / 16 columns around data row: 210003. Filling remaining columns with `missing`
└ @ CSV C:\Users\TGebbels\.julia\packages\CSV\cwX2w\src\file.jl:586
┌ Warning: thread = 1 warning: only found 15 / 16 columns around data row: 210003. Filling remaining columns with `missing`
└ @ CSV C:\Users\TGebbels\.julia\packages\CSV\cwX2w\src\file.jl:586
ERROR: LoadError: TaskFailedException

    nested task error: CSV.Error("thread = 2 fatal error, encountered an invalidly quoted field while parsing around row = 175539, col = 3: \"\"I will undertake a research trip hosted by Michele Bryd-McPhee curator of ‘Ladies of Hip-Hop Festival’ in New York City in March and July 2018 with 3 fundamental areas of enquiry; \n\", error=INVALID: OK | QUOTED | EOF | INVALID_QUOTED_FIELD , check your `quotechar` arguments or manually fix the field in the file itself")
    Stacktrace:
     [1] fatalerror(buf::Vector{UInt8}, pos::Int64, len::Int64, code::Int16, row::Int64, col::Int64)
       @ CSV C:\Users\TGebbels\.julia\packages\CSV\cwX2w\src\file.jl:590
     [2] parsevalue!(::Type{String}, buf::Vector{UInt8}, pos::Int64, len::Int64, row::Int64, rowoffset::Int64, i::Int64, col::CSV.Column, ctx::CSV.Context)
       @ CSV C:\Users\TGebbels\.julia\packages\CSV\cwX2w\src\file.jl:798
     [3] parserow
       @ C:\Users\TGebbels\.julia\packages\CSV\cwX2w\src\file.jl:640 [inlined]
     [4] parsefilechunk!(ctx::CSV.Context, pos::Int64, len::Int64, rowsguess::Int64, rowoffset::Int64, columns::Vector{CSV.Column}, ::Type{Tuple{}})
       @ CSV C:\Users\TGebbels\.julia\packages\CSV\cwX2w\src\file.jl:550
     [5] multithreadparse(ctx::CSV.Context, pertaskcolumns::Vector{Vector{CSV.Column}}, rowchunkguess::Int64, i::Int64, rows::Vector{Int64}, wholecolumnslock::ReentrantLock)
       @ CSV C:\Users\TGebbels\.julia\packages\CSV\cwX2w\src\file.jl:360
     [6] (::CSV.var"#34#39"{CSV.Context, Vector{Vector{CSV.Column}}, Int64, Int64, Vector{Int64}, ReentrantLock})()
       @ CSV C:\Users\TGebbels\.julia\packages\WorkerUtilities\ey0fP\src\WorkerUtilities.jl:384
Stacktrace:
  [1] sync_end(c::Channel{Any})
    @ Base .\task.jl:448
  [2] macro expansion
    @ .\task.jl:480 [inlined]
  [3] CSV.File(ctx::CSV.Context, chunking::Bool)
    @ CSV C:\Users\TGebbels\.julia\packages\CSV\cwX2w\src\file.jl:240
  [4] File
    @ C:\Users\TGebbels\.julia\packages\CSV\cwX2w\src\file.jl:227 [inlined]
  [5] #File#32
    @ C:\Users\TGebbels\.julia\packages\CSV\cwX2w\src\file.jl:223 [inlined]
  [6] CSV.File(source::String)
    @ CSV C:\Users\TGebbels\.julia\packages\CSV\cwX2w\src\file.jl:162
  [7] read(source::String, sink::Type; copycols::Bool, kwargs::@Kwargs{})
    @ CSV C:\Users\TGebbels\.julia\packages\CSV\cwX2w\src\CSV.jl:117
  [8] read(source::String, sink::Type)
    @ CSV C:\Users\TGebbels\.julia\packages\CSV\cwX2w\src\CSV.jl:113
  [9] top-level scope
    @ c:\Users\TGebbels...\Documents\DCMS Database\CompareCsv.jl:361
 [10] include(fname::String)
    @ Base.MainInclude .\client.jl:489
 [11] run(debug_session::VSCodeDebugger.DebugAdapter.DebugSession, error_handler::VSCodeDebugger.var"#3#4"{String})
    @ VSCodeDebugger.DebugAdapter c:\Users\TGebbels\.vscode\extensions\julialang.language-julia-1.105.2\scripts\packages\DebugAdapter\src\packagedef.jl:126
 [12] startdebugger()
    @ VSCodeDebugger c:\Users\TGebbels\.vscode\extensions\julialang.language-julia-1.105.2\scripts\packages\VSCodeDebugger\src\VSCodeDebugger.jl:45
 [13] top-level scope
    @ c:\Users\TGebbels\.vscode\extensions\julialang.language-julia-1.105.2\scripts\debugger\run_debugger.jl:12
 [14] include(mod::Module, _path::String)
    @ Base .\Base.jl:495
 [15] exec_options(opts::Base.JLOptions)
    @ Base .\client.jl:318
 [16] _start()
    @ Base .\client.jl:552
in expression starting at c:\Users\TGebbels\...\Documents\DCMS Database\CompareCsv.jl:361

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions