Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataFrame(CSV.File(...)) broken with header-only #702

Closed
omus opened this issue Jul 31, 2020 · 1 comment · Fixed by JuliaData/DataFrames.jl#2341
Closed

DataFrame(CSV.File(...)) broken with header-only #702

omus opened this issue Jul 31, 2020 · 1 comment · Fixed by JuliaData/DataFrames.jl#2341

Comments

@omus
Copy link
Member

omus commented Jul 31, 2020

Starting with CSV.jl 0.7.5 parsing a file only containing a header will be treated like the file is empty.

julia> using CSV, DataFrames

julia> names(DataFrame(CSV.File(IOBuffer("a,b,c"))))
0-element Array{String,1}

julia> names(DataFrame(CSV.File(IOBuffer("a,b,c\n1,2,3"))))
3-element Array{String,1}:
 "a"
 "b"
 "c"

Calling CSV.read works as expected.

quinnj added a commit to JuliaData/DataFrames.jl that referenced this issue Aug 1, 2020
Fixes JuliaData/CSV.jl#702. A DataFrame
constructor was removed from CSV.jl in its latest release (0.7.6) as
part of new deprecations for decoupling the two packages. What I forgot
was that we were relying on that constructor because of an ambiguity in
the generic DataFrame constructor fallback that mostly uses Tables.jl to
try and turn anything into a DataFrame. The problem is "mostly"; one of
the checks is if the input is an `AbstractVector` of `AbstractVectors`,
which it turns out `CSV.File` is! This is because it's considered an
`AbstractVector` of `Row`s, which themselves are `<: AbstractVector`. So
we were trying to basically take each row of a `CSV.File` and treat each
row as a column in that fallback constructor.

While it was proposed to put the DataFrame constructor back in CSV.jl
for now; that's pretty unsatisfying because we know we're going to
remove the DataFrames deprecation at some point, and when we do, we
want `DataFrame(CSV.File())` to work out of the box.

The fix here is just adding an additional check up front in the fallback
constructor if we already know the input is `Tables.istable(x)`. If so,
we'll avoid these extra corner case checks and just let the Tables.jl
machinery do the work.
@quinnj
Copy link
Member

quinnj commented Aug 1, 2020

Alternative fix proposed at JuliaData/DataFrames.jl#2341

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants