Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSV.read returned DataFrame is different? #3154

Closed
liupgd opened this issue Sep 16, 2022 · 3 comments · Fixed by #3155
Closed

CSV.read returned DataFrame is different? #3154

liupgd opened this issue Sep 16, 2022 · 3 comments · Fixed by #3155
Labels
Milestone

Comments

@liupgd
Copy link

liupgd commented Sep 16, 2022

Hi there, I tried this code and got some exceptions:

using DataFrames, CSV
df = CSV.read("xxx.csv", DataFrame)
filter(:SomeColumn=>x->x>0, df)

which raised error:

ERROR: MethodError: no method matching Int64(::SentinelArrays.ChainedVectorIndex{Vector{Bool}})
Closest candidates are:
(::Type{T})(::T) where T<:Number at boot.jl:772
(::Type{T})(::AbstractChar) where T<:Union{Int32, Int64} at char.jl:51
(::Type{T})(::AbstractChar) where T<:Union{AbstractChar, Number} at char.jl:50
...
Stacktrace:
[1] nextind(#unused#::SentinelArrays.ChainedVector{Bool, Vector{Bool}}, i::SentinelArrays.ChainedVectorIndex{Vector{Bool}})
@ Base ./abstractarray.jl:181
[2] _copyto_bitarray!(B::BitVector, A::SentinelArrays.ChainedVector{Bool, Vector{Bool}})
@ Base ./bitarray.jl:525
[3] BitVector(A::SentinelArrays.ChainedVector{Bool, Vector{Bool}})
@ Base ./bitarray.jl:509
[4] convert(T::Type{BitVector}, a::SentinelArrays.ChainedVector{Bool, Vector{Bool}})
@ Base ./bitarray.jl:580
[5] _filter_helper(f::Function, cols::SentinelArrays.ChainedVector{String, Vector{String}})
@ DataFrames ~/.julia/packages/DataFrames/a6np0/src/abstractdataframe/abstractdataframe.jl:1110
[6] #filter#87
@ ~/.julia/packages/DataFrames/a6np0/src/abstractdataframe/abstractdataframe.jl:1092 [inlined]
[7] filter(::Pair{Symbol, var"#49#50"}, df::DataFrame)
@ DataFrames ~/.julia/packages/DataFrames/a6np0/src/abstractdataframe/abstractdataframe.jl:1087

But, this code works well:

using DataFrames, CSV, Chain
df = @chain CSV.File("xxx.csv") DataFrame
filter(:SomeColumne=>x->x>0, df)

Version Info:

  • julia-1.8.0
  • DataFrames v1.3.5
  • CSV v0.10.4

Is this a bug? Thx.

@bkamins
Copy link
Member

bkamins commented Sep 16, 2022

This is a bug.
Fixed in #3155

@bkamins bkamins added this to the 1.4 milestone Sep 16, 2022
@bkamins bkamins added the bug label Sep 16, 2022
@bkamins
Copy link
Member

bkamins commented Sep 16, 2022

For now - the recommendation is to use ntasks=1 (i.e. disable multi-threading in CSV.read)

@quinnj
Copy link
Member

quinnj commented Sep 20, 2022

This has been fixed and released upstream (in SentinelArrays.jl package); if you do ]up, then it should bring in the fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants