Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR: BoundsError: attempt to access Core.SimpleVector when converting to DataFrames.DataFrame #68

Closed
xgdgsc opened this issue Feb 18, 2020 · 5 comments · Fixed by #85
Assignees

Comments

@xgdgsc
Copy link

xgdgsc commented Feb 18, 2020

I get the following error when trying to convert to DataFrames.DataFrame

julia> oDf = DataFrames.DataFrame(pDf)
ERROR: BoundsError: attempt to access Core.SimpleVector
  at index [1]
Stacktrace:
 [1] getindex(::Core.SimpleVector, ::Int64) at ./essentials.jl:582
 [2] Array(::Series) at /home/gsc/.julia/packages/Pandas/rAPmB/src/Pandas.jl:81
 [3] (::Pandas.var"#430#433"{Pandas.DataFrame})(::String) at ./none:0
 [4] iterate(::Base.Generator{Array{String,1},Pandas.var"#430#433"{Pandas.DataFrame}}) at ./generator.jl:47
 [5] get_columns_copy_using_missing(::Pandas.DataFrame) at /home/gsc/.julia/packages/Pandas/rAPmB/src/tabletraits.jl:24
 [6] columns at /home/gsc/.julia/packages/Tables/FXXeK/src/fallbacks.jl:175 [inlined]
 [7] #DataFrame#451(::Bool, ::Type{DataFrame}, ::Pandas.DataFrame) at /home/gsc/.julia/packages/DataFrames/uPgZV/src/other/tables.jl:32
 [8] DataFrame(::Pandas.DataFrame) at /home/gsc/.julia/packages/DataFrames/uPgZV/src/other/tables.jl:23
 [9] top-level scope at REPL[77]:1

The table is a simple one obtained from PyCall calling some other python package:

julia> pDf =  Pandas.DataFrame(e_data)
        date     open    close     high     low    volume        amount
0 2020-02-17  100.000  100.000  100.001  99.999  957942.0  9.579393e+07
1 2020-02-18   99.999  100.001  100.001  99.998  887600.0  8.875952e+07

It only happens with this converted table obtained by PyCall. When I save this to csv and load again with Pandas.jl it won' t error on conversion calling oDf = DataFrames.DataFrame(pDf).
Any clue how to workaround this?

@malmaud
Copy link
Collaborator

malmaud commented Feb 18, 2020

Hmm I’ll look into it.

@xgdgsc
Copy link
Author

xgdgsc commented Feb 22, 2020

I compared eltype of both and converted date column from Any to String and it works now.

@malmaud malmaud self-assigned this Feb 24, 2020
@bgroenks96
Copy link

@malmaud Any progress on this? Still appears to be an issue in the current release.

@Toundra
Copy link

Toundra commented Apr 7, 2021

Also breaks on datetime64 columns, here's the code to reproduce

using PyCall, DataFrames
import Pandas as pd

py"""
import pandas as pd

def get_df():
    df = pd.DataFrame({
        "a":pd.to_datetime(["2021.01.15","2021.01.15","2020.04.06"])
    })
    return df
"""

df = py"get_df"()|>pd.DataFrame
df2 = DataFrame(df)

@malmaud
Copy link
Collaborator

malmaud commented Aug 28, 2021

Should be fixed now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants