Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

df.dtypes fails with Arrow #3709

Closed
tvst opened this issue Aug 16, 2021 · 1 comment · Fixed by #3836
Closed

df.dtypes fails with Arrow #3709

tvst opened this issue Aug 16, 2021 · 1 comment · Fixed by #3836
Assignees
Labels
package:arrow Related to Arrow type:bug Something isn't working

Comments

@tvst
Copy link
Contributor

tvst commented Aug 16, 2021

Summary

Using df.dtypes to show the datatypes for a dataframe fails when you're using the Arrow codepath.

Steps to reproduce

Code snippet:

import streamlit as st
import pandas as pd

df = pd.read_json("https://cdn.jsdelivr.net/npm/vega-datasets@2/data/penguins.json")

st.write(df.dtypes)

Expected behavior:

The app should show a table with datatypes for the given dataframe

Actual behavior:

The app shows this exception:

image

Exception text

ArrowInvalid: ("Could not convert dtype('O') with type numpy.dtype[object_]: did not recognize Python value type when inferring an Arrow data type", 'Conversion failed for column 0 with type object')
Traceback:
File "/home/tvst/.local/share/virtualenvs/streamlit-vega-lite-demo-Se5mEOd4/lib/python3.9/site-packages/streamlit/script_runner.py", line 350, in _run_script
    exec(code, module.__dict__)
File "/media/storage/Projects/streamlit/streamlit-vega-lite-demo/demo.py", line 15, in <module>
    df.dtypes
File "/home/tvst/.local/share/virtualenvs/streamlit-vega-lite-demo-Se5mEOd4/lib/python3.9/site-packages/streamlit/__init__.py", line 478, in _transparent_write
    write(*args)
File "/home/tvst/.local/share/virtualenvs/streamlit-vega-lite-demo-Se5mEOd4/lib/python3.9/site-packages/streamlit/elements/write.py", line 182, in write
    self.dg.dataframe(arg)
File "/home/tvst/.local/share/virtualenvs/streamlit-vega-lite-demo-Se5mEOd4/lib/python3.9/site-packages/streamlit/elements/dataframe_selector.py", line 85, in dataframe
    return self.dg._arrow_dataframe(data, width, height)
File "/home/tvst/.local/share/virtualenvs/streamlit-vega-lite-demo-Se5mEOd4/lib/python3.9/site-packages/streamlit/elements/arrow.py", line 82, in _arrow_dataframe
    marshall(proto, data, default_uuid)
File "/home/tvst/.local/share/virtualenvs/streamlit-vega-lite-demo-Se5mEOd4/lib/python3.9/site-packages/streamlit/elements/arrow.py", line 160, in marshall
    proto.data = type_util.data_frame_to_bytes(df)
File "/home/tvst/.local/share/virtualenvs/streamlit-vega-lite-demo-Se5mEOd4/lib/python3.9/site-packages/streamlit/type_util.py", line 371, in data_frame_to_bytes
    table = pa.Table.from_pandas(df)
File "pyarrow/table.pxi", line 1561, in pyarrow.lib.Table.from_pandas
File "/home/tvst/.local/share/virtualenvs/streamlit-vega-lite-demo-Se5mEOd4/lib64/python3.9/site-packages/pyarrow/pandas_compat.py", line 594, in dataframe_to_arrays
    arrays = [convert_column(c, f)
File "/home/tvst/.local/share/virtualenvs/streamlit-vega-lite-demo-Se5mEOd4/lib64/python3.9/site-packages/pyarrow/pandas_compat.py", line 594, in <listcomp>
    arrays = [convert_column(c, f)
File "/home/tvst/.local/share/virtualenvs/streamlit-vega-lite-demo-Se5mEOd4/lib64/python3.9/site-packages/pyarrow/pandas_compat.py", line 581, in convert_column
    raise e
File "/home/tvst/.local/share/virtualenvs/streamlit-vega-lite-demo-Se5mEOd4/lib64/python3.9/site-packages/pyarrow/pandas_compat.py", line 575, in convert_column
    result = pa.array(col, type=type_, from_pandas=True, safe=safe)
File "pyarrow/array.pxi", line 302, in pyarrow.lib.array
File "pyarrow/array.pxi", line 83, in pyarrow.lib._ndarray_to_array
File "pyarrow/error.pxi", line 99, in pyarrow.lib.check_status

Is this a regression?

yes (arrow vs legacy)

Debug info

  • Streamlit version: 0.86.0
@tvst tvst added type:bug Something isn't working package:arrow Related to Arrow status:needs-triage Has not been triaged by the Streamlit team labels Aug 16, 2021
@joelbs
Copy link

joelbs commented Aug 23, 2021

Similar issue. Anyone know anything about this?

@vdonato vdonato removed the status:needs-triage Has not been triaged by the Streamlit team label Aug 26, 2021
@kantuni kantuni self-assigned this Sep 20, 2021
kantuni added a commit to kantuni/streamlit that referenced this issue Sep 23, 2021
kantuni added a commit that referenced this issue Sep 24, 2021
…onversion exception (#3836)

* Closes #3709

* Remove unused import

* Better error message
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
package:arrow Related to Arrow type:bug Something isn't working
Projects
None yet
4 participants