Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hl.Table.to_pandas() generates a dtype=string dataframe which is still experimental #11738

Closed
mkanai opened this issue Apr 6, 2022 · 2 comments · Fixed by #12735
Closed

hl.Table.to_pandas() generates a dtype=string dataframe which is still experimental #11738

mkanai opened this issue Apr 6, 2022 · 2 comments · Fixed by #12735

Comments

@mkanai
Copy link
Contributor

mkanai commented Apr 6, 2022

I encountered an error in an external package when I used a Hail-generated pandas data frame, which is due to an unsupported dtype pandas.StringDtype.
pyranges/pyranges#264

Given it's still experimental in pandas, can we have an option to generate a data frame that have dtype=object string columns? or maybe, we should make dtype=object default.

hail/hail/python/hail/table.py

Lines 3345 to 3346 in c4b0995

if hl_dtype == hl.tstr:
pd_dtype = 'string'

@danking
Copy link
Contributor

danking commented Jun 2, 2022

Great suggestion Masa! We can provide a types argument to to_pandas which allows the user to override the type for a subset of columns. I've marked this help wanted. If someone on the team has some spare cycles they might pick it up. We also welcome PRs to make this change!

@danking
Copy link
Contributor

danking commented Mar 13, 2023

Closing the loop: this was released into 0.2.110!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants