Skip to content

[query] Add to_pandas(types={}) argument to specify user-supplied pandas dtypes#12735

Merged
danking merged 5 commits intohail-is:mainfrom
mkanai:custom_pd_types
Mar 5, 2023
Merged

[query] Add to_pandas(types={}) argument to specify user-supplied pandas dtypes#12735
danking merged 5 commits intohail-is:mainfrom
mkanai:custom_pd_types

Conversation

@mkanai
Copy link
Copy Markdown
Contributor

@mkanai mkanai commented Feb 27, 2023

This PR fixes #11738. Now users can specify arbitrary type conversation between Hail and Pandas dtypes via:

ht.to_pandas(types={"col1": "int32", "col2": np.float64, hl.tstring: "object"})

This maps col1 and col2 to int32 and np.float64, respectively, and all hl.tstring fields to object.

One design question might be whether to have separate arguments for column name and Hail type specifications or not. Any thoughts? cc: @danking

Also, I don't think the current type check would work for np.float64-like numpy dtype specifications...

@danking
Copy link
Copy Markdown
Contributor

danking commented Feb 27, 2023

Amazing! I'll look into this this week. Thank you Masa!

@danking danking assigned patrick-schultz and unassigned danking Mar 1, 2023
@danking
Copy link
Copy Markdown
Contributor

danking commented Mar 1, 2023

@patrick-schultz I added some tests and fixed behavior as a result. I think this is a valuable change, but since I've edited it directly seems like someone else should do review as well.

Copy link
Copy Markdown
Member

@patrick-schultz patrick-schultz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@danking danking merged commit eb60cdd into hail-is:main Mar 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

hl.Table.to_pandas() generates a dtype=string dataframe which is still experimental

3 participants