Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GDS Projection from polars / pandas dataframe or arrow table #654

Closed
Mintactus opened this issue May 30, 2024 · 5 comments
Closed

GDS Projection from polars / pandas dataframe or arrow table #654

Mintactus opened this issue May 30, 2024 · 5 comments

Comments

@Mintactus
Copy link

Mintactus commented May 30, 2024

The new Polars dataframe multi engine is absolutely a must in the data industry.
After using it for months, the performance benifits are insane, adios pandas, your time has come.

Allowing GDS to export as and create projections from polars dataframes would be natural today. ( At least once Polars will be out of Alpha )

Even better, GDS being based on apache arrow, I think it would make sens for GDS to create projection directly from an arrow table ? This will makes it agnostic to the engine processing the data.

@Mats-SX
Copy link
Contributor

Mats-SX commented Jun 3, 2024

Duplicate of #653

@Mats-SX Mats-SX closed this as not planned Won't fix, can't repro, duplicate, stale Jun 3, 2024
@Mats-SX Mats-SX transferred this issue from neo4j/graph-data-science Jun 3, 2024
@Mats-SX
Copy link
Contributor

Mats-SX commented Jun 3, 2024

While projection and export are two distinct feature in GDS and its representation in the GDS Python Client, the question of what kind of DataFrame libraries they accept is seen as a global integration. If we added support for Polars, it should apply for both export and projections.

For now, the same workaround to convert to/from pandas data frames will assist workflows based on Polars. My discussion in the other issue applies similarly for projection, where we make use of Table.from_pandas() in pyarrow, but there is no Table.from_polars().

@Mintactus
Copy link
Author

Since, polars and pandas ( at least the recent version ) and GDS, and so much more on the market all shared one thing in commun, apache arrow format, would it be a solution to simply import and export ( optionally or as a default behavior ) in arrow table format ?

This will makes GDS agnostic the to engine processing the data before they are shipped into or out of GDS?

Thanks

@Mats-SX
Copy link
Contributor

Mats-SX commented Jun 11, 2024

It is a possibility. But the pyarrow.Table type is not as ubiquitous as the pandas.DataFrame type. It is nice to have DataFrame in the API.

But we can have a polymorphic parameter set, and allow passing in pyarrow.Table objects directly. It would be some work to accomplish, but it would be possible.

@Mintactus
Copy link
Author

Mintactus commented Jun 24, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants