Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Make _to_dataframe faster for extension array columns after pandas fix #8950

Open
ilan-gold opened this issue Apr 17, 2024 · 0 comments

Comments

@ilan-gold
Copy link
Contributor

What is your issue?

One pandas-dev/pandas#57676 is completed, we should be able to do the joins in the _to_dataframe method faster (we need to be able to handle the singleton case which is hte issue with pandas):

xarray/xarray/core/dataset.py

Lines 7170 to 7177 in 239309f

def _to_dataframe(self, ordered_dims: Mapping[Any, int]):
columns = [k for k in self.variables if k not in self.dims]
data = [
self._variables[k].set_dims(ordered_dims).values.reshape(-1)
for k in columns
]
index = self.coords.to_index([*ordered_dims])
return pd.DataFrame(dict(zip(columns, data)), index=index)

see discussion here

@ilan-gold ilan-gold added the needs triage Issue that has not been reviewed by xarray team member label Apr 17, 2024
@max-sixty max-sixty added upstream issue and removed needs triage Issue that has not been reviewed by xarray team member labels Apr 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants