-
Notifications
You must be signed in to change notification settings - Fork 356
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explicitly not support codes and levels property for MultiIndex #952
Conversation
Codecov Report
@@ Coverage Diff @@
## master #952 +/- ##
=======================================
Coverage 95.13% 95.13%
=======================================
Files 34 34
Lines 6765 6765
=======================================
Hits 6436 6436
Misses 329 329
Continue to review full report at Codecov.
|
databricks/koalas/indexes.py
Outdated
i += 1 | ||
|
||
sdf = sdf.orderBy("__order__").select(sdf.colRegex("`__code_.`")) | ||
return DataFrame(_InternalFrame(sdf=sdf)).astype('int').to_numpy().T.tolist() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmmmm... this collects everything into driver side which can be against the design principles ... I think we might have to rather explicitly don't support, and let users call to_pandas()
directly .. ? cc @ueshin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, in that case, we might also rather explicitly not support levels
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
emm, I see. Then if so, what do you suggest to do next? @ueshin @HyukjinKwon
In my work experience, I think most of the time, those method/property related to index are used on columns, and pandas should be enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It requires to collect all data into driver so we shouldn't implement. @charlesdong1991 do you use this property and levels
often? If not, let's explicitly don't support. You can refer #744 to explicitly don't support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for my late reply, didn't have time these days @HyukjinKwon
tbh, i barely use this levels
as well as codes
often, but this can be different across different people and work. I will also drop supports for levels
as well
Softagram Impact Report for pull/952 (head commit: d3fd4c0)⭐ Change Overview
📄 Full report
Impact Report explained. Give feedback on this report to support@softagram.com |
'levels', | ||
reason="'levels' requires to collect all data into the driver which is against the " | ||
"design principle of Koalas. Alternatively, you could call 'to_pandas()' and" | ||
" use 'levels' property in pandas.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ueshin let me merge this one for now .. but we can always flip this decision if I missed anything.
Thanks @charlesdong1991 |
No description provided.