New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Series.item #1502
Implement Series.item #1502
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1502 +/- ##
==========================================
- Coverage 93.93% 93.90% -0.04%
==========================================
Files 36 36
Lines 8445 8448 +3
==========================================
Hits 7933 7933
- Misses 512 515 +3
Continue to review full report at Codecov.
|
item_top_two = self[:2] | ||
if len(item_top_two) != 1: | ||
raise ValueError("can only convert an array of size 1 to a Python scalar") | ||
return item_top_two[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like this still runs Spark jobs twice? We should explicitly call to_pandas()
or collect()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I'll check and fix it. Thanks !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@HyukjinKwon
Yeah, I should have considered much more carefully.
Thanks for reminding me once again, @HyukjinKwon ! I must keep that in mind.
According to the comment #1502 (comment), fixed `Series.item` to run a Spark job not twice, once.
According to the comment databricks/koalas#1502 (comment), fixed `Series.item` to run a Spark job not twice, once.
This PR proposes
Series.item