Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implements tail() for DataFrame & Series #1055

Closed
wants to merge 2 commits into from

Conversation

itholic
Copy link
Contributor

@itholic itholic commented Nov 19, 2019

Implements pandas.DataFrame.tail() (https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.tail.html#pandas.DataFrame.tail) and pandas.Series.tail(https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.tail.html#pandas.Series.tail)

For DataFrame

>>> df = ks.DataFrame({'animal':['alligator', 'bee', 'falcon', 'lion',
...                    'monkey', 'parrot', 'shark', 'whale', 'zebra']})
>>> df
      animal
0  alligator
1        bee
2     falcon
3       lion
4     monkey
5     parrot
6      shark
7      whale
8      zebra

>>> df.tail(3)
  animal
8  zebra
7  whale
6  shark

For Series

>>> df = ks.DataFrame({'animal':['alligator', 'bee', 'falcon', 'lion']})
>>> df.animal.tail(2)
3      lion
2    falcon
Name: animal, dtype: object

@codecov-io
Copy link

codecov-io commented Nov 19, 2019

Codecov Report

Merging #1055 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1055      +/-   ##
==========================================
+ Coverage   95.13%   95.13%   +<.01%     
==========================================
  Files          34       34              
  Lines        6765     6767       +2     
==========================================
+ Hits         6436     6438       +2     
  Misses        329      329
Impacted Files Coverage Δ
databricks/koalas/missing/frame.py 100% <ø> (ø) ⬆️
databricks/koalas/missing/series.py 100% <ø> (ø) ⬆️
databricks/koalas/series.py 96.42% <100%> (ø) ⬆️
databricks/koalas/frame.py 96.56% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 221b5a2...07d67b9. Read the comment docs.

@HyukjinKwon
Copy link
Member

Looks okay.

@softagram-bot
Copy link

Softagram Impact Report for pull/1055 (head commit: 07d67b9)

⭐ Change Overview

Showing the changed files, dependency changes and the impact - click for full size
(Open in Softagram Desktop for full details)

💡 Insights

  • Co-change Alert: You modified frame.py. Often test_dataframe.py (koalas/tests) is modified at the same time.

📄 Full report

Impact Report explained. Give feedback on this report to support@softagram.com

This function returns the last `n` rows for the object based
on position. It is useful for quickly verifying data,
for example, after sorting or appending rows.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, I plan to propose a new API tail in Spark itself. If we manage to add it, we could efficiently do that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we hold on for a while? We could leave a note about that if we add it in Spark.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HyukjinKwon ah, that sounds great!! 👍
Okay, then just let me know for fix this after tail will be added on spark,

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR submitted at apache/spark#26809. Let's see how it goes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HyukjinKwon awesome! thanks for reminding me 👍

@HyukjinKwon
Copy link
Member

Now, Apache Spark 3.0 will have an API DataFrame.tail. Let's revisit this when Spark 3.0 is out.

@itholic itholic deleted the s_tail branch February 6, 2020 15:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants