Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix window functions to resolve. #2090

Merged
merged 3 commits into from Mar 8, 2021

Conversation

ueshin
Copy link
Collaborator

@ueshin ueshin commented Mar 5, 2021

After the window functions, we should always resolve the InternalFrame; otherwise the window will be applied later, causing a weird behavior:

>>> kdf = ks.DataFrame({"Col1": [10, 20, 15, 30, 45], "Col2": [13, 23, 18, 33, 48], "Col3": [17, 27, 22, 37, 52]})
>>> kdf
   Col1  Col2  Col3
0    10    13    17
1    20    23    27
2    15    18    22
3    30    33    37
4    45    48    52
>>> kdf['Col1'].shift().loc[kdf['Col1'] == 20]
1   NaN
Name: Col1, dtype: float64

This should be:

>>> pdf = kdf.to_pandas()
>>> pdf['Col1'].shift().loc[pdf['Col1'] == 20]
1    10.0
Name: Col1, dtype: float64

Resolves #2078.

@ueshin ueshin requested a review from HyukjinKwon March 5, 2021 23:49
@codecov-io
Copy link

codecov-io commented Mar 6, 2021

Codecov Report

Merging #2090 (20120e3) into master (becd789) will decrease coverage by 0.01%.
The diff coverage is 95.45%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2090      +/-   ##
==========================================
- Coverage   94.55%   94.53%   -0.02%     
==========================================
  Files          57       57              
  Lines       13227    13236       +9     
==========================================
+ Hits        12507    12513       +6     
- Misses        720      723       +3     
Impacted Files Coverage Δ
databricks/koalas/groupby.py 91.63% <ø> (ø)
databricks/koalas/spark/accessors.py 94.64% <90.00%> (-0.33%) ⬇️
databricks/koalas/base.py 97.53% <100.00%> (ø)
databricks/koalas/frame.py 96.61% <100.00%> (ø)
databricks/koalas/series.py 96.78% <100.00%> (ø)
databricks/conftest.py 93.75% <0.00%> (-3.13%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update becd789...20120e3. Read the comment docs.

@HyukjinKwon HyukjinKwon merged commit 44bbdb6 into databricks:master Mar 8, 2021
@ueshin ueshin deleted the resolve_window_functions branch March 8, 2021 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Out of Synchronization operations with shift
3 participants