Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug when iloc slice terminates at before-the-zero position #7277

Merged
merged 9 commits into from
Feb 5, 2021

Conversation

isVoid
Copy link
Contributor

@isVoid isVoid commented Feb 1, 2021

Closes #7246

This PR fixes a bug in Dataframe.iloc. When the slice provided to iloc, is decrementing and also terminates at before-the-zero position, such as slice(2, -1, -1) or slice(4, None, -1), the terminal position still gets wrapped around.

Frame._slice is moved to DataFrame._slice to resolve typing issue.

@isVoid isVoid added bug Something isn't working 3 - Ready for Review Ready for review by team Python Affects Python cuDF API. non-breaking Non-breaking change labels Feb 1, 2021
@isVoid isVoid self-assigned this Feb 1, 2021
@isVoid isVoid requested a review from a team as a code owner February 1, 2021 22:55
kkraus14
kkraus14 previously approved these changes Feb 2, 2021
@kkraus14
Copy link
Collaborator

kkraus14 commented Feb 2, 2021

@isVoid can you retarget this to 0.18? This is a nice bugfix with minimal changes that would be good to get in.

@isVoid isVoid changed the base branch from branch-0.19 to branch-0.18 February 2, 2021 02:38
@isVoid
Copy link
Contributor Author

isVoid commented Feb 2, 2021

[Updated] Moving ._slice from Frame.py to DataFrame.py to resolve for a typing issue.

@@ -831,6 +833,66 @@ def __sizeof__(self):
index = self._index.__sizeof__()
return columns + index

def _slice(self: T, arg: slice) -> T:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this not get used underneath Series.iloc[slice(...)] as well?

Copy link
Contributor Author

@isVoid isVoid Feb 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Series.iloc seems to depend on ColumnBase.__getitem__, which later calls ColumnBase.slice.

data = self._sr._column[arg]

I also noticed the following:

>>> s = cudf.Series([1,2,3,4,5])
>>> s.iloc[4:-1:-1]
Series([], dtype: int64)

which is likely the same issue, investigating.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the same issue in ColumnBase.slice.

if stop < 0:
stop = stop + len(self)

Submitted a quick fix. Although I think the best fix here is to consolidate Series.iloc and DataFrame.iloc to share the common code path under Frame._slice.

@kkraus14 kkraus14 added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team 5 - Ready to Merge Testing and reviews complete, ready to merge labels Feb 2, 2021
@kkraus14 kkraus14 dismissed their stale review February 2, 2021 21:58

Additional changes based on Series.slice

@isVoid
Copy link
Contributor Author

isVoid commented Feb 3, 2021

rerun tests

@kkraus14 kkraus14 added the 5 - Ready to Merge Testing and reviews complete, ready to merge label Feb 4, 2021
@kkraus14
Copy link
Collaborator

kkraus14 commented Feb 4, 2021

@gpucibot merge

@codecov
Copy link

codecov bot commented Feb 4, 2021

Codecov Report

Merging #7277 (05a8979) into branch-0.18 (8860baf) will increase coverage by 0.10%.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.18    #7277      +/-   ##
===============================================
+ Coverage        82.09%   82.19%   +0.10%     
===============================================
  Files               97      100       +3     
  Lines            16474    16954     +480     
===============================================
+ Hits             13524    13936     +412     
- Misses            2950     3018      +68     
Impacted Files Coverage Δ
python/cudf/cudf/__init__.py 100.00% <ø> (ø)
python/cudf/cudf/_fuzz_testing/avro.py 0.00% <ø> (ø)
python/cudf/cudf/_fuzz_testing/csv.py 0.00% <ø> (ø)
python/cudf/cudf/_fuzz_testing/fuzzer.py 0.00% <ø> (ø)
python/cudf/cudf/_fuzz_testing/io.py 0.00% <ø> (ø)
python/cudf/cudf/_fuzz_testing/json.py 0.00% <ø> (ø)
python/cudf/cudf/_fuzz_testing/main.py 0.00% <ø> (ø)
python/cudf/cudf/_fuzz_testing/orc.py 0.00% <ø> (ø)
python/cudf/cudf/_fuzz_testing/parquet.py 0.00% <ø> (ø)
python/cudf/cudf/_fuzz_testing/utils.py 0.00% <ø> (ø)
... and 74 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e2f6952...05a8979. Read the comment docs.

@isVoid
Copy link
Contributor Author

isVoid commented Feb 4, 2021

rerun tests

@galipremsagar
Copy link
Contributor

@isVoid There seem to be some conflicts, could you resolve them?

@rapids-bot rapids-bot bot merged commit 0410a36 into rapidsai:branch-0.18 Feb 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge bug Something isn't working non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] iloc not working properly
3 participants