Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Fixed assign failure when with Copy-on-Write #60941

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

chilin0525
Copy link
Contributor

@chilin0525 chilin0525 commented Feb 16, 2025

@chilin0525
Copy link
Contributor Author

Since there has been no feedback for two weeks, just a friendly ping @mroeschke @rhshadrach @WillAyd , thanks 🙏

Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

Comment on lines 1470 to 1474
# Check that view is modified correctly
expected_view = DataFrame(
{"B": [2, 2, 2, 2], "C": [3, 2, 1, 2]}, index=df.index
)
tm.assert_frame_equal(df, expected_view)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Once the assignment is done, this is no longer a view. Can you change to just "df" instead of "view".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, sovled in fcffbf2

@@ -37,6 +37,7 @@ Other enhancements
updated to work correctly with NumPy >= 2 (:issue:`57739`)
- :meth:`Series.str.decode` result now has ``StringDtype`` when ``future.infer_string`` is True (:issue:`60709`)
- :meth:`~Series.to_hdf` and :meth:`~DataFrame.to_hdf` now round-trip with ``StringDtype`` (:issue:`60663`)
- The :meth:`DataFrame.iloc` now works correctly with ``copy_on_write`` option (:issue:`60309`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give a little more detail here. Perhaps adding

after subsetting the columns of a DataFrame and using a slice

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, Solved in 725b41b

Comment on lines +576 to +580
if isinstance(indexer[1], slice) and indexer[1] == slice(None):
col_indexer = slice(None)
else:
col_indexer = np.arange(len(blk_loc))
self.blocks[0].setitem((indexer[0], col_indexer), value)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me why it's correct to pass through slice(None) but not other cases, e.g. slice(0, 3, 2) or [0, 1].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right. My current changes failed under the following test case. I'll mark the PR as a draft until I properly handle the bug and will also add more test cases. Thanks!

dftest3 = pd.DataFrame(
    {"A": [1, 4, 1, 5], "B": [2, 5, 2, 6], "C": [3, 6, 1, 7], "D": [8, 9, 10, 11]}
)
df3 = dftest3[["B", "C", "D"]]
df3.iloc[[1, 3], 0:3:2] = [[2, 2], [2, 2]]

@chilin0525 chilin0525 marked this pull request as draft March 2, 2025 08:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: assignment fails with copy_on_write = True
2 participants