Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CoW: Return read-only array in Index.values #53704

Merged
merged 9 commits into from
Jun 20, 2023

Conversation

phofl
Copy link
Member

@phofl phofl commented Jun 16, 2023

  • closes #xxxx (Replace xxxx with the GitHub issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added type annotations to new arguments/methods/functions.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Comment on lines 63 to 65
if using_copy_on_write:
with pytest.raises(ValueError, match="assignment"):
idx.values[0] = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This defeats a bit the purpose of the original test (and I don't think we need to test here that assigning into a read-only numpy array gives an error).
So maybe just remove it? (we already check np.shares_memory) Or manually set the writeable flag to True and then assign the value.

But actually, now that we keep track of references to Index data as well, the original setitem doesn't really need to do a copy, I think? (for another issue/PR)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah let's rip it out

Comment on lines 350 to 352
if not using_copy_on_write:
arr[0] = val
assert new_index[0] != val
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly here, we want to test that the copy=True keyword is honored (I assume, based on the comment). So maybe add a copy() to the line creating arr above: arr = index.values.copy(). Then arr is writeable, and the test can be done as normal (ensuring that arr was actually copied when creating the Index.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense

check_names = engine != "fastparquet"
if using_copy_on_write and engine == "fastparquet":
request.node.add_marker(
pytest.mark.xfail(reason="fastparquet write into index")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does fastparquet exactly? (it tries to write into the array it gets from the index? Do you know why? (that sounds as a bug in fastparquet, as it can change the dataframe you are writing?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can simply set the flag there, they seem to use the index to do a conversion. It's already on my todo list, but don't want to block this pr because of that

@jorisvandenbossche jorisvandenbossche added Index Related to the Index class or subclasses Copy / view semantics labels Jun 19, 2023
# Conflicts:
#	pandas/tests/copy_view/index/test_index.py
@phofl
Copy link
Member Author

phofl commented Jun 20, 2023

thx for noticing, rebased

@jorisvandenbossche jorisvandenbossche merged commit ed4ea1a into pandas-dev:main Jun 20, 2023
27 of 30 checks passed
@phofl phofl deleted the index_values branch June 20, 2023 15:11
canthonyscott pushed a commit to canthonyscott/pandas-anthony that referenced this pull request Jun 23, 2023
Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Daquisu pushed a commit to Daquisu/pandas that referenced this pull request Jul 8, 2023
Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Copy / view semantics Index Related to the Index class or subclasses
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants