BUG: dataframe setitem error on size incompatible assignment #41193

Mxchaeltrxn · 2021-04-28T09:55:56Z

closes BUG: Unexpected behavior when assigning multi-dimensional array to DataFrame column #40827
tests added / passed
Ensure all linting tests pass, see here for how to run them

I've labeled the error something silly but that can be fixed later.

This pull request replaces this older pr #41008

I've put all my tests inside one file for easier review.

Issues

I'm unsure how to stop this test from failing test_pivot_table_doctest_case Could you help with this?

This test (test_invalid_colormap) also fails but when I rerun it individually, it passes ...

I also had an issue with flake8 and black not working together ... in my most recent commit so I just committed it with --no-verify.

Mxchaeltrxn · 2021-04-28T23:01:33Z

I may have worked something out so don't review this yet.

Mxchaeltrxn · 2021-04-29T03:33:29Z

I've gotten test_pivot_table_doctest_case to pass.

jreback

don't add a new test file. rather somewhere in tests/frame/indexing

jreback · 2021-04-30T21:03:59Z

pandas/core/internals/managers.py

@@ -1233,6 +1233,12 @@ def value_getitem(placement):
            blk = self.blocks[blkno]
            blk_locs = blklocs[val_locs.indexer]
            if blk.should_store(value):
+                if (
+                    value.shape[0] != 1


probably better to have .should_store return False in this case and have it hit the other path? or is that incorreect

If I understand you correctly, I think it should stay as it is because the below test case fails:

def test_setitem_size_incompatible_ndarray2(arr): data = DataFrame( [[1, "A", 1.0], [2, "B", 2.0], [3, "C", 3.0], [4, "D", 4.0]], columns=["A", "A", "B"], ) msg = "Errored123" with pytest.raises(ValueError, match=msg): # data is of shape (4,2) but we are assigning it to shape (4,4) data["A"] = np.random.randn(4, 4)

It fails because it goes down the other path (where blk.should_store is false) and no error is thrown despite assigning the column(s) to data of a different shape. I haven't handled this yet because I wanted to make sure the logic was ok so far (sorry I should have mentioned this earlier).

I think that .should_store is probably not the best description of what it does, (based on my understanding of the method).

Mxchaeltrxn · 2021-05-01T10:33:53Z

@jreback Thanks for the review. I will move the tests and rename them later. How do I make sure that I'm not duplicating tests ...?

jreback · 2021-05-12T01:34:12Z

@jreback Thanks for the review. I will move the tests and rename them later. How do I make sure that I'm not duplicating tests ...?

IIUC, you are essentially narrowing a case, so either try to find the tests that exercise that part of the code (might) be hard, or simply add in the logical place; by-definition you won't be duplicating.

Mxchaeltrxn · 2021-05-26T22:01:54Z

@jreback Sorry I've been a bit busy. I'll return to this on the weekend when I have some time.

Mxchaeltrxn · 2021-05-30T11:50:30Z

@jreback Hey so I've been trying to work on this but I've been getting an error. Do you have any ideas?
Steps to reproduce:

git fetch upstream
git merge upstream/master
conda env update -n pandas-dev --file environment.yml --prune
python setup.py build_ext -j 4

I get this strange error:

  File "setup.py", line 249
    f"{extension}-source file '{sourcefile}' not found.\n"
                                                         ^
SyntaxError: invalid syntax

I'm not really sure why this is happening when it wasn't before. I'm using the new M1 chip on the mac if that matters.

Also, if I get this issue in the future, where should I be putting it? I don't think should be raising a Github issue for development build errors.

github-actions · 2021-06-30T00:02:24Z

This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this.

Mxchaeltrxn · 2021-07-02T13:03:43Z

This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this.

I will have a look and see if I can get this repository running in the next few days.

Mxchaeltrxn · 2021-08-08T01:02:35Z

Closing as I do not want to work on this anymore.

Mxchaeltrxn mentioned this pull request Apr 28, 2021

Fix setting dataframe column to 2d array with more than one col #41008

Closed

3 tasks

Mxchaeltrxn changed the title ~~FIX: dataframe setitem error on size incompatible assignment~~ BUG: dataframe setitem error on size incompatible assignment Apr 28, 2021

BUG: dataframe setitem error on size incompatible assignment

afb2200

Mxchaeltrxn force-pushed the fix/DfSingleColIndex branch from 195549e to afb2200 Compare April 28, 2021 10:02

Mxchaeltrxn added 2 commits April 29, 2021 11:15

BUG: Change conditional in frame setitem

4884272

WIP: Add failing tests (behaviour not yet implemented)

8aa72aa

jreback requested changes Apr 30, 2021

View reviewed changes

jreback added the Indexing Related to indexing on series/frames, not to indexes themselves label Apr 30, 2021

simonjayhawkins added this to the 1.3 milestone May 25, 2021

jreback removed this from the 1.3 milestone May 26, 2021

github-actions bot added the Stale label Jun 30, 2021

Mxchaeltrxn closed this Aug 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: dataframe setitem error on size incompatible assignment #41193

BUG: dataframe setitem error on size incompatible assignment #41193

Mxchaeltrxn commented Apr 28, 2021 •

edited

Mxchaeltrxn commented Apr 28, 2021

Mxchaeltrxn commented Apr 29, 2021

jreback left a comment

jreback Apr 30, 2021

Mxchaeltrxn May 1, 2021

Mxchaeltrxn May 1, 2021

Mxchaeltrxn commented May 1, 2021

jreback commented May 12, 2021

Mxchaeltrxn commented May 26, 2021

Mxchaeltrxn commented May 30, 2021

github-actions bot commented Jun 30, 2021

Mxchaeltrxn commented Jul 2, 2021

Mxchaeltrxn commented Aug 8, 2021

BUG: dataframe setitem error on size incompatible assignment #41193

BUG: dataframe setitem error on size incompatible assignment #41193

Conversation

Mxchaeltrxn commented Apr 28, 2021 • edited

Issues

Mxchaeltrxn commented Apr 28, 2021

Mxchaeltrxn commented Apr 29, 2021

jreback left a comment

Choose a reason for hiding this comment

jreback Apr 30, 2021

Choose a reason for hiding this comment

Mxchaeltrxn May 1, 2021

Choose a reason for hiding this comment

Mxchaeltrxn May 1, 2021

Choose a reason for hiding this comment

Mxchaeltrxn commented May 1, 2021

jreback commented May 12, 2021

Mxchaeltrxn commented May 26, 2021

Mxchaeltrxn commented May 30, 2021

github-actions bot commented Jun 30, 2021

Mxchaeltrxn commented Jul 2, 2021

Mxchaeltrxn commented Aug 8, 2021

Mxchaeltrxn commented Apr 28, 2021 •

edited