Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assignment to multiple columns only works if they existed before #13658

Closed
jzwinck opened this issue Jul 14, 2016 · 8 comments · Fixed by #29334
Closed

Assignment to multiple columns only works if they existed before #13658

jzwinck opened this issue Jul 14, 2016 · 8 comments · Fixed by #29334
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves
Milestone

Comments

@jzwinck
Copy link
Contributor

jzwinck commented Jul 14, 2016

Assignment to multiple columns of a :class:`DataFrame` when some of the columns do not exist would previously assign the values to the last column. Now, new columns would be constructed with the right values. 

.. ipython:: python

   df = pd.DataFrame({'a': [0, 1, 2], 'b': [3, 4, 5]})
   df

*Previous behavior*:

.. code-block:: ipython

   In [3]: df[['a', 'c']] = 1
   In [4]: df
   Out[4]:
      a  b
   0  1  1
   1  1  1
   2  1  1

*New behavior*:

.. ipython:: python

   df[['a', 'c']] = 1
   df

import pandas as pd
df = pd.DataFrame({'a': [1, 2]})
df['b'] = 3 # creates column 'b'
df[['a', 'b']] = 4 # assigns to columns 'a' and 'b'
df[['c', 'd']] = 5 # KeyError: "['c' 'd'] not in index" !
df[['b', 'c']] = 6 # KeyError: "['c'] not in index" !

I would have expected all the above cases to work, but for some reason the last two fail. New column creation only seems to work for a single column, whilst multiple-column assignment works only if all columns exist.

INSTALLED VERSIONS
python: 3.5.1.final.0
python-bits: 64
pandas: 0.18.1
numpy: 1.11.1

@sinhrks sinhrks added the Indexing Related to indexing on series/frames, not to indexes themselves label Jul 15, 2016
@howsiwei
Copy link
Contributor

howsiwei commented May 16, 2019

In current master (0.25.0.dev0+564.g9c5165e26) multiple-column assignment gives the following error when none of the columns exists:

import pandas as pd
df = pd.DataFrame({'a': [1, 2]})
df[['b', 'c']] = 1
KeyError: "None of [Index(['b', 'c'], dtype='object')] are in the [columns]"

However a warning is given instead when only some of the column do not exist:

df[['a', 'c']] = 1
__main__:1: FutureWarning: 
Passing list-likes to .loc or [] with any missing label will raise
KeyError in the future, you can use .reindex() as an alternative.

See the documentation here:
https://pandas.pydata.org/pandas-docs/stable/indexing.html#deprecate-loc-reindex-listlike

This seems quite inconsistent. Should it be fixed? @jreback

@howsiwei
Copy link
Contributor

@jreback any updates?

@jreback
Copy link
Contributor

jreback commented May 23, 2019

this is an open issue @howsiwei

@howsiwei
Copy link
Contributor

@jreback I'm curious what's the intended behavior and if PR is welcomed.

@jreback
Copy link
Contributor

jreback commented May 23, 2019

@howsiwei likely this is a bug

@howsiwei
Copy link
Contributor

@jreback do you mean that multiple-column assignment should only gives a warning when none of the columns exists?

@jreback
Copy link
Contributor

jreback commented May 25, 2019

multiple column with a scalar assignment should work w/o a warning or error
all of the OP cases should work

@yaelcurl
Copy link

Hi, I have the same issue with multiple rows assignment:

Pandas version 2.4.1

df=pd.DataFrame({'a':[1,2,3,4],'b':[5,6,7,8]})
df.loc[[4,5]]=1

Traceback (most recent call last):

Cell In[97], line 1

df.loc[[4,5]]=1

File ~..\lib\site-packages\pandas\core\indexing.py:881 in setitem

indexer = self._get_setitem_indexer(key)

File ~..\lib\site-packages\pandas\core\indexing.py:764 in _get_setitem_indexer

return self._convert_to_indexer(key, axis=0)

File ~..\lib\site-packages\pandas\core\indexing.py:1484 in _convert_to_indexer

return self._get_listlike_indexer(key, axis)[1]

File ~..\lib\site-packages\pandas\core\indexing.py:1520 in _get_listlike_indexer

keyarr, indexer = ax._get_indexer_strict(key, axis_name)

File ~..\lib\site-packages\pandas\core\indexes\base.py:6115 in _get_indexer_strict

self._raise_if_missing(keyarr, indexer, axis_name)

File ~..\lib\site-packages\pandas\core\indexes\base.py:6176 in _raise_if_missing

raise KeyError(f"None of [{key}] are in the [{axis_name}]")

KeyError: "None of [Index([4, 5], dtype='int32')] are in the [index]"

works for columns:
df[['c','d']]=0

Out[101]:

a b c d

0 0 0 0 0

1 0 0 0 0

2 3 7 0 0

3 4 8 0 0

Is there an easy way to update values for certain indices when some of them exist and some new?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
6 participants