Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting empty column on empty DataFrame #5632

Closed
FragLegs opened this issue Dec 2, 2013 · 6 comments · Fixed by #5633

Comments

@FragLegs
Copy link
Contributor

commented Dec 2, 2013

This issue appears in 0.13.0-rc1 and did not appear in 0.12.0-854-gde63e00

Let us say we have an empty DataFrame. Then the following calls (which worked in pandas 0.12) no longer work:

df = pd.DataFrame()
df['foo'] = []

df = pd.DataFrame()
df['foo'] = df.index

df = pd.DataFrame()
df['foo'] = range(len(df))

They all throw an exception:

File "/usr/local/lib/python3.2/dist-packages/pandas-0.13.0rc1-py3.2-linux-x86_64.egg/pandas/core/frame.py", line 1855, in __setitem__
    self._set_item(key, value)
  File "/usr/local/lib/python3.2/dist-packages/pandas-0.13.0rc1-py3.2-linux-x86_64.egg/pandas/core/frame.py", line 1915, in _set_item
    self._ensure_valid_index(value)

File "/usr/local/lib/python3.2/dist-packages/pandas-0.13.0rc1-py3.2-linux-x86_64.egg/pandas/core/frame.py", line 1899, in _ensure_valid_index
    raise ValueError('Cannot set a frame with no defined index '
ValueError: Cannot set a frame with no defined index and a non-series

The following will work:

df = pd.DataFrame()
df['foo'] = pd.Series([])

df = pd.DataFrame()
df['foo'] = pd.Series(df.index)

df = pd.DataFrame()
df['foo'] = pd.Series(range(len(df)))

The issue appears to be on lines 1897-1899 of pandas.core.frame:

if not len(self.index):
    if not isinstance(value, Series):
        raise ValueError('Cannot set a frame with no defined index '

Perhaps it could be changed to:

if not len(self.index) and len(value) > 0:
    if not isinstance(value, Series):
        raise ValueError('Cannot set a frame with no defined index '
@jreback

This comment has been minimized.

Copy link
Contributor

commented Dec 2, 2013

What is an actual usecase of doing this?

@FragLegs

This comment has been minimized.

Copy link
Contributor Author

commented Dec 2, 2013

I stumbled upon it because I do a merge and then create a new column which is a copy of the index (for further manipulation). Sometimes that merge leaves me with an empty DataFrame.

More generally, I might want to create an auto-numbered column with df['foo'] = range(len(df)). With the recent code change, I either have to change that to df['foo'] = pd.Series(range(len(df))) or do a check to ensure that df is not empty.

@jreback

This comment has been minimized.

Copy link
Contributor

commented Dec 2, 2013

you don't need to auto number
the index if you don't set it IS an auto numbered index (and if u have a different index, you can just df.reset_index())
no need to actually have a numbering

@jreback

This comment has been minimized.

Copy link
Contributor

commented Dec 2, 2013

can you try with #5633 should be ok fro all of the examples above

@jreback jreback closed this in #5633 Dec 3, 2013

@FragLegs

This comment has been minimized.

Copy link
Contributor Author

commented Dec 3, 2013

Thank you! All test cases listed above work. Your solution is much better than the one I proposed.

@jreback

This comment has been minimized.

Copy link
Contributor

commented Dec 3, 2013

gr8...thanks for the report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.