We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
If you have a DataFrame with a repeated or non-unique column, then some assignments fail.
df = pd.DataFrame(np.random.randn(10,2), columns=['that', 'that']) df2 Out[10]: that that 0 1 1 1 1 1 2 1 1 3 1 1 4 1 1 5 1 1 6 1 1 7 1 1 8 1 1 9 1 1 [10 rows x 2 columns]
This is float data and the following works:
df['that'] = 1.0
However, this fails with an error and breaks the dataframe (e.g. a subsequent repr will also fail.)
df2['that'] = 1 Traceback (most recent call last): File "/users/is/dbew/pyenvs/timeseries/lib/python2.7/site-packages/ipython-1.1.0_1_ahl1-py2.7.egg/IPython/core/interactiveshell.py", line 2830, in run_code exec code_obj in self.user_global_ns, self.user_ns File "<ipython-input-11-8701f5b0efe4>", line 1, in <module> df2['that'] = 1 File "/users/is/dbew/pyenvs/timeseries/lib/python2.7/site-packages/pandas-0.13.0_292_g4dcecb0-py2.7-linux-x86_64.egg/pandas/core/frame.py", line 1879, in __setitem__ self._set_item(key, value) File "/users/is/dbew/pyenvs/timeseries/lib/python2.7/site-packages/pandas-0.13.0_292_g4dcecb0-py2.7-linux-x86_64.egg/pandas/core/frame.py", line 1960, in _set_item NDFrame._set_item(self, key, value) File "/users/is/dbew/pyenvs/timeseries/lib/python2.7/site-packages/pandas-0.13.0_292_g4dcecb0-py2.7-linux-x86_64.egg/pandas/core/generic.py", line 1057, in _set_item self._data.set(key, value) File "/users/is/dbew/pyenvs/timeseries/lib/python2.7/site-packages/pandas-0.13.0_292_g4dcecb0-py2.7-linux-x86_64.egg/pandas/core/internals.py", line 2968, in set _set_item(item, arr[None, :]) File "/users/is/dbew/pyenvs/timeseries/lib/python2.7/site-packages/pandas-0.13.0_292_g4dcecb0-py2.7-linux-x86_64.egg/pandas/core/internals.py", line 2927, in _set_item self._add_new_block(item, arr, loc=None) File "/users/is/dbew/pyenvs/timeseries/lib/python2.7/site-packages/pandas-0.13.0_292_g4dcecb0-py2.7-linux-x86_64.egg/pandas/core/internals.py", line 3108, in _add_new_block new_block = make_block(value, self.items[loc:loc + 1].copy(), TypeError: unsupported operand type(s) for +: 'slice' and 'int'
I stepped through the code and it looked like most places handle repeated columns ok except the code that reallocates arrays when the dtype changes.
I've tested this against pandas 0.13.0 and the latest master. Here's the output of installed versions when running on the master:
commit: None python: 2.7.3.final.0 python-bits: 64 OS: Linux OS-release: 2.6.18-308.el5 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB
pandas: 0.13.0-292-g4dcecb0 Cython: 0.16 numpy: 1.7.1 scipy: 0.9.0 statsmodels: None patsy: None scikits.timeseries: None dateutil: 1.5 pytz: None bottleneck: 0.6.0 tables: 2.3.1-1 numexpr: 2.0.1 matplotlib: 1.1.1 openpyxl: None xlrd: 0.8.0 xlwt: None xlsxwriter: None sqlalchemy: None lxml: 2.3.6 bs4: None html5lib: None bq: None apiclient: None
The text was updated successfully, but these errors were encountered:
looks like an untested case; i'll take look
Sorry, something went wrong.
was an untested / missing case...thanks
Wow, that was fast! Thanks very much.
Successfully merging a pull request may close this issue.
If you have a DataFrame with a repeated or non-unique column, then some assignments fail.
This is float data and the following works:
However, this fails with an error and breaks the dataframe (e.g. a subsequent repr will also fail.)
I stepped through the code and it looked like most places handle repeated columns ok except the code that reallocates arrays when the dtype changes.
I've tested this against pandas 0.13.0 and the latest master. Here's the output of installed versions when running on the master:
commit: None
python: 2.7.3.final.0
python-bits: 64
OS: Linux
OS-release: 2.6.18-308.el5
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_GB
pandas: 0.13.0-292-g4dcecb0
Cython: 0.16
numpy: 1.7.1
scipy: 0.9.0
statsmodels: None
patsy: None
scikits.timeseries: None
dateutil: 1.5
pytz: None
bottleneck: 0.6.0
tables: 2.3.1-1
numexpr: 2.0.1
matplotlib: 1.1.1
openpyxl: None
xlrd: 0.8.0
xlwt: None
xlsxwriter: None
sqlalchemy: None
lxml: 2.3.6
bs4: None
html5lib: None
bq: None
apiclient: None
The text was updated successfully, but these errors were encountered: