Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
BUG: DataFrame.insert with allow_duplicates=True fails when already duplicates present #14291
Comments
|
@mbochk That looks like a bug indeed. Thanks for reporting |
jorisvandenbossche
changed the title from
allow_duplicates doesn't work while several duplicates present to BUG: DataFrame.insert with allow_duplicates=True fails when already duplicates present
Sep 23, 2016
jorisvandenbossche
added the
Bug
label
Sep 23, 2016
|
I looked into this and discovered that the problem is in frame.py, in the _sanitize_column method. Here's the relevant code: # broadcast across multiple columns if necessary
if key in self.columns and value.ndim == 1:
if (not self.columns.is_unique or
isinstance(self.columns, MultiIndex)):
existing_piece = self[key]
if isinstance(existing_piece, DataFrame):
value = np.tile(value, (len(existing_piece.columns), 1))On the third time |
|
What happens here is needed when you are setting a certain column (eg |
jreback
added Reshaping Difficulty Intermediate Effort Low
labels
Oct 5, 2016
jreback
added this to the
Next Major Release
milestone
Oct 5, 2016
paul-mannino
referenced
this issue
Oct 10, 2016
Closed
BUG: Fix issue with inserting duplicate columns in a dataframe (GH14291) #14384
paul-mannino
added a commit
to paul-mannino/pandas
that referenced
this issue
Oct 11, 2016
|
|
paul-mannino |
a00f0fe
|
paul-mannino
referenced
this issue
Oct 15, 2016
Closed
BUG: Fix issue with inserting duplicate columns in a dataframe (#14291) #14431
jreback
modified the milestone: 0.19.1, Next Major Release
Oct 19, 2016
paul-mannino
added a commit
to paul-mannino/pandas
that referenced
this issue
Oct 19, 2016
|
|
paul-mannino |
ad06cb4
|
paul-mannino
added a commit
to paul-mannino/pandas
that referenced
this issue
Oct 22, 2016
|
|
paul-mannino |
2698005
|
jreback
closed this
in 2e77536
Oct 24, 2016
jorisvandenbossche
added a commit
to jorisvandenbossche/pandas
that referenced
this issue
Nov 2, 2016
|
|
paul-mannino + jorisvandenbossche |
f77c108
|
amolkahat
added a commit
to amolkahat/pandas
that referenced
this issue
Nov 26, 2016
|
|
paul-mannino + amolkahat |
a49baeb
|
mbochk commentedSep 23, 2016
•
edited by jorisvandenbossche
upon DataFrame.insert option allow_duplicates works, but only only once.
When i have 2 columns with same name, additon of third throws
Code Sample, a copy-pastable example if possible
Expected Output
zxc qwe qwe
0 1 1 1
1 2 2 2
2 3 3 3
3 4 4 4
output of
pd.show_versions()commit: None
python: 2.7.12.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 58 Stepping 9, GenuineIntel
byteorder: little
LC_ALL: None
LANG: ru_RU
pandas: 0.18.1
nose: 1.3.7
pip: 8.1.2
setuptools: 25.1.6
Cython: 0.24.1
numpy: 1.11.1
scipy: 0.18.0
statsmodels: 0.6.1
xarray: None
IPython: 5.1.0
sphinx: 1.4.6
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.6.1
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.6.1
matplotlib: 1.5.3
openpyxl: 2.3.2
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.2
lxml: 3.6.4
bs4: 4.4.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.13
pymysql: 0.7.6.None
psycopg2: None
jinja2: 2.8
boto: 2.40.0
pandas_datareader: None
<\details>