-
-
Notifications
You must be signed in to change notification settings - Fork 19k
Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
# Define a DataFrame
df = pd.DataFrame({
'A': [1, 2, 3],
'B': [4, 5, 6],
'C': [7, 8, 9]
})
# Create a Series for renaming columns, with a non-unique index
rename_series = pd.Series(['X', 'Y', 'Z', 'W'], index=['A', 'B', 'B', 'C'])
# Rename columns using the filtered Series
df.rename(columns=rename_series, inplace=True)
print(df) #TypeError: unhashable type: 'Series'
print(df['X'])#TypeError: cannot convert the series to <class 'int'>
The following extended example shows that dataframe can appear uncorrupted if display is not reaching the problematic column names:
import pandas as pd
import numpy as np
# Define a DataFrame of size 180x200
data = np.random.randint(1, 100, size=(20, 60))
columns = [f'Col_{i}' for i in range(60)]
df = pd.DataFrame(data, columns=columns)
# Create a Series for renaming columns, ensuring all names are unique except for two in the middle
new_names = [f'New_{i}' for i in range(61)]
old_names = [f'Col_{i}' for i in range(30)] + ['Col_29'] + [f'Col_{i}' for i in range(30, 60)]
rename_series = pd.Series(new_names, index=old_names)
# Apply renaming to the DataFrame
df.rename(columns=rename_series, inplace=True)
df #works
df['New_0'] #TypeError: cannot convert the series to <class 'int'>
Issue Description
When renaming dataframe columns with a Series containing duplicates indexing no error is thrown but dataframe is corrupted.
Expected Behavior
It should either produce a valid dataframe like using insteadSeries.to_dict()
would, or throw an error during conversion.
Installed Versions
INSTALLED VERSIONS
commit : d9cdd2e
python : 3.12.3.final.0
python-bits : 64
OS : Darwin
OS-release : 23.4.0
Version : Darwin Kernel Version 23.4.0: Fri Mar 15 00:12:41 PDT 2024; root:xnu-10063.101.17~1/RELEASE_ARM64_T8103
machine : arm64
processor : arm
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.UTF-8
pandas : 2.2.2
numpy : 1.26.4
pytz : 2024.1
dateutil : 2.9.0.post0
setuptools : None
pip : 24.0
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : 1.4.6
psycopg2 : 2.9.9
jinja2 : None
IPython : 8.23.0
pandas_datareader : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : None
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : None
numba : None
numexpr : None
odfpy : None
openpyxl : 3.1.2
pandas_gbq : None
pyarrow : None
pyreadstat : None
python-calamine : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : 2.0.29
tables : None
tabulate : None
xarray : None
xlrd : None
zstandard : None
tzdata : 2024.1
qtpy : None
pyqt5 : None