Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Pandas changes dtypes of columns when no float (or other) assignments are done to this column #34573

Closed
volsend opened this issue Jun 4, 2020 · 1 comment · Fixed by #34599
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Indexing Related to indexing on series/frames, not to indexes themselves
Milestone

Comments

@volsend
Copy link

volsend commented Jun 4, 2020

df = pd.DataFrame()
df["a"] = int(0)

for col_name in ["b", "c"]:
    df[col_name] = 0.

print(df.dtypes)
    
for idx, b in enumerate([1,2,3]):
    df.loc[df.shape[0]] = {
        'a' : int(idx),
        'b' : float(b),
        'c' : float(b),
    }

print(df.dtypes)

Problem description

In the above example, pandas changes the dtype of the column a of dataframe df from int64 to float64, while there is no operation, that assigns non-integer value to this column.

Expected Output

Currently, the output of this code is:

a      int64
b    float64
c    float64
dtype: object
a    float64
b    float64
c    float64
dtype: object

while the expected one is:

a      int64
b    float64
c    float64
dtype: object
a      int64
b    float64
c    float64
dtype: object
pd.show_versions() output
INSTALLED VERSIONS
------------------
commit           : None
python           : 3.6.9.final.0
python-bits      : 64
OS               : Linux
OS-release       : 4.15.0-101-generic
machine          : x86_64
processor        : x86_64
byteorder        : little
LC_ALL           : None
LANG             : C.UTF-8
LOCALE           : en_US.UTF-8

pandas           : 1.0.4
numpy            : 1.18.1
pytz             : 2019.3
dateutil         : 2.8.1
pip              : 20.1.1
setuptools       : 44.0.0
Cython           : None
pytest           : None
hypothesis       : None
sphinx           : None
blosc            : None
feather          : None
xlsxwriter       : None
lxml.etree       : 4.5.0
html5lib         : None
pymysql          : None
psycopg2         : None
jinja2           : 2.10.3
IPython          : 7.11.1
pandas_datareader: None
bs4              : None
bottleneck       : None
fastparquet      : None
gcsfs            : None
lxml.etree       : 4.5.0
matplotlib       : 3.1.2
numexpr          : None
odfpy            : None
openpyxl         : None
pandas_gbq       : None
pyarrow          : None
pytables         : None
pytest           : None
pyxlsb           : None
s3fs             : None
scipy            : 1.4.1
sqlalchemy       : None
tables           : None
tabulate         : None
xarray           : None
xlrd             : 1.2.0
xlwt             : None
xlsxwriter       : None
numba            : None
@volsend volsend added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jun 4, 2020
@jorisvandenbossche
Copy link
Member

My guess is that we might convert the dictionary into a Series first (which would turn it into a float series). Just a guess though, didn't actually check.

simonjayhawkins added a commit to simonjayhawkins/pandas that referenced this issue Jun 5, 2020
@simonjayhawkins simonjayhawkins added Dtype Conversions Unexpected or buggy dtype conversions Indexing Related to indexing on series/frames, not to indexes themselves and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Jun 5, 2020
@jreback jreback added this to the 1.1 milestone Jun 9, 2020
jreback pushed a commit that referenced this issue Jun 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
4 participants