New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

COMPAT: df.to_dict('index') is not preserving dtypes #18580

Closed
Anton-vBR opened this Issue Nov 30, 2017 · 2 comments

Comments

Projects
None yet
2 participants
@Anton-vBR

Anton-vBR commented Nov 30, 2017

Code Sample

import pandas as pd
from pandas.core.common import standardize_mapping

df = pd.DataFrame(dict(ints=range(5),floats=[float(i) for i in range(5)]))

into_c = standardize_mapping(dict)

print('Using iterrows:')
print(into_c((k, v.to_dict(dict)) for k, v in df.iterrows()))

Problem description

The problem appears when trying to export dataframe to dict using 'index' as param and when that dataframe conatains both float and int. Expected behaviour would be to separate ints and floats (as shown below). iAs recommended by the docs, we could use itertuples instead. I have included a code-snippet that could work. (Disclaimer: Sorry, this might be a pull request but I thought it could fit here since I'm not feeling familiar with the whole process)

Expected Output

{0: {'ints': 0, 'floats': 0.0}, 1: {'ints': 1, 'floats': 1.0}, 2: {'ints': 2, 'floats': 2.0}, 3: {'ints': 3, 'floats': 3.0}, 4: {'ints': 4, 'floats': 4.0}}

print('\nUsing itertuples:')
print(into_c((t[0],dict(zip(df.columns,t[1:]))) for t in df.itertuples()))

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Darwin
OS-release: 14.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.21.0
pytest: 2.9.2
pip: 9.0.1
setuptools: 27.2.0
Cython: 0.24.1
numpy: 1.13.1
scipy: 0.18.1
pyarrow: None
xarray: None
IPython: 5.1.0
sphinx: 1.4.6
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.6.1
blosc: None
bottleneck: 1.1.0
tables: 3.2.3.1
numexpr: 2.6.1
feather: None
matplotlib: 1.5.3
openpyxl: 2.3.2
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.3
lxml: 3.6.4
bs4: 4.5.1
html5lib: None
sqlalchemy: 1.0.13
pymysql: None
psycopg2: None
jinja2: 2.8
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@jreback

This comment has been minimized.

Contributor

jreback commented Dec 1, 2017

@Anton-vBR not terribly averse to changing this if it will work in existing cases and also preserve dtypes. want to submit a PR?

@jreback jreback added this to the Next Major Release milestone Dec 1, 2017

@jreback jreback changed the title from pd.to_dict('index') is not preserving dtypes to COMPT: df.to_dict('index') is not preserving dtypes Dec 1, 2017

@jreback jreback changed the title from COMPT: df.to_dict('index') is not preserving dtypes to COMPAT: df.to_dict('index') is not preserving dtypes Dec 1, 2017

@Anton-vBR

This comment has been minimized.

Anton-vBR commented Dec 1, 2017

@jreback Ok sure, I'll have a look at doing the necessary steps to submit a PR.

@reidy-p reidy-p referenced this issue Mar 21, 2018

Merged

API: Preserve int columns in to_dict('index') #20444

4 of 4 tasks complete

@jreback jreback modified the milestones: Next Major Release, 0.23.0 Mar 22, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment