Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: pd.Dataframe.from_dict when first value is a list #29213

Open
MaddeeRubenson opened this issue Oct 24, 2019 · 4 comments
Open

BUG: pd.Dataframe.from_dict when first value is a list #29213

MaddeeRubenson opened this issue Oct 24, 2019 · 4 comments
Labels
Bug Constructors Series/DataFrame/Index/pd.array Constructors Nested Data Data where the values are collections (lists, sets, dicts, objects, etc.).

Comments

@MaddeeRubenson
Copy link

Reproducible example:

import pandas as pd

tmp_scen = {'name': 'Current conditions',
 'actionSet': ['not_irrigated', 'cover_crop_none', 'mar_none'],
 'cost': 0.0,
 'totalNitrogen': 9.649798364504209,
 'totalPhosphorus': 2.4570326616674745,
 'sediment': 1.2039077716577735,
 'appliedWater': 11.607623110151138,
 'infiltrationVolume': 0.7424774065057028,
 'acres': 3.382359912104316,
 'irrType': 'not_irrigated'}

good_result = pd.DataFrame.from_dict(tmp_scen, orient = 'index')

tmp_scen = {'actionSet': ['not_irrigated', 'cover_crop_none', 'mar_none'],
 'name': 'Current conditions',
 'cost': 0.0,
 'totalNitrogen': 9.649798364504209,
 'totalPhosphorus': 2.4570326616674745,
 'sediment': 1.2039077716577735,
 'appliedWater': 11.607623110151138,
 'infiltrationVolume': 0.7424774065057028,
 'acres': 3.382359912104316,
 'irrType': 'not_irrigated'}

error_result = pd.DataFrame.from_dict(tmp_scen, orient = 'index')

Problem description

Error when creating a dataframe from a dictionary when the first item in the dictionary is a list. See reproducible example above.

TypeError: object of type 'float' has no len()

Expected Output

Out[369]: 
                                                             0
actionSet           [not_irrigated, cover_crop_none, mar_none]
name                                        Current conditions
cost                                                         0
totalNitrogen                                           9.6498
totalPhosphorus                                        2.45703
sediment                                               1.20391
appliedWater                                           11.6076
infiltrationVolume                                    0.742477
acres                                                  3.38236
irrType                                          not_irrigated

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.7.3.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 158 Stepping 9, GenuineIntel
byteorder: little
LC_ALL: None
LANG: en
LOCALE: None.None

pandas: 0.24.2
pytest: 4.4.1
pip: 19.0.3
setuptools: 40.8.0
Cython: 0.29.6
numpy: 1.16.2
scipy: 1.2.1
pyarrow: None
xarray: None
IPython: 7.4.0
sphinx: 1.8.5
patsy: 0.5.1
dateutil: 2.8.0
pytz: 2018.9
blosc: None
bottleneck: 1.2.1
tables: 3.5.1
numexpr: 2.6.9
feather: None
matplotlib: 3.0.3
openpyxl: 2.6.1
xlrd: 1.2.0
xlwt: 1.3.0
xlsxwriter: 1.1.5
lxml.etree: 4.3.2
bs4: 4.7.1
html5lib: 1.0.1
sqlalchemy: 1.3.1
pymysql: 0.9.3
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

@WillAyd
Copy link
Member

WillAyd commented Oct 24, 2019

This seems to work fine on master, though not with the result you would expect. It is ambiguous as to whether you mean to get a 1 column data frame or a 3 column data frame and just made a programming mistake with this kind of construction.

Not sure if there is anything to do here. Note you can construct a Series instead if you want 1D

@asishm
Copy link
Contributor

asishm commented Oct 24, 2019

@WillAyd both examples use the same dictionary albeit with a different order of keys.

Edit: a simpler example:

In [25]: pd.DataFrame.from_dict({'a': [1,2,3], 'b': 1}, orient='index')
# TypeError: object of type 'int' has no len()
In [26]: pd.DataFrame.from_dict({'b': 1, 'a': [1,2,3]}, orient='index')
Out[26]:
           0
b          1
a  [1, 2, 3]

Note that the initial object itself is the same in both examples. Python dicts aren't guaranteed insertion order until 3.7 (iirc).

@jbrockmendel jbrockmendel added the Constructors Series/DataFrame/Index/pd.array Constructors label Oct 30, 2019
@matttan90
Copy link
Contributor

Hey! I'm having a go at this.

@jreback jreback added the Reshaping Concat, Merge/Join, Stack/Unstack, Explode label Nov 3, 2019
@jreback jreback added this to the 1.0 milestone Nov 3, 2019
@TomAugspurger TomAugspurger removed this from the 1.0 milestone Jan 8, 2020
@mroeschke mroeschke added Bug and removed Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Jun 28, 2020
@jbrockmendel jbrockmendel added List-Like Scalars Nested Data Data where the values are collections (lists, sets, dicts, objects, etc.). and removed List-Like Scalars labels Sep 21, 2020
@Dr4cky
Copy link

Dr4cky commented Jan 20, 2023

Wow, I did not think changing the order would fix my issue, but it did. Having a list as the very first value in a dictionary gave me the 'object of type 'float' has no len()' error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Constructors Series/DataFrame/Index/pd.array Constructors Nested Data Data where the values are collections (lists, sets, dicts, objects, etc.).
Projects
None yet
Development

No branches or pull requests

9 participants