Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better error messages #497

Closed
jankatins opened this issue Dec 16, 2011 · 3 comments

Comments

@jankatins
Copy link
Contributor

commented Dec 16, 2011

I'm trying to build a dataframe from a list of list and I get an "AssertionError" and I have no idea what I did wrong or what I should do differently. A shorted version of the data and code:

pprint(_fields)
[ 'name',
'timescited',
'weight',
'closeness',
'betweenness',
'degree',
'numberofworks',
'pagerank',
'pages',
'constraint',
'citationaverage',
'eigenvector']

pprint(_data2)
[ ['Huselid, Ma', 'Gulati, R', 'Damanpour, F', 'Mcallister, Dj', 'Tsai, Wp'],
[2721, 5251, 1269, 1287, 2834],
[6, 17, 6, 3, 6],
[ 0.0002510562854116362,
0.00025108339541739277,
0.00025105435150127433,
0.00025106898828104144,
0.0002510723039607744],
[ 23311.0,
173596.65383408728,
79279.82282582425,
18701.108026093425,
57261.74716265692],
[4, 11, 9, 3, 7],
[6, 17, 6, 3, 6],
[ 0.0001438440250079284,
0.0003063098736173118,
0.00027071793856870986,
7.839463995668608e-05,
0.00020047003509480145],
[130, 387, 148, 68, 88],
[ 0.28573421556122447,
0.12411566040831735,
0.19959178902739988,
0.3703749360800643,
0.200788126303937],
[453, 308, 211, 429, 472],
[ 5.889013732612275e-08,
0.0005066654664776551,
4.567816673280219e-07,
2.4226314239685797e-06,
4.069482619394397e-07]]

df = DataFrame(_data2, columns=_fields )
Traceback (most recent call last):
File "", line 1, in
File "C:\portabel\Python27\lib\site-packages\pandas\core\frame.py", line 208, in init
copy=copy)
File "C:\portabel\Python27\lib\site-packages\pandas\core\frame.py", line 273, in _init_ndarray
block = make_block(values.T, columns, columns)
File "C:\portabel\Python27\lib\site-packages\pandas\core\internals.py", line 211, in make_block
do_integrity_check=do_integrity_check)
File "C:\portabel\Python27\lib\site-packages\pandas\core\internals.py", line 26, in init
assert(len(items) == len(values))
AssertionError

It would be nice if pandas could output a more descriptive error message in case some inputs are not what pandas expects.

Thanks a lot for the lib!

Jan

@jankatins

This comment has been minimized.

Copy link
Contributor Author

commented Dec 16, 2011

I found the problem after some try&error: _data2 needs to be a dict... The problem about the unhelpfull error message remains :-)

@wesm

This comment has been minimized.

Copy link
Member

commented Dec 18, 2011

ok, here's the new exception:

In [3]: DataFrame(data2, columns=fields)
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
/home/wesm/code/pandas/<ipython-input-3-7de2f1fbdd32> in <module>()
----> 1 DataFrame(data2, columns=fields)

/home/wesm/code/pandas/pandas/core/frame.pyc in __init__(self, data, index, columns, dtype, copy)
    209         elif isinstance(data, list):
    210             if len(data) > 0 and isinstance(data[0], (list, tuple)):
--> 211                 data, columns = _list_to_sdict(data, columns)
    212                 mgr = self._init_dict(data, index, columns, dtype=dtype)
    213             else:

/home/wesm/code/pandas/pandas/core/frame.pyc in _list_to_sdict(data, columns)
   3557         if len(columns) != len(content):
   3558             raise AssertionError('%d columns passed, passed data had %s '
-> 3559                                  'columns' % (len(columns), len(content)))
   3560 
   3561     sdict = dict((c, lib.maybe_convert_objects(vals))

AssertionError: 12 columns passed, passed data had 5 columns

for this data you'd want to do:


In [7]: DataFrame(zip(*data2), columns=fields)
Out[7]: 
<class 'pandas.core.frame.DataFrame'>
Int64Index: 5 entries, 0 to 4
Data columns:
name               5  non-null values
timescited         5  non-null values
weight             5  non-null values
closeness          5  non-null values
betweenness        5  non-null values
degree             5  non-null values
numberofworks      5  non-null values
pagerank           5  non-null values
pages              5  non-null values
constraint         5  non-null values
citationaverage    5  non-null values
eigenvector        5  non-null values
dtypes: int64(6), float64(5), object(1)

because you have the columns in the rows, zip(*data) transposes a list of lists

@wesm wesm closed this Dec 18, 2011

@jankatins

This comment has been minimized.

Copy link
Contributor Author

commented Dec 19, 2011

Thanks a lot for this change!

I'm still not sure if I would have understood the error. How about something like "12 column name(s) passed, but passed data had 5 columns." or even a added"... If you passed in a list of list, maybe transposing the data with zip(*data) helps."?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.