-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added support for constructing a table from a list of dicts #901
Conversation
@ryanfox - thanks for getting started on this. [EDIT: removed comment about _init_from_dict] |
try: | ||
cols[name].append(row[name]) | ||
except KeyError: | ||
cols[name].append(None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can do this kind of logic more cleanly with
cols[name].append(row.get(name, None))
However in this case there is not yet support for doing anything with the None
value that gets put in the list. This will break something downstream in a way that will be harder for users to figure out. So I would suggest that after the except KeyError:
you should raise an exception like:
raise ValueError('Row {0} is missing a value for column {1!r}'.format(i_row, name))
You would also need to do for i_row, row in enumerate(data):
in the loop above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. One question: in the second format spot, why do you have !r, not just {1}. I'm not sure why a dict key would need any special formatting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The {1!r}
is just a quick way to make the column name enclosed in quotes so it is more distinctive in the error message. But I just realized this may look worse for unicode, so just skip it and use {1}
.
Thanks for the input! I went ahead and implemented your suggestions. I've got a commit ready, but wondering if I should rebase to squash the two commits. Also, I'm not sure if I understand that Travis build page. It looks like it's running different combinations of Python and numpy? Per the testing guidelines I have all the tests passing (excepting 98 skipped and 3 xfailed) after a |
I went ahead and updated my changes with your suggestions. I squashed the commits and re-pushed as well. |
@@ -1051,6 +1054,23 @@ def _init_from_list(self, data, names, dtypes, n_cols, copy): | |||
|
|||
cols = [] | |||
def_names = _auto_names(n_cols) | |||
|
|||
if data and all(isinstance(row, dict) for row in data): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possible to specify the column order explicitly via _init_from_dict
. Something like the following should work:
Instead of names = set()
use names_from_data = set()
. Then:
if names is None:
names = sorted(names_from_data)
That does two things. First it defers to the user-supplied names
, which then must provide an ordered list of all the desired columns. Second, if the user didn't supply names
then it uses the sorted names from the data. I think sorting is good because then it provides the output in a predictable order.
Alright, I gave it one more shot. I implemented all the changes discussed above, except waiting on a reply from iguananaut on the loop counter. I'm also holding off on rebasing until the code is in an acceptable state, so I'm not rebasing every commit. Thanks for the inputs again, guys. |
@ryanfox - looks like you can leave the |
I pushed my latest changes, leaving |
Thanks @ryanfox, looks good! I'm merging now. The travis failure is in sphinx-build and appears unrelated to anything in this PR. |
Added support for constructing a table from a list of dicts
At some point this should be updated to work on any dict-like object ( |
Implemented ticket #647. I did not implement anything with regard to recognizing None as missing data, as that got split off into another ticket.
As every dict may not have the same keys, it compiles a set of all the keys across all dicts first, then compiles columns from those key names. Missing values are taken to be None.
After the list of row-dicts is converted to a dict of col-lists, the _init_from_dict() method is called and the function returns.