Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Want json-sort for sorting json objects #17

Open
nfitch opened this issue Sep 9, 2013 · 4 comments
Open

Want json-sort for sorting json objects #17

nfitch opened this issue Sep 9, 2013 · 4 comments

Comments

@nfitch
Copy link
Contributor

nfitch commented Sep 9, 2013

As far as I've searched, I haven't found an app doing external merge sorts of json objects. For sorting json objects in Manta, it seems that someone would need to extract a field in the first pass, sort using the standard unix sort and then remove the field in a final pass. Of course, all this could be done as part of the user's "standard" data processing, but that requires more complexity than most people want/need.

manta-compute-bin may not be the right place for a tool like this, but it's a good place to record the idea. Also, it'd be nice to have these options:

  1. Control over what happens to object that don't have the field
  2. Filter rows that don't need to be in the final sort
@davepacheco
Copy link
Contributor

I've needed something like this in order to use "diff" with large json objects. It's poorly documented, but it's here if you want to use it: https://github.com/davepacheco/kartlytics/blob/items/tools/json_normalize

It seems like you should be able to do this with "json -e" as well, but I haven't been able to make that work.

@trentm
Copy link
Contributor

trentm commented Sep 9, 2013

Would something first-class in json be useful/desirable here? I've occasionally wanted to be able to sort arrays-of-objects with json -s key or something

@davepacheco
Copy link
Contributor

Potentially, yes. The "json_normalize" tool I linked to sorts all object
keys lexicographically and lets the user specify a field on which to sort
arrays of objects. It lets you specify N of these keys that it uses for
the first N arrays it finds, depth-wise. This approach assumes the objects
basically look alike, though. I wonder if there's a better way to express
it.

@nfitch
Copy link
Contributor Author

nfitch commented Sep 10, 2013

I wouldn't mind if it ends up in the json-tool as long as trentm/json#49 is done before-hand. That said, though, the json tool can't assume that all the json objects will fit in memory- it's going to have to implement some sort of external merge sort (like the unix sort command does today). I wasn't sure that sorting complexity should be put in the json-tool, but it's really up to you, trentm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants