-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SUGGESTION + BUG: address multiple JSON input files/streams separately #64
Comments
+1 on the idea; I think it would be great to have multiple named input streams, so you can do joins and such. Currently I think you'd have to munge each stream into fields of an umbrella object. A clean way you could do it would be to have a built-in function
Where, borrowing from an example in the manual,
and
giving the result
That way you could do command substitution to get data from curl or what have you:
And not have to include an HTTP client in jq. :) You could keep the meaning of |
Or, use bash-style argument variables $1, $2,
BTW: can you have several arguments of the form
I agree, it's not the unix way to include a HTTP client in jq. (One issue is that, JSON is a specific use of HTTP, and you often must include headers to that effect. It would be nice to hide that.) EDIT ah! The nature of JSON documents is that we can already distinguish them, regardless of whether they have been concatenated or are in separate files/streams, because a sequence of JSON instances does not have enclosing {} or [], and the documents are not comma separated. Therefore,
It's a question of accessing this list (which might not fit with the
Therefore, we have a syntax like
Note that the leading Can also do it with variables:
And with
EDIT2 sorry, I'm wrong about the "rooted path" - the root changes when you filter. So you need to capture it in a variable at the start. For this reason, a function won't work on its own (but we can use function syntax by having it access that variable).
BTW: need the extra The above is to demonstrate the generality. I think using one specific variable, and streaming the other, is clearer for this specific code:
It turns out to be similar to the example code in the manual for this, just using
|
The crash is because jq uses 16-bit bytecode indexes internally and putting 5k lines of JSON into the program overflows these. That limit will probably remain in place for the next while, but it should definitely give an error message rather than just crash. As @13ren points out, you can more or less solve your problem by A HTTP client in jq would certainly be useful, a lot of my (and probably everyone else's) usecases have been For the moment, I think I'm more likely to hack up a |
@stedolan thanks, a Also, I think accessing unix tools as if they emitted JSON could be profound. I've seen some research on this concept, of structured-data versions of unix tools (using XML though), and I believe Microsoft's powershell does it too. |
@13ren, as you discovered, you can do as many process substitutions as you like and they all appear as separate files. More here: https://en.wikipedia.org/wiki/Process_substitution I had originally thought of the Your |
@jkleint thanks, saying "process substitution" enabled me to find it in the man page. (I'd searched for "<(", but when concentrating on getting the escaping right, I was below the entry, and man's search doesn't wrap, so didn't find it). A cool feature, esp combined with I agree
re: A simple solution is to check the number of arguments - but this can be subverted by omitting JSON values from other files. I think a totally secure Or, just run each file through Of course, simplest is to have a command line switch in |
Maybe addressing multiple JSON documents separately is outside the scope of the "stream processing" philosophy, however:
sed
andawk
both have script commands to read and write specific files... so, it is at least aligned with the concept of "sed for JSON", if not "stream processing".Maybe there's a way/idiom to do this already?
example of my present usage:
I'm using
jq
twice, once to extractair_temp
from one JSON stream, then again to append it to another - it would be nice to combine them. I'm not sure of syntax, but maybe sed-liker myfilename
.... [even nicer,get myurl
]. This woud have the same syntactic role and semantic effect as object/array construction. Could assign them to variables, etc.example of proposed usage:
BUG:
Actually, I see now that I could just use
$(curl -s $h_url)
to combine them. However, there seems to be a bug (maybe overflow? because too big for a constructed object? It's fine when streamed.). I'm using jqversion 1.2
:I still like the idea of facilitating/encouraging this kind of use of jq.
BTW: in my actual code, I'm getting and putting to http://jsonblob.com/api (not file
t
as above).The text was updated successfully, but these errors were encountered: