pkoppstein edited this page Nov 28, 2018 · 291 revisions

Frequently Asked Questions

FAQ and Wiki Authors

𝑸: Who can edit the wiki, including this FAQ?

A: Anyone with a GitHub account.

𝑸: Who wrote this FAQ?

A: Various contributors, see the page's history. Any user is welcomed to add their FAQ, even without an answer.

𝑸: Who wrote the rest of this wiki?

A: Mostly the authors/maintainers.

𝑸: Can I add to the jq Cookbook?

A: Absolutely, please do!

Installation

𝑸: On Windows, how can I install a more recent version of jq than is available using choco, without having to build from source?

A: A wide range of versions of jq are available as jq.exe files from Appveyor. See the Installation page for further details.

𝑸: I have just upgraded jq while retaining a previous version, which was working properly before the upgrade. Now, however, when running the previous version, I get the error message:

dyld: Library not loaded: /usr/local/opt/oniguruma/lib/libonig.4.dylib

What can I do?

A: ln -s /usr/local/opt/oniguruma/lib/libonig.5.dylib libonig.4.dylib

𝑸: What are the pre-requisites for compiling and installing jq from GitHub?

A: To create a jq executable from source requires a C development environment, preferably with a recent version of bison (3.0 or newer), but if you can't get a recent-enough bison you can use the --disable-maintainer-mode option to ./configure. See http://stedolan.github.io/jq/download for details. To get regexp support you'll also need to have Oniguruma installed.

𝑸: How do I install oniguruma?

A: On a Mac, we recommend brew install oniguruma. For Linux, use your package manager to find oniguruma-dev or oniguruma-devel. (See also the next FAQ regarding the location of the libonig library.) All else failing, download oniguruma from https://github.com/kkos/oniguruma/archive/master.zip or https://web.archive.org/web/http://www.geocities.jp/kosako3/oniguruma/ and consult the INSTALL file. If you have a recipe to share, please add it to the Installation page on this wiki.

𝑸: When I run make, I get error messages such as "undefined reference to OnigSyntaxPerl_NG".

A: The Oniguruma library may have been installed in a directory that the standard jq installation process does not know about. For example, jq may be expecting the libonig library to be /usr/local/lib64/libonig.so.2 but it might instead be located in /usr/local/lib/. If this is the case, a simple workaround is to create symbolic links for all the libonig* files and then run ./configure again.

𝑸: Are there any complete recipes for installing jq from source?

A: See the main README for jq, and the Installation wiki page.

Caveats

𝑸: Is . really a JSON pretty-printer?

A: Yes and no. jq will by default pretty-print the representation of JSON that it holds in memory, but this may differ significantly from the corresponding input. Specifically, JSON numbers in the input are converted to IEEE 754 64-bit values, so loss of precision and other changes can result. In addition, the regular jq parser will, in effect, ignore all but the last occurrence of duplicate keys within an object.

𝑸: Can jq be used as a JSON validator?

A: Strictly speaking, no. Although jq is fairly strict about what it accepts as JSON, there is currently no "strict" mode, and jq will quietly accept some not-strictly-valid JSON texts, e.g. 00 is mapped to 0. See also the subsection on numbers below. However jq can be very helpful in pinpointing discrepancies from JSON.

𝑸: Are there restrictions on variable names? Why does the use of $end result in a “syntax error”?

A: Reserved words such as end cannot be used as $-variable names. Sorry😱

For a full listing of keywords and other details, see Keywords.

General Questions

𝑸: Where can I get additional help?

A:

𝑸: How can I access the value of a key with hyphens or $ or other special characters in it? Why does .a.["$"] produce a syntax error?

A: The basic form for accessing the value of a key is .["KEYNAME"] where "KEYNAME" is any valid JSON string, but recent versions of jq also allow ."KEYNAME".

Using the basic form might require explicit use of the pipe symbol, as in .["a-b"]|.["x-y"], but this can be abbreviated to .["a-b"]["x-y"].

In fact, if the expression E | .[F] is valid, then it can be abbreviated to (E)[F], or even E[F] if E is sufficiently simple, e.g. if E is a jq identifier or a dot-separated string of such identifiers. A jq identifier is an alphanumeric string beginning with an alphabetic character, where "alphabetic character" here includes the underscore (_).

Applying these rules, it is apparent that .a | .["$"] can be abbreviated to .a["$"]. However, there is no rule allowing .a.["$"], which is thus syntactically invalid.

𝑸: How can "in-place" editing of a JSON file be accomplished? What is jq's equivalent of sed -i?

A: Currently, jq does not have an option to edit a file "in-place" in the manner of the -i option of sed or ruby. There are several alternatives, but using tee or output redirection (>) to overwrite the input file is not recommended, even if it seems to work. (See e.g. http://askubuntu.com/questions/752174).

Here are two reasonable approaches:

(1) Use an explicit temporary file.

For example:

jq ... input.json > tmp.json && mv tmp.json input.json

A more elaborate variation might use mktemp and might check whether the temporary file is empty or identical to the source file.

(2) Use a command-line utility.

For example:

If concurrency is an issue, you will probably want to use flock or chflags uchg.

𝑸: Given an array, A, containing an item, X, how can I find the least index of X in A? Why does [[1]] | index([1]) return null rather than 0? Why does [1,2] | index([1,2]) return 0 rather than null?

A: The simplest uniform method for finding the least index of X in an array is to query for [X] rather than X itself, that is: index([X]).

By contrast, the filter index([1,2]) attempts to find [1,2] as a subsequence of contiguous items in the input array. This is for uniformity with the behavior of t | index(s) where s and t are strings.

If X is not an array, then index([X]) may be abbreviated to index(X).

𝑸: Which date-time functions are sensitive to environment variables?

A: strflocaltime and localtime in jq 1.6rc1 depend on the TZ (time-zone) environment variable, e.g.:

$ TZ=FR jq1.6 -cn 'now|localtime[:5]'
[2018,1,27,6,29]

$ TZ=EST jq1.6 -cn 'now|localtime[:5]'
[2018,1,27,1,29]

TZ=EST jq1.6 -cn 'now|strflocaltime("%Y-%m-%dT%H:%M:%S EST")'
"2018-02-27T01:34:59 EST"

TZ=Asia/Shanghai jq1.6 -nr '1543200371|strflocaltime("%Y-%m-%dT%H:%M:%S %Z")'
2018-11-26T10:46:11 CST

𝑸: My file contains valid JSON, so why does jq give the error message: "parse error: Invalid numeric literal at EOF at line 1 ..."

A: The encoding of the file might not be valid for JSON text, which is defined as a sequence of Unicode code points. jq currently requires the text be encoded as UTF-8 (and therefore allows ASCII). If you need to convert from one encoding to another, consider using iconv, or if you are using Windows, try pasting your JSON into Notepad and saving the file as a UTF-8 file.

𝑸: How can I "zip" two arrays together? Why doesn't jq have a "zip" function for zipping together two arrays?

A: Use transpose/0, which has more functionality than the typical "zip" function.

𝑸: How can a variable number of arguments be passed to jq? How can a bash array of values be passed in to jq as a single argument?

A: Here is an example showing how embedded spaces in the values can be handled in the context of a bash shell:

$ x=(1 "a b" 2)
$ jq -n --argjson args "$(printf '%s\n' "${x[@]}" | jq -nR '[inputs]')" '$args'
[
  "1",
  "a b",
  "2"
]

This approach is applicable so long as none of the values contains a newline character. To use NUL as the separator, consider:

$ jq -n --argjson args "$(printf '%s\0' "${x[@]}" | jq -Rsc 'split("\u0000")')" '$args'
[
  "1",
  "a b",
  "2"
]

As of February 25, 2017, a variable number of JSON arguments can be passed to jq on the command line using the "--args" and/or "--jsonargs" command-line options. See the manual for details.

𝑸: Is jq's sort stable?

A: As of January 18, 2016 (7835a72), the builtin sort filter is stable; prior to that, stability was platform-dependent. This means that stability is NOT guaranteed in jq 1.5 or earlier.

𝑸: How can a stream of JSON entities be collected together in an array?

A: For streams generated within a jq program, one approach is simply to wrap the generator within square brackets, e.g. [range(0,10)]. Another option is to use reduce, e.g. reduce range(0;10) as $i ([]; . + [$i]).

For an external stream of JSON entities (e.g. in a file or from an invocation of curl), use the -s (--slurp) command-line option if you are using jq 1.4. For example, the following jq command will emit an array consisting of the input entities:

jq -s .

jq 1.5 includes the streaming filter inputs, which would normally be used in conjunction with the -n option, as in these examples, which produce the same result, namely [1,2]:

$ (echo 1; echo 2) | jq -nc '[inputs]'

$ (echo 1; echo 2) | jq -nc 'reduce inputs as $row ([]; . + [$row])'

𝑸: What is the equivalent of XPath's // expression? How can I find the value of a given key, no matter how deeply nested the object is? How can I find the path to a slot?

A: XPath's // expression allows one to select nodes in an XML document, no matter where they are. For example, //book selects all "book" elements.

The corresponding expression in jq is .., which yields a stream; it is typically used with the ? operator and the empty filter as in this example:

$ jq -nc '[{},{"book":10}] | .. | .book? // empty'
10

Similarly, the jq expression for finding all the paths to "book" nodes is path(..|book? // empty); for example:

$ jq -nc '[{},{"book":1}] | path(.. | .book? // empty)'
[1,"book"]

To find all the paths to objects which have a key named "book":

$ jq -nc '[{},{"book":1}] | path(.. | select(type == "object" and has("book")))'
[1]

𝑸: How to extract parts of JSON into shell variables?

A: jq has a way to format text in a shell-safe way. For example, this:

$ eval "$(jq -r '@sh "a=\(.a) b=\(.b)"' sample.json)"

sets shell variables $a and $b to the .a and .b values in the input, assuming these values are atomic (i.e., neither arrays nor objects). To avoid using eval, consider using a bash array, e.g.:


$ data=( $(jq -n '"a\tb","c"| @sh' )  )
$ echo "${data[0]}"
"'a\tb'"

See also the next Q.

𝑸: How can a stream of JSON texts produced by jq be converted into a bash array of corresponding values?

A: One option would be to use mapfile (aka readarray), for example:

mapfile -t array <<< $(jq -c '.[]' input.json)

An alternative that might be indicative of what to do in other shells is to use read -r within a while loop. The following bash script populates an array, x, with JSON texts. The key points are the use of the -c option, and the use of the bash idiom while read -r value; do ... done < <(jq .......):

#!/bin/bash
x=()
while read -r value
do
  x+=("$value")
done < <(jq -c '.[]' input.json)

𝑸: How can environment variables be passed to a jq program? How can a jq program be parameterized?

A: (1) In jq version 1.4, the primary mechanisms for passing in parameters and/or environment variables are the --arg and --argfile command-line options, e.g. at a Mac or Linux prompt:

$ jq -n -r --arg x abc '$x, ("def" as $x | $x), $x' 

will emit:

abc
def
abc

Note that values passed in in this manner are always strings. Recent versions of jq also have the --argjson option. See the jq manual for further options and details.

(2) Careful use of quotation marks can also be helpful, e.g.

$ hello="Goodbye"; jq -n '"He said '"$hello"'!"'

(See the Windows section below regarding quotation marks at a Windows command-line prompt.)

(3) In a shell script, cat << EOF and/or cat << 'EOF' can be helpful.

(4) In sufficiently recent versions of jq (jq>1.4), exported environment variables can be read as illustrated by this snippet:

$ export hello="Goodbye"; jq -n 'env.hello'
"Goodbye"

𝑸: How can I sort an array of strings by length? How can I sort an array using multiple criteria?

A: The key to both questions is sort_by/1. For example, to sort an array of strings by their lengths, one could simply use sort_by(length).

To sort by multiple criteria, we use the fact that jq's sort sorts arrays lexicographically. This means we can simply provide the set of sorting criteria as an array. For example, suppose a triangle is represented by a triple [a, b, c] where each component is the length of one side, and that we wish to sort the triangles first by perimeter, and then by the length of the maximum side. The filter to use is: sort_by( [add, max] ).

For example:

$ jq -c -n '[ [3,4,5], [3,4,6], [3.5, 3.5, 5]] | sort_by( [add, max] )'
[[3,4,5],[3.5,3.5,5],[3,4,6]]

Note: The jq 1.4 reference manual deprecates sort_by/1 in favor of sort/1, but the deprecation has been retracted.

𝑸: How can I convert JSON-P (JSONP) to JSON using jq?

A: Assuming that the padding takes the form of a function call:

$ jq -R  'capture("\\((?<x>.*)\\)[^)]*$").x | fromjson'

or if your jq does not support regular expressions:

$ jq -R 'explode | .[1+index("("|explode): rindex(")"|explode)] | implode | fromjson'

At a Windows command-line prompt, one could put either of the above jq filters into a file and invoke jq with the -f option, or escape the quotation marks, e.g.:

C:\ jq -R  "match(\"\\((?<x>.*)\\)[^)]*$\").captures[0].string | fromjson"

This command could be used in a pipeline, along the following lines:

curl ..... | jq -R ..... | jq .....

𝑸: Why is there no filter like to_values for accessing all the values of an object?

A: The expression .[] emits a stream of the input object's values; if necessary they can be wrapped into an array by writing [ .[] ]. See also map_values/1 in the manual.

𝑸: How can I recursively eliminate null-valued keys?

A: walk(if type == "object" then with_entries(select(.value != null)) else . end)

For example, using the above filter, {"a": {"b": 1, "c": null}} would be transformed to {"a": {"b": 1}}

Note that walk was only introduced as a builtin after jq 1.5 was released. If your jq does not include walk, simply include its definition before invoking it, or add it to your ~/.jq initialization file:

# Apply f to composite entities recursively, and to atoms
def walk(f):
  . as $in
  | if type == "object" then
      reduce keys_unsorted[] as $key
        ( {}; . + { ($key):  ($in[$key] | walk(f)) } ) | f
  elif type == "array" then map( walk(f) ) | f
  else f
  end;

𝑸: How can I select a specific set of key-value pairs from a JSON object? How can I use one object as a template for querying another? How can I delete keys from an embedded object?

A: (1) If the goal is to create an object with a set of specific keys known ahead of time, consider this example:

$ jq -c -n '{"a": 1, "b": null, "c":3} | {a,b,d}'
{"a":1,"b":null,"d":null}

If the goal is to create an object as above but omitting fields which are undefined in the target object, then the following filter will do the job:

def query(queryobject):
  with_entries( select( .key as $key | queryobject | has( $key ) ));

Example:

$ jq -c -n '{"a": 1, "b": null, "c":3} | query( {a,b,d} )'
{"a":1,"b":null}

To delete keys based on their values, consider this example:

echo '{"outer": { "a": "delete me", "b": "delete me too", "keep": 1} }' |\
  jq '.outer |= with_entries(select(.value|tostring|test("delete")|not))'
{
  "outer": {
    "keep": 1
  }
}

(2) If the specific set of keys is not known ahead of time, then query as defined immediately above can still be used. If the keys are known as a list of strings, then reduce could be used, e.g. if the target object is $o and the list of keys is $l:

reduce $l[] as $key ({}; . + { ($key): $o[$key] })

See also the preceding question regarding the recursive removal of key-value pairs.

𝑸: How can I rename the keys of an object programmatically?

A: One way to rename the keys of an object is to use with_entries, e.g.

with_entries( if .key | contains("-") then .key |= sub("-";".") else . end)

To rename keys recursively, see the Q defining translate_keys(f) below.

𝑸: How can I delete an element from an array by index? How can I delete all elements by value?

A: To delete an element from an array at index (offset) 1, consider these examples:

$ jq -cn '[0,10,20] | del(.[1])'
[0,20]

$ jq -cn '[0,10,20] | .[0:1] + .[2:]'
[0,20]

$ jq -cn '[0,10,20] | delpaths([[1]])'
[0,20]

$ jq -cn '[0,10,20] | .[1] = null | map(select(.!=null))'
[0,20]

$ jq -cn '[0,10,20] | [.[0,2]]'
[0,20]

To delete all occurrences of a particular value, use array subtraction as it retains the ordering:

$ jq -cn '[0,10,20,10,30] - [10]'
[0,20,30]

𝑸: How can I merge two JSON objects?

A: The + operator can always be used to merge two objects, but + resolves conflicts simply by ignoring the conflicting values in the left-hand-side operand. The * operator is also available (see the jq Manual for details). To resolve conflicting values, say v1 and v2, by combining the two values into an array, see this gist. The "combine" filter defined there achieves commutativity and associativity by using "unique". See also the next Q&A.

𝑸: How can I convert an array of objects into an object of corresponding arrays? How can I meld an array of objects, $a, into a single object with keys, $k, such that .[$k][$i] is $a[$i][$k]?

A:

def meld: . as $in | reduce (add|keys[]) as $k ({}; .[$k] = [$in[] | .[$k]]);

Example:

[{a:1,b:10}, {a:2,c:3}] | meld

produces:

{"a":[1,2],"b":[10,null],"c":[null,3]}

𝑸: How can I create and initialize an array of a specific size? An m by n matrix?

A: To create an array of n+1 nulls, one can write:

[][n] = null

In practice, one is more likely to use range and/or reduce, e.g. to create an array of n 0s:

[range(0;n) | 0]

or:

reduce range(0;n) as $i ([]; . + [0])

Here is a function that produces a representation of an m by n matrix with initial value specified by its input:

def matrix(m;n): . as $init
  | [ range(0; n) | $init ] as $row
  | [ range(0; m) | $row ];

𝑸: If the condition in an "if-then-else-end" statement is not satisfied, is it possible to emit nothing? Can I omit the "else" clause?

A: if TEST then VALUE else empty end

Or you could just write select(TEST) | VALUE.

The "else" clause cannot be omitted, but you can write your own "if-then-else" filter. A particularly useful complement to select is when/2 defined as follows:

def when(COND; ACTION): if COND? // null then ACTION else . end;

Thus, for example, 0 | when(empty; 1) emits 0.

𝑸: How does one append an element to an array?

A: If a is an array, then a + [e] will result in a copy of a with e appended to it. This is often seen in expressions such as . + [e].

There are alternatives that may be (very marginally) more efficient. Assuming that a is an array, these expressions will also produce a + [e]:

 a | setpath( [length] ; e )

 a | .[length] = e

𝑸: How can I strip off those pesky double-quotation marks?

A: To output top-level strings without quotation marks, consider using the "-r" (--raw-output) option of the jq command. Often this option together with string interpolation, join/1, or @tsv can be used to achieve the desired effect.

Within a jq program, if s is a string that has outer quotation marks (e.g. ""abc"") then using a sufficiently recent version of jq, s[1:-1] will do the job; otherwise try s | s[1:length-1].

To check whether s[1:-1] is supported in your version of jq, try this at the command-line prompt ($ for Mac/Linux, > for Windows):

$ jq -n '"\"abc\""[1:-1]'
> jq -n "\"\\\"abc\\\"\"[1:-1]"

If neither of the above options is applicable, consider using sed.

𝑸: How can I convert a string to uppercase? To lowercase?

A: If your version of jq does not have ascii_downcase and ascii_upcase, then you might want to use their definitions:

# like ruby's downcase - only characters A to Z are affected
def ascii_downcase:
  explode | map( if 65 <= . and . <= 90 then . + 32  else . end) | implode;

# like ruby's upcase - only characters a to z are affected
def ascii_upcase:
  explode | map( if 97 <= . and . <= 122 then . - 32  else . end) | implode;

𝑸: Can I use jq in the shebang line? Can a jq script be turned into an executable command?

A: Yes, if your operating system supports them.

The trick to using jq in the shebang line is NOT to specify an argument for the -f option.

For jq>1.4, a suitable shebang line for a script that read from stdin would be:

#!/usr/bin/env jq -Mf

For earlier versions of jq, you may be able to use a shebang line such as:

#!/usr/local/bin/jq -n -M -f

As of December 2015, jq>1.5 also supports creating jq executables using exec as illustrated by the following:

#!/bin/sh
# this next line is ignored by jq, which otherwise does not continue comments \
exec jq -nef "$0" "$@"
# jq code follows
true

(Notice the trailing \ at the end of the second line.)

In both cases (that is, with jq on the shebang line, or using exec jq), arguments can be passed in to the script using the --arg NAME VALUE option. For example, if the script was in a file named "shebang" in the current directory, you could type:

./shebang --arg x 123

Then the value of $x in the jq program would be the string "123".

𝑸: How can I modify some or all of the keys in a JSON entity, no matter how deeply nested? How can I rename keys programmatically?

A: If your jq has walk/1 as a builtin, then it can be used as described below; otherwise, you can simply include its definition (available on this page or, for example, from https://github.com/stedolan/jq/blob/master/src/builtin.jq) in your jq script.

The following filter recursively walks the input JSON entity, changing each encountered key, k, to (k|filter), where "filter" is the specified filter, which may be any jq expression. The usual clobbering rules regarding duplicate object keys apply.

def translate_keys(f):
  walk( if type == "object" then with_entries( .key |= f ) else . end);

Example 1:

{"a": 1, "b": [{"c":2}] } | translate_keys( "@" + . )

yields:

{"@a": 1,"@b": [ {"@c":2} ]}

Example 2:

{"a": 1, "b": [{"c":2}] } | translate_keys( if . == "c" then "C" else . end )

yields:

{"a":1,"b":[{"C":2}]}

𝑸: How can I sort an inner array of an object? How can I sort all the arrays in a JSON entity? How can I modify a deeply nested array?

A: If the path to the inner entity is known, one can use |= as illustrated here:

{"array": [3,1,2] }
| .array |= sort

To sort all arrays in a JSON entity, no matter where they occur, one option is to use walk/1:

walk( if type == "array" then sort else . end )

(If your jq does not have walk/1 as a builtin, see elsewhere on this page for a link to its definition in jq.)

Alternatively, you could use a recursive procedure such as the following:

# Apply f to arrays no matter where they occur
def recursively(f):
  . as $in
  | if type == "object" then reduce keys_unsorted[] as $key
      ( {}; . + { ($key):  ($in[$key] | recursively(f)) } )
  elif type == "array" then map( recursively(f) ) | f
  else .
  end;

For example:

{"a": [3,[30,10,20],2], "b": [3,1,2] } | recursively(sort)

yields:

{"a":[2,3,[10,20,30]],"b":[1,2,3]}

𝑸: Can jq process CSV or TSV files? What about XML? YAML? TOML? HTML? BCOR? msgpack? ...

A: TSV files (that is, files in which each line has tab-separated values) can be read naively using the following incantation:

$ jq -R -s 'split("\n") | map( split("\t") )'

This simply produces an array of arrays, each representing one row. If the first row contains headers, and if it is desired to produce an array of objects with the headers as keys, then see the Cookbook

CSV files are trickier, firstly because there are several "standards", and secondly because of certain intrinsic potential complexities.

Consider using cq for querying CSV; this package also supports YAML files. For more about YAML, see the Question about YAML elsewhere on this page, and similarly for HTML and TOML.

Note that the kislyuk version of yq (https://github.com/kislyuk/yq) also includes an executable, xq, for handling XML.

See also the previously mentioned recipe.

To process other formats, please use the appropriate transmogrification tools. Often a Google search using terms such as bcor2json or msgpack2json will yield some nuggets.

𝑸: How can I view all the paths leading to a particular key, together with its value, no matter how deeply nested the corresponding object is?

A: Suppose we are interested in every occurrence of the "country" key in a JSON entity. The following filter yields a stream of [PATH, VALUE] pairs, where PATH shows the path to an object having "country" as a key, and VALUE is the corresponding value:

[paths( .. | select(type=="object" and has("country")) )][-1] as $path
| [$path, getpath($path + ["country"])]

Example output:

[["addresses","address_name","address_spec"],"USA"]

𝑸: Is there a way to have jq keep going after it hits an error in the input file? Can jq handle broken JSON?

A: Yes, though in general, preprocessing (e.g. using hjson or any-json) would be preferable. Also, there are more options if you have jq 1.5.

If you do not have jq 1.5 or later, then consider using the -R option to read each line as text. If each JSON entity is on a separate line (as is often the case with log files, for example), then you can use a filter such as fromjson? // "sorry" to ensure that each line will always yield a JSON entity.

If you have jq 1.5 or later, there are two additional techniques.

The first uses the --seq option, documented in the manual and discussed in the blog entry at https://blog.jpalardy.com/posts/handling-broken-json-with-jq/. Here is an illustration:

echo $'1\ntwo\n3' | sed "s/^/$(printf '\36')/" | jq --seq . 2> /dev/null | tr -d '\36'
1
3

The second additional technique that is available with jq 1.5 or later uses the try/catch feature. For example, if each line of a file is either a self-contained JSON text or a non-JSON string, we could pretty-print the JSON and convert the non-JSON strings to JSON strings as follows:

jq -R '. as $line | try fromjson catch $line'

Here is an example using the inputs filter, which allows the JSON text to be spread over multiple lines:

def handle: inputs | [., "length is \(length)"] ;
def process: try handle catch ("Failed", process) ;
process

Note that when inputs is used to read a file, jq should normally be invoked with the -n command-line option.

𝑸: How can one efficiently determine whether a stream is empty or not?

A: def isempty(s): 0 == ((label $go | s | 1, break $go) // 0);

This was added as a built-in after the release of jq 1.5.

𝑸: How can one set the exit code of jq to signal an error in the event that an attempt is made to access an undefined key? Can the return code be set to 1 on an "out-of-bounds" error? Is there a flag to alter the semantics of array access?

A: jq does not have a global flag to alter the behavior of the fundamental operations for accessing the contents of JSON arrays or objects. If you want an "out-of-bounds" or "key-does-not-exist" error condition to be raised when an attempt to access the value at a non-existent index or key is made, then simply define a function with the desired semantics. The following function can be used for both arrays and objects:

# f should be a JSON string (a key name) or integer (an index)
def get(f): if has(f) then .[f] else error("\(type) is not defined at \(f)") end;

A similar function can be defined for "put" operations.

If you do not want to be bothered with the error message that would be sent to STDERR if the error condition is raised, you can redirect it to /dev/null (or NUL or perhaps null in a Windows environment), e.g. along the lines of: jq .... 2> /dev/null

Conversely, if you just want an error message to be written on "stderr" without an error condition being raised, you could use the stderr filter.

𝑸: How can I print the items in an array together with their indices? Given an array, a, of elements a[i], how can I generate the array with elements [i, a[i]]? How can I add a counter to a stream of items?

A: One approach is to use range, e.g.

["a","b"] | range(0; length) as $i | [$i, .[$i]]

Another is to use to_entries, e.g.

["a","b"] | to_entries[] | [.key, .value]

Yet another is to use transpose, e.g.

["a","b"] | [[range(0; length)], .] | transpose[]

If your jq has foreach, then for streams, you can adopt the technique illustrated by this generic filter:

# Given a stream, s, of values, emit a stream of [id, value] pairs,
# where id is a counter starting with the given number
def counts(s; start): foreach s as $value (start-1; .+1; [., $value]);

𝑸: Does jq support JSONL (JSON Lines)? How can I sort a stream of JSON objects by some key?

A: Yes, because jq is stream-oriented.

One way to convert JSONL input to a JSON array is to use the -s ("slurp") option. If your jq has inputs, then that may also be helpful in processing JSONL input.

The key to producing JSON Lines is the -c ("compact") option. To convert a JSON array into a stream of its elements, simply pipe it into .[]. In many cases, one can simplify " _ | .[]" to just "_[]" as in the following example, which shows how to sort a stream of JSON objects by some key:

jq -s -c 'sort_by(.id)[]'

Numbers

𝑸: Why does jq convert floating-point integers (such as 2.0) to plain integers (2)? Why isn't the precision of numbers preserved?

A: The jq parser converts JSON numbers to IEEE 754 64-bit values. The original JSON representation is lost. For numbers that are very large or very small in magnitude, the loss of precision will be significant.

𝑸: Why do 1E1000 and infinite both print as 1.7976931348623157e+308 ? Why does nan print as null?

A: The JSON number 1E1000 is represented internally as the IEEE 754 value for infinity, which prints as shown. However, the jq expression infinite == 1.7976931348623157e+308 returns false. To test whether a jq numeric value is equal to infinite, you can use the filter isinfinite. The jq filter nan evaluates to the IEEE value for NaN, which prints as null.

𝑸: What are the largest (smallest) numbers that jq can handle? Does "overflow" cause an error?

A: Currently, jq does not include "bigint" support. Very large integers will be converted to a floating point approximation. The largest number that can be reliably used as an integer is 2^53 (9,007,199,254,740,992). The largest floating point value is about 1.79e+308 and the smallest is about 1e-323.

In general, arithmetic operations do not raise errors, except that in jq 1.5, division by 0 does result in an error. In jq 1.4, 1/0 prints out as 1.7976931348623157e+308, and 0 * (1/0) evaluates to null.

A basic "bigint" library is available at Bigint.jq.

𝑸: What mathematical functions are supported?

A: The answer varies from version to version, and by platform. As of Feb 13, 2017 (revision 1c806b), jq has a builtin function, builtins/0, that produces an array of strings of the form "FUNCTION/ARITY", one for each builtin function.

The following functions should be generally available:

acos, acosh, asin, asinh, atan, atanh, cbrt, cos, cosh, exp2, exp, floor, j0, j1, log10, log2, log, sin, sinh, sqrt, tan, tanh, tgamma, y0, y1 – these are the standard mathematical functions available in C. They are all 0-arity filters.

In addition, the 0-arity filter length is defined so that if its input is numeric, its output will be the absolute value of that input; for example: -1.1 | length yields 1.1

As of June 28, 2015, the following are also provided on most platforms and have their standard "libm" definitions: atan2/2, hypot/2, pow/2, remainder/2.

Examples: atan2(1;1), hypot(3;4), pow(2;3), remainder(5; -2) yields:

0.7853981633974483
5
8
1

nan/0 and infinite/0 are also defined so that nan | isnan and infinite | isinfinite evaluate to true.

Definition of builtins

Note: As of Feb 13, 2017 (revision 1c806b), jq has a builtin function, builtins/0, that produces an array of strings of the form "NAME/ARITY", one for each builtin function.

𝑸: How can I view the definition of a jq builtin function?

A: https://github.com/stedolan/jq/blob/master/src/builtin.jq

𝑸: How can I circumvent the limitation that paths/1 does not generate paths to null?

A: As of July 14 2016, the builtin version of paths/1 does not generate paths to null. The following definition may be used instead:

def allpaths:
  def conditional_recurse(f):  def r: ., (select(.!=null) | f | r); r;
  path(conditional_recurse(.[]?)) | select(length > 0);

This definition is written so that anyone who wants to define conditional_recurse/1 as a top-level filter can easily do so. For reference:

  • the builtin recurse(f) is defined in terms of: def r: (f | select | r);
  • conditional_recurse(f) is defined in terms of: def r: (select | f | r);

See also #1163

𝑸: Is it possible to redefine jq builtins?

A: Yes, but the redefined builtin will only be effective with respect to invocations that occur after it has been redefined.

For example, if you wanted to redefine paths in accordance with the definition of allpaths in the previous Q, you could simply add the appropriate definition (def paths: ...) before any invocation of paths.

If you want to compare the performance of a builtin with an alternative definition, you can simply redefine it.

𝑸: Why does map_values(select(...)) produce the wrong result in my program? How can I use map_values/1 to delete keys?

A: In versions of jq available before Jan 30, 2017 (revision bd7b48c), map_values(select(g)) will yield the empty stream if select(g) is empty. This is generally not what is intended.

If you want to use map_values/1 to delete keys without being dependent on having a sufficiently recent version of jq, then the following alternative definition has much to recommend it:

def map_values(f):
  with_entries( [.value|f] as $v | select( $v|length == 1) | .value = $v[0] ) ;

You may wish to include this in your ~/.jq file or standard jq library.

"or" versus "//"

𝑸: What is the difference between the binary operators "or" and "//" ?

A: Both have a short-circuit behavioral semantics, but the two operators are otherwise very different. In a nutshell, given two expressions, A and B:

"A // B" either produces the truthy elements of A if there are any,
or else the entire stream B.

"A or B" produces a (possibly empty) stream of boolean values that 
is computed in an entirely different way, namely as the concatenation
of the streams a1 or B, a2 or B ...

(A JSON value is said to be "falsey" if it is null or false, and "truthy" otherwise.)

Here are the details regarding "A or B":

(i) If a and b are expressions each producing a single JSON value, then

"a or b" evaluates to true if a is truthy or if b is truthy, and false otherwise. 

(ii) If A is a (possibly empty) stream then:

A or empty evaluates to A
empty or A evaluates to empty

(iii) If A and B are expressions producing non-empty streams of values, (a1, ...) and (b1 ...) respectively, then:

'A or B' produces the concatenation of the streams: a1 or B, a2 or B, ...,

where ai or B evaluates to true if ai is truthy, and otherwise to the boolean stream:

false or b1, false or b2, ...

Example 1:

(null,1) or (2,3)

produces (true, true, true) - the first two values come from evaluating null or (2,3), and the third comes from evaluating 1 or (2,3).

Example 2:

(null, 1, null,2) // (10, 20)

produces (1, 2)

Related Resources

Tutorials

𝑸: What tutorials are available for jq?

See also the jq Cookbook and How-to:-Avoid-Pitfalls.

Editor Bindings

𝑸: What bindings are available for vim?

A: https://github.com/vito-c/jq.vim

Language Bindings

𝑸: What language bindings are available for Java?

A:

𝑸: What language bindings are available for Python?

A:

pip install jq # For details, see https://pypi.python.org/pypi/jq
pip install pyjq # For details, see https://pypi.python.org/pypi/pyjq

𝑸: What language bindings are available for R?

A: See https://cran.r-project.org/web/packages/jqr/index.html

𝑸: What language bindings are available for Ruby?

A:

gem install ruby-jq # For details, see https://github.com/winebarrel/ruby-jq

𝑸: What language bindings are available for PHP?

A: A jq extension for PHP is available from https://github.com/kjdev/php-ext-jq

𝑸: What language bindings are available for node.js?

A: https://github.com/sanack/node-jq

𝑸: What language bindings are available for browsers?

A: https://github.com/fiatjaf/jq-web - actually, a wrapper around a emscripten-compiled jq.

Projects

𝑸: How can I use jq interactively?

A:

𝑸: How can I use jq for YAML instead of JSON?

A:

  • yq transcodes YAML on standard input to JSON and pipes it to jq; yq can also translate JSON back to YAML.

  • y2j provides a wrapper called yq that uses Python to convert YAML to JSON, runs a jq filter on the JSON, and converts the result JSON back to YAML.

  • any-json simply converts YAML to JSON.

  • remarshal provides yaml2json and json2yaml scripts, amongst others.

𝑸: How can I use jq for TOML instead of JSON?

A:

  • remarshal provides toml2json and json2toml scripts, amongst others.

    • brew install remarshal
  • npm install --global toml2json installs a toml2json script

𝑸: How can I use jq to process HTML?

A: First convert the HTML to JSON, e.g. using pup (https://github.com/ericchiang/pup) or hq (https://github.com/rbwinslow/hq).

𝑸: How can I use jq to process JavaScript objects that are not JSON but are specified in accordance with the ECMAScript standard?

A: It may be possible to convert the objects to JSON using json5 (https://github.com/json5/json5), about which some further details are given below. Consider also using the JSON.stringify() function of your favorite JavaScript interpreter.

𝑸:: What other jq-related projects are there?

A:

Windows

𝑸: Why doesn't jq '.' work? Why aren't my jq commands being parsed properly?

A:

  • Writing jq . should be sufficient to invoke jq's . filter on every platform.

  • Consider placing your jq commands in a file and then invoking jq with the -f FILENAME option.

  • To quote a jq command string in a Windows environment, use double-quotation marks, e.g. jq ".". To quote JSON strings within the command string, use \", for example:

      jq -n "\"Hello world!\""
    

Note also that on Windows, echo is different from on other platforms. In short:

  • correct on Windows: echo "Hello" | jq ". + \" world\""
  • correct on other planets: echo '"Hello"' | jq '. + " world"'

Notable Differences between Versions

In the following:

  • "1.4+" refers to a sufficiently recent version of jq since the release of Version 1.4.
  • "all versions" refers to versions 1.3, 1.4, and 1.4+

𝑸: Is there a NEWS or Changelog file that documents when a particular feature was released?

A: https://github.com/stedolan/jq/blob/master/NEWS

See also https://github.com/stedolan/jq/releases

𝑸: In which versions of jq is the ordering of the keys of an object preserved?

A: In jq 1.3, the keys are sorted, e.g.

jq -n '{b:1, a:2} | to_entries[].key'

produces the stream: "a" "b". In jq 1.4 and later, the ordering is preserved. Note that keys/0 sorts the keys; to avoid this, keys_unsorted was introduced in jq 1.5.

𝑸: What alternatives are there to .["key"] for accessing the value of a key?

A: If "KEY" is an object key beginning with an alphabetic character and composed entirely of alphanumeric characters (it being understood that _ counts here as an alphabetic character), then all versions of jq allow .KEY as an alternative to .["KEY"]. In addition, in jq 1.4+, the form ."KEYNAME" is supported for any valid key name.

𝑸: In which versions of jq are regular expressions supported?

A: 1.4+. See the next section for further details.

𝑸: How can I match a string while ignoring case?

A: For simple matches, consider using ascii_downcase (see above). For regex matches, use the "i" flag with one of the regex filters available in jq 1.4+.

𝑸: How can I access the last element of an array?

A: If a is an array, then the most robust way of accessing the last element is: a| .[length-1]. To set the last element, either of the following should work in all versions:

  • a | .[length - 1] = value
  • a | setpath([length - 1]; value)

If you're feeling at all adventurous, try this at the command-line prompt ($):

$ jq -n '[1][-1]'

𝑸: How can I "slurp" from a secondary file? Is there way to "slurp" a file using the --argfile option? What is the --slurpfile option introduced in jq 1.5?

A: jq normally reads data from "stdin" or the file specified on the command line, e.g. jq . PRIMARY.json. In jq 1.4, the --argfile option allows one to read data from one or more secondary files: the contents of the file will be slurped if and only if it contains more than one JSON entity.

In jq 1.5, the --slurpfile option has been added to allow one to read the contents of an entire file of JSON entities as an array, e.g. jq -n --slurpfile a SECONDARY.json '$a | length' will report how many JSON items were read from the file named SECONDARY.json. The --slurpfile option always slurps the contents of the specified file, even if it is empty.

𝑸: What backwards-incompatible changes have been made since the release of jq 1.5?

A: The following listing may be incomplete.

  • The idiom .foo?//empty must now be written with a space immediately following the question mark, e.g. .foo? //empty

  • empty on RHS of |=

In jq 1.5 and earlier, expressions such as:

{a:1} | .a |= empty

produced null. This was surprising and not very useful.

As the result of a change introduced on January 30, 2017:

 $ jq -n '{a:1} | .a |= empty'
 {}

Note that the consequences of including empty in the body of a reduce statement might be surprising, e.g.:

 $ jq -n 'reduce 2 as $x (3; empty)'
 null

This behavior might also change, so caution should be exercised when including empty in this manner in the body of a reduce statement.

Support for Regular Expressions

Regex support was added soon after the release of jq 1.4.

𝑸: test("\d") does not work! Why can't I use character classes?

A: The regular expression must be given as a JSON string, which means that backslashes must be escaped, as in this example:

$ jq -n '"Is 1 a digit?" | test("\\d")'
true

𝑸: How can I eliminate all control characters in all strings, wherever they occur?

A:

walk(if type == "string" then gsub("\\p{Cc}"; "") else . end)

This will excise ASCII and Latin-1 control characters from all strings (other than key names), and illustrates that Unicode character categories can be specified using the abbreviated forms (here "Cc" rather than "Control").

The following filter will excise all control characters from all strings, including key names:

walk(if type == "string" then gsub("\\p{Cc}"; "")
     elif type == "object" then with_entries( .key |= gsub("\\p{Cc}"; "") )
     else . end)

If your jq does not have walk/1, simply include its definition (search for 'def walk' on this page) before its invocation.

𝑸: Where is the regex (regular expression) documentation?

A: jq uses the PCRE mode of the Oniguruma regex engine. A copy of the relevant authoritative documentation is at Docs-for-Oniguruma-Regular-Expressions-(RE.txt); the "master" version is at RE.

𝑸: How are named capture variables used?

A: Here are three examples. It is assumed in all cases that the shell allows single quotation marks to be used for quoting a string.

In the first example, we want to extract the numeric prefix of a "semantic version" specification:

$ echo '{"VERSION": "0.2.1-alpha+abxc23"}' |\
    jq '.VERSION | sub("(?<vers>[0-9]+\\.[0-9]+\\.[0-9]+).*"; .vers)'

The result:

"0.2.1"

To capture the variables as a JSON object, use capture/1:

echo '{"VERSION": "0.2.1-alpha+abxc23"}' |
  jq '.VERSION | capture("(?<vers>[0-9]+\\.[0-9]+\\.[0-9]+).*")'
{
  "vers": "0.2.1" 
}

In the last example, notice the use of the form \(.NAME) in the "to-string":

$ jq -n '"abc" | sub( "(?<head>^.)(?<tail>.*)"; "\(.head)-\(.tail)")'
"a-bc"

𝑸: Oniguruma is no longer available at http://www.geocities.jp. Where is the Oniguruma repository?

A: https://github.com/kkos/oniguruma

Streaming JSON parser

𝑸: How to handle huge JSON texts?

A: jq 1.5 includes a streaming parser that can be used to avoid having to read JSON texts completely before processing them. A search for a needle in a stack, for example, does not first have to create an in-memory representation of the entire JSON text, and can therefore go faster.

Here is an example of how to convert a top-level array of JSON objects into a stream of its elements:

$ jq -n '[{foo:"bar"},{foo:"baz"}]' | jq -cn --stream 'fromstream(1|truncate_stream(inputs))'
{"foo":"bar"}
{"foo":"baz"}
$ 

Notice the use of the "-n" option.

More generally:

$ echo '[{"foo":"bar"},99,null,{"foo":"baz"}]' |
  jq -cn --stream 'fromstream( inputs|(.[0] |= .[1:]) | select(. != [[]]) )'
{"foo":"bar"}
99
null
{"foo":"baz"}
$ 

Here is another example. Suppose we want to extract some information from certain JSON objects in a very large JSON document. For the sake of specificity, let's consider the case where the following would be appropriate except for the size of the JSON document:

.. | objects | select(.class=="FINDME"?) | .id

An alternative solution using jq's streaming parser would be as follows:

foreach inputs as $in (null;
  if has("id") and has("class") then null
  else . as $x
  | $in
  | if length != 2 then null
    elif .[0][-1] == "id" then ($x + {id: .[-1]})
    elif .[0][-1] == "class"
         and .[-1] == "FINDME" then  ($x + {class: .[-1]})
    else $x
    end
  end;
  select(has("id") and has("class")) | .id )

Invocation:

jq -n --stream -f program.jq input.json

See the jq Cookbook for further examples, and the jq manual for further details.

Processing not-quite-valid JSON

𝑸: Can jq process objects with duplicate keys? Can jq help convert objects with duplicate keys to an alternative format so that no information is lost?

A: The JSON syntax formally allows objects with duplicate keys, and jq can accordingly read them, but the regular jq parser effectively ignores all but the last occurrence of each key within any given object.
jq's streaming parser, however, can be used to convert a JSON object with duplicate keys to an alternative format so that none of the values are lost. This is illustrated at https://stackoverflow.com/questions/36956590/json-fields-have-the-same-name.

𝑸: Does jq support the processing of invalid JSON? Can jq be instructed to ignore comments?

A: If you want jq to ignore an error in the input file, see the Q above (search for the italicized text).

jq cannot be instructed to ignore Javascript-style comments, but see the next Q about using other tools to filter out such comments.

Apart from the possibility of skipping over invalid input, jq generally expects JSON input to be strictly valid, but JSON literals can be specified in a jq program more flexibly. For example:

$ jq -n '{a: 1}'
{
  "a": 1
}

Thus you may be able to use jq -n -f FILENAME to convert nearly-valid JSON to JSON.

𝑸: How can I convert JavaScript objects to JSON? How can I rectify a not-quite-valid-JSON text? How can I read a file which consists of JSON and comments?

A: As noted in the previous Q, jq itself can be used to transform nearly-valid JSON to JSON in many instances. For example, "#" comments can be removed using jq.

Here are brief descriptions of some other command-line tools that can be used to convert "not-quite JSON" to JSON. Some of these can also be used to remove comments.

relaxed-json

For "Plain Old JavaScript objects", consider https://github.com/phadej/relaxed-json

The relaxed-json command, rjson, can be installed by running:

yarn global add relaxed-json

or:

sudo npm install -g relaxed-json
strip-json-comments
 npm install --global strip-json-comments-cli
jsonlint

The jsonlint script provided by the python demjson package (pip install demjson) can be used as a JSON rectifier by invoking it with the -S and -f options. For example:

$ jsonlint -Sf 
/* This is a comment */
// Another comment
{'a': 1}

produces:

{ "a" : 1 }

For further information, see http://deron.meranda.us/python/demjson/

json5

json5 is a command-line tool for converting JSON5 (a superset of JSON that is also a subset of Javascript) to JSON. In brief:

npm install json5
ln -s ~/node_modules/.bin/json5 ~/bin
json5 -c FILENAME.json5  # generates FILENAME.json

Note that json5 -c FILENAME.SUFFIX will generate FILENAME.SUFFIX.json if "SUFFIX" is not "json5".

Documentation on JSON5 and json5 is at http://json5.org/

any-json

any-json purports to support the transmogrification of the following formats to JSON: cson, csv, hjson, ini, json5, xls, xlsx, xml, yaml.

Example:

$ any-json -format=json
// This line is recognized as a comment even though the input format has been specified as JSON!
/* Line 1
   Line 2
*/
[1,2]

Output:

[
  1,
  2
]
hjson

The hjson website describes a tool, also called hjson, for converting from hjson to JSON. In brief:

npm install hjson -g
hjson -j file.hjson # to convert to JSON

and/or:

pip install hjson
python -m hjson.tool -j file.hjson # convert to JSON
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.