Zachary M. Train edited this page Oct 20, 2018 · 60 revisions

For delicacies too choice for the manual.


Find the maximal elements of an array or stream

# Given an array of values as input, generate a stream of values of the 
# maximal elements as determined by f.
# Notes:
# 1. If the input is [] then the output stream is empty.
# 2. If f evaluates to null for all the input elements,
#    then the output stream will be the stream of all the input items.

def maximal_by(f):
  (map(f) | max) as $mx
  | .[] | select(f == $mx);


[ {"a":1, "id":1},  {"a":2, "id":2}, {"a":2, "id":3}, {"a":1, "id":4} ] | maximal_by(.a)

emits the objects with "id" equal to 2 and 3.

The above can also be used to find the maximal elements of a stream, but if the stream has a very large number of items, then an approach that requires less space might be warranted. Here are two alternative stream-oriented functions. The first simply iterates through the given stream, s, twice, and therefore assumes that [s]==[s], which is not the case, for example, for inputs :

# Emit a stream of the f-maximal elements of the given stream on the assumption
# that `[stream]==[stream]`
def maximals_by_(stream; f):
   (reduce stream as $x (null;  ($x|f) as $y | if . == null or . < $y then $y else . end)) as $mx
   | stream
   | select(f == $mx);

Here is a one-pass implementation that maintains a candidate list of maximal elements:

# Emit a stream of the f-maximal elements of the stream, s:
def maximals_by(s; f):
  reduce s as $x ({v:null, a:[]};
    ($x|f) as $y
    | if $y == .v then .a += [$x] elif $y > .v then {v:$y, a: [$x]} else . end)
  | .a[] ;

Using jq as a template engine

Here we describe three approaches:

  • the first uses jq "$-variables" as template variables; it might be suitable if there are only a small number of template variables, and if it is a requirement that all template variables be given values.

  • the second approach is similar to the first approach but scales well and does not require that all template variables be explicitly given values. It uses jq accessors (such as .foo or .["foo-bar"]) as template variables instead of "$-variables".

  • the third approach uses a JSON dictionary to define the template variables; it scales well but is slightly more complex and presupposes that the JSON dictionary is accurate.

Using jq variables as template variables

One straightforward approach is to use a jq object as a template, with jq variables as the template variables. The template can then be instantiated at the command line.

For example, suppose we start with the following template in a file named ab.jq:

{a: $a, b: $a}

One way to instantiate it would be by invoking jq as follows:

jq -n --argjson a 0 -f ab.jq

Notice that the contents of the file ab.jq need not be valid JSON; in fact, any valid jq program will do, so long as JSON values are provided for all the global "$-variables".

Notice also that if a key name is itself to be a template variable, it would have to be specified in parentheses, as for example:

{($a) : 0}

The disadvantage of this approach is that it does not scale so well for a large number of template variables, though jq's support for object destructuring might help. For example, one might want to set the "$-variables" in the template file using object destructuring, like so:

. as {a: $a}       # use the incoming data to set the $-variables 
| {a: $a, b: $a}   # the template

Using jq accessors as template variables

Using this approach, jq accessors are used as template variables. With the above example in mind, the template file (ab.jq) would be:

{a: .a, b: .a}

To instantiate the variables, we now only need a JSON object specifying the values, e.g.

echo '{"a":0}' | jq -f ab.jq

This approach scales well, but in the template, key names cannot be specified as template variables.

Arbitrary strings as template variables

Another scalable approach would be to use special JSON string values as template variables, and a JSON object for mapping these strings to JSON values.

For example, suppose that the file template.json contains the template:

{"a": "<A>", "b": ["<A>"]}

Here, the intent is that "<A>" is a template variable.

Now suppose that dictionary.json contains the dictionary as a JSON object:

 { "<A>": 0 }

and that fillin.jq contains the following jq program for instantiating templates:

 # $dict should be the dictionary for mapping template variables to JSON entities.
 # WARNING: this definition does not support template-variables being 
 # recognized as such in key names.
  reduce paths as $p (.;
    getpath($p) as $v
    | if $v|type == "string" and $dict[$v] then setpath($p; $dict[$v]) else . end)

Then the invocation:

jq --argfile dict dictionary.json -f fillin.jq template.json


  "a": 0,
  "b": [
  • dictionary.json is a JSON object defining the mapping
  • template.json is a JSON document defining the template
  • fillin.jq is the jq program for instantiating the template

The main disadvantage of this approach is that care must be taken to ensure that template variable names do not "collide" with string values that are intended to be fixed.

Emit the ids of JSON objects in a Riak database

The following script illustrates how curl and jq can work nicely together, especially if the entities stored at each Riak key are JSON entities.

The specific task we consider is as follows:

Task: Given a Riak database at $RIAK with a bucket $BUCKET, and assuming that each value at each riak key is a JSON entity, then for each top-level object or array of objects, emit the value if any of its "id" key; the values should be emitted as a stream, it being understood that if any of the objects does not have an "id" key, then it should be skipped.

The following script has been tested as a bash script with these values for RIAK and BUCKET:

curl -Ss "$RIAK/buckets/$BUCKET/keys?keys=stream" |\
  jq -r '.keys[] | @uri' |\
while read key
  curl -Ss "$RIAK/buckets/$BUCKET/keys/$key?keys"
done | jq 'if type == "array" then .[] | .id elif type == "object" then .id else empty end'

Filter objects based on the contents of a key

E.g., I only want objects whose genre key contains "house".

$ json='[{"genre":"deep house"}, {"genre": "progressive house"}, {"genre": "dubstep"}]'
$ echo "$json" | jq -c '.[] | select(.genre | contains("house"))'
{"genre":"deep house"}
{"genre":"progressive house"}

If it is possible that some objects might not contain the key you want to check, and you just want to ignore the objects that don't have it, then the above will need to be modified. For example:

$ json='[{"genre":"deep house"}, {"genre": "progressive house"}, {"volume": "wubwubwub"}]'
$ echo "$json" | jq -c '.[] | select(.genre | . and contains("house"))'

If your version of jq supports ? then it could also be used:

$ echo "$json" | jq -c '.[] | select(.genre | contains("house"))?'

In jq version 1.4+ (that is, in sufficiently recent versions of jq after 1.4), you can also use regular expressions, e.g. using the "$json" variable defined above:

$ echo "$json" | jq -c 'map( select(.genre | test("HOUSE"; "i")))'
[{"genre":"progressive house"},{"genre":"progressive house"}]

Note: use a semi-colon (";") to separate the arguments of test.

Filter objects based on tags in an array

In this section, we discuss how to select items from an array of objects each of which has an array of tags, where the selection is based on the presence or absence of a given tag in the array of tags.

For the sake of illustration, suppose the following sample JSON is in a file named input.json:

[ { "name": "Item 1",
    "tags": [{ "name": "TAG" },  { "name": "TAG" }, { "name": "Not-TAG" } ] },
  { "name": "Item 2",
    "tags": [ { "name": "Not-TAG" } ] } ]

Notice that the first item is tagged twice with the tag "TAG".

Here is a jq filter that will select the objects with the tag "TAG":

map(select( any(.tags[]; .name == "TAG" )))

In words: select an item if any of its tags matches "TAG".

Using the -c command-line option would result in the following output:

[{"name":"Item 1","tags":[{"name":"TAG"},{"name":"TAG"},{"name":"Not-TAG"}]}]

Using any/2 here is recommended because it allows the search for the matching tag to stop once a match is found.

A less efficient approach would be to use ’any/0’:

map(select([ .tags[] | .name == "TAG" ] | any))

The subexpression [ .tags[] | .name == "TAG" ] creates an array of boolean values, where true means the corresponding tag matched; this array is then passed as input to the any filter to determine whether there is a match.

If the tags are distinct, the subexpression could be written as select(.tags[] | .name == "TAG") with the same results; however if this subexpression is used, then the same item will appear as many times as there is a matching tag, as illustrated here:

$ jq 'map(select(.tags[] | .name == "TAG"))[] | .name'  input.json
"Item 1"
"Item 1"

Selecting all items that do NOT have a specific tag

To select items that do NOT have the "TAG" tag, we could use all/2 or all/0 with the same results:

 $ jq -c 'map(select( all( .tags[]; .name != "TAG") ))'  input.json
[{"name":"Item 2","tags":[{"name":"Not-TAG"}]}]

 $ jq -c 'map(select([ .tags[] | .name != "TAG" ] | all))'  input.json
[{"name":"Item 2","tags":[{"name":"Not-TAG"}]}]

Using all/2 would be more efficient if only because it avoids the intermediate array.

Find the most recent object in an S3 bucket

$ json=`aws s3api list-objects --bucket my-bucket-name`
$ echo "$json" | jq '.Contents | max_by(.LastModified) | {Key}' 

Sort by numeric values extracted from text

Say you have an array of objects with an "id" key and a text value that embeds a numeric ID among other text, and you want to sort by that numeric ID:


Add an element to an object array

Given an array of objects, I want to add another key to all elements in each of those objects based on existing keys:

  $ json='[{"a":1,"b":2},{"a":1,"b":1}]'
  $ echo "$json" | jq 'map(. + {color:(if (.a/.b) == 1 then "red" else "green" end)})'

>> [{"color":"green","b":2,"a":1},{"color":"red","b":1,"a":1}]

Explanation This example uses the map() operator. The filter for map copies all the keys of the input object using . and then merges this new object with the color object using the + operator. The color object itself is formed using the if conditional operator.

Note that this could also be done in the following manner:

jq 'map(.color = if (.a/.b) == 1 then "red" else "green" end)'

Zip column headers with their rows

Given the following JSON:

    "columnHeaders": [
            "name": "ga:pagePath",
            "columnType": "DIMENSION",
            "dataType": "STRING"
            "name": "ga:pageviews",
            "columnType": "METRIC",
            "dataType": "INTEGER"
    "rows": [
        [ "/" , 8 ],
        [ "/a", 4 ],
        [ "/b", 3 ],
        [ "/c", 2 ],
        [ "/d", 1 ]

How can I convert this into a form like:

    { "ga:pagePath": "/", "ga:pageviews": 8 },
    { "ga:pagePath": "/a", "ga:pageviews": 4 },
    { "ga:pagePath": "/b", "ga:pageviews": 3 },
    { "ga:pagePath": "/c", "ga:pageviews": 2 },
    { "ga:pagePath": "/d", "ga:pageviews": 1 }

Explanation Okay, so first we want to get the columnHeaders as an array of names:

(.columnHeaders | map(.name)) as $headers

Then, for each row, we take the $headers as entries (if this doesn't mean anything to you, refer to the with_entries section of the manual) and we use those to create a new object, in which the keys are the values from the entries and the values are the corresponding values on the row for each of said entries. Tricky, I know.

  | map(. as $row
        | $headers
        | with_entries({ "key": .value,
                         "value": $row[.key]}) )

Then we put it all together: wrapping it on a filter is left as an exercise for the reader.

(.columnHeaders | map(.name)) as $headers
| .rows
| map(. as $row
      | $headers
      | with_entries({"key": .value,
                      "value": $row[.key]}) )

(This recipe is from #623.)

Delete elements from objects recursively

A straightforward and general way to delete key/value pairs from all objects, no matter where they occur, is to use walk/1. (If your jq does not have walk/1, then you can copy its definition from

For example, to delete all "foo" keys, you could use the filter:

walk(if type == "object" then del(.foo) else . end)

It may also be possible to use the recurse builtin, as shown in the following example.

Let's take the recurse example from the manual, and add a bunch of useless {"foo": "bar"} to it:

{"name": "/", "foo": "bar", "children": [
  {"name": "/bin", "foo": "bar", "children": [
    {"name": "/bin/ls", "foo": "bar", "children": []},
    {"name": "/bin/sh", "foo": "bar", "children": []}]},
  {"name": "/home", "foo": "bar", "children": [
    {"name": "/home/stephen", "foo": "bar", "children": [
      {"name": "/home/stephen/jq", "foo": "bar", "children": []}]}]}]}

recurse(.children[]) | .name will give me all the names, but destroy the structure of the JSON in the process.

Is there a way to get that information, but preserve the structure?

That is, with the JSON above as input, the desired output would be:

{"name": "/", "children": [
  {"name": "/bin", "children": [
    {"name": "/bin/ls", "children": []},
    {"name": "/bin/sh", "children": []}]},
  {"name": "/home", "children": [
    {"name": "/home/stephen", "children": [
      {"name": "/home/stephen/jq", "children": []}]}]}]}


In order to remove the "foo" attribute from each element of the structure, you want to recurse through the structure and set each element to the result of deleting the foo attribute from itself. This translates to jq as:

recurse(.children[]) |= del(.foo)

If, instead of blacklisting foo, you'd rather whitelist name and children, you could do something like:

recurse(.children[]) |= {name, children}

(This recipe is from #263.)

Extract Specific Data for While Loop in Shell Script

Thanks to @pkoppstein and @wtlangford in Issue #663, I (@RickCogley) was able to finalize a shell script to pull descriptive metadata from a database of ours, which has a REST interface.

This cookbook entry makes use of curl, while read loops, and of course jq in a bash shell script. Once the JSON metadata files are output, they can be git pushed to a git repo, and diffed to see how the database settings change over time.

We assume a JSON stream like the following, with unique values for table id's, aliases and names:

  "id": "99999",
  "name": "My Database",
  "description": "Lorem ipsum, the description.",
  "culture": "en-US",
  "timeZone": "CST",
  "tables": [
      "id": 12341,
      "recordName": "Company",
      "recordsName": "Companies",
      "alias": "t_12341",
      "showTab": true,
      "color": "#660000"
      "id": 12342,
      "recordName": "Order",
      "recordsName": "Orders",
      "alias": "t_12342",
      "showTab": true,
      "color": "#006600"
      "id": 12343,
      "recordName": "Order Item",
      "recordsName": "Order Items",
      "alias": "t_12343",
      "showTab": true,
      "color": "#000099"

... the goal is to extract to a file only the table aliases using curl against a db's REST interface, then use the file's aliases as input to a while loop, in which curl again can be used to grab the details about tables.

First we set variables, then run curl against the REST API. The resulting JSON stream has no newlines, so piping it through jq '.' fixes this (bonus, if you also have XML, you can pipe it through xmllint to get a similar effect: xmllint --format -). The result is output to a file which contains JSON like the above.


curl -k "https://mydb.tld/api/$db_id/$db_rest_token/getinfo.json" |\
  jq '.' > $compcode-$db_id-Database-describe.json 
jq -r '.tables[] | "\(.alias) \(.recordName)"' \
  $compcode-$db_id-Database-describe.json > $compcode-tables.txt

The filter '.tables[] | "\(.alias) \(.recordName)"' selects the "tables" array, then from that, uses the filter "\(.foo) \(.bar)" to create a string with just those elements. Note, the -r here gives you just raw output in the file, which is what you need for the while read loop.

The output file looks like:

t_12341 Company
t_12342 Order
t_12343 Order Item

Next, the shell script uses a while read loop to parse that output file $compcode-tables.txt, then curl again to get table-specific info using the table alias talias as input. It passes the raw JSON output from the REST i/f through jq '.' to add newlines, then outputs that to a file using the two loop variables in the filename (as well as variables from the top of the script).

while read talias tname
  curl -k "https://mydb.tld/api/$db_id/$db_rest_token/$talias/getinfo.json" |\
    jq '.' >"$compcode-$db_id-Table-$talias-$tname-getinfo.json"
done < $compcode-tables.txt

The result is a collection of files like these:

ACME-98765-Table-t_12343-Order Item-getinfo.json

... that can be committed to a git repo, for diffing.

Extract data and set shell variables

A variation on the preceding entry:

$ eval "$(jq -r '@sh "a=\(.a) b=\(.b)"')"

This works because the @sh format type quotes strings to be shell-eval safe.

Another variant:

$ jq -r '@sh "a=\(.a) b=\(.b)"' | while read -r line; do eval "$line"; ...; done

To share multiple values without using eval, consider setting a bash array variable, e.g.

    vars=( $(jq -n -r '[1,2.3, null, "abc"] | .[] | @sh' ) )
    for f in "${vars[@]}"; do echo "$f" ; done

This approach will only work if the values are all single tokens, as in the example. In general, it is better to use jq -c to emit each value on a line separately; they can then be read using mapfile or one at a time.

For Windows, here is a .bat file that illustrates two approaches using jq. In the first example, the name of the variable is determined in the .bat file; in the second example, the name is determined by the jq program:

@echo off

for /f "delims=" %%I in ('jq -n -r "\"123\""') do set A=%%I
echo A is %A%

jq -n -r  "@sh \"set B=123\"" > setvars.bat
call .\setvars.bat
echo B is %B%

Convert a CSV file with Headers to JSON

There are several freely available tools for converting CSV files to JSON. For example, the npm package d3-dsv (npm install -g d3-dsv) includes a command-line program named csv2json, which expects the first line of the input file to be a header row, and uses these as keys. Such tools may be more convenient than jq for converting CSV files to JSON, not least because there are several "standard" CSV file formats.

For trivially simple CSV files, however, the jq invocation jq -R 'split(",")' can be used to convert each line to a JSON array. If the trivially simple CSV file has a row of headers, then as shown below, jq can also be used to produce a stream or array of objects using the header values as keys.

In this recipe, therefore, we will assume that either the CSV is trivially simple or that a suitable tool for performing the basic row-by-row conversion to JSON arrays is available. One such tool is any-json.

The following jq program expects as input an array, the first element of which is to be interpreted as a row of headers, and the other elements of which are to be interpreted as rows.

    # Requires: jq 1.5
    # objectify/1 expects an array of atomic values as inputs, and packages
    # these into an object with keys specified by the "headers" array and
    # values obtained by trimming string values, replacing empty strings
    # by null, and converting strings to numbers if possible.
    def objectify(headers):
      def tonumberq: tonumber? // .;
      def trimq: if type == "string" then sub("^ +";"") | sub(" +$";"") else . end;
      def tonullq: if . == "" then null else . end;
      . as $in
      | reduce range(0; headers|length) as $i
          ({}; .[headers[$i]] = ($in[$i] | trimq | tonumberq | tonullq) );
    def csv2jsonHelper:
      .[0] as $headers
      | reduce (.[1:][] | select(length > 0) ) as $row
          ([]; . + [ $row|objectify($headers) ]);

Usage example:

$ any-json input.csv | jq -f csv2json-helper.jq

Processing a large number of lines or JSON entities

Using jq 1.4 to process a file consisting of a large number of JSON entities or lines of raw text can be very challenging if any kind of reduction step is necessary, as the --slurp option requires the input to be stored in memory. One way to circumvent the limitations of jq 1.4 in this respect would be to break up the input file into smaller pieces, process them separately (perhaps in parallel), and then combine the results. Examples and utilities for parallel processing using jq can be found in jq-hopkok's parallelism folder.

The introduction of the inputs builtin in jq 1.5 allows files to be read in efficiently on an entity-by-entity or line-by-line basis. That is, the entire file no longer need be read in using the "slurp" option.

Here is an example drawn from

The input file consists of JSON entities, like so:

{"machine": "possible_victim01", "domain": "", "timestamp":1435071870}
{"machine": "possible_victim01", "domain": "", "timestamp":1435071875}
{"machine": "possible_victim01", "domain": "", "timestamp":1435071877}
{"machine": "possible_victim02", "domain": "", "timestamp":1435071877}
{"machine": "possible_victim03", "domain": "", "timestamp":1435071879}

The task is to produce a report consisting of a single object, like so:

      "possible_victim01": {
        "total": 3,
        "evildoers": {
          "": 2,
          "": 1
      "possible_victim02": {
        "total": 1,
        "evildoers": {
          "": 1
      "possible_victim03": {
        "total": 1,
        "evildoers": {
          "": 1

Here is a straightforward jq program that will do the job:

reduce inputs as $line
   $line.machine as $machine
   | $line.domain as $domain
   | .[$machine].total as $total
   | .[$machine].evildoers as $evildoers
   | . + { ($machine): {"total": (1 + $total),
                        "evildoers": ($evildoers | (.[$domain] += 1)) }} )

The program would be invoked with the -n option, e.g. like so:

jq -n -f program.jq data.json

The -n option is required as the invocation of inputs does the reading of the file.

If the task requires both per-line (or per-entity) processing as well as some kind of reduction, then the foreach builtin, also introduced in jq 1.5, is very useful, as it obviates the need to accumulate anything that is not required for the reduction.

The trick is to use foreach (inputs, null) rather than just foreach inputs. As a simple example, suppose we have a file consisting of a large number of JSON objects, some of which have a key, say "n", and we are required to extract the corresponding values as well as determine the number of objects for which the "n" value is present and not null.

foreach (inputs, null) as $line 
   if $line.n then .+1 else . end;
   if $line == null then . else $line.n // empty end)

Processing huge JSON texts

In jq 1.5 there is a --stream option that enables a streaming JSON parser, which allows huge JSON texts to be processed without requiring the entire JSON text to be read into memory.

The jq FAQ shows how to use the jq streaming parser to convert a top-level JSON array into a stream of its elements.

In the following, we consider three other examples:

(a) a single JSON object with many scalar-valued keys


{"a":1, "b": {"c": 3}}


jq -c --stream '. as $in | select(length == 2) | {}|setpath($in[0]; $in[1])' # stream of leaflets



Notice that the output consists of a stream of "leaflets", that is, a stream of JSON entities, one for each "leaf", where each "leaflet" reflects the original structure:

(b) A JSON object with a single scalar-valued array


{"a": [1, 2.2, true, "abc", null]}'


jq -nc --stream '
    fromstream( 1|truncate_stream(inputs)
      |  select(length>1)
      | .[0] |= .[1:] )'



(c) An arbitrary JSON object


{"a": [1, 2], "b": [3, 4]}


jq -nc --stream '
  def atomize(s):
    fromstream(foreach s as $in ( {previous:null, emit: null};
      if ($in | length == 2) and ($in|.[0][0]) != .previous and .previous != null
      then {emit: [[.previous]], previous: $in|.[0][0]}
      else { previous: ($in|.[0][0]), emit: null}
      (.emit // empty), $in) ) ;



For further information about the streaming parser, see the jq Manual and the FAQ.

List keys used in any object in a list

If you have an array of JSON objects and want to obtain a listing of the top-level keys in these objects, consider:

add | keys

If you want to obtain a listing of all the keys in all the objects, no matter how deeply nested, you can use this filter:

[.. | objects | keys[]] | unique

For example, given the array: [{"a": {"b":1}}, {"a": {"c":2}}]

the previous filter will produce: ["a", "b", "c"]

Include or import a module and call its functions

Key points:

  • If the module, say M.jq, is located in ~/.jq/ or ~/.jq/M/ then there should be no need to invoke jq with the -L option;
  • The search path can be specified using the command-line option: -L <path> # the path may be absolute
  • include "filename"; # a reference to filename.jq
  • import "filename" as symbol; # a reference to filename.jq
  • :: is the scope resolution operator, e.g. builtin::walk

Example 1: ~/.jq/library/library.jq

  1. Copy the definition of walk/1 to $HOME/.jq/library/library.jq (see e.g.

  2. jq 'include "library"; walk(if type == "object" then del(.foo) else . end)' <<< '{"a":1, "foo": 2}'

Example 2: ~/jq/library.jq

  1. Copy the definition of walk/1 to $HOME/jq/library.jq (see e.g.

  2. Invoke jq with the -L option:

jq -L $HOME/jq 'import "library" as lib;
   lib::walk(if type == "object" then del(.foo) else . end)' <<< '{"a":1, "foo": 2}'

Remove adjacent matching elements from a list

The unique built-in will give you each unique element in a list, but sometimes it's useful to mimic the behavior of the unix uniq command. One way to do that is to use range to select only elements which differ from their neighbor:

def uniq: 
  [range(0;length) as $i
   | .[$i] as $x
   | if $i == 0 or $x != .[$i-1] then $x else empty end];

Example input:


Applying the uniq filter produces:


And here is a stream-oriented version:

    def uniq(s):
      foreach s as $x ({n:-1, prev: null};
        .n += 1 | .prev = .this | .this = $x;
        if .n==0 or .prev != $x then $x else empty end);
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.