# JQ by Example

* http://stedolan.github.io/jq/

* Equivalent for `sed`, `akw`, `csvkit` for JSON data.
* Analogue to xpath query language for xml
* Have a look at `xidel` for scaping HTML/XML http://videlibri.sourceforge.net/xidel.html

In [1]:
# Have some data stored as json
!cat DataSets/HistogramAPI.json | head

[
[1358024400,1800,{
	"0.5": 1,
	"0.59": 2,
	"1.7": 1,
	"2.5": 1,
	"3.4": 1,
	"3.5": 1,
	"3.6": 4,
	"3.7": 5,


In [20]:
# Pretty print
!cat DataSets/HistogramAPI.json | jq '.' | head

[
  [
    1358024400,
    1800,
    {
      "0.5": 1,
      "0.59": 2,
      "1.7": 1,
      "2.5": 1,
      "3.4": 1,


In [3]:
# Look at first elements: timestamps
!cat DataSets/HistogramAPI.json | jq '.[][0]'

[0m1358024400[0m
[0m1358026200[0m
[0m1358028000[0m
[0m1358029800[0m
[0m1358031600[0m
[0m1358033400[0m
[0m1358035200[0m
[0m1358037000[0m
[0m1358038800[0m
[0m1358040600[0m
[0m1358042400[0m
[0m1358044200[0m
[0m1358046000[0m
[0m1358047800[0m
[0m1358049600[0m
[0m1358051400[0m


In [4]:
# Count number of elements
!cat DataSets/HistogramAPI.json | jq '.[][0]' | wc -l

16


In [25]:
# Focus on first entry:extract histogram
!cat DataSets/HistogramAPI.json | jq '.[0][2]' | head

{
  "0.5": 1,
  "0.59": 2,
  "1.7": 1,
  "2.5": 1,
  "3.4": 1,
  "3.5": 1,
  "3.6": 4,
  "3.7": 5,
  "3.8": 6,


In [26]:
# Convert to array of 'entries' = item pairs
!cat DataSets/HistogramAPI.json | jq '.[0][2] | to_entries' | head

[
  {
    "key": "0.5",
    "value": 1
  },
  {
    "key": "0.59",
    "value": 2
  },
  {


In [27]:
# Convert to sequence of arrays
!cat DataSets/HistogramAPI.json | jq '.[0][2] | to_entries | .[] | [.key, .value]' | head

[
  "0.5",
  1
]
[
  "0.59",
  2
]
[
  "1.7",


In [28]:
# Output as csv
!cat DataSets/HistogramAPI.json | jq '.[0][2] | to_entries | .[] | [.key, .value] | @csv' --raw-output | head

"0.5",1
"0.59",2
"1.7",1
"10",193
"100",1
"11",209
"12",223
"120",1
"13",176
"14",163


In [30]:
# Clean out quotes (... the dirty way)
!cat DataSets/HistogramAPI.json |\
 jq '.[0][2] | to_entries | .[] | [.key, .value] | @csv' --raw-output |\
 perl -pe 's/"//g' |\
 head

0.5,1
0.59,2
1.7,1
10,193
100,1
11,209
12,223
120,1
13,176
14,163


In [31]:
# Sotre
!cat DataSets/HistogramAPI.json |\
 jq '.[0][2] | to_entries | .[] | [.key, .value] | @csv' --raw-output |\
 perl -pe 's/"//g' \
 > DataSets/HistogramAPI.csv

# Store Logic in Bash Script

* Name: `hist2csv`
* Use `cat` to print `stdin`
* Use `$1`,`$2`, ... for command line arguments. (Does not work in single quotes!)

<pre>
#!/bin/bash
cat |\
jq ".[$1][2] | to_entries | .[] | [.key, .value] | @csv" --raw-output |\
perl -pe 's/"//g'
</pre>

In [33]:
!cat DataSets/HistogramAPI.json | ./hist2csv 10 | head

0.4,1
0.5,1
0.59,1
0.69,1
10,162
100,63
1000,4
11,52
110,56
1100,7


# All done

In [13]:
!head DataSets/HistogramAPI.csv

0.5,1
0.59,2
1.7,1
10,193
100,1
11,209
12,223
120,1
13,176
14,163
