Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tool that shows all possible jq paths in a given json file #243

Closed
ilyash opened this issue Dec 14, 2013 · 23 comments
Closed

tool that shows all possible jq paths in a given json file #243

ilyash opened this issue Dec 14, 2013 · 23 comments
Labels

Comments

@ilyash
Copy link

ilyash commented Dec 14, 2013

( Copy+pasting unanswered email: )

Hi.

I have just prepared a tiny tool that shows all possible jq paths in a given json file. I guess it can help people that use jq. I'm not sure where in the docs it goes. Can you please add a link wherever you think is appropriate?

Please note (also, should be clear to users before following the link) that the script was not heavily tested and I guess is alpha quality in general, quick and dirty, naive implementation.... but still works for me.

https://github.com/ilyash/show-struct

Thanks!

@nicowilliams
Copy link
Contributor

Can you write that program in jq? I bet it can be done!

@ilyash
Copy link
Author

ilyash commented Jan 7, 2014

@nicowilliams ... but why ?I'm sure I can be more productive doing pretty much anything else :)

@bryder
Copy link

bryder commented Mar 21, 2014

Excellent @ilyash ! I'm going to use that. I was thinking of doing the same thing. And yes @nicowilliams I did think about how I could have done it in jq.

@ghost
Copy link

ghost commented Mar 21, 2014

@nicowilliams now i have a new crazy jq side project to match my sudoku validator

@ghost
Copy link

ghost commented Mar 21, 2014

Do you think this application could be used to provide autocompletion on the command line? That would be amazing.

@nicowilliams
Copy link
Contributor

FYI, I believe that this:

jq 'path(recurse(if type|. == "array" or . =="object" then .[] else

empty end))'

or, with master, this:

jq 'path(..)'

will list all the possible paths.

@ilyash
Copy link
Author

ilyash commented Apr 10, 2014

@nicowilliams this is completely different output, it's not copy+paste ready to help with building the jq command line.

@ilyash
Copy link
Author

ilyash commented Apr 10, 2014

@slapresta this is very good idea, not sure yet whether and how this could be implemented. The first concern is that the command is of the form jq PATH FILE so when completing the PATH part, the FILE is "unknown" to completion yet.

@nicowilliams
Copy link
Contributor

+1 to @slapresta 's idea.

@ilyash This produces a list of paths in the input:

jq -c 'path(..)|[.[]|tostring]|join("/")'

It's a very simple, trivial program :)

@ilyash
Copy link
Author

ilyash commented Jun 21, 2014

@nicowilliams You are totally ignoring the differences between outputs and the fact that show-struct is helpful for constructing the path part of jq command. Have you seen the sample output of the show-struct? Let's compare full (but censored) outputs for the same input.

show_struct:

.Records -- (Array of 3 elements)
.Records[]
.Records[].awsRegion -- us-east-1
.Records[].eventName -- DescribeInstances
.Records[].eventSource -- ec2.amazonaws.com
.Records[].eventTime -- 2013-12-06T10:34:34Z .. 2013-12-06T10:36:36Z (3 unique values)
.Records[].eventVersion -- 1.0
.Records[].requestParameters
.Records[].requestParameters.filterSet
.Records[].requestParameters.filterSet.items -- (Array of 3 elements)
.Records[].requestParameters.filterSet.items[]
.Records[].requestParameters.filterSet.items[].name -- group-name .. tag:role (3 unique values)
.Records[].requestParameters.filterSet.items[].valueSet
.Records[].requestParameters.filterSet.items[].valueSet.items -- (Array of 1 elements)
.Records[].requestParameters.filterSet.items[].valueSet.items[]
.Records[].requestParameters.filterSet.items[].valueSet.items[].value -- dsp_worker .. staging-dsp-processor (5 unique values)
.Records[].requestParameters.instancesSet -- (Empty hash)
.Records[].responseElements -- <responseOmitted>
.Records[].sourceIPAddress -- X.Y.Z.W .. X.Y.Z.WW (3 unique values)
.Records[].userAgent -- aws-sdk-java/1.5.0 Linux/3.2.0-4-amd64 OpenJDK_64-Bit_Server_VM/23.7-b01 .. aws-sdk-java/1.6.7 Linux/3.2.0-4-amd64 OpenJDK_64-Bit_Server_VM/23.7-b01 (2 unique values)
.Records[].userIdentity
.Records[].userIdentity.accessKeyId -- XXXXXXXXXXXXXXXXXXXX
.Records[].userIdentity.accountId -- NNNNNNNNNNNN
.Records[].userIdentity.arn -- arn:aws:iam::NNNNNNNNNNNN:user/stage-dsp-manager
.Records[].userIdentity.principalId -- XXXXXXXXXXXXXXXXXXXXX
.Records[].userIdentity.type -- IAMUser
.Records[].userIdentity.userName -- stage-dsp-manager

In contrast, the jq program above shows:

""
"Records"
"Records/0"
"Records/0/eventVersion"
"Records/0/userIdentity"
"Records/0/userIdentity/type"
"Records/0/userIdentity/principalId"
"Records/0/userIdentity/arn"
"Records/0/userIdentity/accountId"
"Records/0/userIdentity/accessKeyId"
"Records/0/userIdentity/userName"
"Records/0/eventTime"
"Records/0/eventSource"
"Records/0/eventName"
"Records/0/awsRegion"
"Records/0/sourceIPAddress"
"Records/0/userAgent"
"Records/0/requestParameters"
"Records/0/requestParameters/instancesSet"
"Records/0/requestParameters/filterSet"
"Records/0/requestParameters/filterSet/items"
"Records/0/requestParameters/filterSet/items/0"
"Records/0/requestParameters/filterSet/items/0/name"
"Records/0/requestParameters/filterSet/items/0/valueSet"
"Records/0/requestParameters/filterSet/items/0/valueSet/items"
"Records/0/requestParameters/filterSet/items/0/valueSet/items/0"
"Records/0/requestParameters/filterSet/items/0/valueSet/items/0/value"
"Records/0/requestParameters/filterSet/items/1"
"Records/0/requestParameters/filterSet/items/1/name"
"Records/0/requestParameters/filterSet/items/1/valueSet"
"Records/0/requestParameters/filterSet/items/1/valueSet/items"
"Records/0/requestParameters/filterSet/items/1/valueSet/items/0"
"Records/0/requestParameters/filterSet/items/1/valueSet/items/0/value"
"Records/0/requestParameters/filterSet/items/2"
"Records/0/requestParameters/filterSet/items/2/name"
"Records/0/requestParameters/filterSet/items/2/valueSet"
"Records/0/requestParameters/filterSet/items/2/valueSet/items"
"Records/0/requestParameters/filterSet/items/2/valueSet/items/0"
"Records/0/requestParameters/filterSet/items/2/valueSet/items/0/value"
"Records/0/responseElements"
"Records/1"
"Records/1/eventVersion"
"Records/1/userIdentity"
"Records/1/userIdentity/type"
"Records/1/userIdentity/principalId"
"Records/1/userIdentity/arn"
"Records/1/userIdentity/accountId"
"Records/1/userIdentity/accessKeyId"
"Records/1/userIdentity/userName"
"Records/1/eventTime"
"Records/1/eventSource"
"Records/1/eventName"
"Records/1/awsRegion"
"Records/1/sourceIPAddress"
"Records/1/userAgent"
"Records/1/requestParameters"
"Records/1/requestParameters/instancesSet"
"Records/1/requestParameters/filterSet"
"Records/1/requestParameters/filterSet/items"
"Records/1/requestParameters/filterSet/items/0"
"Records/1/requestParameters/filterSet/items/0/name"
"Records/1/requestParameters/filterSet/items/0/valueSet"
"Records/1/requestParameters/filterSet/items/0/valueSet/items"
"Records/1/requestParameters/filterSet/items/0/valueSet/items/0"
"Records/1/requestParameters/filterSet/items/0/valueSet/items/0/value"
"Records/1/requestParameters/filterSet/items/1"
"Records/1/requestParameters/filterSet/items/1/name"
"Records/1/requestParameters/filterSet/items/1/valueSet"
"Records/1/requestParameters/filterSet/items/1/valueSet/items"
"Records/1/requestParameters/filterSet/items/1/valueSet/items/0"
"Records/1/requestParameters/filterSet/items/1/valueSet/items/0/value"
"Records/1/requestParameters/filterSet/items/2"
"Records/1/requestParameters/filterSet/items/2/name"
"Records/1/requestParameters/filterSet/items/2/valueSet"
"Records/1/requestParameters/filterSet/items/2/valueSet/items"
"Records/1/requestParameters/filterSet/items/2/valueSet/items/0"
"Records/1/requestParameters/filterSet/items/2/valueSet/items/0/value"
"Records/1/responseElements"
"Records/2"
"Records/2/eventVersion"
"Records/2/userIdentity"
"Records/2/userIdentity/type"
"Records/2/userIdentity/principalId"
"Records/2/userIdentity/arn"
"Records/2/userIdentity/accountId"
"Records/2/userIdentity/accessKeyId"
"Records/2/userIdentity/userName"
"Records/2/eventTime"
"Records/2/eventSource"
"Records/2/eventName"
"Records/2/awsRegion"
"Records/2/sourceIPAddress"
"Records/2/userAgent"
"Records/2/requestParameters"
"Records/2/requestParameters/instancesSet"
"Records/2/requestParameters/filterSet"
"Records/2/requestParameters/filterSet/items"
"Records/2/requestParameters/filterSet/items/0"
"Records/2/requestParameters/filterSet/items/0/name"
"Records/2/requestParameters/filterSet/items/0/valueSet"
"Records/2/requestParameters/filterSet/items/0/valueSet/items"
"Records/2/requestParameters/filterSet/items/0/valueSet/items/0"
"Records/2/requestParameters/filterSet/items/0/valueSet/items/0/value"
"Records/2/requestParameters/filterSet/items/1"
"Records/2/requestParameters/filterSet/items/1/name"
"Records/2/requestParameters/filterSet/items/1/valueSet"
"Records/2/requestParameters/filterSet/items/1/valueSet/items"
"Records/2/requestParameters/filterSet/items/1/valueSet/items/0"
"Records/2/requestParameters/filterSet/items/1/valueSet/items/0/value"
"Records/2/requestParameters/filterSet/items/2"
"Records/2/requestParameters/filterSet/items/2/name"
"Records/2/requestParameters/filterSet/items/2/valueSet"
"Records/2/requestParameters/filterSet/items/2/valueSet/items"
"Records/2/requestParameters/filterSet/items/2/valueSet/items/0"
"Records/2/requestParameters/filterSet/items/2/valueSet/items/0/value"
"Records/2/responseElements"

@joelpurra
Copy link
Contributor

joelpurra commented Jul 9, 2014

I like the idea of a compact structural overview, but am not as interested in the value or array lengths.

jq '[path(..)|map(if type=="number" then "[]" else tostring end)|join(".")|split(".[]")|join("[]")]|unique|map("."+.)|.[]'
[
    path(..)
    | map(
        if type == "number" then
            "[]"
        else
            tostring
        end
    )
    | join(".")
    | split(".[]")
    | join("[]")
]
| unique
| map("." + .)
| .[]

See also structure.sh in jq-hopkok for a more fleshed out script, in particular for property names containing special characters.

@nicowilliams
Copy link
Contributor

@joelpurra Very clever! This will come in handy when we have eval.

@nicowilliams
Copy link
Contributor

@slapresta Your jq sudoku code is very cool. I wonder if you can modify it to take advantage of tail call optimization in master...

joelpurra added a commit to joelpurra/har-dulcify that referenced this issue Jul 14, 2014
@ilyash
Copy link
Author

ilyash commented Aug 16, 2014

@joelpurra , I have a sample file (not sure where I got it from, it's just there):

[
  {"$jt:env": "HOME"},
  ["$jt:env", "HOME"]

]

Both outputs from your snippet and my show_struct are pretty confusing:

"."
".[]"
".[].$jt:env"

and

[] -- (Array of 2 elements)
[].$jt:env -- HOME
[][] -- $jt:env .. HOME (2 unique values)

I guess we could both improve...

@joelpurra
Copy link
Contributor

@ilyash: Yes, it's a confusing example =)

I have an updated version that works slightly differently. Plan to move it from har-dulcify to jq-hopkok at some point.

<temp.json har-dulcify/src/util/structure.sh
.
.[]
.[]["$jt:env"]
.[][]

Because of the mixed data types, jq doesn't fully agree when using this as input on the same file.

@nicowilliams
Copy link
Contributor

On Sat, Aug 16, 2014 at 2:51 PM, Ilya Sher notifications@github.com wrote:

@joelpurra https://github.com/joelpurra , I have a sample file (not
sure where I got it from, it's just there):

[
{"$jt:env": "HOME"},
["$jt:env", "HOME"]
]

Both outputs from your snippet and my show_struct are pretty confusing:

"."
".[]"
".[].$jt:env"

and

[] -- (Array of 2 elements)
[].$jt:env -- HOME
[][] -- $jt:env .. HOME (2 unique values)

I guess we could both improve...

A path-based representation of the above text which papers over the
difference in (non-scalar) types of the elements of the top-level array...
must be lossy.

@jacobsalmela
Copy link

@nicowilliams or @ilyash what about showing the values of those paths as well?

@ayosec
Copy link

ayosec commented Jun 21, 2019

If anyone is interested, the following code show all paths with their values:

paths(scalars) as $p
  | [ ( [ $p[] | tostring ] | join(".") )
    , ( getpath($p) | tojson )
    ]
  | join(" = ")

Example:

$ jq -r '
paths(scalars) as $p
  | [ ( [ $p[] | tostring ] | join(".") )
    , ( getpath($p) | tojson )
    ]
  | join(" = ")
' <<'INPUT'
{
  "a": 1,
  "b": [ "red", "green", "blue" ],
  "c": {
    "d": [
      {
        "a": 100,
        "b": 200,
        "c": "x\ny\nz"
      },
      {
        "a": 101,
        "b": 201
      }
    ]
  }
}
INPUT

Output:

a = 1
b.0 = "red"
b.1 = "green"
b.2 = "blue"
c.d.0.a = 100
c.d.0.b = 200
c.d.0.c = "x\ny\nz"
c.d.1.a = 101
c.d.1.b = 201

@gertcuykens
Copy link

@ayosec is it possible to have this? thx

a = 1
b[0] = "red"
b[1] = "green"
b[2] = "blue"
c.d[0].a = 100
c.d[0].b = 200
c.d[0].c = "x\ny\nz"
c.d[1].a = 101
c.d[1].b = 201

Tried it myself but seems alot harder then I expected.

@pkoppstein
Copy link
Contributor

pkoppstein commented Jul 26, 2019

@gertcuykens - The following produces the output you've indicated you want:

def path2text($value):
  def tos: if type == "number" then . else "\"\(tojson)\"" end;
  reduce .[] as $segment ("";  .
    + ($segment
       | if type == "string" then "." + . else "[\(.)]" end))
  + " = \($value | tos)";
  

paths(scalars) as $p
  | getpath($p) as $v
  | $p | path2text($v)

@gertcuykens
Copy link

gertcuykens commented Jul 26, 2019

yep works awesome thx, wouldn't be able to find it myself in a million years :)

PS just made a slide modification else you get double quotes = ""test""

def path2text($value):
  def tos: if type == "number" then . else tojson end;
  reduce .[] as $segment ("";  .
    + ($segment
       | if type == "string" then "." + . else "[\(.)]" end))
  + " = \($value | tos)";

paths(scalars) as $p
  | getpath($p) as $v
  | $p | path2text($v)

@YordanGeorgiev
Copy link

YordanGeorgiev commented Oct 8, 2020

@ilyash - open a buy me a beer / coffee account ... I will be one of the first to send you money ... Your tools has save me and my customers tremendous amount of time ...

@orvn
Copy link

orvn commented Jul 14, 2021

I've been using this syntax

jq 'select(objects)|=[.] | map( paths(scalars) ) | map( map(select(numbers)="[]") | join(".")) | unique'

That last pipe to unique is also helpful for really large objects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests