Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Are wildcard values possible? #82

Closed
davehatton opened this issue Feb 5, 2013 · 3 comments
Closed

Are wildcard values possible? #82

davehatton opened this issue Feb 5, 2013 · 3 comments
Labels

Comments

@davehatton
Copy link

(Sorry for asking a question here, I couldn't find a forum for jq)

I have a pair of artificial and convoluted files

company1.json

{
    "A0001": {
        "album": "Album1", 
        "ref": "0001",
        "artist": "Artist1",
        "tracks": {
            "T1": {
                 "track": "1",
                 "title": "Album 1 Song 1",
                 "composer": "Composer Song 1"
            },
            "T2": {
                 "track": "2",
                 "title": "Album 1 Song 2",
                 "composer": "Composer Song 2"
            },
            "T3": {
                 "track": "3",
                 "title": "Album 1 Song 3",
                 "composer": "Composer Song 3"
            }
        }
    }, 
    "A0002": {
        "album": "Album2", 
        "ref": "0002",
        "artist": "Artist2",
        "tracks": {
            "T1": {
                 "track": "1",
                 "title": "Album 2 Song 1",
                 "composer": "Composer Song 1"
            },
            "T2": {
                 "track": "2",
                 "title": "Album 2 Song 2",
                 "composer": "Composer Song 2"
            },
            "T3": {
                 "track": "3",
                 "title": "Album 2 Song 3",
                 "composer": "Composer Song 3"
            }
        }
    }
}

company2.json

{
    "A0003": {
        "album": "Album3", 
        "ref": "0003",
        "artist": "Artist3",
        "tracks": {
            "T1": {
                 "track": "1",
                 "title": "Album 3 Song 1",
                 "composer": "Composer Song 1"
            },
            "T2": {
                 "track": "2",
                 "title": "Album 3 Song 2",
                 "composer": "Composer Song 2"
            },
            "T3": {
                 "track": "3",
                 "title": "Album 3 Song 3",
                 "composer": "Composer Song 3"
            }
        }
    }, 
    "A0004": {
        "album": "Album4", 
        "ref": "0004",
        "artist": "Artist4",
        "tracks": {
            "T1": {
                 "track": "1",
                 "title": "Album 4 Song 1",
                 "composer": "Composer Song 1"
            },
            "T2": {
                 "track": "2",
                 "title": "Album 4 Song 2",
                 "composer": "Composer Song 2"
            },
            "T3": {
                 "track": "3",
                 "title": "Album 4 Song 3",
                 "composer": "Composer Song 3"
            }
        }
    }
}

I can run the following

cat company*.json | jq 'if .A0002.tracks.T2.title == "Album 2 Song 2" then (.A0002) else select(false) end | .album, .artist'

which returns

"Album2"
"Artist2"

Is there a way I can wildcard the values A0002 and T2?
eg:

cat company*.json | jq 'if .**WILDCARD**.tracks.**WILDCARD**.title == "Album 2 Song 2" then (.**COMPLETE RECORD**) else select(false) end | .album, .artist'

also can I do some form of string match
eg

cat company*.json | jq 'if .**WILDCARD**.tracks.**WILDCARD**.title ISLIKE "Song 2" then (.**COMPLETE RECORD**) else select(false) end | .album, .artist'
@ghost
Copy link

ghost commented Feb 5, 2013

.[] | select( .tracks[].title | contains("Song 2") ) | .album, .artist

My understanding (as a user) is that .[] acts as a wildcard - it produces the values of an object, as a stream. It also works for arrays (actualy, only its application to arrays is documented..). NB: if it occurs after another selector, you omit the ., so it's just [].

Thus you would need .[].tracks[].title for the test... however, for the result, you don't actually want the whole object, but only the value of the matching field (e.g. not the whole object in company1.json, but just value of field A0002). One way to do this is to iterate over the fields you want, applying the test to each, and returning the matching ones:

.[] | if .tracks[].title == "Album 2 Song 2" then (.) else select(false) end | .album, .artist

This can be simplified a bit. First, select(false) is the same as empty:

.[] | if .tracks[].title == "Album 2 Song 2" then (.) else empty end | .album, .artist

Next, the whole if expression can be written as a select:

.[] | select( .tracks[].title == "Album 2 Song 2" ) | .album, .artist

Finally, to answer your second question, jq doesn't have an ISLIKE, but it does have contains which, when used with strings, is a substring test. It takes the sought substring as an argument, and returns true or false. So, the script becomes:

.[] | select( .tracks[].title | contains("Song 2") ) | .album, .artist

@davehatton
Copy link
Author

13ren, thank you very much for your clear and detailed response. I'd been tearing out my hair trying to understand this.

I have two further questions related to my example files.

Firstly, given that

.[] | select( .tracks[].title == "Album 2 Song 2" ) 

finds the target record for me, is there a way to return just the whole records key? in this case A0002.

(I can see the keys operator, but can't see how to use it to achieve this.)

Secondly, Having passed back just the key, eg A0002 it would be great to pass back the file-name that the record was found in also, eg company1.json. Is there an elegant way to do this?

(My thoughts were to add a top level key "FILENAME": "company1.json to the input file and reference it with .FILENAME, but I get jq: error: Cannot index string with string if I do that)

@ghost
Copy link

ghost commented Feb 5, 2013

. as $in| keys[]| select( $in[.].tracks[].title == "Album 2 Song 2" )

To return the key (fieldname), you have to use that in the lookup. keys gives you the fieldnames as an array; keys[] streams the contents of this array (as with [] before).

Because jq only has one stream argument, and we need both the object we are addressing and the key/fieldname, we need to store one of them in a variable. In the above, the object is stored in variable $in. Setting a variable doesn't affect the stream, but acts exactly like ..

I think your idea for a top-level key should work. Note that the wildcard code will attempt to lookup (or "index") the field tracks of the object. If you try to do this on a string instead of an object, you get that error. Cannot index string with string. So one solution is to add a check for the type of value. e.g.

. as $in| keys[]| select( ($in[.]|type=="object") and $in[.].tracks[].title == "Album 2 Song 2" )

You can factor out the $in[.], just within the boolean expression, which doesn't affect the overall result:

. as $in| keys[]| select($in[.]| (type=="object" and .tracks[].title == "Album 2 Song 2") )

Though maybe it's clearer, and more jq-like, to just filter it out in a separate step:

. as $in| keys[]| select($in[.]|type=="object") | select( $in[.].tracks[].title == "Album 2 Song 2")

BTW: type isn't documented, but noted in an issue.

As an aside, you could even have a separate string outside the object even though it's not valid josn (because jq accepts a stream of json instances), like:

"myfilename"
{...}

. as $in| select(type=="object")| keys[]| select( $in[.].tracks[].title == "Album 2 Song 2" )

Finally, there is a way to do partially want you want. Although jq doesn't distinguish between different input files, it does distinguish between json instances in a stream. The --slurp/-s flag assembles this stream into an array. If you have one json per file, this tells you which file it was (though not the filename). Here's some discussion.

BTW: I think you might be interested in this little tutorial towards the end of the docs, about more complex queries: http://stedolan.github.com/jq/manual/#VariablesandFunctions

@davehatton
Copy link
Author

I'm beginning to feel like I'm taking advantage of you, If I can crack this, then I'm all set to solve my task

If I alter the input file to something like

{
    "FILENAME": company1.json",
    "A0001": {
        "album": "Album1", 
        "ref": "0001",
        "artist": "Artist1",
        "tracks": {
            "T1": {
                 "track": "1",
                 "title": "Album 1 Song 1",
                 "composer": "Composer Song 1"
            },
            "T2": {
                 "track": "2",
                 "title": "Album 1 Song 2",
                 "composer": "Composer Song 2"
            },
            "T3": {
                 "track": "3",
                 "title": "Album 1 Song 3",
                 "composer": "Composer Song 3"
            }
        }
    }, 
    "A0002": {
        "album": "Album2", 
        "ref": "0002",
        "artist": "Artist2",
        "tracks": {
            "T1": {
                 "track": "1",
                 "title": "Album 2 Song 1",
                 "composer": "Composer Song 1"
            },
            "T2": {
                 "track": "2",
                 "title": "Album 2 Song 2",
                 "composer": "Composer Song 2"
            },
            "T3": {
                 "track": "3",
                 "title": "Album 2 Song 3",
                 "composer": "Composer Song 3"
            }
        }
    }
}

and then use

cat company*.json | jq '. as $in | .FILENAME as $fn | keys[] | select($in[.]|type=="object") | select($in[.].tracks[].title == "Album 2 Song 2") as $res| $res, $fn'

I get this

"A0002"
"company1.json"

whoo-hoo!!!

is this the best way to tackle this or is there a more jq way?

@ghost
Copy link

ghost commented Feb 6, 2013

I'm beginning to feel like I'm taking advantage of you

You can fix that by passing on your understanding to the next asker... also a tremendous way to consolidate your understanding. ;-)

@CourseraStudent58
Copy link

I especially appreciated the example of how to use contains().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants