Add a flatten operator that supports wildcard projections #20

Merged
merged 8 commits into from Nov 27, 2013

Conversation

Projects
None yet
2 participants
@jamesls
Owner

jamesls commented Nov 21, 2013

There have been requests to add an operator that will merge/flatten sublists. For example, given something like this:

[ [1, 2, 3], [4, 5, 6], [7, 8, 9]]

It should be possible to get:

[1, 2, 3, 4, 5, 6, 7 ,8, 9]

The proposed syntax has an updated grammar of bracket-specifier = "[" (number / "*") "]" / "[]"
When this is encountered any sublists in a list are merged into the parent list.

Updated spec/compliance tests are included.

I've also moved out the wildcard syntax [*] out of indices.json and into wildcard.json

On a practical level, this enable things like this in the AWS CLI:

$ aws ec2 describe-instances --query 'Reservations[].Instances[].[InstanceId, State.Name]'
[
    [
        "i-123",
        "stopped"
    ],
    [
        "i-123",
        "running"
    ],
    [
        "i-123",
        "stopped"
    ],
    [
        "i-123",
        "running"
    ],
    [
        "i-cc07a7fb",
        "running"
    ],
    [
        "i-123",
        "stopped"
    ]
]

Note how you just have a list of instances instead of the extra nesting you get from the reservations list using the Reservations[*].Instances[*].[InstanceId,State.Name] version.

cc @mtdowling @trevorrowe

@mtdowling

This comment has been minimized.

Show comment Hide comment
@mtdowling

mtdowling Nov 21, 2013

Contributor

Looks good.

One other key difference that might be worth mentioning is that the flatten operator projects differently than the wildcard operator:

A wildcard operator projection wraps subsequent wildcard projections and then rolls them up as each sub-projection completes thus causing each projection to be wrapped in a corresponding array. However, a flatten operator projection stops projecting results when another flatten operator is encountered. This has the same basic effect of the wildcard operator but does not add an outer wrapping array and allows the initial merging of results as each subsequent flatten operator is encountered.

For example:

Given foo[].bar[].baz and the following data:

{
  "foo": [
    {
      "bar": [{"baz": 1}, {"baz": 2}]
    },
    {
      "bar": [{"baz": 3}, {"baz": 4}]
    },
    {
      "bar": [{"baz": 5}, {"baz": 6}]
    }
  ]
}

The expected result is [1, 2, 3, 4, 5, 6].

With wildcards, foo[*].bar[*].baz, the result is

[
    [
        1,
        2
    ],
    [
        3,
        4
    ],
    [
        5,
        6
    ]
]

Without this key bit of information, it took me quite some time to figure out how to implement this.

You can see how this actually works by taking a look at the byte code generated for my interpreter for the first example and then followed by the second example:

  foo[].bar[].baz

  0: field              "foo"
  1: merge              
  2: each             5
    3: field            "bar"
  4: goto             2
  5: merge              
  6: each             9
    7: field            "baz"
  8: goto              6
  9: stop               

Notice that the loop iterating over each flatten element stops iteration and then merges before the next flatten operator. This is quite different than how normal wildcard projections occur:

  foo[*].bar[*].baz

  0: field              "foo"
  1: each             7
    2: field            "bar"
    3: each           6
      4: field          "baz"
    5: goto           3
  6: goto             1
  7: stop              
Contributor

mtdowling commented Nov 21, 2013

Looks good.

One other key difference that might be worth mentioning is that the flatten operator projects differently than the wildcard operator:

A wildcard operator projection wraps subsequent wildcard projections and then rolls them up as each sub-projection completes thus causing each projection to be wrapped in a corresponding array. However, a flatten operator projection stops projecting results when another flatten operator is encountered. This has the same basic effect of the wildcard operator but does not add an outer wrapping array and allows the initial merging of results as each subsequent flatten operator is encountered.

For example:

Given foo[].bar[].baz and the following data:

{
  "foo": [
    {
      "bar": [{"baz": 1}, {"baz": 2}]
    },
    {
      "bar": [{"baz": 3}, {"baz": 4}]
    },
    {
      "bar": [{"baz": 5}, {"baz": 6}]
    }
  ]
}

The expected result is [1, 2, 3, 4, 5, 6].

With wildcards, foo[*].bar[*].baz, the result is

[
    [
        1,
        2
    ],
    [
        3,
        4
    ],
    [
        5,
        6
    ]
]

Without this key bit of information, it took me quite some time to figure out how to implement this.

You can see how this actually works by taking a look at the byte code generated for my interpreter for the first example and then followed by the second example:

  foo[].bar[].baz

  0: field              "foo"
  1: merge              
  2: each             5
    3: field            "bar"
  4: goto             2
  5: merge              
  6: each             9
    7: field            "baz"
  8: goto              6
  9: stop               

Notice that the loop iterating over each flatten element stops iteration and then merges before the next flatten operator. This is quite different than how normal wildcard projections occur:

  foo[*].bar[*].baz

  0: field              "foo"
  1: each             7
    2: field            "bar"
    3: each           6
      4: field          "baz"
    5: goto           3
  6: goto             1
  7: stop              

@jamesls jamesls merged commit 5f9daf8 into develop Nov 27, 2013

1 check passed

default The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment