# `OPENJSON` and `CROSS APPLY`

I introduced `OPENJSON` to myself in [another notebook](https://github.com/BryanWilhite/jupyter-central/blob/master/funkykb/t-sql/openjson-and-json_query.ipynb) and this one should properly introduce `CROSS APPLY` [📖 [docs](https://learn.microsoft.com/en-us/sql/t-sql/queries/from-transact-sql?view=sql-server-ver16#left_table_source--cross--outer--apply-right_table_source)]—or more precisely `CROSS` and `APPLY`, where, when `CROSS` is specified:

>…no rows are produced when the `right_table_source` is evaluated against a specified row of the `left_table_source` and returns an empty result set.

And `APPLY` is a direct reference to the mathematical definition of _apply_ [📖 [Wikipedia](https://en.wikipedia.org/wiki/Apply)] in the context of computer programming:

>In computer programming, **apply** applies a function to a list of arguments.

A crystal-clear explanation of `CROSS APPLY` is in this video: 

<figure>
    <a href="https://www.youtube.com/watch?v=PajIaYWTzwk">
        <img alt="CROSS APPLY vs CROSS JOIN - when should I use JOINs, and when should I use APPLY?" src="https://img.youtube.com/vi/PajIaYWTzwk/maxresdefault.jpg" width="480" />
    </a>
    <p><small>CROSS APPLY vs CROSS JOIN - when should I use JOINs, and when should I use APPLY?</small></p>
</figure>

>It seems to me that CROSS APPLY can fill a certain gap when working with calculated fields in complex/nested queries, and make them simpler and more readable.
>
>—<https://stackoverflow.com/a/10976217/22944>
>


## the `Orders` JSON with a `Products` array

Recall the `Orders` JSON from my [other notebook](https://github.com/BryanWilhite/jupyter-central/blob/master/funkykb/t-sql/openjson-and-json_query.ipynb) but, in these notes, we will add an array of `Products` for every `Order`:

In [1]:
EXEC sys.sp_set_session_context
    @key = N'myJson',
    @value = '
{
    "Orders": [
        {
            "OrderNumber":"SO43659",
            "OrderDate":"2011-05-31T00:00:00",
            "AccountNumber":"AW29825",
            "Products": [
                { "ProductId": "SKU348875", "Price": 2024.9940, "Quantity": 1 },
                { "ProductId": "SKU758875", "Price": 70.3670, "Quantity": 5 }
            ]
        },
        {
            "OrderNumber":"SO43661",
            "OrderDate":"2011-06-01T00:00:00",
            "AccountNumber":"AW73565",
            "Products": [
                { "ProductId": "SKU348875", "Price": 2024.9940, "Quantity": 1 },
                { "ProductId": "SKU758875", "Price": 70.3670, "Quantity": 2 },
                { "ProductId": "SKU921234", "Price": 88.79, "Quantity": 1 },
                { "ProductId": "SKU112678", "Price": 7.99, "Quantity": 12 }
            ]
        }
    ]
}'

My [other notebook](https://github.com/BryanWilhite/jupyter-central/blob/master/funkykb/t-sql/openjson-and-json_query.ipynb) introduces the `WITH` clause. We can use what we know about the `WITH` clause to project columns from the JSON:

In [2]:
DECLARE @myJson NVARCHAR(MAX) = CONVERT(NVARCHAR(MAX), SESSION_CONTEXT(N'myJson'))

SELECT
    *
FROM
    OPENJSON(@myJson, '$.Orders')
    WITH (
        ORDER_NUMBER   NVARCHAR(32)  '$.OrderNumber'
    ,   ORDER_DATE     DATETIME      '$.OrderDate'
    ,   ACCOUNT_NUMBER NVARCHAR(32)  '$.AccountNumber'
    ,   PRODUCTS       NVARCHAR(MAX) '$.Products'
    ) orders

ORDER_NUMBER,ORDER_DATE,ACCOUNT_NUMBER,PRODUCTS
SO43659,2011-05-31 00:00:00.000,AW29825,
SO43661,2011-06-01 00:00:00.000,AW73565,


When we faithfully attempt to find our `Products` array with `$.Products` we get `NULL`. We know that `$.Products` represents an object or array. The [documentation](https://learn.microsoft.com/en-us/sql/t-sql/functions/openjson-transact-sql?view=sql-server-ver16#with_clause-1) for the `WITH` clause has commentary about this situation:

>If the path represents an object or an array, and the property can’t be found at the specified path, the function returns null in lax mode or returns an error in strict mode. This behavior is similar to the behavior of the `JSON_VALUE` function.

We might struggle with this situation and resort to something like this:

In [3]:
DECLARE @myJson NVARCHAR(MAX) = CONVERT(NVARCHAR(MAX), SESSION_CONTEXT(N'myJson'))

SELECT
    *
FROM
    OPENJSON(@myJson, '$.Orders') orders

    CROSS APPLY

    OPENJSON(orders.[value], '$.Products') WITH (
        PRODUCT_ID NVARCHAR(32) '$.ProductId'
    ,   PRICE      DECIMAL      '$.Price'
    ,   QUANTITY   INT          '$.Quantity'
    ) products

key,value,type,PRODUCT_ID,PRICE,QUANTITY
0,"{  ""OrderNumber"":""SO43659"",  ""OrderDate"":""2011-05-31T00:00:00"",  ""AccountNumber"":""AW29825"",  ""Products"": [  { ""ProductId"": ""SKU348875"", ""Price"": 2024.9940, ""Quantity"": 1 },  { ""ProductId"": ""SKU758875"", ""Price"": 70.3670, ""Quantity"": 5 }  ]  }",5,SKU348875,2025,1
0,"{  ""OrderNumber"":""SO43659"",  ""OrderDate"":""2011-05-31T00:00:00"",  ""AccountNumber"":""AW29825"",  ""Products"": [  { ""ProductId"": ""SKU348875"", ""Price"": 2024.9940, ""Quantity"": 1 },  { ""ProductId"": ""SKU758875"", ""Price"": 70.3670, ""Quantity"": 5 }  ]  }",5,SKU758875,70,5
1,"{  ""OrderNumber"":""SO43661"",  ""OrderDate"":""2011-06-01T00:00:00"",  ""AccountNumber"":""AW73565"",  ""Products"": [  { ""ProductId"": ""SKU348875"", ""Price"": 2024.9940, ""Quantity"": 1 },  { ""ProductId"": ""SKU758875"", ""Price"": 70.3670, ""Quantity"": 2 },  { ""ProductId"": ""SKU921234"", ""Price"": 88.79, ""Quantity"": 1 },  { ""ProductId"": ""SKU112678"", ""Price"": 7.99, ""Quantity"": 12 }  ]  }",5,SKU348875,2025,1
1,"{  ""OrderNumber"":""SO43661"",  ""OrderDate"":""2011-06-01T00:00:00"",  ""AccountNumber"":""AW73565"",  ""Products"": [  { ""ProductId"": ""SKU348875"", ""Price"": 2024.9940, ""Quantity"": 1 },  { ""ProductId"": ""SKU758875"", ""Price"": 70.3670, ""Quantity"": 2 },  { ""ProductId"": ""SKU921234"", ""Price"": 88.79, ""Quantity"": 1 },  { ""ProductId"": ""SKU112678"", ""Price"": 7.99, ""Quantity"": 12 }  ]  }",5,SKU758875,70,2
1,"{  ""OrderNumber"":""SO43661"",  ""OrderDate"":""2011-06-01T00:00:00"",  ""AccountNumber"":""AW73565"",  ""Products"": [  { ""ProductId"": ""SKU348875"", ""Price"": 2024.9940, ""Quantity"": 1 },  { ""ProductId"": ""SKU758875"", ""Price"": 70.3670, ""Quantity"": 2 },  { ""ProductId"": ""SKU921234"", ""Price"": 88.79, ""Quantity"": 1 },  { ""ProductId"": ""SKU112678"", ""Price"": 7.99, ""Quantity"": 12 }  ]  }",5,SKU921234,89,1
1,"{  ""OrderNumber"":""SO43661"",  ""OrderDate"":""2011-06-01T00:00:00"",  ""AccountNumber"":""AW73565"",  ""Products"": [  { ""ProductId"": ""SKU348875"", ""Price"": 2024.9940, ""Quantity"": 1 },  { ""ProductId"": ""SKU758875"", ""Price"": 70.3670, ""Quantity"": 2 },  { ""ProductId"": ""SKU921234"", ""Price"": 88.79, ""Quantity"": 1 },  { ""ProductId"": ""SKU112678"", ""Price"": 7.99, ""Quantity"": 12 }  ]  }",5,SKU112678,8,12


The SQL above is cross-applying directly on `orders.[value]`. My [other notebook](https://github.com/BryanWilhite/jupyter-central/blob/master/funkykb/t-sql/openjson-and-json_query.ipynb) explains that `[value]` is one of the three default columns projected from `OPENJSON`. This approach, therefore, gives us the default columns of `orders` and the “strongly typed” columns of `products`. We simply must regard this tradeoff as intolerable.

## the importance of the `AS JSON` option in `WITH`

We do not have to make the tradeoff mentioned above because Microsoft provides the `AS JSON` option in `WITH`:

>Use the `AS JSON` option in a column definition to specify that the referenced property contains an inner JSON object or array. 

We can go back to our previous column definition, `PRODUCTS NVARCHAR(MAX) '$.Products'`, but this time add the `AS JSON` option:

In [4]:
DECLARE @myJson NVARCHAR(MAX) = CONVERT(NVARCHAR(MAX), SESSION_CONTEXT(N'myJson'))

SELECT
    *
FROM
    OPENJSON(@myJson, '$.Orders')
    WITH (
        ORDER_NUMBER   NVARCHAR(32)  '$.OrderNumber'
    ,   ORDER_DATE     DATETIME      '$.OrderDate'
    ,   ACCOUNT_NUMBER NVARCHAR(32)  '$.AccountNumber'
    ,   PRODUCTS       NVARCHAR(MAX) '$.Products'      AS JSON
    ) orders

ORDER_NUMBER,ORDER_DATE,ACCOUNT_NUMBER,PRODUCTS
SO43659,2011-05-31 00:00:00.000,AW29825,"[  { ""ProductId"": ""SKU348875"", ""Price"": 2024.9940, ""Quantity"": 1 },  { ""ProductId"": ""SKU758875"", ""Price"": 70.3670, ""Quantity"": 5 }  ]"
SO43661,2011-06-01 00:00:00.000,AW73565,"[  { ""ProductId"": ""SKU348875"", ""Price"": 2024.9940, ""Quantity"": 1 },  { ""ProductId"": ""SKU758875"", ""Price"": 70.3670, ""Quantity"": 2 },  { ""ProductId"": ""SKU921234"", ""Price"": 88.79, ""Quantity"": 1 },  { ""ProductId"": ""SKU112678"", ""Price"": 7.99, ""Quantity"": 12 }  ]"


Instead of cross-applying on `orders.[value]` we can use the “strongly typed” column `orders.[PRODUCTS]`:

In [5]:
DECLARE @myJson NVARCHAR(MAX) = CONVERT(NVARCHAR(MAX), SESSION_CONTEXT(N'myJson'))

SELECT
    orders.[ORDER_NUMBER]
,   orders.[ORDER_DATE]
,   orders.[ACCOUNT_NUMBER]
,   products.[PRODUCT_ID]
,   products.[PRICE]
,   products.[QUANTITY]
FROM
    OPENJSON(@myJson, '$.Orders')
    WITH (
        ORDER_NUMBER   NVARCHAR(32)  '$.OrderNumber'
    ,   ORDER_DATE     DATETIME      '$.OrderDate'
    ,   ACCOUNT_NUMBER NVARCHAR(32)  '$.AccountNumber'
    ,   PRODUCTS       NVARCHAR(MAX) '$.Products'      AS JSON
    ) orders

    CROSS APPLY

    OPENJSON(orders.[PRODUCTS])
    WITH (
        PRODUCT_ID NVARCHAR(32) '$.ProductId'
    ,   PRICE      DECIMAL      '$.Price'
    ,   QUANTITY   INT          '$.Quantity'
    ) products


ORDER_NUMBER,ORDER_DATE,ACCOUNT_NUMBER,PRODUCT_ID,PRICE,QUANTITY
SO43659,2011-05-31 00:00:00.000,AW29825,SKU348875,2025,1
SO43659,2011-05-31 00:00:00.000,AW29825,SKU758875,70,5
SO43661,2011-06-01 00:00:00.000,AW73565,SKU348875,2025,1
SO43661,2011-06-01 00:00:00.000,AW73565,SKU758875,70,2
SO43661,2011-06-01 00:00:00.000,AW73565,SKU921234,89,1
SO43661,2011-06-01 00:00:00.000,AW73565,SKU112678,8,12


## <!-- -->

🐙🐱[BryanWilhite](https://github.com/BryanWilhite)