diff --git a/CHANGELOG.md b/CHANGELOG.md index ca68a76e..034b0386 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,3 +1,12 @@ +#### 2.1.0 Milestone Release + +- New syntax (`?:` default operator) supports fallback to RHS if the LHS is Boolean equivalent to false (PR #784) +- New syntax (`??` coalescing operator) supports fallback to RHS if the LHS is non-existent (PR #784) +- Improve regex generation for DateTime parser (PR #728) +- Truncate fractional part of numeric argument of `$pad` function (PR #729) +- Await array elements (PR #747) +- Various documentation fixes and improvements + #### 2.0.6 Maintenance Release - Protect __evaluate_entry and __evaluate_exit callbacks (PR #700) diff --git a/package.json b/package.json index e3abe6e6..120d8e56 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "jsonata", - "version": "2.0.6", + "version": "2.1.0", "description": "JSON query and transformation language", "module": "jsonata.js", "main": "jsonata.js", diff --git a/website/versioned_docs/version-2.1.0/construction.md b/website/versioned_docs/version-2.1.0/construction.md new file mode 100644 index 00000000..91a45238 --- /dev/null +++ b/website/versioned_docs/version-2.1.0/construction.md @@ -0,0 +1,106 @@ +--- +id: version-2.1.0-construction +title: Building result structures +sidebar_label: Result Structures +original_id: construction +--- + +So far, we have discovered how to extract values from a JSON document, and how to manipulate the data using numeric, string and other operators. It is useful to be able to specify how this processed data is presented in the output. + +## Array constructors + +As previously observed, when a location path matches multiple values in the input document, these values are returned as an array. The values might be objects or arrays, and as such will have their own structure, but the _matched values_ themselves are at the top level in the resultant array. + +It is possible to build extra structure into the resultant array by specifying the construction of arrays (or [objects](#object-constructors)) within the location path expression. At any point in a location path where a field reference is expected, a pair of square brackets `[]` can be inserted to specify that the results of the expression within those brackets should be contained within a new array in the output. Commas are used to separate multiple expressions within the array constructor. + +Array constructors can also be used within location paths for making multiple selections without the broad brush use of wildcards. + +__Examples__ + +- The four email addresses are returned in a flat array. +
+
Email.address
+
[ + "fred.smith@my-work.com", + "fsmith@my-work.com", + "freddy@my-social.com", + "frederic.smith@very-serious.com" +]
+
+ +- Each email object generates an array of addresses. +
+
Email.[address]
+
[ + [ "fred.smith@my-work.com", "fsmith@my-work.com" ], + [ "freddy@my-social.com", "frederic.smith@very-serious.com" ] +]
+
+ +- Selects the `City` value of both `Address` and `Alternative.Address` objects. +
+
[Address, Other.`Alternative.Address`].City
+
[ "Winchester", "London" ]
+
+ +## Object constructors + +In a similar manner to the way arrays can be constructed, JSON objects can also be constructed in the output. At any point in a location path where a field reference is expected, a pair of braces `{}` containing key/value pairs separated by commas, with each key and value separated by a colon: `{key1: value1, key2:value2}`. The keys and values can either be literals or can be expressions. The key must either be a string or an expression that evaluates to a string. + +When an object constructor follows an expression that selects multiple values, the object constructor will create a single object that contains a key/value pair for each of those context values. If an array of objects is required (one for each context value), then the object constructor should immediately follow the dot '.' operator. + +__Examples__ + +- Produces an array of objects (one for each phone). +
+
Phone.{type: number}
+
[ + { "home": "0203 544 1234" }, + { "office": "01962 001234" }, + { "office": "01962 001235" }, + { "mobile": "077 7700 1234" } +]
+
+ +- Combines the key/value pairs into a single object. See [Grouping using object key expression](sorting-grouping.md) for more details. +
+
Phone{type: number}
+
{ + "home": "0203 544 1234", + "office": [ + "01962 001234", + "01962 001235" + ], + "mobile": "077 7700 1234" +}
+
+ +- Combines the key/value pairs into a single object. In this case, for consistency, all numbers are grouped into arrays. See [Singleton array and value equivalence](predicate.md#singleton-array-and-value-equivalence) for more details. +
+
Phone{type: number[]}
+
{ + "home": [ + "0203 544 1234" + ], + "office": [ + "01962 001234", + "01962 001235" + ], + "mobile": [ + "077 7700 1234" + ] +}
+
+ + +## JSON literals + +The array and object constructors use the standard JSON syntax for JSON arrays and JSON objects. In addition to this values of the other JSON data types can be entered into an expression using their native JSON syntax: +- strings - `"hello world"` +- numbers - `34.5` +- Booleans - `true` or `false` +- nulls - `null` +- objects - `{"key1": "value1", "key2": "value2"}` +- arrays - `["value1", "value2"]` + +__JSONata is a superset of JSON.__ This means that any valid JSON document is also a valid JSONata expression. This property allows you to use a JSON document as a template for the desired output, and then replace parts of it with expressions to insert data into the output from the input document. diff --git a/website/versioned_docs/version-2.1.0/embedding-extending.md b/website/versioned_docs/version-2.1.0/embedding-extending.md new file mode 100644 index 00000000..4969f5e6 --- /dev/null +++ b/website/versioned_docs/version-2.1.0/embedding-extending.md @@ -0,0 +1,175 @@ +--- +id: version-2.1.0-embedding-extending +title: Embedding and Extending JSONata +sidebar_label: Embedding and Extending JSONata +original_id: embedding-extending +--- + +## API + +### jsonata(str) + +Parse a string `str` as a JSONata expression and return a compiled JSONata expression object. + +```javascript +var expression = jsonata("$sum(example.value)"); +``` + +If the expression is not valid JSONata, an `Error` is thrown containing information about the nature of the syntax error, for example: + +``` +{ + code: "S0202", + stack: "...", + position: 16, + token: "}", + value: "]", + message: "Syntax error: expected ']' got '}'" +} +``` + +`expression` has three methods: + +### expression.evaluate(input[, bindings[, callback]]) + +Run the compiled JSONata expression against object `input` and return the result as a new object. + +```javascript +var result = await expression.evaluate({example: [{value: 4}, {value: 7}, {value: 13}]}); +``` + +`input` should be a JavaScript value such as would be returned from `JSON.parse()`. If `input` could not have been parsed from a JSON string (is circular, contains functions, ...), `evaluate`'s behaviour is not defined. `result` is a new JavaScript value suitable for `JSON.stringify()`ing. + +`bindings`, if present, contain variable names and values (including functions) to be bound: + +```javascript +await jsonata("$a + $b()").evaluate({}, {a: 4, b: () => 78}); +// returns 82 +``` + +`expression.evaluate()` may throw a run-time `Error`: + +```javascript +var expression = jsonata("$notafunction()"); // OK, valid JSONata +await expression.evaluate({}); // Throws +``` + +The `Error` contains information about the nature of the run-time error, for example: + +``` +{ + code: "T1006", + stack: "...", + position: 14, + token: "notafunction", + message: "Attempted to invoke a non-function" +} +``` + +If `callback(err, value)` is supplied, `expression.evaluate()` returns `undefined`, the expression is run asynchronously and the `Error` or result is passed to `callback`. + +```javascript +await jsonata("7 + 12").evaluate({}, {}, (error, result) => { + if(error) { + console.error(error); + return; + } + console.log("Finished with", result); +}); +console.log("Started"); + +// Prints "Started", then "Finished with 19" +``` + +### expression.assign(name, value) + +Permanently binds a value to a name in the expression, similar to how `bindings` worked above. Modifies `expression` in place and returns `undefined`. Useful in a JSONata expression factory. + +```javascript +var expression = jsonata("$a + $b()"); +expression.assign("a", 4); +expression.assign("b", () => 1); + +await expression.evaluate({}); // 5 +``` + +Note that the `bindings` argument in the `expression.evaluate()` call clobbers these values: + +```javascript +await expression.evaluate({}, {a: 109}); // 110 +``` + +### expression.registerFunction(name, implementation[, signature]) + +Permanently binds a function to a name in the expression. + +```javascript +var expression = jsonata("$greet()"); +expression.registerFunction("greet", () => "Hello world"); + +await expression.evaluate({}); // "Hello world" +``` + +You can do this using `expression.assign` or `bindings` in `expression.evaluate`, but `expression.registerFunction` allows you to specify a function `signature`. This is a terse string which tells JSONata the expected input argument types and return value type of the function. JSONata raises a run-time error if the actual input argument types do not match (the return value type is not checked yet). + +```javascript +var expression = jsonata("$add(61, 10005)"); +expression.registerFunction("add", (a, b) => a + b, ""); + +await expression.evaluate({}); // 10066 +``` + +### Function signature syntax + +A function signature is a string of the form ``. `params` is a sequence of type symbols, each one representing an input argument's type. `return` is a single type symbol representing the return value type. + +Type symbols work as follows: + +Simple types: + +- `b` - Boolean +- `n` - number +- `s` - string +- `l` - `null` + +Complex types: + +- `a` - array +- `o` - object +- `f` - function + +Union types: + +- `(sao)` - string, array or object +- `(o)` - same as `o` +- `u` - equivalent to `(bnsl)` i.e. Boolean, number, string or `null` +- `j` - any JSON type. Equivalent to `(bnsloa)` i.e. Boolean, number, string, `null`, object or array, but not function +- `x` - any type. Equivalent to `(bnsloaf)` + +Parametrised types: + +- `a` - array of strings +- `a` - array of values of any type + +Some examples of signatures of built-in JSONata functions: + +- `$count` has signature ``; it accepts an array and returns a number. +- `$append` has signature ``; it accepts two arrays and returns an array. +- `$sum` has signature `:n>`; it accepts an array of numbers and returns a number. +- `$reduce` has signature `:j>`; it accepts a reducer function `f` and an `a` (array of JSON objects) and returns a JSON object. + +Each type symbol may also have *options* applied. + +- `+` - one or more arguments of this type + - E.g. `$zip` has signature ``; it accepts one array, or two arrays, or three arrays, or... +- `?` - optional argument + - E.g. `$join` has signature `s?:s>`; it accepts an array of strings and an optional joiner string which defaults to the empty string. It returns a string. +- `-` - if this argument is missing, use the context value ("focus"). + - E.g. `$length` has signature ``; it can be called as `$length(OrderID)` (one argument) but equivalently as `OrderID.$length()`. + +### Writing higher-order function extensions + +It is possible to write an extension function that takes one or more functions in its list of arguments and/or returns + a function as its return value. + + diff --git a/website/versioned_docs/version-2.1.0/expressions.md b/website/versioned_docs/version-2.1.0/expressions.md new file mode 100644 index 00000000..810292fb --- /dev/null +++ b/website/versioned_docs/version-2.1.0/expressions.md @@ -0,0 +1,115 @@ +--- +id: version-2.1.0-expressions +title: Manipulating data with functions and expressions +sidebar_label: Functions and Expressions +original_id: expressions +--- + +## String expressions + +Path expressions that point to a string value will return that value. + +String literals can also be created by enclosing the +sequence of characters in quotes. Either double quotes `"` or single quotes `'` can be used, provided the same quote type is +used for the start and end of the string literal. Single quote characters may be included within a double quoted string and +_vice versa_ without escaping. Characters within the string literal may be escaped using the same format +as [JSON strings](https://tools.ietf.org/html/rfc7159#section-7). + +Strings can be combined using the concatenation operator `&`. This is an infix operator and will join the two strings +returned by the expressions either side of it. This is the only operator that will attempt to typecast its operands to +the expected (string) type. + +__Examples__ + +- Concatenate `FirstName` followed by space followed by `Surname` +
+
FirstName & ' ' & Surname
+
"Fred Smith"
+
+ +- Concatenates the `Street` and `City` from the `Address` object with a comma separator. Note the use of [parentheses](composition.md#parenthesized-expressions-and-blocks) +
+
Address.(Street & ', ' & City)
+
"Hursley Park, Winchester"
+
+ +- Casts the operands to strings, if necessary +
+
5&0&true
+
"50true"
+
+ + + +## Numeric expressions + +Path expressions that point to a number value will return that value. + +Numeric literals can also be created using the same syntax as [JSON numbers](https://tools.ietf.org/html/rfc7159#section-6). + +Numbers can be combined using the usual mathematical operators to produce a resulting number. Supported operators: +- `+` addition +- `-` subtraction +- `*` multiplication +- `/` division +- `%` remainder (modulo) + +__Examples__ + +Consider the following JSON document: +``` +{ + "Numbers": [1, 2.4, 3.5, 10, 20.9, 30] +} +``` + +| Expression | Output | Comments +| ---------- | ------ |----| +| `Numbers[0] + Numbers[1]` | 3.4 |Adding 2 prices| +| `Numbers[0] - Numbers[4]` | -19.9 | Subtraction | +| `Numbers[0] * Numbers[5]` | 30 |Multiplying price by quantity| +| `Numbers[0] / Numbers[4]` | 0.04784688995215 |Division| +| `Numbers[2] % Numbers[5]` | 3.5 |Modulo operator| + + +## Comparison expressions + +Often used in predicates, for comparison of two values. Returns Boolean `true` or `false`. Supported operators: + +- `=` equals +- `!=` not equals +- `<` less than +- `<=` less than or equal +- `>` greater than +- `>=` greater than or equal +- `in` value is contained in an array + + +__Examples__ + +| Expression | Output | Comments +| ---------- | ------ |----| +| `Numbers[0] = Numbers[5]` | false |Equality | +| `Numbers[0] != Numbers[4]` | true | Inequality | +| `Numbers[1] < Numbers[5]` | true |Less than| +| `Numbers[1] <= Numbers[5]` | true |Less than or equal| +| `Numbers[2] > Numbers[4]` | false |Greater than| +| `Numbers[2] >= Numbers[4]` | false |Greater than or equal| +| `"01962 001234" in Phone.number` | true | Value is contained in| + +## Boolean expressions + +Used to combine Boolean results, often to support more sophisticated predicate expressions. Supported operators: + +- `and` +- `or` + +Note that `not` is supported as a function, not an operator. + +__Examples__ + +| Expression | Output | Comments +| ---------- | ------ |----| +| `(Numbers[2] != 0) and (Numbers[5] != Numbers[1])` | true |`and` operator | +| `(Numbers[2] != 0) or (Numbers[5] = Numbers[1])` | true | `or` operator | + diff --git a/website/versioned_docs/version-2.1.0/higher-order-functions.md b/website/versioned_docs/version-2.1.0/higher-order-functions.md new file mode 100644 index 00000000..34e1c1da --- /dev/null +++ b/website/versioned_docs/version-2.1.0/higher-order-functions.md @@ -0,0 +1,164 @@ +--- +id: version-2.1.0-higher-order-functions +title: Higher order functions +sidebar_label: Higher Order Functions +original_id: higher-order-functions +--- + +## `$map()` +__Signature:__ `$map(array, function)` + +If the input argument is an array of 2 or more elements, returns an array containing the results of applying the `function` parameter to each value in the `array` parameter. + +``` +$map([1,2,3], function($v) { $v * 2 }) +``` + +evaluates to + +``` +[ 2, 4, 6 ] +``` + +If the input argument is an array with 1 element, returns the single result of applying the `function` parameter to each value in the `array` parameter. + +``` +$map([2], function($v) { $v * 2 }) +``` + +evaluates to + +``` +4 +``` + +If the input argument is an empty array, returns nothing (represented in Javascript as `undefined`) + +The function that is supplied as the second parameter must have the following signature: + +`function(value [, index [, array]])` + +Each value in the input array is passed in as the first parameter in the supplied function. The index (position) of that value in the input array is passed in as the second parameter, if specified. The whole input array is passed in as the third parameter, if specified. + +__Examples__ +- `$map([1..5], $string)` => `["1", "2", "3", "4", "5"]` + +With user-defined (lambda) function: +``` +$map(Email.address, function($v, $i, $a) { + 'Item ' & ($i+1) & ' of ' & $count($a) & ': ' & $v +}) +``` + +evaluates to: + +``` +[ + "Item 1 of 4: fred.smith@my-work.com", + "Item 2 of 4: fsmith@my-work.com", + "Item 3 of 4: freddy@my-social.com", + "Item 4 of 4: frederic.smith@very-serious.com" +] +``` + +## `$filter()` +__Signature:__ `$filter(array, function)` + +Returns an array containing only the values in the `array` parameter that satisfy the `function` predicate (i.e. `function` returns Boolean `true` when passed the value). + +The function that is supplied as the second parameter must have the following signature: + +`function(value [, index [, array]])` + +Each value in the input array is passed in as the first parameter in the supplied function. The index (position) of that value in the input array is passed in as the second parameter, if specified. The whole input array is passed in as the third parameter, if specified. + +__Example__ +The following expression returns all the products whose price is higher than average: +``` +$filter(Account.Order.Product, function($v, $i, $a) { + $v.Price > $average($a.Price) +}) +``` + +## `$single()` +__Signature:__ `$single(array, function)` + +Returns the one and only one value in the `array` parameter that satisfy the `function` predicate (i.e. `function` returns Boolean `true` when passed the value). Throws an exception if the number of matching values is not exactly one. + +The function that is supplied as the second parameter must have the following signature: + +`function(value [, index [, array]])` + +Each value in the input array is passed in as the first parameter in the supplied function. The index (position) of that value in the input array is passed in as the second parameter, if specified. The whole input array is passed in as the third parameter, if specified. + +__Example__ +The following expression the product in the order whose SKU is `"0406654608"`: +``` +$single(Account.Order.Product, function($v, $i, $a) { + $v.SKU = "0406654608" +}) +``` + +## `$reduce()` +__Signature:__ `$reduce(array, function [, init])` + +Returns an aggregated value derived from applying the `function` parameter successively to each value in `array` in combination with the result of the previous application of the function. + +The `function` must accept at least two arguments, and behaves like an infix operator between each value within the `array`. The signature of this supplied function must be of the form: + +`myfunc($accumulator, $value[, $index[, $array]])` + +__Example__ + +``` +( + $product := function($i, $j){$i * $j}; + $reduce([1..5], $product) +) +``` + +This multiplies all the values together in the array `[1..5]` to return `120`. + +If the optional `init` parameter is supplied, then that value is used as the initial value in the aggregation (fold) process. If not supplied, the initial value is the first value in the `array` parameter. + +## `$sift()` +__Signature:__ `$sift(object, function)` + +Returns an object that contains only the key/value pairs from the `object` parameter that satisfy the predicate `function` passed in as the second parameter. + +If `object` is not specified, then the context value is used as the value of `object`. It is an error if `object` is not an object. + +The function that is supplied as the second parameter must have the following signature: + +`function(value [, key [, object]])` + +Each value in the input object is passed in as the first parameter in the supplied function. The key (property name) of that value in the input object is passed in as the second parameter, if specified. The whole input object is passed in as the third parameter, if specified. + +__Example__ + +``` +Account.Order.Product.$sift(function($v, $k) {$k ~> /^Product/}) +``` + +This sifts each of the `Product` objects such that they only contain the fields whose keys start with the string "Product" (using a regex). This example returns: + +``` +[ + { + "Product Name": "Bowler Hat", + "ProductID": 858383 + }, + { + "Product Name": "Trilby hat", + "ProductID": 858236 + }, + { + "Product Name": "Bowler Hat", + "ProductID": 858383 + }, + { + "ProductID": 345664, + "Product Name": "Cloak" + } +] +``` diff --git a/website/versioned_docs/version-2.1.0/numeric-functions.md b/website/versioned_docs/version-2.1.0/numeric-functions.md new file mode 100644 index 00000000..78e0ebc6 --- /dev/null +++ b/website/versioned_docs/version-2.1.0/numeric-functions.md @@ -0,0 +1,174 @@ +--- +id: version-2.1.0-numeric-functions +title: Numeric functions +sidebar_label: Numeric Functions +original_id: numeric-functions +--- + +## `$number()` +__Signature:__ `$number(arg)` + +Casts the `arg` parameter to a number using the following casting rules + - Numbers are unchanged + - Strings that contain a sequence of characters that represent a legal JSON number are converted to that number + - Hexadecimal numbers start with `0x`, Octal numbers with `0o`, binary numbers with `0b` + - Boolean `true` casts to `1`, Boolean `false` casts to `0` + - All other values cause an error to be thrown. + +If `arg` is not specified (i.e. this function is invoked with no arguments), then the context value is used as the value of `arg`. + +__Examples__ +- `$number("5")` => `5` +- `$number("0x12")` => `18` +- `["1", "2", "3", "4", "5"].$number()` => `[1, 2, 3, 4, 5]` + + +## `$abs()` +__Signature:__ `$abs(number)` + +Returns the absolute value of the `number` parameter, i.e. if the number is negative, it returns the positive value. + +If `number` is not specified (i.e. this function is invoked with no arguments), then the context value is used as the value of `number`. + +__Examples__ +- `$abs(5)` => `5` +- `$abs(-5)` => `5` + +## `$floor()` +__Signature:__ `$floor(number)` + +Returns the value of `number` rounded down to the nearest integer that is smaller or equal to `number`. + +If `number` is not specified (i.e. this function is invoked with no arguments), then the context value is used as the value of `number`. + +__Examples__ +- `$floor(5)` => `5` +- `$floor(5.3)` => `5` +- `$floor(5.8)` => `5` +- `$floor(-5.3)` => `-6` + + +## `$ceil()` +__Signature:__ `$ceil(number)` + +Returns the value of `number` rounded up to the nearest integer that is greater than or equal to `number`. + +If `number` is not specified (i.e. this function is invoked with no arguments), then the context value is used as the value of `number`. + +__Examples__ +- `$ceil(5)` => `5` +- `$ceil(5.3)` => `6` +- `$ceil(5.8)` => `6` +- `$ceil(-5.3)` => `-5` + + +## `$round()` +__Signature:__ `$round(number [, precision])` + +Returns the value of the `number` parameter rounded to the number of decimal places specified by the optional `precision` parameter. + +The `precision` parameter (which must be an integer) species the number of decimal places to be present in the rounded number. If `precision` is not specified then it defaults to the value `0` and the number is rounded to the nearest integer. If `precision` is negative, then its value specifies which column to round to on the left side of the decimal place + +This function uses the [Round half to even](https://en.wikipedia.org/wiki/Rounding#Round_half_to_even) strategy to decide which way to round numbers that fall exactly between two candidates at the specified precision. This strategy is commonly used in financial calculations and is the default rounding mode in IEEE 754. + +__Examples__ +- `$round(123.456)` => `123` +- `$round(123.456, 2)` => `123.46` +- `$round(123.456, -1)` => `120` +- `$round(123.456, -2)` => `100` +- `$round(11.5)` => `12` +- `$round(12.5)` => `12` +- `$round(125, -1)` => `120` + +## `$power()` +__Signature:__ `$power(base, exponent)` + +Returns the value of `base` raised to the power of `exponent` (baseexponent). + +If `base` is not specified (i.e. this function is invoked with one argument), then the context value is used as the value of `base`. + +An error is thrown if the values of `base` and `exponent` lead to a value that cannot be represented as a JSON number (e.g. Infinity, complex numbers). + +__Examples__ +- `$power(2, 8)` => `256` +- `$power(2, 0.5)` => `1.414213562373` +- `$power(2, -2)` => `0.25` + +## `$sqrt()` +__Signature:__ `$sqrt(number)` + +Returns the square root of the value of the `number` parameter. + +If `number` is not specified (i.e. this function is invoked with one argument), then the context value is used as the value of `number`. + +An error is thrown if the value of `number` is negative. + +__Examples__ +- `$sqrt(4)` => `2` +- `$sqrt(2)` => `1.414213562373` + +## `$random()` +__Signature:__ `$random()` + +Returns a pseudo random number greater than or equal to zero and less than one (0 ≤ n < 1) + +__Examples__ +- `$random()` => `0.7973541067127` +- `$random()` => `0.4029142127028` +- `$random()` => `0.6558078550072` + + +## `$formatNumber()` +__Signature:__ `$formatNumber(number, picture [, options])` + +Casts the `number` to a string and formats it to a decimal representation as specified by the `picture` string. + +The behaviour of this function is consistent with the XPath/XQuery function [fn:format-number](https://www.w3.org/TR/xpath-functions-31/#func-format-number) as defined in the XPath F&O 3.1 specification. The picture string parameter defines how the number is formatted and has the [same syntax](https://www.w3.org/TR/xpath-functions-31/#syntax-of-picture-string) as fn:format-number. + +The optional third argument `options` is used to override the default locale specific formatting characters such as the decimal separator. If supplied, this argument must be an object containing name/value pairs specified in the [decimal format](https://www.w3.org/TR/xpath-functions-31/#defining-decimal-format) section of the XPath F&O 3.1 specification. + +__Examples__ + +- `$formatNumber(12345.6, '#,###.00')` => `"12,345.60"` +- `$formatNumber(1234.5678, "00.000e0")` => `"12.346e2"` +- `$formatNumber(34.555, "#0.00;(#0.00)")` => `"34.56"` +- `$formatNumber(-34.555, "#0.00;(#0.00)")` => `"(34.56)"` +- `$formatNumber(0.14, "01%")` => `"14%"` +- `$formatNumber(0.14, "###pm", {"per-mille": "pm"})` => `"140pm"` +- `$formatNumber(1234.5678, "①①.①①①e①", {"zero-digit": "\u245f"})` => `"①②.③④⑥e②"` + + +## `$formatBase()` +__Signature:__ `$formatBase(number [, radix])` + +Casts the `number` to a string and formats it to an integer represented in the number base specified by the `radix` argument. If `radix` is not specified, then it defaults to base 10. `radix` can be between 2 and 36, otherwise an error is thrown. + +__Examples__ + +- `$formatBase(100, 2)` => `"1100100"` +- `$formatBase(2555, 16)` => `"9fb"` + + +## `$formatInteger()` +__Signature:__ `$formatInteger(number, picture)` + +Casts the `number` to a string and formats it to an integer representation as specified by the `picture` string. + +The behaviour of this function is consistent with the two-argument version of the XPath/XQuery function [fn:format-integer](https://www.w3.org/TR/xpath-functions-31/#func-format-integer) as defined in the XPath F&O 3.1 specification. The picture string parameter defines how the number is formatted and has the same syntax as fn:format-integer. + +__Examples__ + +- `$formatInteger(2789, 'w')` => `"two thousand, seven hundred and eighty-nine"` +- `$formatInteger(1999, 'I')` => `"MCMXCIX"` + +## `$parseInteger()` +__Signature:__ `$parseInteger(string, picture)` + +Parses the contents of the `string` parameter to an integer (as a JSON number) using the format specified by the `picture` string. +The picture string parameter has the same format as `$formatInteger`. Although the XPath specification does not have an equivalent +function for parsing integers, this capability has been added to JSONata. + +__Examples__ + +- `$parseInteger("twelve thousand, four hundred and seventy-six", 'w')` => `12476` +- `$parseInteger('12,345,678', '#,##0')` => `12345678` diff --git a/website/versioned_docs/version-2.1.0/numeric-operators.md b/website/versioned_docs/version-2.1.0/numeric-operators.md new file mode 100644 index 00000000..04121657 --- /dev/null +++ b/website/versioned_docs/version-2.1.0/numeric-operators.md @@ -0,0 +1,62 @@ +--- +id: version-2.1.0-numeric-operators +title: Numeric Operators +sidebar_label: Numeric Operators +original_id: numeric-operators +--- + +## `+` (Addition) + +The addition operator adds the operands to produce the numerical sum. It is an error if either operand is not a number. + +__Example__ + +`5 + 2` => `7` + + +## `-` (Subtraction/Negation) + +The subtraction operator subtracts the RHS value from the LHS value to produce the numerical difference It is an error if either operand is not a number. + +It can also be used in its unary form to negate a number + +__Examples__ + +- `5 - 2` => `3` +- `- 42` => `-42` + +## `*` (Multiplication) + +The multiplication operator multiplies the operands to produce the numerical product. It is an error if either operand is not a number. + +__Example__ + +`5 * 2` => `10` + +## `/` (Division) + +The division operator divides the RHS into the LHS to produce the numerical quotient. It is an error if either operand is not a number. + +__Example__ + +`5 / 2` => `2.5` + + +## `%` (Modulo) + +The modulo operator divides the RHS into the LHS using whole number division to produce a whole number quotient and a remainder. This operator returns the remainder. It is an error if either operand is not a number. + +__Example__ + +`5 % 2` => `1` + +## `..` (Range) + +The sequence generation operator is used to create an array of monotonically increasing integer start with the number on the LHS and ending with the number on the RHS. It is an error if either operand does not evaluate to an integer. The sequence generator can only be used within an array constructor []. + +__Examples__ + +- `[1..5]` => `[1, 2, 3, 4, 5]` +- `[1..3, 7..9]` => `[1, 2, 3, 7, 8, 9]` +- `[1..$count(Items)].("Item " & $)` => `["Item 1","Item 2","Item 3"]` +- `[1..5].($*$)` => `[1, 4, 9, 16, 25]` diff --git a/website/versioned_docs/version-2.1.0/other-operators.md b/website/versioned_docs/version-2.1.0/other-operators.md new file mode 100644 index 00000000..70fd120e --- /dev/null +++ b/website/versioned_docs/version-2.1.0/other-operators.md @@ -0,0 +1,151 @@ +--- +id: version-2.1.0-other-operators +title: Other Operators +sidebar_label: Other Operators +original_id: other-operators +--- + + +## `&` (Concatenation) + +The string concatenation operator is used to join the string values of the operands into a single resultant string. If either or both of the operands are not strings, then they are first cast to string using the rules of the `$string` function. + +__Example__ + +`"Hello" & "World"` => `"HelloWorld"` + +## `? :` (Conditional) + +The conditional ternary operator is used to evaluate one of two alternative expressions based on the result of a predicate (test) condition. The operator takes the form: + +` ? : ` + +The `` expression is first evaluated. If it evaluates to Boolean `true`, then the operator returns the result of evaluating the `` expression. Otherwise it returns the result of evaluating the `` expression. If `` evaluates to a non-Boolean value, then the value is first cast to Boolean using the rules of the `$boolean` function. + +__Example__ + +`Price < 50 ? "Cheap" : "Expensive"` + +## `?:` (Default/Elvis) + +The default (or "elvis") operator returns the left-hand side if it has an effective Boolean value of `true`, otherwise it returns the right-hand side. This is useful for providing fallback values when an expression may evaluate to a value with an effective Boolean value of `false` (such as `null`, `false`, `0`, `''`, or `undefined`). + +__Syntax__ + +` ?: ` + +__Example__ + +`foo.bar ?: 'default'` => `'default'` (if `foo.bar` is evaluates to Boolean `false`) + +## `??` (Coalescing) + +The coalescing operator returns the left-hand side if it is defined (not `undefined`), otherwise it returns the right-hand side. This is useful for providing fallback values only when the left-hand side is missing or not present (empty sequence), but not for other values with an effective Boolean value of `false` like `0`, `false`, or `''`. + +__Syntax__ + +` ?? ` + +__Example__ + +`foo.bar ?? 42` => `42` (if `foo.bar` is undefined) + +`foo.bar ?? 'default'` => `'default'` (if `foo.bar` is undefined) + +`0 ?? 1` => `0` + +`'' ?? 'fallback'` => `''` + +## `:=` (Variable binding) + +The variable binding operator is used to bind the value of the RHS to the variable name defined on the LHS. The variable binding is scoped to the current block and any nested blocks. It is an error if the LHS is not a `$` followed by a valid variable name. + +__Examples__ + +- `$five := 5` +- `$square := function($n) { $n * $n }` + +## `~>` (Chain) + +The function chaining operator is used in the situations where multiple nested functions need to be applied to a value, while making it easy to read. The value on the LHS is evaluated, then passed into the function on the RHS as its first argument. If the function has any other arguments, then these are passed to the function in parenthesis as usual. It is an error if the RHS is not a function, or an expression that evaluates to a function. + +__Examples__ + +`$uppercase($substringBefore($substringAfter(Customer.Email, "@"), "."))` + +and + +`$sum(Account.Order.Product.(Price * Quantity))` + +can be more clearly written: + +`Customer.Email ~> $substringAfter("@") ~> $substringBefore(".") ~> $uppercase()` + +and + +`Account.Order.Product.(Price * Quantity) ~> $sum()` + +This operator can also be used in a more abstract form to define new functions based on a combination of existing functions. In this form, there is no value passed in on the LHS of the first function in the chain. + +For example, the expression + +``` +( + $uppertrim := $trim ~> $uppercase; + $uppertrim(" Hello World ") +) +``` + +=> `"HELLO WORLD"` + +creates a new function `$uppertrim` that performs `$trim` followed by `$uppercase`. + + +## `... ~> | ... | ... |` (Transform) + +The object transform operator is used to modify a copy of an object structure using a pattern/action syntax to target specific modifications while keeping the rest of the structure unchanged. + +The syntax has the following structure: + +`head ~> | location | update [, delete] |` + +where + +- `head` evaluates to the object that is to be copied and transformed +- `location` evaluates to the part(s) within the copied object that are to be updated. The `location` expression is evaluated relative to the result of `head`. The result of evaluating `location` must be an object or array of objects. +- `update` evaluates to an object that is merged into the object matched by each `location`. `update` is evaluated relative to the result of `location` and if `location` matched multiple objects, then the update gets evaluated for each one of these. The result of (each) update is merged into the result of `location`. +- `delete` (optional) evaluates to a string or an array of strings. Each string is the name of the name/value pair in each object matched by `location` that is to be removed from the resultant object. + +The `~>` operator is the operator for function chaining and passes the value on the left hand side to the function on the right hand side as its first argument. The expression on the right hand side must evaluate to a function, hence the `|...|...|` syntax generates a function with one argument. + +Example: + +`| Account.Order.Product | {'Price': Price * 1.2} |` + +defines a transform that will return a deep copy the object passed to it, but with the `Product` object modified such that its `Price` property has had its value increased by 20%. The first part of the expression is the path location that specifies all of the objects within the overall object to change, and the second part defines an object that will get merged into the object(s) matched by the first part. The merging semantics is the same as that of the `$merge()` function. + +This transform definition syntax creates a JSONata function which you can either assign to a variable and use multiple times, or invoke inline. +Example: + +`payload ~> |Account.Order.Product|{'Price': Price * 1.2}|` + +or: + +`$increasePrice := |Account.Order.Product|{'Price': Price * 1.2}|` + +This also has the benefit that multiple transforms can be chained together for more complex transformations. + +In common with `$merge()`, multiple changes (inserts or updates) can be made to an object. +Example: + +`|Account.Order.Product|{'Price': Price * 1.2, 'Total': Price * Quantity}|` + +Note that the Total will be calculated using the original price, not the modified one (JSONata is declarative not imperative). + +Properties can also be removed from objects. This is done using the optional `delete` clause which specifies the name(s) of the properties to delete. +Example: + +`$ ~> |Account.Order.Product|{'Total': Price * Quantity}, ['Price', 'Quantity']|` + +This copies the input, but for each `Product` it inserts a Total and removes the `Price` and `Quantity` properties. + diff --git a/website/versioned_docs/version-2.1.0/overview.md b/website/versioned_docs/version-2.1.0/overview.md new file mode 100644 index 00000000..a63fd007 --- /dev/null +++ b/website/versioned_docs/version-2.1.0/overview.md @@ -0,0 +1,39 @@ +--- +id: version-2.1.0-overview +title: JSONata Documentation +sidebar_label: Overview +original_id: overview +--- + +## Introduction + +JSONata is a lightweight query and transformation language for JSON data. Inspired by the 'location path' semantics of XPath 3.1, it allows sophisticated queries to be expressed in a compact and intuitive notation. A rich complement of built in operators and functions is provided for manipulating and combining extracted data, and the results of queries can be formatted into any JSON output structure using familiar JSON object and array syntax. Coupled with the facility to create user defined functions, advanced expressions can be built to tackle any JSON query and transformation task. + +

+ +## Get JSONata + +* Try it out at [try.jsonata.org](http://try.jsonata.org/) +* Install the module from [NPM](https://www.npmjs.com/package/jsonata) +* Fork the repo on [GitHub](https://github.com/jsonata-js/jsonata) + +## Implementations of JSONata + +The following are known implementations of JSONata in addition to the primary implementation in JavaScript in the above repo. + +|Language|Link|Notes|JSONata version| +|---|---|---|---| +|C|https://github.com/qlyoung/jsonata-c|Runs JSONata in embedded JS engine|1.8.3| +|Go|https://github.com/blues/jsonata-go|Native implementation|1.5.4| +|Go|https://github.com/yxuco/gojsonata|Native implementation| | +|Java|https://github.com/IBM/JSONata4Java|Native implementation| | +|Java|https://github.com/dashjoin/jsonata-java|Native port of reference|2.0.5| +|.NET|https://github.com/mikhail-barg/jsonata.net.native|Native implementation|1.8.5| +|Python|https://github.com/qlyoung/pyjsonata|API bindings based on C bindings|1.8.3| +|Python|https://github.com/rayokota/jsonata-python|Native port of reference|2.0.5| +|Rust|https://github.com/johanventer/jsonata-rust|Implementation work in progress| | +|Rust|https://github.com/Stedi/jsonata-rs|Actively-developed fork of jsonata-rust| | + +## Find out more + +* Introduction at [London Node User Group meetup](https://www.youtube.com/watch?v=TDWf6R8aqDo) diff --git a/website/versioned_docs/version-2.1.0/predicate.md b/website/versioned_docs/version-2.1.0/predicate.md new file mode 100644 index 00000000..3426866c --- /dev/null +++ b/website/versioned_docs/version-2.1.0/predicate.md @@ -0,0 +1,85 @@ +--- +id: version-2.1.0-predicate +title: Query refinement using predicate expressions +sidebar_label: Predicate Queries +original_id: predicate +--- + +## Predicates + +At any step in a location path, the selected items can be filtered using a predicate - `[expr]` where `expr` evaluates to a Boolean value. Each item in the selection is tested against the expression, if it evaluates to `true`, then the item is kept; if `false`, it is removed from the selection. The expression is evaluated relative to the current (context) item being tested, so if the predicate expression performs navigation, then it is relative to this context item. + +#### Examples: + +- Select the `Phone` items that have a `type` field that equals `"mobile"`. +
+
Phone[type='mobile']
+
{ "type": "mobile", "number": "077 7700 1234" }
+
+ +- Select the mobile phone number +
+
Phone[type='mobile'].number
+
"077 7700 1234"
+
+ +- Select the office phone numbers - there are two of them! +
+
Phone[type='office'].number
+
[ "01962 001234", "01962 001235" ]
+
+ + +## Singleton array and value equivalence + +Within a JSONata expression or subexpression, any value (which is not itself an array) and an array containing just that value are deemed to be equivalent. This allows the language to be composable such that location paths that extract a single value from an object and location paths that extract multiple values from arrays can both be used as inputs to other expressions without needing to use different syntax for the two forms. + +Consider the following examples: + +* `Address.City` returns the single value `"Winchester"` +* `Phone[0].number` matches a single value, and returns that value `"0203 544 1234"` +* `Phone[type='home'].number` likewise matches the single value `"0203 544 1234"` +* `Phone[type='office'].number` matches two values, so returns an array `[ "01962 001234", "01962 001235" ]` + +When processing the return value of a JSONata expression, it might be desirable to have the results in a consistent format regardless of how many values were matched. In the first two expressions above, it is clear that each expression is addressing a single value in the structure and it makes sense to return just that value. In the last two expressions, however, it is not immediately obvious how many values will be matched, and it is not helpful if the host language has to process the results in different ways depending on what gets returned. + +If this is a concern, then the expression can be modified to make it return an array even if only a single value is matched. This is done by adding empty square brackets `[]` to a step within the location path. The examples above can be re-written to always return an array as follows: + +* `Address[].City` returns `[ "Winchester"] ` +* `Phone[0][].number` returns `[ "0203 544 1234" ]` +* `Phone[][type='home'].number` returns `[ "0203 544 1234" ]` +* `Phone[type='office'].number[]` returns `[ "01962 001234", "01962 001235" ]` + +Note that the `[]` can be placed either side of the predicates and on any step in the path expression + +## Wildcards + +Use of `*` instead of field name to select all fields in an object + +#### Examples + +- Select the values of all the fields of `Address` +
+
Address.*
+
[ "Hursley Park", "Winchester", "SO21 2JN" ]
+
+ +- Select the `Postcode` value of any child object +
+
*.Postcode
+
"SO21 2JN"
+
+ + + +## Navigate arbitrary depths + +Descendant wildcard `**` instead of `*` will traverse all descendants (multi-level wildcard). + +#### Examples + +- Select all `Postcode` values, regardless of how deeply nested they are in the structure +
+
**.Postcode
+
[ "SO21 2JN", "E1 6RF" ]
+
diff --git a/website/versioned_docs/version-2.1.0/processing.md b/website/versioned_docs/version-2.1.0/processing.md new file mode 100644 index 00000000..9cdb1a2f --- /dev/null +++ b/website/versioned_docs/version-2.1.0/processing.md @@ -0,0 +1,66 @@ +--- +id: version-2.1.0-processing +title: The JSONata processing model +sidebar_label: Processing Model +original_id: processing +--- + +## The JSONata type system + +JSONata is a superset of JSON and the JSONata type system is a superset of the JSON data types. In common with all functional programming languages, the function is also a first-class data type. The following data types are supported by JSONata: + +- string +- number +- Boolean +- null +- object +- array +- function + +All but the last one are in common with JSON. + +## Sequences + +JSONata has been designed foremost as a query language, whereby a path expression can select zero, one or more than one values from the JSON document. These values, each of which can be of any of the types listed above, are returned as a _result sequence_. During the evaluation of expressions, which involves the results of subexpressions being combined or becoming the context inputs to other subexpressions, the sequences are subject to the process of _sequence flattening_. + +The sequence flattening rules are as follows: + +1. An __empty sequence__ is a sequence with no values and is considered to be 'nothing' or 'no match'. It won't appear in the output of any expression. If it is associated with an object property (key/value) pair in a result object, then that object will not have that property. + +2. A __singleton sequence__ is a sequence containing a single value. It is considered equivalent to that value itself, and the output from any expression, or sub-expression will be that value without any surrounding structure. + +3. A sequence containing more than one value is represented in the output as a JSON array. This is still internally flagged as a sequence and subject to the next rule. Note that if an expression matches an array from the input JSON, or a JSON array is explicitly constructed in the query using the [array constructor](construction#array-constructors), then this remains an array of values rather than a sequence of values and will not be subject to the sequence flattening rules. However, if this array becomes the context of a subsequent expression, then the result of that _will_ be a sequence. + +4. If a sequence contains one or more (sub-)sequences, then the values from the sub-sequence are pulled up to the level of the outer sequence. A result sequence will never contain child sequences (they are flattened). + + + +## JSONata path processing + +The JSONata path expression is a _declarative functional_ language. + +__Functional__ because it is based on the map/filter/reduce programming paradigm as supported by popular functional programming languages through the use of higher-order functions. + +__Declarative__ because these higher-order functions are exposed through a lightweight syntax which lets the user focus on the intention of the query (declaration) rather than the programming constructs that control their evaluation. + +A path expression is a sequence of one or more of the following functional stages: + +Stage | Syntax | Action +---|---|--- + __Map__ | seq`.`expr | Evaluates the RHS expression in the context of each item in the input sequence. Flattens results into result sequence. + __Filter__ | seq`[`expr`]` | Filter results from previous stage by applying predicate expression between brackets to each item. + __Sort__ | seq`^(`expr`)` | Sorts (re-orders) the input sequence according to the criteria in parentheses. + __Index__ | seq`#`$var | Binds a named variable to the current context position (zero offset) in the sequence. + __Join__ | seq`@`$var | Binds a named variable to the current context item in the sequence. Can only be used directly following a map stage. +__Reduce__ | seq`{` expr`:`expr`,` expr`:`expr ...`}` | Group and aggregate the input sequence to a single result object as defined by the name/value expressions. Can only appear as the final stage in a path expression. + +In the above table: + +- In the 'Syntax' column, 'seq' refers to the input sequence for the current stage, which is the result sequence from the previous stage. +- The 'Action' column gives a brief outline of the stage's behavior; fuller details are in the [Path Operators](path-operators) reference page. +- The relative precedence of each operator affects the scope of its influence on the input sequence. Specifically, + - The Filter operator binds tighter than the Map operator. This means, for example, that `books.authors[0]` will select the all of the first authors from _each_ book rather than the first author from all of the books. + - The Sort (order-by) operator has the lowest precedence, meaning that the full path to the left of it will be evaluated, and its result sequence will be sorted. + - This operator precedence can be overridden by using parentheses. For example, `(books.authors)[0]` will select the the first author from all of the books (single value). Note, however, that parentheses also define a scope frame for variables, so any variables that have been bound within the parentheses block including those bound by the `@` and `#` operators will go out of scope at the end of the parens block. +- The variables bound by the `@` and `#` operators go out of scope at the end of the path expression. + - The Reduce stage, if used, will terminate the current path expression. Although a Map operator can immediately follow this, it will be interpreted as the start of a new path expression, meaning that any previously bound context or index variables will be out of scope. diff --git a/website/versioned_docs/version-2.1.0/programming.md b/website/versioned_docs/version-2.1.0/programming.md new file mode 100644 index 00000000..e9d1c476 --- /dev/null +++ b/website/versioned_docs/version-2.1.0/programming.md @@ -0,0 +1,469 @@ +--- +id: version-2.1.0-programming +title: Programming constructs +sidebar_label: Functional Programming +original_id: programming +--- + +So far, we have introduced all the parts of the language that allow us to extract data from an input JSON document, combine the data using string and numeric operators, and format the structure of the output JSON document. What follows are the parts that turn this into a Turing complete, functional programming language. + +## Comments + +JSONata expressions can be interleaved with comments using 'C' style comment delimeters. For example, + +``` +/* Long-winded expressions might need some explanation */ +( + $pi := 3.1415926535897932384626; + /* JSONata is not known for its graphics support! */ + $plot := function($x) {( + $floor := $string ~> $substringBefore(?, '.') ~> $number; + $index := $floor(($x + 1) * 20 + 0.5); + $join([0..$index].('.')) & 'O' & $join([$index..40].('.')) + )}; + + /* Factorial is the product of the integers 1..n */ + $product := function($a, $b) { $a * $b }; + $factorial := function($n) { $n = 0 ? 1 : $reduce([1..$n], $product) }; + + $sin := function($x){ /* define sine in terms of cosine */ + $cos($x - $pi/2) + }; + $cos := function($x){ /* Derive cosine by expanding Maclaurin series */ + $x > $pi ? $cos($x - 2 * $pi) : $x < -$pi ? $cos($x + 2 * $pi) : + $sum([0..12].($power(-1, $) * $power($x, 2*$) / $factorial(2*$))) + }; + + [0..24].$sin($*$pi/12).$plot($) +) +``` +Produces [this](http://try.jsonata.org/ryYn78Q0m), if you're interested! + +## Conditional logic + +### Ternary operator (`? :`) + +If/then/else constructs can be written using the ternary operator "? :". + +`predicate ? expr1 : expr2` + +The expression `predicate` is evaluated. If its effective boolean value (see definition) is `true` then `expr1` is evaluated and returned, otherwise `expr2` is evaluated and returned. + +__Examples__ + +
+
Account.Order.Product.{ + `Product Name`: $.Price > 100 ? "Premium" : "Basic" +}
+
[ + { + "Bowler Hat": "Basic" + }, + { + "Trilby hat": "Basic" + }, + { + "Bowler Hat": "Basic" + }, + { + "Cloak": "Premium" + } +]
+
+ +### Elvis/Default operator (`?:`) + +The default (or "elvis") operator is syntactic sugar for a common pattern using the ternary operator. It returns the left-hand side if it has an effective Boolean value of `true`, otherwise it returns the right-hand side. + +`expr1 ?: expr2` + +This is equivalent to: + +`expr1 ? expr1 : expr2` + +The elvis operator is useful for providing fallback values when an expression may evaluate to a value with an effective Boolean value of `false`, without having to repeat the expression twice as you would with the ternary operator. + +__Examples__ + +
+
Account.Order.Product.{ + `Product Name`: $.'Product Name', + `Category`: $.Category ?: "Uncategorized" +}
+
[ + { + "Product Name": "Bowler Hat", + "Category": "Uncategorized" + }, + { + "Product Name": "Trilby hat", + "Category": "Uncategorized" + }, + { + "Product Name": "Bowler Hat", + "Category": "Uncategorized" + }, + { + "Product Name": "Cloak", + "Category": "Uncategorized" + } +]
+
+ +### Coalescing operator (`??`) + +The coalescing operator is syntactic sugar for a common pattern using the ternary operator with the `$exists` function. It returns the left-hand side if it is defined (not `undefined`), otherwise it returns the right-hand side. + +`expr1 ?? expr2` + +This is equivalent to: + +`$exists(expr1) ? expr1 : expr2` + +The coalescing operator is useful for providing fallback values only when the left-hand side is missing or not present (empty sequence), but not for other values with an effective Boolean value of `false` like `0`, `false`, or `''`. It avoids having to evaluate the expression twice and explicitly use the `$exists` function as you would with the ternary operator. + +__Examples__ + +
+
Account.Order.{ + "OrderID": OrderID, + Rating": ($sum(Product.Rating) / $count(Product.Rating)) ?? 0 +}
+
[ + { + "OrderID": "order101", + "Rating": 5 + }, + { + "OrderID": "order102", + "Rating": 3 + }, + { + "OrderID": "order103", + "Rating": 4 + }, + { + "OrderID": "order104", + "Rating": 2 + } +]
+
+ +## Variables + +Any name that starts with a dollar '$' is a variable. A variable is a named reference to a value. The value can be one of any type in the language's [type system](processing#the-jsonata-type-system). + +### Built-in variables + +- `$` The variable with no name refers to the context value at any point in the input JSON hierarchy. Examples +- `$$` The root of the input JSON. Only needed if you need to break out of the current context to temporarily navigate down a different path. E.g. for cross-referencing or joining data. Examples +- Native (built-in) functions. See function library. + +### Variable binding + +Values (of any type in the type system) can be bound to variables + +`$var_name := "value"` + +The stored value can be later referenced using the expression `$var_name`. + +The scope of a variable is limited to the 'block' in which it was bound. E.g. + +``` +Invoice.( + $p := Product.Price; + $q := Product.Quantity; + $p * $q +) +``` + +Returns Price multiplied by Quantity for the Product in the Invoice. + +## Functions + +The function is a first-class type, and can be stored in a variable just like any other data type. A library of built-in functions is provided (link) and assigned to variables in the global scope. For example, `$uppercase` contains a function which, when invoked with a string argument, `str`, will return a string with all the characters in `str` changed to uppercase. + +### Invoking a function + +A function is invoked by following its reference (or definition) by parentheses containing a comma delimited sequence of arguments. + +__Examples__ + +- `$uppercase("Hello")` returns the string "HELLO". +- `$substring("hello world", 0, 5)` returns the string "hello" +- `$sum([1,2,3])` returns the number 6 + +### Defining a function + +Anonymous (lambda) functions can be defined using the following syntax: + +`function($l, $w, $h){ $l * $w * $h }` + +and can be invoked using + +`function($l, $w, $h){ $l * $w * $h }(10, 10, 5)` which returns 500 + +The function can also be assigned to a variable for future use (within the block) + +``` +( + $volume := function($l, $w, $h){ $l * $w * $h }; + $volume(10, 10, 5); +) +``` + +### Function signatures + +Functions can be defined with an optional signature which specifies the parameter types of the function. If supplied, +the evaluation engine will validate the arguments passed to the function before it is invoked. A dynamic error is +thown if the argument list does not match the signature. + +A function signature is a string of the form ``. `params` is a sequence of type symbols, each one representing an input argument's type. `return` is a single type symbol representing the return value type. + +Type symbols work as follows: + +Simple types: + +- `b` - Boolean +- `n` - number +- `s` - string +- `l` - `null` + +Complex types: + +- `a` - array +- `o` - object +- `f` - function + +Union types: + +- `(sao)` - string, array or object +- `(o)` - same as `o` +- `u` - equivalent to `(bnsl)` i.e. Boolean, number, string or `null` +- `j` - any JSON type. Equivalent to `(bnsloa)` i.e. Boolean, number, string, `null`, object or array, but not function +- `x` - any type. Equivalent to `(bnsloaf)` + +Parametrised types: + +- `a` - array of strings +- `a` - array of values of any type + +Some examples of signatures of built-in JSONata functions: + +- `$count` has signature ``; it accepts an array and returns a number. +- `$append` has signature ``; it accepts two arrays and returns an array. +- `$sum` has signature `:n>`; it accepts an array of numbers and returns a number. +- `$reduce` has signature `:j>`; it accepts a reducer function `f` and an `a` (array of JSON objects) and returns a JSON object. + +Each type symbol may also have *options* applied. + +- `+` - one or more arguments of this type + - E.g. `$zip` has signature ``; it accepts one array, or two arrays, or three arrays, or... +- `?` - optional argument + - E.g. `$join` has signature `s?:s>`; it accepts an array of strings and an optional joiner string which defaults to the empty string. It returns a string. +- `-` - if this argument is missing, use the context value ("focus"). + - E.g. `$length` has signature ``; it can be called as `$length(OrderID)` (one argument) but equivalently as `OrderID.$length()`. + + +### Recursive functions + +Functions that have been assigned to variables can invoke themselves using that variable reference. This allows recursive functions to be defined. Eg. + +
+
( + $factorial:= function($x){ $x <= 1 ? 1 : $x * $factorial($x-1) }; + $factorial(4) +)
+
24
+
+ +Note that it is actually possible to write a recursive function using purely anonymous functions (i.e. nothing gets assigned to variables). This is done using the [Y-combinator](https://en.wikipedia.org/wiki/Fixed-point_combinator#Fixed_point_combinators_in_lambda_calculus) which might be an interesting [diversion](#advanced-example-the-y-combinator) for those interested in functional programming. + +### Tail call optimization (Tail recursion) + +A recursive function adds a new frame to the call stack each time it invokes itself. This can eventually lead to stack exhaustion if the function recurses beyond a certain limit. Consider the classic recursive implementation of the factorial function + +``` +( + $factorial := function($x) { + $x <= 1 ? 1 : $x * $factorial($x-1) + }; + $factorial(170) +) +``` + +This function works by pushing the number onto the stack, then when the stack unwinds, multiplying it by the result of the factorial of the number minus one. Written in this way, the JSONata evaluator has no choice but to use the call stack to store the intermediate results. Given a large enough number, the call stack will overflow. + +This is a recognised problem with functional programming and the solution is to rewrite the function slightly to avoid the _need_ for the stack to store the itermediate result. The following implementation of factorial achieves this + +``` +( + $factorial := function($x){( + $iter := function($x, $acc) { + $x <= 1 ? $acc : $iter($x - 1, $x * $acc) + }; + $iter($x, 1) + )}; + $factorial(170) +) +``` + +Here, the multiplication is done _before_ the function invokes itself and the intermediate result is carried in the second parameter `$acc` (accumulator). The invocation of itself is the _last_ thing that the function does. This is known as a 'tail call', and when the JSONata parser spots this, it internally rewrites the recursion as a simple loop. Thus it can run indefinitely without growing the call stack. Functions written in this way are said to be [tail recursive](https://en.wikipedia.org/wiki/Tail_call). + +### Higher order functions + +A function, being a first-class data type, can be passed as a parameter to another function, or returned from a function. Functions that process other functions are known as higher order functions. Consider the following example: + +``` +( + $twice := function($f) { function($x){ $f($f($x)) } }; + $add3 := function($y){ $y + 3 }; + $add6 := $twice($add3); + $add6(7) +) +``` +- The function stored in variable `$twice` is a higher order function. It takes a parameter `$f` which is a function, and returns a function which takes a parameter `$x` which, when invoked, applies the function `$f` twice to `$x`. +- `$add3` stores a function that adds 3 to its argument. Neither `$twice` or `$add3` have been invoked yet. +- `$twice` is invoked by passing the function `add3` as its argument. This returns a function that applies `$add3` twice to _its_ argument. This returned function is not invoked yet, but rather assigned to the variable `add6`. +- Finally the function in `$add6` is invoked with the argument 7, resulting in 3 being added to it twice. It returns 13. + +### Functions are closures + +When a lambda function is defined, the evaluation engine takes a snapshot of the environment and stores it with the function body definition. The environment comprises the context item (i.e. the current value in the location path) together with the current in-scope variable bindings. When the lambda function is later invoked, it is done so in that stored environment rather than the current environment at invocation time. This property is known as _lexical scoping_ and is a fundamental property of _closures_. + +Consider the following example: + +``` +Account.( + $AccName := function() { $.'Account Name' }; + + Order[OrderID = 'order104'].Product.{ + 'Account': $AccName(), + 'SKU-' & $string(ProductID): $.'Product Name' + } +) +``` + +When the function is created, the context item (referred to by '$') is the value of `Account`. Later, when the function is invoked, the context item has moved down the structure to the value of each `Product` item. However, the function body is invoked in the environment that was stored when it was defined, so its context item is the value of `Account`. This is a somewhat contrived example, you wouldn't really need a function to do this. The expression produces the following result: + +``` +{ + "Account": "Firefly", + "SKU-858383": "Bowler Hat", + "SKU-345664": "Cloak" +} +``` + +### Partial function application + +Functions can [partially applied](https://en.wikipedia.org/wiki/Partial_application) by invoking the function with one or more (but not all) +arguments replaced by a question mark `?` placeholder. The result of this is another function whose arity (number of parameters) is reduced +by the number of arguments supplied to the original function. This returned function can be treated like any other newly defined function, +e.g. bound to a variable, passed to a higher-order function, etc. + +__Examples__ + +- Create a function to return the first five characters of a string by partially applying the `$substring` function +
+
( + $first5 := $substring(?, 0, 5); + $first5("Hello, World") +)
+
"Hello"
+
+ +- Partially applied function can be further partially applied +
+
( + $firstN := $substring(?, 0, ?); + $first5 := $firstN(?, 5); + $first5("Hello, World") +)
+
"Hello"
+
+ + +### Function chaining + +Function chaining can be used in two ways: + +1. To avoid lots of nesting when multiple functions are applied to a value + +2. As a higher-order construct for defining new functions by combining existing functions + +#### Invocation chaining + +`value ~> $funcA ~> $funcB`\ +is equivalent to\ +`$funcB($funcA(value))` + +__Examples__ + +- `Customer.Email ~> $substringAfter("@") ~> $substringBefore(".") ~> $uppercase()` + +#### Function composition + +[Function composition](https://en.wikipedia.org/wiki/Function_composition) is the application of one function to another function +to produce a third function. + +`$funcC := $funcA ~> $funcB`\ +is equivalent to\ +`$funcC := function($arg) { $funcB($funcA($arg)) }` + +__Examples__ + +- Create a new function by chaining two existing functions +
+
( + $normalize := $uppercase ~> $trim; + $normalize(" Some Words ") +)
+
"SOME WORDS"
+
+ +### Functions as first class values + +Function composition can be combined with partial function application to produce a very compact syntax for defining new +functions. + +__Examples__ + +- Create a new function by chaining two partially evaluated functions +
+
( + $first5Capitalized := $substring(?, 0, 5) ~> $uppercase(?); + $first5Capitalized(Address.City) +)
+
"WINCH"
+
+ + +### Advanced example - The Y-combinator + +There is no need to read this section - it will do nothing for your sanity or ability to manipulate JSON data. + +Earlier we learned how to write a recursive function to calculate the factorial of a number and hinted that this could be done without naming any functions. We can take higher-order functions to the extreme and write the following: + +`λ($f) { λ($x) { $x($x) }( λ($g) { $f( (λ($a) {$g($g)($a)}))})}(λ($f) { λ($n) { $n < 2 ? 1 : $n * $f($n - 1) } })(6)` + +which produces the result `720`. The Greek lambda (λ) symbol can be used in place of the word `function` which, if you can find it on your keyboard, will save screen space and please the fans of [lambda calculus](https://en.wikipedia.org/wiki/Lambda_calculus). + +The first part of this above expression is an implementation of the [Y-combinator](https://en.wikipedia.org/wiki/Fixed-point_combinator#Fixed_point_combinators_in_lambda_calculus) in this language. We could assign it to a variable and apply it to other recursive anonymous functions: + +``` +( + $Y := λ($f) { λ($x) { $x($x) }( λ($g) { $f( (λ($a) {$g($g)($a)}))})}; + [1,2,3,4,5,6,7,8,9] . $Y(λ($f) { λ($n) { $n <= 1 ? $n : $f($n-1) + $f($n-2) } }) ($) +) +``` + +to produce the Fibonacci series `[ 1, 1, 2, 3, 5, 8, 13, 21, 34 ]`. + +But we don't need to do any of this. Far more sensible to use named functions: + +``` +( + $fib := λ($n) { $n <= 1 ? $n : $fib($n-1) + $fib($n-2) }; + [1,2,3,4,5,6,7,8,9] . $fib($) +) +``` diff --git a/website/versioned_docs/version-2.1.0/regex.md b/website/versioned_docs/version-2.1.0/regex.md new file mode 100644 index 00000000..f5ad6ce4 --- /dev/null +++ b/website/versioned_docs/version-2.1.0/regex.md @@ -0,0 +1,92 @@ +--- +id: version-2.1.0-regex +title: Using Regular Expressions +sidebar_label: Regular Expressions +original_id: regex +--- + +The regular expression is a syntax for matching and extracting parts of a string. JSONata provides first class support for regular expressions surrounded by the familiar slash delimeters found in many scripting languages. + +`/regex/flags` + +where: +- `regex` - the regular expression +- `flags` - optionally either or both of: + - `i` - ignore case + - `m` - multiline match + + ## Functions which use regular expressions + +A number of functions are available that take a regular expression as a parameter + +- [$match()](string-functions#match) +- [$contains()](string-functions#contains) +- [$split()](string-functions#split) +- [$replace()](string-functions#replace) + +__Examples__ + + ## Regular expressions in query predicates + +Regexes are often used in query predicates (filter expressions) when selecting objects that contain a matching string property. For this, a short cut notation can be used as follows: + +`path.to.object[stringProperty ~> /regex/]` + +The `~>` is the [chain operator](other-operators#-chain), and its use here implies that the result of `/regex/` is a function. We'll see below that this is in fact the case. + +__Examples__ + +``Account.Order.Product[`Product Name` ~> /hat/i ]`` + +will match all products that have 'hat' in their name. + + ## Generic matchers + +The JSONata type system is based on the JSON type system, with the addition of the function type. In order to accommodate the regex as a standalone expression, the syntax `/regex/` evaluates to a function. Think of the `/regex/` syntax as a higher-order function that results in a 'matching function' when evaluated. The `regex` between the slashes is the parameter to this HOF and the function that gets returned, when applied to _its_ string parameter, will return a structure that contains details of parts of the string that have been matched. If nothing is matched, then it returns the empty sequence (i.e. JavaScript `undefined`). + +__Example__ + +`$matcher := /[a-z]*an[a-z]*/i` + +Evaluation of the regex returns a function, and this has been bound to a variable `$matcher`. Later, the `$matcher` function is invoked on a string: + +`$matcher('A man, a plan, a canal, Panama!')` + +This returns the following JSONata object (JSON, but also with a function property): + +``` +{ + "match": "man", + "start": 2, + "end": 4, + "groups": [], + "next": "#0" +} +``` + +This contains information of the first matching substring within this famous palindrome, specifically: +- `match` - the substring within the original string that matches the regex +- `start` - the starting position (zero offset) of the matching substring within the original string +- `end` - the ending position of the matching substring within the original string +- `groups` - if capturing groups are used in the regex, then this array contains a string for the text captured by each group +- `next()` - when invoked, will return details of the second occurrence of any matching substring (and so on). + +In this example, invoking `next()` will return: + +``` +{ + "match": "canal", + "start": 17, + "end": 22, + "groups": [], + "next": "#0" +} +``` + +and so on, until it eventually returns the empty sequence. + + ## Writing a custom matcher + +We've learned that the regex syntax is just a function generator, and the signature and return structure of the generated 'matcher' function is well defined. The four regex-aware functions (`$match`, `$contain`, `$split`, `$replace`) simply invoke this function as part of their implementation. Apart from that, they have no awareness that these matcher functions were generated by the regex syntax. + +So it's possible to write any user-defined matcher function, provided it conforms to this contract. This can be done as a JSONata lambda function or (more likely) as an extension function. It can then be passed to these four 'regex-aware' functions and they will search using the custom matcher rather than a regex. diff --git a/website/versioned_docs/version-2.1.0/sorting-grouping.md b/website/versioned_docs/version-2.1.0/sorting-grouping.md new file mode 100644 index 00000000..2ebbca3f --- /dev/null +++ b/website/versioned_docs/version-2.1.0/sorting-grouping.md @@ -0,0 +1,109 @@ +--- +id: version-2.1.0-sorting-grouping +title: Sorting, Grouping and Aggregation +sidebar_label: Sorting, Grouping and Aggregation +original_id: sorting-grouping +--- + +## Sorting + +Arrays contain an ordered collection of values. If you need to re-order the values, then the array must be sorted. In JSONata, there are two ways of sorting an array: + +1. Using the [`$sort()`](array-functions#sort) function. + +2. Using the [order-by](path-operators#order-by-) operator. + +The [order-by](path-operators#order-by-) operator is a convenient syntax that can be used directly in a path expression to sort the result sequences in ascending or descending order. The [`$sort()`](array-functions#sort) function requires more syntax to be written, but is more flexible and supports custom comparator functions. + +## Grouping + +The JSONata [object constructor](construction#object-constructors) syntax allows you to specify an expression for the key in any key/value pair (the value can obviously be an expression too). The key expression must evaluate to a string since this is a restriction on JSON objects. The key and value expressions are evaluated for each item in the input context (see [processing model](processing#the-jsonata-processing-model)). The result of each key/value expression pair is inserted into the resulting JSON object. + +If the evaluation of any key expression results in a key that is already in the result object, then the result of its associated value expression will be grouped with the value(s) already associated with that key. Note that the value expressions are not evaluated until all of the grouping has been performed. This allows for aggregation expressions to be evaluated over the collection of items for each group. + +__Examples__ + +- Group all of the product sales by name, with the price of each item in each group +
+
Account.Order.Product{`Product Name`: Price}
+
{ + "Bowler Hat": [ 34.45, 34.45 ], + "Trilby hat": 21.67, + "Cloak": 107.99 +}
+
+ +- Group all of the product sales by name, with the price and the quantity of each item in each group +
+
Account.Order.Product { + `Product Name`: {"Price": Price, "Qty": Quantity} +}
+
{ + "Bowler Hat": { + "Price": [ 34.45, 34.45 ], + "Qty": [ 2, 4 ] + }, + "Trilby hat": { "Price": 21.67, "Qty": 1 }, + "Cloak": { "Price": 107.99, "Qty": 1 } +}
+
+ +Note in the above example, the value expression grouped all of the prices together and all of the quantities together into separate arrays. This is because the context value is the sequence of all grouped Products and the `Price` expression will select all prices from all products. If you want to collect the price and quantity into individual objects, then you need to evaluate the object constructor _for each_ product in the context sequence. The following example shows this. + +- Explicit use of `$.{ ... }` to create an object for each item in the group. +
+
Account.Order.Product { + `Product Name`: $.{"Price": Price, "Qty": Quantity} +}
+
{ + "Bowler Hat": [ + { "Price": 34.45, "Qty": 2 }, + { "Price": 34.45, "Qty": 4 } + ], + "Trilby hat": { "Price": 21.67, "Qty": 1 }, + "Cloak": { "Price": 107.99, "Qty": 1 } +}
+
+ +- Multiply the Price by the Quantity for each product in each group +
+
Account.Order.Product{`Product Name`: $.(Price*Quantity)}
+
{ + "Bowler Hat": [ 68.9, 137.8 ], + "Trilby hat": 21.67, + "Cloak": 107.99 +}
+
+ +- The total aggregated value in each group +
+
Account.Order.Product{`Product Name`: $sum($.(Price*Quantity))}
+
{ + "Bowler Hat": 206.7, + "Trilby hat": 21.67, + "Cloak": 107.99 +}
+
+ + + +## Aggregation + +Often queries are just required to return aggregated results from a set of matching values. A number of aggregation functions are available which return a single aggregated value when applied to an array of values. + +__Examples__ + +- Total price of each product in each order +
+
$sum(Account.Order.Product.Price)
+
198.56
+
+ +- More likely want to add up the total of the price times quantity for each order +
+
$sum(Account.Order.Product.(Price*Quantity))
+
336.36
+
+ +Other [numeric aggregation functions](aggregation-functions) are available (i.e. average, min, max) and an [aggregator for strings](string-functions#join). It is also possible to write complex custom aggregators using the [`$reduce()`](higher-order-functions#reduce) higher-order function. + diff --git a/website/versions.json b/website/versions.json index dbe05603..b50b4055 100644 --- a/website/versions.json +++ b/website/versions.json @@ -1,4 +1,5 @@ [ + "2.1.0", "2.0.0", "1.8.0", "1.7.0"