Skip to content

Latest commit

 

History

History
1044 lines (739 loc) · 36.5 KB

LANGUAGE-REFERENCE.adoc

File metadata and controls

1044 lines (739 loc) · 36.5 KB

Unquery Reference Manual

Table of Contents

A full reference of the Unquery language.

1. Overview

The query language Unquery is a query language for structured documents, currently supporting JSON and XML. It is the query language of XCiteDB, a structured document database with version control, supporting XML as well as JSON metadata. It is also used in unq, a tool for querying and transforming JSON documents. While supporting multiple documents formats as input, the result of Unquery is always represented as JSON. In case of XML, the query results are represented as strings inside a JSON document.

There are two main design principles for Unquery:

  • Principle 1: Each Unquery query is a well-formed JSON document.

  • Principle 2: The structure of the query result corresponds to the structure of the query itself.

There are few exceptions to those principles. First, we allow c-style comments (both // and /* …​ /*) in Unquery. And second, few language features may violate Principle 2 and create results that may not completely correspond to the query structure.

2. Query structure and evaluation

Each query is evaluated by traversing the query from the top node to the leaves in the order they appear in the query. When querying multiple documents, the query is repeatedly evaluated for each document. Objects in Unquery generate corresponding objects in the result. Arrays in Unquery generate arrays, and strings contain expressions that are evaluated into JSON values in the result (either a string, or any other value).

2.1. Strings

A string in Unquery represents an expression. The simplest expressions are paths in the queried JSON document. For example:

  • "Field1" is the value for the key Field1.

  • "Field1.Field2" is the value of Field2 within Field1

  • "[1]" is the 2nd element in an array (starting from 0)

  • "Field1[0]" is the first element in the array in Field1.

  • "Field1[0].Field2" is the value of Field2 in the object at the first element in the array in Field1.

When a field name includes characters other than alphanumeric characters on underscore, we need to enclose it in backticks. E.g:

"`The Second Field`"

Note that a path does not have to result in a string value. It can be evaluated to any JSON value, including arrays, objects, and even complex structures such as objects containing arrays of objects etc.

Other basic expressions include constant numbers, bool values, or strings. String constants need to be enclosed in single quotes: "'A String'" or escaped quotes: "\"A String\"".

We can write more complex expressions with operators. For example "Field1+Field2", use parentheses for presedence. For example: "(Field1+2)*5", and use functions, such as: "$length(Field1)" for the length of the string in Field1. Functions always start with a $ sign. We will list all supported operators and functions later in this document.

At the end of the string, after an expression, we may add a predicate, which starts with the symbol ? followed by a condition. For example: "Field1?Field2=5". Given a predicate, the expression would be evaluated only when the condition is true.

We may also provide sorting specifier for ordering values inside an array. E.g. "Field1@ascending".

2.2. Arrays

An array in Unquery may contain one or more values. An array in the query evaluates to an array in the result. Each time we evaluate the array in the query (for each document, or multiple times when iterating over elements within the document), we add a new element for the result array.

For example, suppose the query is: ["FirstName"]

If we evaluate multiple documents, when one document is for John Doe, the other is for Jane Doe, and the third is for David Gill, the result would be: ["John","Jane","David"]

Similarly, if the query is: ["FirstName","LastName"]

The result would be: ["John","Doe","Jane","Doe","David","Gill"]

2.3. Objects

Objects in Unquery evaluate to objects in the results. Each object in the query contains pairs of string keys, and values. In the simplest case, constant keys in the querry would correspond to keys with the same name at the result. For example, for the query:

{
    "key1":"Field1",
    "key2": "2+5"
}

The result would be:

{
    "key1":"The value of Field1",
    "key2":"7"
}

The keys are evaluated in order. And for each key, the query evalautes the value. Note that when the value is a simple string expression as in the above example, it is evaluated once, and then can no longer change, even if we evaluate the same object again for a different documents. If, for one document, the value of Field1 is "One value", and for another, the value is "Another value", the result would be just "One value".

Also note that the order of the fields in the result is not guaranteed. While Unquery tries to preserve the order of the fields in an object, the order might not be preserved at some cases, especially when using dynamic key names (See Section ???).

Key names in Unquery are not just constant strings. The language support other types of key names, such as directives (special instructions, starting with #, for example, "#if"), and dynamic key names which evaluate to one or more key names (for example, "$(Field1)").

Following the key name, Unquery allows a language construct called context modifier (See Section ???).

2.4. Mixing objects and arrays

Unquery allows mixing object and arrays in any possible way. For example, when the query is [{"key1":"Field1"}], the result would be:

[{"key1":"value1"}, {"key1","value2"},{"key1":"vale3"}]

For the query: {"key1":["Field1"]}

The result is:

{"key1":["value1","value2","value3"]}

And for: [["Field1"]], the result is:

[["value1"],["value2"],["value3"]]

For the last example, note that there are three entries for the outer array, and a single entry for each inner array. The reason is that each time we evaluate a new document, we add a new entry to the outer array. When we evaluate the second document, we add a new entry to the outer array, so we no longer modify the inner array for the first entry.

Mixing arrays and objects become even more interesting once we use context traversal. For example, the query:

{"result:Array1[]": ["Field1"]}

Would result in one long array with all the values for Array1[].Field1 for all documents. e.g.:

{"result": ["doc1_value1","doc1_value2","doc2_value1","doc2_value2"]}

But if we enclose the entire query in square brackets:

[{"result:Array1[]": ["Field1"]}]

We would get:

[{"result":["doc1_value1","doc1_value2"]},
 {"result":["doc2_value1","doc2_value2"]}]

Note that once again, when we use arrays within arrays, the result is that Unquery creates a new object for each document in the outer array, and within each document, it adds new values to the inner array.

3. Path expressions

A path expression is used to select a value in a document by specifying the path (i.e. keys and array indexes) from the top. A path in Unquery is relative to the current context path, which is intially the document’s root (or top), unless changed with a context modifier.

3.1. Dot

A . (dot) is used to specify the current value. Without any context modifiers, it is the document’s root, and would therefore return the entire document. For example:

["."]

Would return all queried documents as an array. And:

{"result:Field1": "."}

Is equivalent to:

{"result":"Field1"}

3.2. Field name

A field name returns a value corresponding to a key at the current context path. For example: "Field1"` would return the value of Field1.

When the field name includes spaces or any symbols other than underscore or alphanumeric characters, you need to use backtick. For example:

"`Field name with spaces`"

3.3. Subfield

A dot followed by field name (without any spaces between the dot and the key name) extracts a subfield (key for a value). For example: "Field1.Subfield2".

A subfield extraction operator can be used after an expression, and not just a path. For example: "$var(x).Subfield1" extracts the field Subfield1 from the JSON stored in variable x.

The subfield name doesn’t have to be a string. We can use an expression an evaluate it to get the key name. For example: Field1.$(Field2) would read the value of Field2 and use the value as the key name to extract.

3.4. Array element

A number in square brackets, e.g. [1], extracts the corresponding element number from the array (in this case, element 1, which is the 2nd element). An array element can appear anywhere in a path. For example: [0].Field1, or Field1[0][2].Field2.

We can use an expession instead of a constant number. For example Field1[1+1] or Field1[$index].

3.5. Whole array

Two square brackets without any number or expression between them returns all of the elements of an array. For example: Array1[]. In this case, this would be equivalent to Array1 without any square brackets, since the value of Array1 is the entire array.

Hovever, we do need to use the [] operator when doing array projection. For example: Array1[].Field1 would return an array of Field1 value for each object in Array1. We could achieve the same effect using context traversal, as in:

{"result:Array1[]":["Field1"]}

Which is usually preferable since it’s more idiomatic for Unquery. But in some cases, array projections are more convenient. For example, when using the in operator.

3.6. Root

In case we changed the current path with a context modifier, we can still access the root element, or any path that starts with the root element (i.e. absolute path) by using a / (slash) operator. For example:

{"result:Field1": "/Field2"}

This would return Field2 of the top object, and not Field1.Field2.

3.7. Up

The ../ (two dots and slash) operator changes the context to one level up in the path. For example, if the current path is Field1.Field2, the path expression ../Field3 returns the value of Field1.Field3. Note that the up opator also skips array indexes, so when the path is Field1.Field2[3], the expression: ../Field3 would still return: Field1.Field3.

3.8. Previous document

The << operator would temporary switch back to the previous context, before switching to a different document. For example:

{
    "result->$file('another.unq')" : {
        "key1":"Field1",
        "key2":"<<Field1"
    }
}

In this case, key1 would contain the value of Field1 in another.unq, and key2 would contain the value of Field1 in the original document.

4. Conditions

Conditions are used to filter out results. Conditions can appear in four different places in Unquery:

4.1. Predicates

Predicates appear in a value, after an expression, or in a key, after a context modifier. The syntax is ? condition. When there is a predicate on a value, the value is evaluated only when the predicate is true.

When the predicate is false, the expression is skipped. When the predicate appears in a array value, the value won’t be added to the array. When a predicate appears in an object, the specific key and value won’t be added to the object, but other fields may still be evaluated and added. When a false predicate appears after an aggregate function, the aggregate value won’t be updated in this case.

For example:

{
   "key1": "value1?x!=1",
   "key2": "value2",
}

When x is 1, "key1" won’t appear, but "key2" would still be there.

4.2. Constraints

Constrains have the syntax: <expression1> <comparator-op> <expression2>. When used in an array or standalone value, the effect of "value1 = value2" (for example) would be the same as "value1? value1 = value2". However, when used in an object field, the entire object won’t be evaluated when the constraints are not met. For example:

{
    "key1": "value1!=1",
    "key2": "value2>5",
    "key3": "value3"
}

This object is evaluated only when both value1!=1 and value2>5. This is equivalent to using the`#if` directive:

{
    "#if": "value1!=1 & value2>5"
    "key1": "value1",
    "key2": "value2",
    "key3": "value3"
}

4.3. Equality and inequality operators

We can compare the value of two expressions using equality and inequality operators: =, <, >, , >=, '!='.

For example x != y.

4.4. Negation

We can negate the condition using a ! before the condition. For example ! x=y is equivalent to x!=y.

4.5. Exists

An exclamation mark after a field name or path is true only if the field exist in the document. For example: "Field1.Field2!" is true when Field2 exists in Field1. This operator can also be used as value constraint. For example:

{
    "key1": "Field1!",
    "key2": "value_expr"
}

The object is evaluated only when Field1 exists.

4.6. Type test

The type operators: is_array, is_object, is_literal, is_number, is_float, is_int, and is_bool test if a field is of a specific type. For example, to get all integer values in an array with mixed numbers and strings:

{
    "numbers:[]": [". is_int"]
}

4.7. Conjunctions and parentheses

The operators & (and) and | (or) can be used to combine conditions. & have higher precedence than `|'. Paretheses are allowed to change the order of precedence.

For example: x=5 | (y>7 & x=z)

4.8. String conditions

Unquery supports the operators contains, starts_with, ends_with and matches. Where the first three test if a string contains, starts with or ends with another string, and the last one tests if a string matches a regular expression.

For example: Field1 contains 'Developer' & Field2 matches 'A.*b'.

4.9. Value in array

The operator in is true when a value is found inside an array. The operator not_in is its negation: value not_in array is equivalent to !(value in array).

5. Dynamic field names

Keys (field names) can be either a constant string, or an expression that get evaluated to a value (or multiple values, in some cases). A non-constant key is a dynamic value.

5.1. Constant keys

When a key is an alphanumeric string (including underscores), or enclosed in backticks, it is treated as a constant. The key in the result is the same as the key in the query.

5.2. Eval expression

When using $(expression), the expression is evaluated, and the resulting value is the key name. For example:

{
   "Field1": "value1",
   "$(Field2)": "value2"
}

In this case, the name of the first field would be "Field1", and the name of the second field would be "Field2".

Note that if we evaluate this object multiple times (either for multiple documents, or when using context traversal), we would add more fields to the result object. This can be used to construct a dictionary-style object.

Another case where this is useful is when some key values occure multiple times. We can use the pattern:

{
    "$(bin)": ["value"]
}

The effect would be similar to group_by in other query languages. For example, if we have the JSONs:

[
    {
        "bin": "One",
        "value": "val1"
    },
    {
        "bin": "Two",
        "value": "val2"
    },
    {
        "bin": "One",
        "value": "val3"
    },
    {
        "bin": "Two",
        "value": "val4"
    },

]

And the query:

{
    "results:[]": {
        "$(bin)" : ["value"]
    }
}

The result would be:

{
    "results": {
        "One": ["val1", "val3"],
        "Two": ["val2", "val4"]
    }
}

Note that when using function (starting with $ sign), the $(…​) is redundant. For example, in the key, we can use "$index" instead of "$($index)".

5.3. Copy keys from queried object

When using "{}" as the key, this would be evaluated to all keys in the queried object. For example, when the query is:

{
    "{}": "value"
}

And the queried object contains key1, key2 and key3, the result would be:

{
    "key1": "value",
    "key2": "value",
    "key3": "value"
}

The value in all three keys is identical here, which is probably rarely useful. A more useful pattern is:

{
    "{}:" : "."
}

The colon after the {} means that the context for each key would be modified accoding to the name of the key, so that "." is the value of the corresponding key in each case. The result is copying all keys and their correspond values from the queried objects. This would allow us to copy all fields, and possibly add extra fields to the result.

If we wanted to filter out some fields instead of copying everything, there are several ways we can do that. One is by adding a predicate to the value expression. For example, for copying all fields except key2, we can write:

{
    "{}:" : ".?$key!='key2'"
}

Another syntax we support is:

{
    "{'regex'}:" : "."
}

This is equivalent to:

{
    "{}:" : ".?$key matches 'regex'"
}

6. Array sorting

By default, array elements appear in the order they are added to the array. In case we want the array to be ordered, we can use sorting specifiers.

6.1. Simple sorting

There are four sorting specifiers in Unquery: * @ascending * @descending * @unique_ascending * @unique_descending

Sorting specifiers are added at the end of a string value, after the expression (and after the predicate, if present). When used, strings are sorted by lexical order, and numbers are sorted numerically. The @unique…​ specifiers also remove duplicates.

For example: ["FirstName@unique_ascending"]

6.2. Sorting arrays of objects

For arrays of objects, we can sort the objects according to a value of a field in the object. For example:

[
    {
        "firstname":"FirstName@ascending",
        "lastname":"LastName"
    }
]

This would sort the object based on first name. We can also define both primary and secondary keys for sorting (or any number of keys), by adding a number in parentheses after the sorting specifier. For example:

[
    {
        "firstname":"FirstName@descending(2)",
        "lastname":"LastName@ascending(1)"
    }
]

In this case, the object are ordered in ascending order according to last name, and object with the same last name are ordered in descending order based on first name.

6.3. Arrays with muliple items in the query

It is unadvisable to use sorting specifiers with more than one item in the array in the query. For example, the sorting order for the following query is undefined: ["FirstName@ascending","Lastname"]

In case you do want to sort such an array, you should use the same sorting specifier for all entries. For example: ["FirstName@ascending","Lastname@ascending"]

7. Context modifiers

A context in Unquery consists of the identity of the document that is being processed, and a path within this document. Initially, the query is traversing a set of documents (for example, the JSON files that are specified at the command line), and the path is the top element for each document.

A context modifier can change the context of the query. Context modifiers are specified in the key, after the key names, and their scope is the value for that key (i.e. everything in the object/array/string that is provided as a value for this key).

There are two kinds of context modifiers: context modifiers that change the path within a document, which start with : (colon), and context modifiers that switch to a different document, which start with (arrow).

Some context modifiers iterate through multiple contexts (either paths, or documents). Such context modifiers are called context traversals.

7.1. Path modifiers

The simplest types of context modifiers are simply path expressions. For example:

{
    "key:Accounts[0]": "id"
}

In this case, it would return the value of Accounts[0].id.

7.2. Modify context based on key name

A useful abbreviation is using a colon after a key name, with nothing after that. For example:

{
    "Field1:": "value"
}

The key Field1: is equivalent to Field1:Field1. This allows us to avoid writing the same key over and over. If, for example, we want to copy the key and value for FirstName, we can write: "FirstName:" : "."

7.3. Array traversal

Context modifiers allow us to iterate through all elements in an array, by using []. For example:

{
    "key:Accounts[]": ["id"]
}

Would return an array of all ids for all accounts. One subtle detail here is that in case the current value is not an array but an object or literal, it would not be an error, and the query would process this object as if it was an array with a single element containing this value.

7.4. Object keys traversal

Context modifiers also allow us to iterate trough all fields in an object, using {}. For example:

{
    "result:{}": ["."]
}

The above query takes an object and convert it to an array containing all the values of fields in that object.

If we do not want to iterate over all fields but just some of them, one way to do this is using: {'regex'}. For example:

{
    "result:{'A.*B'}": ["."]
}

Another way to filter the fields is to use predicates. For example:

{
    "result:{}?$key!='LastName'": ["."]
}

7.5. Recursive descent

In case we want to traverse all paths in our context, including all array elements and all object keys, we can use the recursive descent operator : **. For example, the following query returns a sorted array with all keys in the document:

{
    "#return:**": ["$key@unique_ascending"]
}

7.6. Switching documents

Context modifiers that switch documents start with a (arrow) operator, followed by an expression specifying the new document.

We currently suppot the following (there are additional arrow operations in XCiteDB):

  • →$file('file-name'): Read another json file, and swtich context to this file. This allows us to link to data in another document.

  • →$var(variable-name): Switch to the JSON stored in this variable.

Notes:

  1. It is usually better to read a file into a variable an switch to that variable, so that we won’t have to read the same file mutiple times.

  2. when switching documents, it is useful to use the << operator to connect between the values in the old document and values in the new document.

7.7. Chaining context modifiers

We can specify a sequence of context modifiers, by writing them one after the other. For example:

{
    "result:[]:{}" : ["."]
}

This query would return an array with all the field values of all the object in an array.

{
    "result->$file('employees.json'):Employees[]" : ["FirstName"]
}

This would show all the first names in the file employees.json.

7.8. Context-or operator

When we want to explore multiple alternative contexts, we can use the || (context-or) operator.

For example:

{
    "names:.||Family[]" : ["FirstName"]
}

The above query would list all first names, both at the top of each JSON document, and in the Family[] array.

7.9. Context predicates

Optionally, a predicate can be added after each context quelifier. For example:

{
    "result:Customers[]?Balance>100000:Accounts[]" : ["accountNumber"]
}

8. Directives

Directives are specified using key names that start with #. They can be used to change the behaviour of the query or perform special actions such as variable and function declarations. The are evaluated in the order they appear.

8.1. #assign

The #assign directive assigns a new value into a variable. For example:

{
    "#var x" : "1",
    "obj": {
        "#assign x": "2"
    },
    "x_value": "$var(x)"
}

The result of the above query is {"x_value": 2}.

Note: when the variable was not previously declared with #var, the effect of #assign would be the same a #var.

8.2. #exists

The #exists directive is a condition which is true only if its value is non-empty. This allows us to test if some condition holds within a complex value (arrays, objects etc.). For example:

[
    {
        "#exists:Employees[]" : ["Salary>100000"],
        "company":"CompanyName"
    }
]

The above query returns the list of companies that have employees with a salary of over 100000.

8.3. #func

The #func directive allows adding user-defined functions. For example:

{
    "#func fullname" : "FirstName+' '+LastName",
    "names:Employees[]": ["$fullname"]
}

This query is equivalent to:

{
    "names:Employees[]": ["FirstName+' '+LastName"]
}

Functions can also have parameters. For example:

{
    "#func fullname(x,y)" : "$var(x)+' '+$var(y)",
    "names:Employees[]": ["$fullname(FirstName, LastName)"]
}

Function definitions can also be recursive, for more complicated queries.

8.4. #if

An #if directive specifies a condition. If the condition is false, the entire object is skipped.

For example:

[
    {
        "#if" : "Title!=CEO",
        "FirstName:": ".",
        "LastName:": "."
    }
]

The above query returns the list of all employees, except the CEO.

8.5. #notexists

A #notexist directive is a condition which is true only when the value is empty. For example:

[
    {
        "#notexists:Employees[]": ["Salary<30000"],
        "company": "CompanyName"
    }
]

This query would return the list of companies that do not have employees than make under 30000.

8.6. #return

The #return directive breaks the second principle of Unquery. It can make the structure of the result different than the structure of the query. But it is necessary to avoid a potential limitation when using context modifiers or directives. To use context modifiers or directives, we often need to add a wrapping object in cases where we do not need an object in the result.

For example:

{
    "result:Employees[]": ["FirstName"]
}

The above query returns an object containing a key: result, with an array of the first names of all employees. But in case we want to get simply an array, without the wrapping object, we can write:

{
    "#return:Employees[]": ["FirstName"]
}

Note that in case of a return, any other fields in that object would be dropped from the result. And in case of multiple #return directives in the same object, only the first one would be effective.

8.7. #returnif

The #returnif directive is effective only if its value is non-null. This allows us to provide multiple #returnif directives, where each is true in different cases.

One place where this is very useful is for defining a recursive function. For example, the following function works for any JSON with any structure, and replaces the value of field with the key key1 to 'something else':

{
	"#func rec" : {
	       "#returnif:[]": ["$rec"],
	       "#returnif:{}": {
	       	   "$key":"$rec"
	       },
	       "#returnif":"'something else'?$key='key1'",
	       "#return":"."
	},
	"#return" : "$rec"
}

8.8. #var

A #var directive declares a variable, that can later be accessed either with the $var function or with %name. A variable can store any JSON value, including simple strings and numbers, or more complex values.

For example:

{
    "#var x": "'Some string'",
    "Field1": "%x"
}

And a more complex example:

{
    "#var dic:Employees[]": {
        "$(EmployeeId)": "."
    },
    "Employee1": "$var(dic).1001"
}

This example transforms the array of all employees into a dictionary containing all employees and indexed by EmployeeId as the key. Later, we can access a specific employee with id 1001 by $var(dic).1001.

Variables are scoped. When a variable is declared in an outer object, and a variable with the same name is declared in an inter object, the inner variable hides the outer variable.

For example:

{
    "#var x":"1",
    "obj":{
        "#var x":"2",
        "value":"$var(x)"
    },
    "value":"$var(x)"
}

Inside obj, the value of x is 2. But after getting back to the outer object, the value of x is 1 again.

9. Operators

We support arithmetic and string operators in Unquery expressions.

9.1. Arithmetic operators

All the standard operators: +, -, *, /, and mod (modulu) are supported.

The operators are supported both for integers and floating point. If both operands are integers, the result is an integer. Otherwise, the result is a double-precision floating point number.

9.2. String operators

The only supported operator between strings in +, which performs string concatenation.

9.3. Order of predendence and parentheses

The order of precedence is + and - are lower precedence, *, '/', and mod are higher precedence. We can use parentheses to change the order of precedence.

10. Functions

10.1. $() (evaluate function)

The $() (or evaluate) treats the expression inside the parantheses as a path expression, and returns the value of that path. This is useful, for example, for retrieving a value in a dictionary object. For example, if employees is a dictionary from id to employee names, and group is an array of id`s, the following query would list the employee names in `group:

{
    "#return:group[]": ["/employees.$(.)"]
}

10.2. $csv

$csv(filename [,delimiter] [,have_headers]) is a function that reads a csv file (comma seperated values), and returns a json document. The delimiter parameter is optional, with comma as default, and have_deaders is an optional boolean value, with default true.

The function expects a file where the first line the the column names (unless have_headers is false), rows are seperated by newline, and columns are seperated by the specified delimiter (which is not used inside the values).

The result is an array of objects. The field names in each object are the column names. The values are strings, numbers, or booleans. If the value in the csv looks like a number (i.e. containing only numeric digits and a dot), it would be converted into numbers. If it is "true" or "false", it would be converted into boolean. Otherwise, the value is a string.

For example, for the csv file:

username,user_id,active
sela,100,true
ari,101,true
jack,102,false

The result is:

[
    {
        "username": "sela",
        "user_id": 100,
        "active": true
    },
    {
        "username": "ari",
        "user_id": 101,
        "active": true
    },
    {
        "username": "jack",
        "user_id": 102,
        "active": false
    }
]

10.3. $env

$env(expression), where expression evaluates to a string value, returns the value of the environment variable in expression. For example, "$env('BROWSER')" returns the value of the environmet variable BROWSER. === $file

The function $file reads a json file, and returns its value. For example:

{
    "ThisJSONFile": ".",
    "AnotherJSONFile": "$file('another-file.json')"
}

10.4. $filename

$filename (without any parameters), returns the string for the current filename. For example, getting array with the names of all processed files:

[
    "$filename"
]

10.5. $find

The function $find(path,string) finds all occurrences of string in path, and returns an array with all indexes where string is found.

10.6. $if

The function $if(cond,then,else) is used for a ternary if-then-else expression. For example: "$if(x>10,'yes', 'no')"

10.7. $ifind

$ifind(path,string) is like $find, but case insensitive. Returns an array of indexes.

10.8. $index

The $index function returns the last index in the context path. If the current path is Field1[5], it would return 5, and also in case the path is Field1[5].Field2, or Field1[3][5].

10.9. $join

$join(array,delimiter) joins all strings in a json strcture (array, object, etc), adding delimiter between them.

10.10. $key

The $key$ function returns the last key in the context path. For example, if the current path is Field1.Field2, it would return Field2. If the path ends with an array index, returns the key before any array index. For example, for the path Field1.Field2[7], the returned key would be Field2 as well.

10.11. $length

The function $length(expr) returns the length of the string returned by expr.

10.12. $lower

The function $lower(expr) converts the string returned by expr to lowercase.

10.13. $now

Current time in unix epoch time.

10.14. $path

The function $path (no arguments) returns the current context path.

10.15. $replace

The function $replace(str, from, to [,replace_all?]) replaces all occurances of from in str to to (or first occurance if replace_all is false).

10.16. $reskey

The function $reskey returns the result key name in the result. For example: {"field1:Field1":"$reskey"} would return {"field1":"field1"}.

10.17. $size

The function $size(path) returns the size of the array in $path.

10.18. $split

The function $split(string, delimiter) splits the delimiter-separated values in a string into an array.

10.19. $substr

$substr(stringexpr, start, length) returns a substring of stringexpr, startin at offset start, and of length length.

10.20. $time_to_str

The function $time_to_str(expr,format_string) converts unix epoch time to date and time string based on a format string. The format string is the same as in strftime in c.

10.21. $to_time

The function $to_time(string,format_string) converts a string into unix epoch time. The format string is the same as in strptime in c. === $upper

The function $upper(expr) converts the string returned by expr to uppercase.

10.22. $var

The function $var(variable-name) returns the value of a variable. For example:

{
    "#var x": "1000",
    "value": "$var(x)
}

Would return {"value":1000}

Note: instead of $var(x), you can also write: %x.

10.23. Type cast functions: $string, $number, $int, $float, $bool

Type cast functions convert literals from one type to another. It can also convert strings to numbers. For example, "$int('5')" would return the integer value "5".

Another use case is for forcing floating point division. "14/10" would return "1", and "14/$float(10)" returns "1,4".

$number casts strings to either int or float type, depending on the string’s content.

11. Aggregate functions

Aggregate functions calculate a value based on multiple fields. Each time a field is visited, the value of the aggregate function is updated. For example: "$max(Salary)" would return the maximum value for Salary.

11.1. Aggregates and non-array values

Recall that when we have values (including objects) outside of an array, it is usually updated once, even if there are multiple matching values. For example, the query {"salary":"salary"} returns the first salary the query finds, even if there are multiple employees with different salaries. However, when we use aggregate functions, the value is updated each time we visit the field containing that value. For example: {"AllSalaries":"$sum(salary)"} would return the sum of all salaries.

11.2. Aggregates, arrays, and grouping

An aggregate value is updated each time we visit that field. This mean that in case we use an array, such as ["$sum(Salary)"], a new element would be added each time we update the array, and this value is updated based on one value alone. The result would be just an array of salaries in this case.

However, if we use context traversals inside the array, aggregate functions can still be meaningful. For example:

[
  {
    "name":"FullName",
    "avgFamilyAge:Family[]": "$avg(Age)"
  }
]

Would return an array of people, and the average age of the family members for each entry in this array. This allows us to do aggregation at each level in the document.

The same principle works when we group values using dynamic key names. For example, the following query would group employees by title, and display the average salary for each title:

{
    "$(Title)":"$avg(Salary)"
}

11.3. Aggregates and conditions

An aggregate value is updated each time it is evaluated. If we skip evaluating it due to using predicates or #if conditions, we won’t include those values. For example, "$avg(Age)?Age>=18" would calculate the average age of those that are 18 year old or more.

We can also use aggregate functions inside conditions, but with one caveat: we can only compare the aggregate functions to constant values. For example:

[
  {
    "name":"FullName",
    "avgFamilyAge:Family[]": "$avg(Age)>40"
  }
]

This would only list families with average age greater than 40.

11.4. $avg

The average value

11.5. $count

Takes no parameters, and returns the number of times we evaluated this function. Especially useful when combined with predicates. For example: `"$count?Age>=18"' would return the number of people that are 18 or older.

11.6. $min

The minimum value of a field

11.7. $max

The maximum value of a field

11.8. $prev

This function allows us to do aggregations that are not directly supported by a built-in function. $prev(defalut-value) returns default-value the first time we use it, and returns the value of the expression it appears in on subsequent calls. For example: "$prev('')+Text" would concatenate the text field Text in all documents.

11.9. $sum

The sum of the values of a field