- 1. Overview
- 2. Query structure and evaluation
- 3. Path expressions
- 4. Conditions
- 5. Dynamic field names
- 6. Array sorting
- 7. Context modifiers
- 8. Directives
- 9. Operators
- 10. Functions
- 10.1. $() (evaluate function)
- 10.2. $csv
- 10.3. $env
- 10.4. $filename
- 10.5. $find
- 10.6. $if
- 10.7. $ifind
- 10.8. $index
- 10.9. $join
- 10.10. $key
- 10.11. $length
- 10.12. $lower
- 10.13. $now
- 10.14. $path
- 10.15. $replace
- 10.16. $reskey
- 10.17. $size
- 10.18. $split
- 10.19. $substr
- 10.20. $time_to_str
- 10.21. $to_time
- 10.22. $var
- 10.23. Type cast functions: $string, $number, $int, $float, $bool
- 11. Aggregate functions
A full reference of the Unquery language.
The query language Unquery is a query language for structured documents, currently supporting JSON and XML. It is the query language of XCiteDB
, a structured document database with version control, supporting XML as well as JSON metadata. It is also used in unq, a tool for querying and transforming JSON documents. While supporting multiple documents formats as input, the result of Unquery is always represented as JSON. In case of XML, the query results are represented as strings inside a JSON document.
There are two main design principles for Unquery:
-
Principle 1: Each Unquery query is a well-formed JSON document.
-
Principle 2: The structure of the query result corresponds to the structure of the query itself.
There are few exceptions to those principles. First, we allow c-style comments (both //
and /* … /*
) in Unquery. And second, few language features may violate Principle 2 and create results that may not completely correspond to the query structure.
Each query is evaluated by traversing the query from the top node to the leaves in the order they appear in the query. When querying multiple documents, the query is repeatedly evaluated for each document. Objects in Unquery generate corresponding objects in the result. Arrays in Unquery generate arrays, and strings contain expressions that are evaluated into JSON values in the result (either a string, or any other value).
A string in Unquery represents an expression. The simplest expressions are paths in the queried JSON document. For example:
-
"Field1"
is the value for the keyField1
. -
"Field1.Field2"
is the value ofField2
withinField1
-
"[1]"
is the 2nd element in an array (starting from0
) -
"Field1[0]"
is the first element in the array inField1
. -
"Field1[0].Field2"
is the value ofField2
in the object at the first element in the array inField1
.
When a field name includes characters other than alphanumeric characters on underscore, we need to enclose it in backticks. E.g:
"`The Second Field`"
Note that a path does not have to result in a string value. It can be evaluated to any JSON value, including arrays, objects, and even complex structures such as objects containing arrays of objects etc.
Other basic expressions include constant numbers, bool values, or strings. String constants need to be enclosed in single quotes: "'A String'"
or escaped quotes: "\"A String\""
.
We can write more complex expressions with operators. For example "Field1+Field2"
, use parentheses for presedence. For example: "(Field1+2)*5"
, and use functions, such as: "$length(Field1)"
for the length of the string in Field1
. Functions always start with a $
sign. We will list all supported operators and functions later in this document.
At the end of the string, after an expression, we may add a predicate, which starts with the symbol ?
followed by a condition. For example: "Field1?Field2=5"
. Given a predicate, the expression would be evaluated only when the condition is true.
We may also provide sorting specifier for ordering values inside an array. E.g. "Field1@ascending"
.
An array in Unquery may contain one or more values. An array in the query evaluates to an array in the result. Each time we evaluate the array in the query (for each document, or multiple times when iterating over elements within the document), we add a new element for the result array.
For example, suppose the query is:
["FirstName"]
If we evaluate multiple documents, when one document is for John Doe, the other is for Jane Doe, and the third is for David Gill, the result would be:
["John","Jane","David"]
Similarly, if the query is:
["FirstName","LastName"]
The result would be:
["John","Doe","Jane","Doe","David","Gill"]
Objects in Unquery evaluate to objects in the results. Each object in the query contains pairs of string keys, and values. In the simplest case, constant keys in the querry would correspond to keys with the same name at the result. For example, for the query:
{
"key1":"Field1",
"key2": "2+5"
}
The result would be:
{
"key1":"The value of Field1",
"key2":"7"
}
The keys are evaluated in order. And for each key, the query evalautes the value. Note that when the value is a simple string expression as in the above example, it is evaluated once, and then can no longer change, even if we evaluate the same object again for a different documents. If, for one document, the value of Field1
is "One value", and for another, the value is "Another value", the result would be just "One value".
Also note that the order of the fields in the result is not guaranteed. While Unquery tries to preserve the order of the fields in an object, the order might not be preserved at some cases, especially when using dynamic key names (See Section ???).
Key names in Unquery are not just constant strings. The language support other types of key names, such as directives (special instructions, starting with #
, for example, "#if"
), and dynamic key names which evaluate to one or more key names (for example, "$(Field1)"
).
Following the key name, Unquery allows a language construct called context modifier (See Section ???).
Unquery allows mixing object and arrays in any possible way. For example, when the query is [{"key1":"Field1"}]
, the result would be:
[{"key1":"value1"}, {"key1","value2"},{"key1":"vale3"}]
For the query: {"key1":["Field1"]}
The result is:
{"key1":["value1","value2","value3"]}
And for: [["Field1"]]
, the result is:
[["value1"],["value2"],["value3"]]
For the last example, note that there are three entries for the outer array, and a single entry for each inner array. The reason is that each time we evaluate a new document, we add a new entry to the outer array. When we evaluate the second document, we add a new entry to the outer array, so we no longer modify the inner array for the first entry.
Mixing arrays and objects become even more interesting once we use context traversal. For example, the query:
{"result:Array1[]": ["Field1"]}
Would result in one long array with all the values for Array1[].Field1
for all documents. e.g.:
{"result": ["doc1_value1","doc1_value2","doc2_value1","doc2_value2"]}
But if we enclose the entire query in square brackets:
[{"result:Array1[]": ["Field1"]}]
We would get:
[{"result":["doc1_value1","doc1_value2"]},
{"result":["doc2_value1","doc2_value2"]}]
Note that once again, when we use arrays within arrays, the result is that Unquery creates a new object for each document in the outer array, and within each document, it adds new values to the inner array.
A path expression is used to select a value in a document by specifying the path (i.e. keys and array indexes) from the top. A path in Unquery is relative to the current context path, which is intially the document’s root (or top), unless changed with a context modifier.
A .
(dot) is used to specify the current value. Without any context modifiers, it is the document’s root, and would therefore return the entire document. For example:
["."]
Would return all queried documents as an array. And:
{"result:Field1": "."}
Is equivalent to:
{"result":"Field1"}
A field name returns a value corresponding to a key at the current context path. For example: "Field1"`
would return the value of Field1
.
When the field name includes spaces or any symbols other than underscore or alphanumeric characters, you need to use backtick. For example:
"`Field name with spaces`"
A dot followed by field name (without any spaces between the dot and the key name) extracts a subfield (key for a value). For example: "Field1.Subfield2"
.
A subfield extraction operator can be used after an expression, and not just a path. For example: "$var(x).Subfield1"
extracts the field Subfield1
from the JSON stored in variable x
.
The subfield name doesn’t have to be a string. We can use an expression an evaluate it to get the key name. For example: Field1.$(Field2)
would read the value of Field2
and use the value as the key name to extract.
A number in square brackets, e.g. [1]
, extracts the corresponding element number from the array (in this case, element 1
, which is the 2nd element). An array element can appear anywhere in a path. For example: [0].Field1
, or Field1[0][2].Field2
.
We can use an expession instead of a constant number. For example Field1[1+1]
or Field1[$index]
.
Two square brackets without any number or expression between them returns all of the elements of an array. For example: Array1[]
. In this case, this would be equivalent to Array1
without any square brackets, since the value of Array1
is the entire array.
Hovever, we do need to use the []
operator when doing array projection. For example: Array1[].Field1
would return an array of Field1 value for each object in Array1
. We could achieve the same effect using context traversal, as in:
{"result:Array1[]":["Field1"]}
Which is usually preferable since it’s more idiomatic for Unquery. But in some cases, array projections are more convenient. For example, when using the in
operator.
In case we changed the current path with a context modifier, we can still access the root element, or any path that starts with the root element (i.e. absolute path) by using a /
(slash) operator. For example:
{"result:Field1": "/Field2"}
This would return Field2
of the top object, and not Field1.Field2
.
The ../
(two dots and slash) operator changes the context to one level up in the path. For example, if the current path is Field1.Field2
, the path expression ../Field3
returns the value of Field1.Field3
. Note that the up opator also skips array indexes, so when the path is Field1.Field2[3]
, the expression: ../Field3
would still return: Field1.Field3
.
The <<
operator would temporary switch back to the previous context, before switching to a different document. For example:
{
"result->$file('another.unq')" : {
"key1":"Field1",
"key2":"<<Field1"
}
}
In this case, key1
would contain the value of Field1
in another.unq
, and key2
would contain the value of Field1
in the original document.
Conditions are used to filter out results. Conditions can appear in four different places in Unquery:
Predicates appear in a value, after an expression, or in a key, after a context modifier. The syntax is ? condition
. When there is a predicate on a value, the value is evaluated only when the predicate is true.
When the predicate is false, the expression is skipped. When the predicate appears in a array value, the value won’t be added to the array. When a predicate appears in an object, the specific key and value won’t be added to the object, but other fields may still be evaluated and added. When a false predicate appears after an aggregate function, the aggregate value won’t be updated in this case.
For example:
{
"key1": "value1?x!=1",
"key2": "value2",
}
When x
is 1, "key1"
won’t appear, but "key2"
would still be there.
Constrains have the syntax: <expression1> <comparator-op> <expression2>
. When used in an array or standalone value, the effect of "value1 = value2"
(for example) would be the same as "value1? value1 = value2"
. However, when used in an object field, the entire object won’t be evaluated when the constraints are not met. For example:
{
"key1": "value1!=1",
"key2": "value2>5",
"key3": "value3"
}
This object is evaluated only when both value1!=1
and value2>5
. This is equivalent to using the`#if` directive:
{
"#if": "value1!=1 & value2>5"
"key1": "value1",
"key2": "value2",
"key3": "value3"
}
We can compare the value of two expressions using equality and inequality operators: =
, <
, >
, ⇐
, >=
, '!='.
For example x != y
.
We can negate the condition using a !
before the condition. For example ! x=y
is equivalent to x!=y
.
An exclamation mark after a field name or path is true only if the field exist in the document. For example: "Field1.Field2!"
is true when Field2 exists in Field1. This operator can also be used as value constraint. For example:
{
"key1": "Field1!",
"key2": "value_expr"
}
The object is evaluated only when Field1
exists.
The type operators: is_array
, is_object
, is_literal
, is_number
, is_float
, is_int
, and is_bool
test if a field is of a specific type. For example, to get all integer values in an array with mixed numbers and strings:
{
"numbers:[]": [". is_int"]
}
The operators &
(and) and |
(or) can be used to combine conditions. &
have higher precedence than `|'. Paretheses are allowed to change the order of precedence.
For example: x=5 | (y>7 & x=z)
Unquery supports the operators contains
, starts_with
, ends_with
and matches
. Where the first three test if a string contains, starts with or ends with another string, and the last one tests if a string matches a regular expression.
For example: Field1 contains 'Developer' & Field2 matches 'A.*b'
.
Keys (field names) can be either a constant string, or an expression that get evaluated to a value (or multiple values, in some cases). A non-constant key is a dynamic value.
When a key is an alphanumeric string (including underscores), or enclosed in backticks, it is treated as a constant. The key in the result is the same as the key in the query.
When using $(expression)
, the expression is evaluated, and the resulting value is the key name. For example:
{
"Field1": "value1",
"$(Field2)": "value2"
}
In this case, the name of the first field would be "Field1"
, and the name of the second field would be "Field2"
.
Note that if we evaluate this object multiple times (either for multiple documents, or when using context traversal), we would add more fields to the result object. This can be used to construct a dictionary-style object.
Another case where this is useful is when some key values occure multiple times. We can use the pattern:
{
"$(bin)": ["value"]
}
The effect would be similar to group_by
in other query languages. For example, if we have the JSONs:
[
{
"bin": "One",
"value": "val1"
},
{
"bin": "Two",
"value": "val2"
},
{
"bin": "One",
"value": "val3"
},
{
"bin": "Two",
"value": "val4"
},
]
And the query:
{
"results:[]": {
"$(bin)" : ["value"]
}
}
The result would be:
{
"results": {
"One": ["val1", "val3"],
"Two": ["val2", "val4"]
}
}
Note that when using function (starting with $
sign), the $(…)
is redundant. For example, in the key, we can use "$index"
instead of "$($index)"
.
When using "{}"
as the key, this would be evaluated to all keys in the queried object. For example, when the query is:
{
"{}": "value"
}
And the queried object contains key1
, key2
and key3
, the result would be:
{
"key1": "value",
"key2": "value",
"key3": "value"
}
The value in all three keys is identical here, which is probably rarely useful. A more useful pattern is:
{
"{}:" : "."
}
The colon after the {}
means that the context for each key would be modified accoding to the name of the key, so that "."
is the value of the corresponding key in each case. The result is copying all keys and their correspond values from the queried objects. This would allow us to copy all fields, and possibly add extra fields to the result.
If we wanted to filter out some fields instead of copying everything, there are several ways we can do that. One is by adding a predicate to the value expression. For example, for copying all fields except key2
, we can write:
{
"{}:" : ".?$key!='key2'"
}
Another syntax we support is:
{
"{'regex'}:" : "."
}
This is equivalent to:
{
"{}:" : ".?$key matches 'regex'"
}
By default, array elements appear in the order they are added to the array. In case we want the array to be ordered, we can use sorting specifiers.
There are four sorting specifiers in Unquery:
* @ascending
* @descending
* @unique_ascending
* @unique_descending
Sorting specifiers are added at the end of a string value, after the expression (and after the predicate, if present). When used, strings are sorted by lexical order, and numbers are sorted numerically. The @unique…
specifiers also remove duplicates.
For example: ["FirstName@unique_ascending"]
For arrays of objects, we can sort the objects according to a value of a field in the object. For example:
[
{
"firstname":"FirstName@ascending",
"lastname":"LastName"
}
]
This would sort the object based on first name. We can also define both primary and secondary keys for sorting (or any number of keys), by adding a number in parentheses after the sorting specifier. For example:
[
{
"firstname":"FirstName@descending(2)",
"lastname":"LastName@ascending(1)"
}
]
In this case, the object are ordered in ascending order according to last name, and object with the same last name are ordered in descending order based on first name.
It is unadvisable to use sorting specifiers with more than one item in the array in the query. For example, the sorting order for the following query is undefined:
["FirstName@ascending","Lastname"]
In case you do want to sort such an array, you should use the same sorting specifier for all entries. For example:
["FirstName@ascending","Lastname@ascending"]
A context in Unquery consists of the identity of the document that is being processed, and a path within this document. Initially, the query is traversing a set of documents (for example, the JSON files that are specified at the command line), and the path is the top element for each document.
A context modifier can change the context of the query. Context modifiers are specified in the key, after the key names, and their scope is the value for that key (i.e. everything in the object/array/string that is provided as a value for this key).
There are two kinds of context modifiers: context modifiers that change the path within a document, which start with :
(colon), and context modifiers that switch to a different document, which start with →
(arrow).
Some context modifiers iterate through multiple contexts (either paths, or documents). Such context modifiers are called context traversals.
The simplest types of context modifiers are simply path expressions. For example:
{
"key:Accounts[0]": "id"
}
In this case, it would return the value of Accounts[0].id
.
A useful abbreviation is using a colon after a key name, with nothing after that. For example:
{
"Field1:": "value"
}
The key Field1:
is equivalent to Field1:Field1
. This allows us to avoid writing the same key over and over. If, for example, we want to copy the key and value for FirstName
, we can write: "FirstName:" : "."
Context modifiers allow us to iterate through all elements in an array, by using []
. For example:
{
"key:Accounts[]": ["id"]
}
Would return an array of all ids for all accounts. One subtle detail here is that in case the current value is not an array but an object or literal, it would not be an error, and the query would process this object as if it was an array with a single element containing this value.
Context modifiers also allow us to iterate trough all fields in an object, using {}
. For example:
{
"result:{}": ["."]
}
The above query takes an object and convert it to an array containing all the values of fields in that object.
If we do not want to iterate over all fields but just some of them, one way to do this is using: {'regex'}
. For example:
{
"result:{'A.*B'}": ["."]
}
Another way to filter the fields is to use predicates. For example:
{
"result:{}?$key!='LastName'": ["."]
}
In case we want to traverse all paths in our context, including all array elements and all object keys, we can use the recursive descent operator : **
. For example, the following query returns a sorted array with all keys in the document:
{
"#return:**": ["$key@unique_ascending"]
}
Context modifiers that switch documents start with a →
(arrow) operator, followed by an expression specifying the new document.
We currently suppot the following (there are additional arrow operations in XCiteDB):
-
→$file('file-name')
: Read another json file, and swtich context to this file. This allows us to link to data in another document. -
→$var(variable-name)
: Switch to the JSON stored in this variable.
Notes:
-
It is usually better to read a file into a variable an switch to that variable, so that we won’t have to read the same file mutiple times.
-
when switching documents, it is useful to use the
<<
operator to connect between the values in the old document and values in the new document.
We can specify a sequence of context modifiers, by writing them one after the other. For example:
{
"result:[]:{}" : ["."]
}
This query would return an array with all the field values of all the object in an array.
{
"result->$file('employees.json'):Employees[]" : ["FirstName"]
}
This would show all the first names in the file employees.json
.
When we want to explore multiple alternative contexts, we can use the ||
(context-or) operator.
For example:
{
"names:.||Family[]" : ["FirstName"]
}
The above query would list all first names, both at the top of each JSON document, and in the Family[]
array.
Directives are specified using key names that start with #
. They can be used to change the behaviour of the query or perform special actions such as variable and function declarations. The are evaluated in the order they appear.
The #assign
directive assigns a new value into a variable. For example:
{
"#var x" : "1",
"obj": {
"#assign x": "2"
},
"x_value": "$var(x)"
}
The result of the above query is {"x_value": 2}
.
Note: when the variable was not previously declared with #var
, the effect of #assign
would be the same a #var
.
The #exists
directive is a condition which is true only if its value is non-empty. This allows us to test if some condition holds within a complex value (arrays, objects etc.). For example:
[
{
"#exists:Employees[]" : ["Salary>100000"],
"company":"CompanyName"
}
]
The above query returns the list of companies that have employees with a salary of over 100000.
The #func
directive allows adding user-defined functions. For example:
{
"#func fullname" : "FirstName+' '+LastName",
"names:Employees[]": ["$fullname"]
}
This query is equivalent to:
{
"names:Employees[]": ["FirstName+' '+LastName"]
}
Functions can also have parameters. For example:
{
"#func fullname(x,y)" : "$var(x)+' '+$var(y)",
"names:Employees[]": ["$fullname(FirstName, LastName)"]
}
Function definitions can also be recursive, for more complicated queries.
An #if
directive specifies a condition. If the condition is false, the entire object is skipped.
For example:
[
{
"#if" : "Title!=CEO",
"FirstName:": ".",
"LastName:": "."
}
]
The above query returns the list of all employees, except the CEO.
A #notexist
directive is a condition which is true only when the value is empty. For example:
[
{
"#notexists:Employees[]": ["Salary<30000"],
"company": "CompanyName"
}
]
This query would return the list of companies that do not have employees than make under 30000.
The #return
directive breaks the second principle of Unquery. It can make the structure of the result different than the structure of the query. But it is necessary to avoid a potential limitation when using context modifiers or directives. To use context modifiers or directives, we often need to add a wrapping object in cases where we do not need an object in the result.
For example:
{
"result:Employees[]": ["FirstName"]
}
The above query returns an object containing a key: result
, with an array of the first names of all employees. But in case we want to get simply an array, without the wrapping object, we can write:
{
"#return:Employees[]": ["FirstName"]
}
Note that in case of a return, any other fields in that object would be dropped from the result. And in case of multiple #return
directives in the same object, only the first one would be effective.
The #returnif
directive is effective only if its value is non-null. This allows us to provide multiple #returnif
directives, where each is true in different cases.
One place where this is very useful is for defining a recursive function. For example, the following function works for any JSON with any structure, and replaces the value of field with the key key1
to 'something else'
:
{
"#func rec" : {
"#returnif:[]": ["$rec"],
"#returnif:{}": {
"$key":"$rec"
},
"#returnif":"'something else'?$key='key1'",
"#return":"."
},
"#return" : "$rec"
}
A #var
directive declares a variable, that can later be accessed either with the $var
function or with %name
. A variable can store any JSON value, including simple strings and numbers, or more complex values.
For example:
{
"#var x": "'Some string'",
"Field1": "%x"
}
And a more complex example:
{
"#var dic:Employees[]": {
"$(EmployeeId)": "."
},
"Employee1": "$var(dic).1001"
}
This example transforms the array of all employees into a dictionary containing all employees and indexed by EmployeeId
as the key. Later, we can access a specific employee with id 1001
by $var(dic).1001
.
Variables are scoped. When a variable is declared in an outer object, and a variable with the same name is declared in an inter object, the inner variable hides the outer variable.
For example:
{
"#var x":"1",
"obj":{
"#var x":"2",
"value":"$var(x)"
},
"value":"$var(x)"
}
Inside obj
, the value of x
is 2. But after getting back to the outer object, the value of x
is 1 again.
We support arithmetic and string operators in Unquery expressions.
All the standard operators: +
, -
, *
, /
, and mod
(modulu) are supported.
The operators are supported both for integers and floating point. If both operands are integers, the result is an integer. Otherwise, the result is a double-precision floating point number.
The only supported operator between strings in +
, which performs string concatenation.
The $()
(or evaluate) treats the expression inside the parantheses as a path expression, and returns the value of that path. This is useful, for example, for retrieving a value in a dictionary object. For example, if employees
is a dictionary from id
to employee names, and group
is an array of id`s, the following query would list the employee names in `group
:
{
"#return:group[]": ["/employees.$(.)"]
}
$csv(filename [,delimiter] [,have_headers])
is a function that reads a csv file (comma seperated values), and returns a json document. The delimiter
parameter is optional, with comma as default, and have_deaders
is an optional boolean value, with default true
.
The function expects a file where the first line the the column names (unless have_headers
is false), rows are seperated by newline, and columns are seperated by the specified delimiter (which is not used inside the values).
The result is an array of objects. The field names in each object are the column names. The values are strings, numbers, or booleans. If the value in the csv looks like a number (i.e. containing only numeric digits and a dot), it would be converted into numbers. If it is "true"
or "false"
, it would be converted into boolean. Otherwise, the value is a string.
For example, for the csv file:
username,user_id,active
sela,100,true
ari,101,true
jack,102,false
The result is:
[
{
"username": "sela",
"user_id": 100,
"active": true
},
{
"username": "ari",
"user_id": 101,
"active": true
},
{
"username": "jack",
"user_id": 102,
"active": false
}
]
$env(expression)
, where expression
evaluates to a string value, returns the value of the environment variable in expression
. For example, "$env('BROWSER')"
returns the value of the environmet variable BROWSER
.
=== $file
The function $file
reads a json file, and returns its value. For example:
{
"ThisJSONFile": ".",
"AnotherJSONFile": "$file('another-file.json')"
}
$filename
(without any parameters), returns the string for the current filename. For example, getting array with the names of all processed files:
[
"$filename"
]
The function $find(path,string)
finds all occurrences of string
in path
, and returns an array with all indexes where string
is found.
The function $if(cond,then,else)
is used for a ternary if-then-else expression. For example: "$if(x>10,'yes', 'no')"
The $index
function returns the last index in the context path. If the current path is Field1[5]
, it would return 5, and also in case the path is Field1[5].Field2
, or Field1[3][5]
.
$join(array,delimiter)
joins all strings in a json strcture (array, object, etc), adding delimiter
between them.
The $key$
function returns the last key in the context path. For example, if the current path is Field1.Field2
, it would return Field2
. If the path ends with an array index, returns the key before any array index. For example, for the path Field1.Field2[7]
, the returned key would be Field2
as well.
The function $replace(str, from, to [,replace_all?])
replaces all occurances of from
in str
to to
(or first occurance if replace_all
is false).
The function $reskey
returns the result key name in the result. For example: {"field1:Field1":"$reskey"}
would return {"field1":"field1"}
.
The function $split(string, delimiter)
splits the delimiter-separated values in a string into an array.
$substr(stringexpr, start, length)
returns a substring of stringexpr
, startin at offset start
, and of length length
.
The function $time_to_str(expr,format_string)
converts unix epoch time to date and time string based on a format string. The format string is the same as in strftime
in c
.
The function $to_time(string,format_string)
converts a string into unix epoch time. The format string is the same as in strptime
in c
.
=== $upper
The function $upper(expr)
converts the string returned by expr
to uppercase.
The function $var(variable-name)
returns the value of a variable. For example:
{
"#var x": "1000",
"value": "$var(x)
}
Would return {"value":1000}
Note: instead of $var(x)
, you can also write: %x
.
Type cast functions convert literals from one type to another. It can also convert strings to numbers. For example, "$int('5')"
would return the integer value "5"
.
Another use case is for forcing floating point division. "14/10"
would return "1"
, and "14/$float(10)"
returns "1,4"
.
$number
casts strings to either int or float type, depending on the string’s content.
Aggregate functions calculate a value based on multiple fields. Each time a field is visited, the value of the aggregate function is updated. For example: "$max(Salary)"
would return the maximum value for Salary
.
Recall that when we have values (including objects) outside of an array, it is usually updated once, even if there are multiple matching values. For example, the query {"salary":"salary"}
returns the first salary the query finds, even if there are multiple employees with different salaries. However, when we use aggregate functions, the value is updated each time we visit the
field containing that value. For example: {"AllSalaries":"$sum(salary)"}
would return the sum of all salaries.
An aggregate value is updated each time we visit that field. This mean that in case we use an array, such as ["$sum(Salary)"]
, a new element would be added each time we update the array, and this value
is updated based on one value alone. The result would be just an array of salaries in this case.
However, if we use context traversals inside the array, aggregate functions can still be meaningful. For example:
[
{
"name":"FullName",
"avgFamilyAge:Family[]": "$avg(Age)"
}
]
Would return an array of people, and the average age of the family members for each entry in this array. This allows us to do aggregation at each level in the document.
The same principle works when we group values using dynamic key names. For example, the following query would group employees by title, and display the average salary for each title:
{
"$(Title)":"$avg(Salary)"
}
An aggregate value is updated each time it is evaluated. If we skip evaluating it due to using predicates or #if
conditions, we won’t include those values.
For example, "$avg(Age)?Age>=18"
would calculate the average age of those that are 18 year old or more.
We can also use aggregate functions inside conditions, but with one caveat: we can only compare the aggregate functions to constant values. For example:
[
{
"name":"FullName",
"avgFamilyAge:Family[]": "$avg(Age)>40"
}
]
This would only list families with average age greater than 40.
Takes no parameters, and returns the number of times we evaluated this function. Especially useful when combined with predicates. For example: `"$count?Age>=18"' would return the number of people that are 18 or older.
This function allows us to do aggregations that are not directly supported by a built-in function. $prev(defalut-value)
returns default-value
the first time we use it, and returns
the value of the expression it appears in on subsequent calls. For example: "$prev('')+Text"
would concatenate the text field Text
in all documents.