Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optional match #1043

Merged
merged 33 commits into from
Apr 6, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
a5b99a8
Enable TCK tests
jeffreylovitz Mar 16, 2020
5325ad1
Introduce Optional and Apply ops
jeffreylovitz Mar 16, 2020
ec9371a
Modify mock AST logic
jeffreylovitz Mar 16, 2020
f95e5ac
Emit error on queries beginning with OPTIONAL MATCH
jeffreylovitz Mar 18, 2020
6330061
Return null on property accesses of null graph entity
jeffreylovitz Mar 18, 2020
85e7368
Disallow OPTIONAL MATCH...MATCH queries
jeffreylovitz Mar 18, 2020
d2d7a65
Fix OPTIONAL filter placement
jeffreylovitz Mar 18, 2020
67ae023
Enable TCK tests
jeffreylovitz Mar 18, 2020
96af9a1
NULL handling for path functions
jeffreylovitz Mar 25, 2020
cf789b9
NULL handling for GraphEntity and list functions
jeffreylovitz Mar 25, 2020
d8e58a2
WIP improve mock AST logic
jeffreylovitz Mar 25, 2020
7906473
Add flow tests
jeffreylovitz Mar 25, 2020
a6332fe
Improve AST mock logic
jeffreylovitz Mar 25, 2020
b3843c1
Error handling for SET and CREATE on null entities
jeffreylovitz Mar 25, 2020
8117f07
Record_Get refactor
jeffreylovitz Mar 27, 2020
59083af
Test null handling
jeffreylovitz Mar 27, 2020
24cc90a
Minor cleanup
jeffreylovitz Mar 27, 2020
98f0574
Add documentation
jeffreylovitz Mar 30, 2020
0cbd1a6
Simplify toPath null handling
jeffreylovitz Mar 30, 2020
c312e9a
Improve comments
jeffreylovitz Mar 30, 2020
de7a126
Allow OPTIONAL MATCH as first clause
jeffreylovitz Mar 30, 2020
7defbfd
Simplify null-checking logic in create ops
jeffreylovitz Mar 30, 2020
61f4965
Use branch of Python client for testing
jeffreylovitz Mar 30, 2020
a55e632
PR fixes
jeffreylovitz Mar 31, 2020
e9d78ad
PR fixes
jeffreylovitz Apr 2, 2020
b7a9884
Remove Record_GetScalar interface
jeffreylovitz Apr 2, 2020
c2c3145
PR fixes
jeffreylovitz Apr 2, 2020
fc0ed1a
PR fixes
jeffreylovitz Apr 3, 2020
2f98977
Add demo query for OPTIONAL MATCH
jeffreylovitz Apr 3, 2020
e240158
Merge branch 'master' into optional-match
swilly22 Apr 3, 2020
a993784
Use standard Python client for automation
jeffreylovitz Apr 3, 2020
38ac976
Emit all columns as SIValues in compact formatter
jeffreylovitz Apr 6, 2020
732be69
Improve flow test for null entities in first result
jeffreylovitz Apr 6, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 30 additions & 1 deletion demo/imdb/imdb_queries.py
Original file line number Diff line number Diff line change
Expand Up @@ -367,6 +367,34 @@ def __init__(self, actors=None, movies=None):
['Tim Blake Nelson']]
)

##################################################################
### grand_budapest_hotel_cast_and_their_other_roles
##################################################################

self.grand_budapest_hotel_cast_and_their_other_roles = QueryInfo(

query="""MATCH (a:actor)-[:act]->(h:movie {title: 'The Grand Budapest Hotel'})
OPTIONAL MATCH (a)-[:act]->(m:movie) WHERE m <> h
RETURN a.name, m.title
ORDER BY a.name, m.title""",

description='All actors in The Grand Budapest Hotel and their other movies',
reversible=False,
max_run_time_ms=4,
expected_result=[['Adrien Brody', None],
['Bill Murray', 'The Jungle Book'],
['F. Murray Abraham', None],
['Harvey Keitel', 'The Ridiculous 6'],
['Harvey Keitel', 'Youth'],
['Jeff Goldblum', 'Independence Day: Resurgence'],
['Jude Law', 'Spy'],
['Mathieu Amalric', None],
['Ralph Fiennes', 'A Bigger Splash'],
['Ralph Fiennes', 'Spectre'],
['Willem Dafoe', 'John Wick'],
['Willem Dafoe', 'The Fault in Our Stars']]
)

self.queries_info = [
self.number_of_actors_query,
self.actors_played_with_nicolas_cage_query,
Expand All @@ -384,7 +412,8 @@ def __init__(self, actors=None, movies=None):
self.eighties_movies_index_scan,
self.find_titles_starting_with_american_query,
self.same_year_higher_rating_than_huntforthewilderpeople_query,
self.all_actors_named_tim
self.all_actors_named_tim,
self.grand_budapest_hotel_cast_and_their_other_roles
]

def queries(self):
Expand Down
87 changes: 45 additions & 42 deletions docs/client_spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,9 @@ Instructions on how to efficiently convert these IDs in the [Procedure Calls](#p

Additionally, two enums are exposed:

[ColumnType](https://github.com/RedisGraph/RedisGraph/blob/ff108d7e21061025166a35d29be1a1cb5bac6d55/src/resultset/formatters/resultset_formatter.h#L14-L19) indicates what type of value is held in each column (more formally, that offset into each row of the result set). Each entry in the header row will be a 2-array, with this enum in the first position and the column name string in the second.
[ColumnType](https://github.com/RedisGraph/RedisGraph/blob/ff108d7e21061025166a35d29be1a1cb5bac6d55/src/resultset/formatters/resultset_formatter.h#L14-L19), which as of RedisGraph v2.1.0 will always be `COLUMN_SCALAR`. This enum is retained for backwards compatibility, and may be ignored by the client unless RedisGraph versions older than v2.1.0 must be supported.

[PropertyType](https://github.com/RedisGraph/RedisGraph/blob/ff108d7e21061025166a35d29be1a1cb5bac6d55/src/resultset/formatters/resultset_formatter.h#L21-L28) indicates the data type (such as integer or string) of each returned scalar value. Each scalar values is emitted as a 2-array, with this enum in the first position and the actual value in the second. A column can consist exclusively of scalar values, such as both of the columns created by `RETURN a.value, 'this literal string'`. Each property on a graph entity also has a scalar as its value, so this construction is nested in each value of the properties array when a column contains a node or relationship.
[ValueType](https://github.com/RedisGraph/RedisGraph/blob/ff108d7e21061025166a35d29be1a1cb5bac6d55/src/resultset/formatters/resultset_formatter.h#L21-L28) indicates the data type (such as Node, integer, or string) of each returned value. Each value is emitted as a 2-array, with this enum in the first position and the actual value in the second. Each property on a graph entity also has a scalar as its value, so this construction is nested in each value of the properties array when a column contains a node or relationship.

## Decoding the result set

Expand Down Expand Up @@ -69,24 +69,26 @@ Verbose (default):
Compact:
```sh
127.0.0.1:6379> GRAPH.QUERY demo "MATCH (a)-[e]->(b) RETURN a, e, b.name" --compact
1) 1) 1) (integer) 2
1) 1) 1) (integer) 1
2) "a"
2) 1) (integer) 3
2) 1) (integer) 1
2) "e"
3) 1) (integer) 1
2) "b.name"
2) 1) 1) 1) (integer) 0
2) 1) 1) 1) (integer) 8
2) 1) (integer) 0
3) 1) 1) (integer) 0
2) (integer) 2
3) "Tree"
2) 1) (integer) 0
2) (integer) 0
3) (integer) 0
4) (integer) 1
5) 1) 1) (integer) 1
2) (integer) 2
3) "Autumn"
2) 1) (integer) 0
3) 1) 1) (integer) 0
2) (integer) 2
3) "Tree"
2) 1) (integer) 7
2) 1) (integer) 0
2) (integer) 0
3) (integer) 0
4) (integer) 1
5) 1) 1) (integer) 1
2) (integer) 2
3) "Autumn"
3) 1) (integer) 2
2) "Apple"
3) 1) "Query internal execution time: 1.085412 milliseconds"
Expand Down Expand Up @@ -119,63 +121,66 @@ Rather than introspecting on the query being emitted, the client implementation

Our sample query `MATCH (a)-[e]->(b) RETURN a, e, b.name` generated the header:
```sh
1) 1) (integer) 2
1) 1) (integer) 1
2) "a"
2) 1) (integer) 3
2) "e"
3) 1) (integer) 1
2) "b.name"
3) "e"
4) 1) (integer) 1
3) "b.name"
```

The 3 array members correspond, in order, to the 3 entities described in the RETURN clause.
The 4 array members correspond, in order, to the 3 entities described in the RETURN clause.

Each is emitted as a 2-array:
```sh
1) ColumnType (enum)
2) column name (string)
```

It is the client's responsibility to store [ColumnType enum](https://github.com/RedisGraph/RedisGraph/blob/ff108d7e21061025166a35d29be1a1cb5bac6d55/src/resultset/formatters/resultset_formatter.h#L14-L19). RedisGraph guarantees that this enum may be extended in the future, but the existing values will not be altered.

In this case, `a` corresponds to a Node column, `e` corresponds to a Relation column, and `b.name` corresponds to a Scalar column. No other column types are currently supported.
The first element is the [ColumnType enum](https://github.com/RedisGraph/RedisGraph/blob/master/src/resultset/formatters/resultset_formatter.h#L14-L19), which as of RedisGraph v2.1.0 will always be `COLUMN_SCALAR`. This element is retained for backwards compatibility, and may be ignored by the client unless RedisGraph versions older than v2.1.0 must be supported.

### Reading result rows

The entity representations in this section will closely resemble those found in [Result Set Graph Entities](result_structure.md#graph-entities).

Our query produced one row of results with 3 columns (as described by the header):
```sh
1) 1) 1) (integer) 0
1) 1) 1) (integer) 8
2) 1) (integer) 0
3) 1) 1) (integer) 0
2) (integer) 2
3) "Tree"
2) 1) (integer) 0
2) (integer) 0
3) (integer) 0
4) (integer) 1
5) 1) 1) (integer) 1
2) (integer) 2
3) "Autumn"
2) 1) (integer) 0
3) 1) 1) (integer) 0
2) (integer) 2
3) "Tree"
2) 1) (integer) 7
2) 1) (integer) 0
2) (integer) 0
3) (integer) 0
4) (integer) 1
5) 1) 1) (integer) 1
2) (integer) 2
3) "Autumn"
3) 1) (integer) 2
2) "Apple"
```
Each element is emitted as a 2-array - [`ValueType`, value].

It is the client's responsibility to store the [ValueType enum](https://github.com/RedisGraph/RedisGraph/blob/master/src/resultset/formatters/resultset_formatter.h#L21-L28). RedisGraph guarantees that this enum may be extended in the future, but the existing values will not be altered.

We know the first column to contain nodes. The node representation contains 3 top-level elements:
The `ValueType` for the first entry is `VALUE_NODE`. The node representation contains 3 top-level elements:

1. The node's internal ID.
2. An array of all label IDs associated with the node (currently, each node can have either 0 or 1 labels, though this restriction may be lifted in the future).
3. An array of all properties the node contains. Properties are represented as 3-arrays - [property key ID, `PropertyType` enum, value].
3. An array of all properties the node contains. Properties are represented as 3-arrays - [property key ID, `ValueType`, value].

```sh
[
Node ID (integer),
[label ID (integer) X label count]
[[property key ID (integer), PropertyType (enum), value (scalar)] X property count]
[[property key ID (integer), ValueType (enum), value (scalar)] X property count]
]
```

The second column contains relations. The relation representation differs from the node representation in two respects:
The `ValueType` for the first entry is `VALUE_EDGE`. The edge representation differs from the node representation in two respects:

- Each relation has exactly one type, rather than the 0+ labels a node may have.
- A relation is emitted with the IDs of its source and destination nodes.
Expand All @@ -194,13 +199,11 @@ As such, the complete representation is as follows:
type ID (integer),
source node ID (integer),
destination node ID (integer),
[[property key ID (integer), PropertyType (enum), value (scalar)] X property count]
[[property key ID (integer), ValueType (enum), value (scalar)] X property count]
]
```

The third column contains a scalar. Each scalar is emitted as a 2-array - [`PropertyType` enum, value].

As with ColumnType, it is the client's responsibility to store the [PropertyType enum](https://github.com/RedisGraph/RedisGraph/blob/ff108d7e21061025166a35d29be1a1cb5bac6d55/src/resultset/formatters/resultset_formatter.h#L21-L28). RedisGraph guarantees that this enum may be extended in the future, but the existing values will not be altered.
The `ValueType` for the third entry is `VALUE_STRING`, and the other element in the array is the actual value, "Apple".

### Reading statistics

Expand Down
40 changes: 40 additions & 0 deletions docs/commands.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ supported.
### Query structure

- MATCH
- OPTIONAL MATCH
- WHERE
- RETURN
- ORDER BY
Expand Down Expand Up @@ -128,6 +129,45 @@ The syntactic sugar `(person_a)<-[:KNOWS]->(person_b)` will return the same resu

The bracketed edge description can be omitted if all relations should be considered: `(person_a)--(person_b)`.

#### OPTIONAL MATCH

The OPTIONAL MATCH clause is a MATCH variant that produces null values for elements that do not match successfully, rather than the all-or-nothing logic for patterns in MATCH clauses.

It can be considered to fill the same role as LEFT/RIGHT JOIN does in SQL, as MATCH entities must be resolved but nodes and edges introduced in OPTIONAL MATCH will be returned as nulls if they cannot be found.

OPTIONAL MATCH clauses accept the same patterns as standard MATCH clauses, and may similarly be modified by WHERE clauses.

Multiple MATCH and OPTIONAL MATCH clauses can be chained together, though a mandatory MATCH cannot follow an optional one.

```sh
GRAPH.QUERY DEMO_GRAPH
"MATCH (p:Person) OPTIONAL MATCH (p)-[w:WORKS_AT]->(c:Company)
WHERE w.start_date > 2016
RETURN p, w, c"
```

All `Person` nodes are returned, as well as any `WORKS_AT` relations and `Company` nodes that can be resolved and satisfy the `start_date` constraint. For each `Person` that does not resolve the optional pattern, the person will be returned as normal and the non-matching elements will be returned as null.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if there are multiple connection between a person p and a number of companies say 10
and out of those 10 only 4 satisfy the criteria how many null will be returned?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 nulls - if p has at least one valid connection, no nulls will be produced. If there is another person p with no valid connections, it will return that p once with null w and c.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


Cypher is lenient in its handling of null values, so actions like property accesses and function calls on null values will return null values rather than emit errors.

```sh
GRAPH.QUERY DEMO_GRAPH
"MATCH (p:Person) OPTIONAL MATCH (p)-[w:WORKS_AT]->(c:Company)
RETURN p, w.department, ID(c) as ID"
```

In this case, `w.department` and `ID` will be returned if the OPTIONAL MATCH was successful, and will be null otherwise.

Clauses like SET, CREATE, MERGE, and DELETE will ignore null inputs and perform the expected updates on real inputs. One exception to this is that attempting to create a relation with a null endpoint will cause an error:

```sh
GRAPH.QUERY DEMO_GRAPH
"MATCH (p:Person) OPTIONAL MATCH (p)-[w:WORKS_AT]->(c:Company)
CREATE (c)-[:NEW_RELATION]->(:NEW_NODE)"
```

If `c` is null for any record, this query will emit an error. In this case, no changes to the graph are committed, even if some values for `c` were resolved.

#### WHERE

This clause is not mandatory, but if you want to filter results, you can specify your predicates here.
Expand Down
6 changes: 1 addition & 5 deletions docs/cypher_support.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,11 +42,7 @@ We do not support any of these properties at the type level, meaning nodes and r
## Clauses
### Reading Clauses
+ MATCH

**Unsupported:**

- OPTIONAL MATCH
- MANDATORY MATCH
+ OPTIONAL MATCH

### Projecting Clauses
+ RETURN
Expand Down
12 changes: 9 additions & 3 deletions src/arithmetic/arithmetic_expression.c
Original file line number Diff line number Diff line change
Expand Up @@ -344,12 +344,18 @@ static bool _AR_EXP_UpdateEntityIdx(AR_OperandNode *node, const Record r) {

static AR_EXP_Result _AR_EXP_EvaluateProperty(AR_ExpNode *node, const Record r, SIValue *result) {
RecordEntryType t = Record_GetType(r, node->operand.variadic.entity_alias_idx);
// Property requested on a scalar value.
if(!(t & (REC_TYPE_NODE | REC_TYPE_EDGE))) {
if(t != REC_TYPE_NODE && t != REC_TYPE_EDGE) {
if(t == REC_TYPE_UNKNOWN) {
/* If we attempt to access an unset Record entry as a graph entity
* (due to a scenario like a failed OPTIONAL MATCH), return a null value. */
*result = SI_NullVal();
return EVAL_OK;
}

/* Attempted to access a scalar value as a map.
* Set an error and invoke the exception handler. */
char *error;
SIValue v = Record_GetScalar(r, node->operand.variadic.entity_alias_idx);
SIValue v = Record_Get(r, node->operand.variadic.entity_alias_idx);
asprintf(&error, "Type mismatch: expected a map but was %s", SIType_ToString(SI_TYPE(v)));
QueryCtx_SetError(error); // Set the query-level error.
return EVAL_ERR;
Expand Down
21 changes: 14 additions & 7 deletions src/arithmetic/entity_funcs/entity_funcs.c
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,14 @@

/* returns the id of a relationship or node. */
SIValue AR_ID(SIValue *argv, int argc) {
if(SI_TYPE(argv[0]) == T_NULL) return SI_NullVal();
GraphEntity *graph_entity = (GraphEntity *)argv[0].ptrval;
return SI_LongVal(ENTITY_GET_ID(graph_entity));
}

/* returns a string representations the label of a node. */
SIValue AR_LABELS(SIValue *argv, int argc) {
if(SI_TYPE(argv[0]) == T_NULL) return SI_NullVal();
char *label = "";
Node *node = argv[0].ptrval;
GraphContext *gc = QueryCtx_GetGraphCtx();
Expand All @@ -32,6 +34,7 @@ SIValue AR_LABELS(SIValue *argv, int argc) {

/* returns a string representation of the type of a relation. */
SIValue AR_TYPE(SIValue *argv, int argc) {
if(SI_TYPE(argv[0]) == T_NULL) return SI_NullVal();
char *type = "";
Edge *e = argv[0].ptrval;
GraphContext *gc = QueryCtx_GetGraphCtx();
Expand All @@ -51,6 +54,7 @@ SIValue AR_EXISTS(SIValue *argv, int argc) {
}

SIValue _AR_NodeDegree(SIValue *argv, int argc, GRAPH_EDGE_DIR dir) {
if(SI_TYPE(argv[0]) == T_NULL) return SI_NullVal();
Node *n = (Node *)argv[0].ptrval;
Edge *edges = array_new(Edge, 0);
GraphContext *gc = QueryCtx_GetGraphCtx();
Expand Down Expand Up @@ -79,11 +83,13 @@ SIValue _AR_NodeDegree(SIValue *argv, int argc, GRAPH_EDGE_DIR dir) {

/* Returns the number of incoming edges for given node. */
SIValue AR_INCOMEDEGREE(SIValue *argv, int argc) {
if(SI_TYPE(argv[0]) == T_NULL) return SI_NullVal();
return _AR_NodeDegree(argv, argc, GRAPH_EDGE_DIR_INCOMING);
}

/* Returns the number of outgoing edges for given node. */
SIValue AR_OUTGOINGDEGREE(SIValue *argv, int argc) {
if(SI_TYPE(argv[0]) == T_NULL) return SI_NullVal();
return _AR_NodeDegree(argv, argc, GRAPH_EDGE_DIR_OUTGOING);
}

Expand All @@ -92,34 +98,35 @@ void Register_EntityFuncs() {
AR_FuncDesc *func_desc;

types = array_new(SIType, 1);
types = array_append(types, T_NODE | T_EDGE);
types = array_append(types, T_NULL | T_NODE | T_EDGE);
func_desc = AR_FuncDescNew("id", AR_ID, 1, 1, types, false);
AR_RegFunc(func_desc);

types = array_new(SIType, 1);
types = array_append(types, T_NODE);
types = array_append(types, T_NULL | T_NODE);
func_desc = AR_FuncDescNew("labels", AR_LABELS, 1, 1, types, false);
AR_RegFunc(func_desc);

types = array_new(SIType, 1);
types = array_append(types, T_EDGE);
types = array_append(types, T_NULL | T_EDGE);
func_desc = AR_FuncDescNew("type", AR_TYPE, 1, 1, types, false);
AR_RegFunc(func_desc);

types = array_new(SIType, 1);
types = array_append(types, SI_ALL);
types = array_append(types, T_NULL | SI_ALL);
func_desc = AR_FuncDescNew("exists", AR_EXISTS, 1, 1, types, false);
AR_RegFunc(func_desc);

types = array_new(SIType, 2);
types = array_append(types, T_NODE);
types = array_append(types, T_NULL | T_NODE);
types = array_append(types, T_STRING);
func_desc = AR_FuncDescNew("indegree", AR_INCOMEDEGREE, 1, VAR_ARG_LEN, types, false);
AR_RegFunc(func_desc);

types = array_new(SIType, 1);
types = array_append(types, T_NODE);
types = array_new(SIType, 2);
types = array_append(types, T_NULL | T_NODE);
types = array_append(types, T_STRING);
func_desc = AR_FuncDescNew("outdegree", AR_OUTGOINGDEGREE, 1, VAR_ARG_LEN, types, false);
AR_RegFunc(func_desc);
}

Loading