Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optional match #1043

Merged
merged 33 commits into from
Apr 6, 2020
Merged
Show file tree
Hide file tree
Changes from 23 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
a5b99a8
Enable TCK tests
jeffreylovitz Mar 16, 2020
5325ad1
Introduce Optional and Apply ops
jeffreylovitz Mar 16, 2020
ec9371a
Modify mock AST logic
jeffreylovitz Mar 16, 2020
f95e5ac
Emit error on queries beginning with OPTIONAL MATCH
jeffreylovitz Mar 18, 2020
6330061
Return null on property accesses of null graph entity
jeffreylovitz Mar 18, 2020
85e7368
Disallow OPTIONAL MATCH...MATCH queries
jeffreylovitz Mar 18, 2020
d2d7a65
Fix OPTIONAL filter placement
jeffreylovitz Mar 18, 2020
67ae023
Enable TCK tests
jeffreylovitz Mar 18, 2020
96af9a1
NULL handling for path functions
jeffreylovitz Mar 25, 2020
cf789b9
NULL handling for GraphEntity and list functions
jeffreylovitz Mar 25, 2020
d8e58a2
WIP improve mock AST logic
jeffreylovitz Mar 25, 2020
7906473
Add flow tests
jeffreylovitz Mar 25, 2020
a6332fe
Improve AST mock logic
jeffreylovitz Mar 25, 2020
b3843c1
Error handling for SET and CREATE on null entities
jeffreylovitz Mar 25, 2020
8117f07
Record_Get refactor
jeffreylovitz Mar 27, 2020
59083af
Test null handling
jeffreylovitz Mar 27, 2020
24cc90a
Minor cleanup
jeffreylovitz Mar 27, 2020
98f0574
Add documentation
jeffreylovitz Mar 30, 2020
0cbd1a6
Simplify toPath null handling
jeffreylovitz Mar 30, 2020
c312e9a
Improve comments
jeffreylovitz Mar 30, 2020
de7a126
Allow OPTIONAL MATCH as first clause
jeffreylovitz Mar 30, 2020
7defbfd
Simplify null-checking logic in create ops
jeffreylovitz Mar 30, 2020
61f4965
Use branch of Python client for testing
jeffreylovitz Mar 30, 2020
a55e632
PR fixes
jeffreylovitz Mar 31, 2020
e9d78ad
PR fixes
jeffreylovitz Apr 2, 2020
b7a9884
Remove Record_GetScalar interface
jeffreylovitz Apr 2, 2020
c2c3145
PR fixes
jeffreylovitz Apr 2, 2020
fc0ed1a
PR fixes
jeffreylovitz Apr 3, 2020
2f98977
Add demo query for OPTIONAL MATCH
jeffreylovitz Apr 3, 2020
e240158
Merge branch 'master' into optional-match
swilly22 Apr 3, 2020
a993784
Use standard Python client for automation
jeffreylovitz Apr 3, 2020
38ac976
Emit all columns as SIValues in compact formatter
jeffreylovitz Apr 6, 2020
732be69
Improve flow test for null entities in first result
jeffreylovitz Apr 6, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ jobs:
- run:
name: Install prerequisite
command: >-
pip install git+https://github.com/RedisGraph/redisgraph-py.git@master &&
pip install git+https://github.com/RedisGraph/redisgraph-py.git@null-graph-entities &&
pip install psutil behave &&
pip install redis-py-cluster &&
pip install git+https://github.com/RedisLabsModules/RLTest.git@master &&
Expand Down
40 changes: 40 additions & 0 deletions docs/commands.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ supported.
### Query structure

- MATCH
- OPTIONAL MATCH
- WHERE
- RETURN
- ORDER BY
Expand Down Expand Up @@ -128,6 +129,45 @@ The syntactic sugar `(person_a)<-[:KNOWS]->(person_b)` will return the same resu

The bracketed edge description can be omitted if all relations should be considered: `(person_a)--(person_b)`.

#### OPTIONAL MATCH

The OPTIONAL MATCH clause is a MATCH variant that produces null values for elements that do not match successfully, rather than the all-or-nothing logic for patterns in MATCH clauses.

It can be considered to fill the same role as LEFT/RIGHT JOIN does in SQL, as MATCH entities must be resolved but nodes and edges introduced in OPTIONAL MATCH will be returned as nulls if they cannot be found.

OPTIONAL MATCH clauses accept the same patterns as standard MATCH clauses, and may similarly be modified by WHERE clauses.

Multiple MATCH and OPTIONAL MATCH clauses can be chained together, though a mandatory MATCH cannot follow an optional one.

```sh
GRAPH.QUERY DEMO_GRAPH
"MATCH (p:Person) OPTIONAL MATCH (p)-[w:WORKS_AT]->(c:Company)
WHERE w.start_date > 2016
RETURN p, w, c"
```

All `Person` nodes are returned, as well as any `WORKS_AT` relations and `Company` nodes that can be resolved and satisfy the `start_date` constraint. For each `Person` that does not resolve the optional pattern, the person will be returned as normal and the non-matching elements will be returned as null.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if there are multiple connection between a person p and a number of companies say 10
and out of those 10 only 4 satisfy the criteria how many null will be returned?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 nulls - if p has at least one valid connection, no nulls will be produced. If there is another person p with no valid connections, it will return that p once with null w and c.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


Cypher is lenient in its handling of null values, so actions like property accesses and function calls on null values will return null values rather than emit errors.

```sh
GRAPH.QUERY DEMO_GRAPH
"MATCH (p:Person) OPTIONAL MATCH (p)-[w:WORKS_AT]->(c:Company)
RETURN p, w.department, ID(c) as ID"
```

In this case, `w.department` and `ID` will be returned if the OPTIONAL MATCH was successful, and will be null otherwise.

Clauses like SET, CREATE, and DELETE will ignore null inputs and perform the expected updates on real inputs. One exception to this is that attempting to create a relation with a null endpoint will cause an error:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about MERGE ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MERGE belongs here as well, I'll add it. Internally, this is all handled by the MergeCreate op, the logic is the same as Create.


```sh
GRAPH.QUERY DEMO_GRAPH
"MATCH (p:Person) OPTIONAL MATCH (p)-[w:WORKS_AT]->(c:Company)
CREATE (c)-[:NEW_RELATION]->(:NEW_NODE)"
```

If `c` is null for any record, this query will emit an error. In this case, no changes to the graph are committed, even if some values for `c` were resolved.

#### WHERE

This clause is not mandatory, but if you want to filter results, you can specify your predicates here.
Expand Down
6 changes: 1 addition & 5 deletions docs/cypher_support.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,11 +42,7 @@ We do not support any of these properties at the type level, meaning nodes and r
## Clauses
### Reading Clauses
+ MATCH

**Unsupported:**

- OPTIONAL MATCH
- MANDATORY MATCH
+ OPTIONAL MATCH

### Projecting Clauses
+ RETURN
Expand Down
8 changes: 6 additions & 2 deletions src/arithmetic/arithmetic_expression.c
Original file line number Diff line number Diff line change
Expand Up @@ -344,8 +344,12 @@ static bool _AR_EXP_UpdateEntityIdx(AR_OperandNode *node, const Record r) {

static AR_EXP_Result _AR_EXP_EvaluateProperty(AR_ExpNode *node, const Record r, SIValue *result) {
RecordEntryType t = Record_GetType(r, node->operand.variadic.entity_alias_idx);
// Property requested on a scalar value.
if(!(t & (REC_TYPE_NODE | REC_TYPE_EDGE))) {
if(t == REC_TYPE_UNKNOWN) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if(!(t & (REC_TYPE_NODE | REC_TYPE_EDGE))) {
    if(t == REC_TYPE_UNKNOWN) {
        /* If we attempt to access a null value as a graph entity (due to a
		 * scenario like a failed OPTIONAL MATCH), return a null value. */
		*result = SI_NullVal();
		return EVAL_OK;
    } else {
		/* Attempted to access a scalar value as a map.
		 * Set an error and invoke the exception handler. */
		char *error;
		SIValue v = Record_GetScalar(r, node->operand.variadic.entity_alias_idx);
		asprintf(&error, "Type mismatch: expected a map but was %s", SIType_ToString(SI_TYPE(v)));
		QueryCtx_SetError(error); // Set the query-level error.
		return EVAL_ERR;
	}
}

/* If we attempt to access a null value as a graph entity (due to a
* scenario like a failed OPTIONAL MATCH), return a null value. */
*result = SI_NullVal();
return EVAL_OK;
} else if(!(t & (REC_TYPE_NODE | REC_TYPE_EDGE))) {
/* Attempted to access a scalar value as a map.
* Set an error and invoke the exception handler. */
char *error;
Expand Down
21 changes: 14 additions & 7 deletions src/arithmetic/entity_funcs/entity_funcs.c
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,14 @@

/* returns the id of a relationship or node. */
SIValue AR_ID(SIValue *argv, int argc) {
if(argv[0].type == T_NULL) return SI_NullVal();
jeffreylovitz marked this conversation as resolved.
Show resolved Hide resolved
GraphEntity *graph_entity = (GraphEntity *)argv[0].ptrval;
return SI_LongVal(ENTITY_GET_ID(graph_entity));
}

/* returns a string representations the label of a node. */
SIValue AR_LABELS(SIValue *argv, int argc) {
if(argv[0].type == T_NULL) return SI_NullVal();
jeffreylovitz marked this conversation as resolved.
Show resolved Hide resolved
char *label = "";
Node *node = argv[0].ptrval;
GraphContext *gc = QueryCtx_GetGraphCtx();
Expand All @@ -32,6 +34,7 @@ SIValue AR_LABELS(SIValue *argv, int argc) {

/* returns a string representation of the type of a relation. */
SIValue AR_TYPE(SIValue *argv, int argc) {
if(argv[0].type == T_NULL) return SI_NullVal();
jeffreylovitz marked this conversation as resolved.
Show resolved Hide resolved
char *type = "";
Edge *e = argv[0].ptrval;
GraphContext *gc = QueryCtx_GetGraphCtx();
Expand All @@ -51,6 +54,7 @@ SIValue AR_EXISTS(SIValue *argv, int argc) {
}

SIValue _AR_NodeDegree(SIValue *argv, int argc, GRAPH_EDGE_DIR dir) {
if(argv[0].type == T_NULL) return SI_NullVal();
jeffreylovitz marked this conversation as resolved.
Show resolved Hide resolved
Node *n = (Node *)argv[0].ptrval;
Edge *edges = array_new(Edge, 0);
GraphContext *gc = QueryCtx_GetGraphCtx();
Expand Down Expand Up @@ -79,11 +83,13 @@ SIValue _AR_NodeDegree(SIValue *argv, int argc, GRAPH_EDGE_DIR dir) {

/* Returns the number of incoming edges for given node. */
SIValue AR_INCOMEDEGREE(SIValue *argv, int argc) {
if(argv[0].type == T_NULL) return SI_NullVal();
jeffreylovitz marked this conversation as resolved.
Show resolved Hide resolved
return _AR_NodeDegree(argv, argc, GRAPH_EDGE_DIR_INCOMING);
}

/* Returns the number of outgoing edges for given node. */
SIValue AR_OUTGOINGDEGREE(SIValue *argv, int argc) {
if(argv[0].type == T_NULL) return SI_NullVal();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use SI_TYPE macro

return _AR_NodeDegree(argv, argc, GRAPH_EDGE_DIR_OUTGOING);
}

Expand All @@ -92,34 +98,35 @@ void Register_EntityFuncs() {
AR_FuncDesc *func_desc;

types = array_new(SIType, 1);
types = array_append(types, T_NODE | T_EDGE);
types = array_append(types, T_NULL | T_NODE | T_EDGE);
func_desc = AR_FuncDescNew("id", AR_ID, 1, 1, types, false);
AR_RegFunc(func_desc);

types = array_new(SIType, 1);
types = array_append(types, T_NODE);
types = array_append(types, T_NULL | T_NODE);
func_desc = AR_FuncDescNew("labels", AR_LABELS, 1, 1, types, false);
AR_RegFunc(func_desc);

types = array_new(SIType, 1);
types = array_append(types, T_EDGE);
types = array_append(types, T_NULL | T_EDGE);
func_desc = AR_FuncDescNew("type", AR_TYPE, 1, 1, types, false);
AR_RegFunc(func_desc);

types = array_new(SIType, 1);
types = array_append(types, SI_ALL);
types = array_append(types, T_NULL | SI_ALL);
func_desc = AR_FuncDescNew("exists", AR_EXISTS, 1, 1, types, false);
AR_RegFunc(func_desc);

types = array_new(SIType, 2);
types = array_append(types, T_NODE);
types = array_append(types, T_NULL | T_NODE);
types = array_append(types, T_STRING);
func_desc = AR_FuncDescNew("indegree", AR_INCOMEDEGREE, 1, VAR_ARG_LEN, types, false);
AR_RegFunc(func_desc);

types = array_new(SIType, 1);
types = array_append(types, T_NODE);
types = array_new(SIType, 2);
types = array_append(types, T_NULL | T_NODE);
types = array_append(types, T_STRING);
func_desc = AR_FuncDescNew("outdegree", AR_OUTGOINGDEGREE, 1, VAR_ARG_LEN, types, false);
AR_RegFunc(func_desc);
}

3 changes: 3 additions & 0 deletions src/arithmetic/list_funcs/list_funcs.c
Original file line number Diff line number Diff line change
Expand Up @@ -144,6 +144,8 @@ SIValue AR_SIZE(SIValue *argv, int argc) {
return SI_LongVal(SIArray_Length(value));
case T_STRING:
return SI_LongVal(strlen(value.stringval));
case T_NULL:
return SI_NullVal();
default:
assert(false);
}
Expand Down Expand Up @@ -227,3 +229,4 @@ void Register_ListFuncs() {
func_desc = AR_FuncDescNew("tail", AR_TAIL, 1, 1, types, true);
AR_RegFunc(func_desc);
}

17 changes: 13 additions & 4 deletions src/arithmetic/path_funcs/path_funcs.c
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,11 @@ SIValue AR_TOPATH(SIValue *argv, int argc) {
const cypher_astnode_t *ast_path = argv[0].ptrval;
uint nelements = cypher_ast_pattern_path_nelements(ast_path);
assert(argc == (nelements + 1));
for(uint i = 1; i < argc; i++) {
// If any element of the path does not exist, the entire path is invalid and NULL should be returned.
if(argv[i].type == T_NULL) return SI_NullVal();
jeffreylovitz marked this conversation as resolved.
Show resolved Hide resolved
}

SIValue path = SIPathBuilder_New(nelements);
for(uint i = 0; i < nelements; i++) {
SIValue element = argv[i + 1];
Expand Down Expand Up @@ -55,14 +60,17 @@ SIValue AR_TOPATH(SIValue *argv, int argc) {
}

SIValue AR_PATH_NODES(SIValue *argv, int argc) {
if(argv[0].type == T_NULL) return SI_NullVal();
jeffreylovitz marked this conversation as resolved.
Show resolved Hide resolved
return SIPath_Nodes(argv[0]);
}

SIValue AR_PATH_RELATIONSHIPS(SIValue *argv, int argc) {
if(argv[0].type == T_NULL) return SI_NullVal();
jeffreylovitz marked this conversation as resolved.
Show resolved Hide resolved
return SIPath_Relationships(argv[0]);
}

SIValue AR_PATH_LENGTH(SIValue *argv, int argc) {
if(argv[0].type == T_NULL) return SI_NullVal();
jeffreylovitz marked this conversation as resolved.
Show resolved Hide resolved
return SI_LongVal(SIPath_Length(argv[0]));
}

Expand All @@ -72,22 +80,23 @@ void Register_PathFuncs() {

types = array_new(SIType, 2);
types = array_append(types, T_PTR);
types = array_append(types, T_NODE | T_EDGE | T_PATH);
types = array_append(types, T_NULL | T_NODE | T_EDGE | T_PATH);
func_desc = AR_FuncDescNew("topath", AR_TOPATH, 1, VAR_ARG_LEN, types, false);
AR_RegFunc(func_desc);

types = array_new(SIType, 1);
types = array_append(types, T_PATH);
types = array_append(types, T_NULL | T_PATH);
func_desc = AR_FuncDescNew("nodes", AR_PATH_NODES, 1, 1, types, false);
AR_RegFunc(func_desc);

types = array_new(SIType, 1);
types = array_append(types, T_PATH);
types = array_append(types, T_NULL | T_PATH);
func_desc = AR_FuncDescNew("relationships", AR_PATH_RELATIONSHIPS, 1, 1, types, false);
AR_RegFunc(func_desc);

types = array_new(SIType, 1);
types = array_append(types, T_PATH);
types = array_append(types, T_NULL | T_PATH);
func_desc = AR_FuncDescNew("length", AR_PATH_LENGTH, 1, 1, types, false);
AR_RegFunc(func_desc);
}

3 changes: 3 additions & 0 deletions src/ast/ast_build_filter_tree.c
Original file line number Diff line number Diff line change
Expand Up @@ -319,6 +319,8 @@ FT_FilterNode *AST_BuildFilterTree(AST *ast) {
if(match_clauses) {
uint match_count = array_len(match_clauses);
for(uint i = 0; i < match_count; i ++) {
// Optional match clauses are handled separately.
if(cypher_ast_match_is_optional(match_clauses[i])) continue;
const cypher_astnode_t *pattern = cypher_ast_match_get_pattern(match_clauses[i]);
_AST_ConvertGraphPatternToFilter(ast, &filter_tree, pattern);
const cypher_astnode_t *predicate = cypher_ast_match_get_predicate(match_clauses[i]);
Expand Down Expand Up @@ -362,3 +364,4 @@ FT_FilterNode *AST_BuildFilterTree(AST *ast) {

return filter_tree;
}

47 changes: 34 additions & 13 deletions src/ast/ast_mock.c
Original file line number Diff line number Diff line change
Expand Up @@ -8,24 +8,43 @@
#include "../query_ctx.h"
#include "../util/rmalloc.h"

AST *AST_MockMatchPattern(AST *master_ast, const cypher_astnode_t *original_path) {
AST *AST_MockMatchClause(AST *master_ast, cypher_astnode_t *node, bool node_is_path) {
/* The AST node and its children must be reused directly,
* as cloning causes annotations (entity names) to be lost.
* TODO consider updating parser to improve this. */
AST *ast = rm_malloc(sizeof(AST));
ast->referenced_entities = master_ast->referenced_entities;
ast->anot_ctx_collection = master_ast->anot_ctx_collection;
ast->free_root = true;
ast->skip = NULL;
ast->limit = NULL;
cypher_astnode_t *pattern;
struct cypher_input_range range = {};
const cypher_astnode_t *predicate = NULL;
uint child_count = (node_is_path) ? 1 : cypher_astnode_nchildren(node);
cypher_astnode_t *children[child_count];

// Reuse the input path directly. We cannot clone, as this causes annotations (entity names) to be lost.
// TODO consider updating parser to improve this.
cypher_astnode_t *path = (cypher_astnode_t *)original_path;
if(node_is_path) {
/* MERGE clauses and path filters are comprised of a single path.
* Construct a pattern node containing just this path. */
pattern = cypher_ast_pattern(&node, 1, &node, 1, range);
children[0] = pattern;
} else {
/* OPTIONAL MATCH clauses contain a pattern node
* and possibly a predicate node (containing WHERE conditions). */
assert(cypher_astnode_type(node) == CYPHER_AST_MATCH);
pattern = (cypher_astnode_t *)cypher_ast_match_get_pattern(node);
predicate = cypher_ast_match_get_predicate(node);

// Build a pattern comprised of the input path.
cypher_astnode_t *pattern = cypher_ast_pattern(&path, 1, &path, 1, range);
// Explicitly collect all child nodes from the clause.
for(uint i = 0; i < child_count; i ++) {
children[i] = (cypher_astnode_t *)cypher_astnode_get_child(node, i);
}
}

// Build a new match clause that holds this pattern.
cypher_astnode_t *match_clause = cypher_ast_match(false, pattern, NULL, 0, NULL, &pattern, 1,
range);
cypher_astnode_t *match_clause = cypher_ast_match(false, pattern, NULL, 0, predicate,
children, child_count, range);

// Build a query node holding this clause.
ast->root = cypher_ast_query(NULL, 0, &match_clause, 1, &match_clause, 1, range);
Expand All @@ -35,14 +54,16 @@ AST *AST_MockMatchPattern(AST *master_ast, const cypher_astnode_t *original_path
return ast;
}

void AST_MockFree(AST *ast) {
/* When freeing the mock AST, we have to be careful to not free the shared path
void AST_MockFree(AST *ast, bool free_pattern) {
/* When freeing the mock AST, we have to be careful to not free any shared node
* or its annotations. We'll free every surrounding layer explicitly - the MATCH
* pattern, the MATCH clause, and finally the AST root. */
* pattern (if we constructed it), the MATCH clause, and finally the AST root. */
const cypher_astnode_t *clause = cypher_ast_query_get_clause(ast->root, 0);
assert(cypher_astnode_type(clause) == CYPHER_AST_MATCH);
const cypher_astnode_t *pattern = cypher_ast_match_get_pattern(clause);
cypher_astnode_free((cypher_astnode_t *)pattern);
if(free_pattern) {
const cypher_astnode_t *pattern = cypher_ast_match_get_pattern(clause);
cypher_astnode_free((cypher_astnode_t *)pattern);
}
cypher_astnode_free((cypher_astnode_t *)clause);
cypher_astnode_free((cypher_astnode_t *)ast->root);
rm_free(ast);
Expand Down
6 changes: 3 additions & 3 deletions src/ast/ast_mock.h
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,9 @@

#include "ast.h"

// Build a temporary AST with one MATCH clause that holds the given path.
AST *AST_MockMatchPattern(AST *master_ast, const cypher_astnode_t *original_path);
// Build a temporary AST with one MATCH clause that holds the given path or pattern.
AST *AST_MockMatchClause(AST *master_ast, cypher_astnode_t *node, bool node_is_pattern);

// Free a temporary AST.
void AST_MockFree(AST *ast);
void AST_MockFree(AST *ast, bool free_pattern);