Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Orientdb @optional bugfix #95

Merged

Conversation

amartyashankha
Copy link
Collaborator

@amartyashankha amartyashankha commented Jun 1, 2018

Expected behavior:
Returned results for a query that has a @filter inside an @optional scope must either:

  • not have the optional edge, or
  • have the optional edge while satisfying the filtering condition.

Actual behavior:
If the optional edge exists, but the filtering condition inside the @optional is not satisfied, the result was still returned as if the optional edge did not exist.

This PR fixes the above.

@amartyashankha amartyashankha requested a review from obi1kenobi June 1, 2018 16:42
@coveralls
Copy link

coveralls commented Jun 18, 2018

Coverage Status

Coverage increased (+0.1%) to 94.706% when pulling 9446860 on amartyashankha:orientdb_optional_bugfix into 1caafae on kensho-technologies:master.

inner_location_name_to_where_filter[inner_location_name] = optional_edge_where_filter

# Sort expressions by inner_location_name to obtain deterministic order
where_filter_expressions = [inner_location_name_to_where_filter[key]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please reformat:

where_filter_expressions = [
    inner_location_name_to_where_filter[key]
    for key in sorted(inner_location_name_to_where_filter.keys())
]

Copy link
Contributor

@obi1kenobi obi1kenobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks like a fine approach, some generally minor tweaks left.

@@ -156,6 +157,14 @@ def _construct_output_to_match(output_block):
return u'SELECT %s FROM' % (u', '.join(selections),)


def _construct_where_to_match(where_block):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like the wrong place to handle this, why can't we lower it away before it gets to this point?

Copy link
Collaborator Author

@amartyashankha amartyashankha Jun 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is being handled elsewhere now. The existing code might still be useful if the a TrueLiteral is explicitly used as a predicate for the Filter?

@@ -197,6 +206,9 @@ def emit_code_from_single_match_query(match_query):
# Represent and add the SELECT clauses with the proper output data.
query_data.appendleft(_construct_output_to_match(match_query.output_block))

# Represent and add the WHERE clause with the proper filters.
query_data.append(_construct_where_to_match(match_query.where_block))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to lower it away and set match_query.where_block to None if no filtering is required, so we'll have to check if it's None here.

.format(self.location))

def to_match(self):
"""Return a unicode object with the MATCH representation of this ContextField."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace ContextField with SelectEdgeContextField or rephrase -- this object is not a subclass of ContextField

Returns:
dict mapping from simple_optional_root_location -> dict containing keys
- 'inner_location_name': Location object correspoding to the unique MarkLocation present
within a simple optional (one that does not expands vertex fields)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: does not expands

an @optional traverse) that expand vertex fields
location_to_optional_root: dict mapping from location -> optional_location
where location is within @optional (not necessarily one that
expands vertex fields) and optional_location is the location
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional_location -> optional_root?

inner_local_field = LocalField(inner_location_name)
inner_location_existence = BinaryComposition(u'!=', inner_local_field, NullLiteral)

edge_field_location = Location(root_location_path, field=edge_field)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're passing both of these values separately as function args, then constructing a Location from them. Why not pass a Location then read the two elements from it? Seems more robust and cleaner.

]

predicate = expression_list_to_conjunction(where_filter_expressions)
super(SelectWhereFilter, self).__init__(predicate)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make this take a predicate expression instead, and move the logic elsewhere. I'm not a fan of having tons of complex logic in a constructor like this.

u'{} {}'.format(where_block, ir_blocks))
ir_except_where = ir_blocks[:-1]

output_block = ir_except_where[-1]
if not isinstance(output_block, ConstructResult):
raise AssertionError(u'Expected last IR block to be ConstructResult, found: '
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't the last IR block any more -- the message should be updated if we stick with this representation.

@@ -443,6 +463,16 @@ def test_filter_on_optional_variable_name_or_alias(self):
}}
RETURN $matches
)
WHERE
(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Elsewhere, the first ( is on the same line as the keyword that starts it. Let's keep the formatting like that: WHERE ( then de-indent everything below by one.

@@ -29,7 +29,8 @@ def test_simple_immediate_output(self):
MarkLocation(base_location),
ConstructResult({
'foo_name': OutputContextField(base_name_location, GraphQLString),
})
}),
SelectWhereFilter({}),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want this block added if it isn't doing anything. Also, regardless of what the MATCH query looks like, it's strange from an IR perspective to construct results before filtering them again.

When this block appears, let's move it to right before ConstructResult instead of right after. That way, ConstructResult is always the last IR block, and that simplifies the error-checking code I commented on above.

@amartyashankha amartyashankha force-pushed the orientdb_optional_bugfix branch from e493ee1 to 8126f7e Compare June 25, 2018 19:48
Copy link
Contributor

@obi1kenobi obi1kenobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor nitpicks remaining, otherwise looks good.


def to_gremlin(self):
"""Not implemented, should not be used."""
raise AssertionError(u'SelectEdgeContextField is only used for the WHERE statement in MATCH'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you need a . at the end of MATCH, otherwise the string gets joined into: ... in MATCHThis function should not ...

@@ -17,12 +19,13 @@
lower_context_field_expressions, prune_non_existent_outputs)
from ..match_query import convert_to_match_query
from ..workarounds import orientdb_class_with_while, orientdb_eval_scheduling

from .utils import _construct_where_filter_predicate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Underscored functions should be private and not imported (except maybe in tests). Remove the underscore from the function name.

@@ -135,6 +136,34 @@ def _split_match_steps_into_match_traversals(match_steps):
return output


def _extract_global_operations(ir_blocks):
"""Extract all global operation blocks (all blocks following GlobalOperationsStart).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So is ConstructResult considered part of the global operations and returned in global_operation_blocks, or is it returned in remaining_ir_blocks?

From the use of this function below, it seems like ConstructResult shouldn't be part of ir_blocks at all -- in which case the docstring must reflect that, and there should be an assertion that ensures that.

@obi1kenobi obi1kenobi merged commit 7b9e4b9 into kensho-technologies:master Jun 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants