Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested AND/OR filters produce incorrect expression #2112

Closed
jakipatryk opened this issue Aug 11, 2022 · 3 comments · Fixed by #2115
Closed

Nested AND/OR filters produce incorrect expression #2112

jakipatryk opened this issue Aug 11, 2022 · 3 comments · Fixed by #2115
Assignees
Labels
bug Something isn't working Conformance Conformance Job affected priority: high Critical to the health of the project

Comments

@jakipatryk
Copy link
Collaborator

jakipatryk commented Aug 11, 2022

Describe the bug

Considering a nested OR filter (in code OrJoinedFilter), for example OR(OR(A, B), OR(B, C)), where A, B, C are arbitrary filters, result of expression:

(OR(A, B) or OR(B, C))

currently yields OR(B) instead of the expected and correct OR(A, B, C).

The same for a nested AND filter (AndJoinedFilter), for example AND(AND(A, B), AND(B, C)), result of expression

(AND(A, B) and AND(B, C))

currently yields AND(B) instead of the expected and correct AND(A, B, C).

To Reproduce (example test)

OR:

  test("Filter with nested ORs (one parent two child ORs)") {
    val leftF1 = DiffersFilter("column1", "v1")
    val leftF2 = EqualsFilter("column1", "v1")
    val childOrLeft = OrJoinedFilters(Set(leftF1, leftF2))

    val rightF1 = DiffersFilter("column2", "v2")
    val rightF2 = EqualsFilter("column2", "v2")
    val rightF3 = EqualsFilter("column1", "v1")
    val childOrRight = OrJoinedFilters(Set(rightF1, rightF2, rightF3))

    val parentOr = childOrLeft or childOrRight

    val filterExpr = parentOr.filter.expr

    val expected = ((col("column1") =!= lit("v1")) or
      (col("column1") === lit("v1")) or
      (col("column2") =!= lit("v2")) or
      (col("column2") === lit("v2")) or
      (col("column1") === lit("v1"))).expr
    val expectedSimplified = ((col("column1") =!= lit("v1")) or
      (col("column1") === lit("v1")) or
      (col("column2") =!= lit("v2")) or
      (col("column2") === lit("v2"))).expr

    assert((filterExpr semanticEquals expected) || (filterExpr semanticEquals expectedSimplified))
  }

AND:

  test("Filter with nested ANDs (one parent two child ANDs)") {
    val leftF1 = DiffersFilter("column1", "v1")
    val leftF2 = EqualsFilter("column1", "v1")
    val childAndLeft = AndJoinedFilters(Set(leftF1, leftF2))

    val rightF1 = DiffersFilter("column2", "v2")
    val rightF2 = EqualsFilter("column2", "v2")
    val rightF3 = EqualsFilter("column1", "v1")
    val childAndRight = AndJoinedFilters(Set(rightF1, rightF2, rightF3))

    val parentAnd = childAndLeft and childAndRight

    val filterExpr = parentAnd.filter.expr

    val expected = ((col("column1") =!= lit("v1")) and
      (col("column1") === lit("v1")) and
      (col("column2") =!= lit("v2")) and
      (col("column2") === lit("v2")) and
      (col("column1") === lit("v1"))).expr
    val expectedSimplified = ((col("column1") =!= lit("v1")) and
      (col("column1") === lit("v1")) and
      (col("column2") =!= lit("v2")) and
      (col("column2") === lit("v2"))).expr

    assert((filterExpr semanticEquals expected) || (filterExpr semanticEquals expectedSimplified))
  }

Expected behavior

Both tests should be passing.

Possible fix

Change:

    @JsonIgnore def or(otherFilter: DataFrameFilter): DataFrameFilter = {
      (this, otherFilter) match {
        case (a: OrJoinedFilters, b: OrJoinedFilters) => OrJoinedFilters(a.filterItems & b.filterItems)
        case (a: OrJoinedFilters, b) => a.copy(filterItems = a.filterItems + b)
        case (a, b: OrJoinedFilters) => b.copy(filterItems = b.filterItems + a)
        case (a, b) => OrJoinedFilters(Set(a, b))
      }
    }

to:

    @JsonIgnore def or(otherFilter: DataFrameFilter): DataFrameFilter = {
      (this, otherFilter) match {
        case (a: OrJoinedFilters, b: OrJoinedFilters) => OrJoinedFilters(a.filterItems | b.filterItems) // <- the change
        case (a: OrJoinedFilters, b) => a.copy(filterItems = a.filterItems + b)
        case (a, b: OrJoinedFilters) => b.copy(filterItems = b.filterItems + a)
        case (a, b) => OrJoinedFilters(Set(a, b))
      }
    }

Similarly for and - that is

   case (a: AndJoinedFilters, b: AndJoinedFilters) => AndJoinedFilters(a.filterItems | b.filterItems) // <- also using | aka "union"

Additional context

This piece of code doesn't seem to be executed at any point by any of currently used systems in Enceladus, at least I couldn't find its usage. When nested filter is constructed via Menas, direct constructors of OrJoinedFilter and AndJoinedFilters seem to be used instead of or/and functions (and thus correct expressions).

@jakipatryk jakipatryk added bug Something isn't working priority: undecided Undecided priority to be assigned after discussion labels Aug 11, 2022
@dk1844
Copy link
Contributor

dk1844 commented Aug 12, 2022

@benedeki check it out, to confirm, please, but my perspective is that @jakipatryk is correct and the current code behaves incorrectly.

I have used his tests to confirm and some extra debugging.

Only a slight readability comment - I would use union instead of | - they are the same but it would be more clear that Sets are united and supplied to the OrJoinedFilter as one instead just | that reminds of logical or (||). Same for AndJoinedFilters

@benedeki benedeki added Conformance Conformance Job affected priority: high Critical to the health of the project and removed priority: undecided Undecided priority to be assigned after discussion labels Aug 18, 2022
@benedeki
Copy link
Collaborator

The bug is definitely there. And I agree with @dk1844 that union would be better/more readable to use.

@jakipatryk
Copy link
Collaborator Author

Release notes
data-model: Behaviour of and/or filter combiner in case of two OrJoinedFilters/AndJoinedFilters have been fixed.

@jakipatryk jakipatryk self-assigned this Sep 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Conformance Conformance Job affected priority: high Critical to the health of the project
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants