Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL: IS NULL for a nested field doesn't work #35171

Closed
astefan opened this issue Nov 1, 2018 · 1 comment
Closed

SQL: IS NULL for a nested field doesn't work #35171

astefan opened this issue Nov 1, 2018 · 1 comment
Labels

Comments

@astefan
Copy link
Contributor

astefan commented Nov 1, 2018

sql> select *, user.* from test;
   groupName   |  user.first   |   user.last
---------------+---------------+---------------
fans           |John           |null
fans           |Alice          |White
fans           |Roger          |Mack
pets           |Beni           |The cat
pets           |Fido           |the Dog
pets           |Maia           |the Cat
players        |John           |Smith
players        |Alice          |null
players        |Tom            |null

In the sample above user is a nested field in the test index. There are null values for user.last in the documents.

For the following query, there are no results:

sql> select *, user.* from test where "user.last" is null;
   groupName   |  user.first   |   user.last
---------------+---------------+---------------

sql>

The problem lies in the way the ES query is constructed. When something is null (or doesn't exist) in ES terminology, the "negated" exists query is used. At the moment, ES SQL translates the query above in the following ES query (some bits were removed as not relevant and to increase readability):

    "query": {
        "bool": {
            "must_not": [
                {
                    "nested": {
                        "query": {
                            "exists": {
                                "field": "user.last",
                                "boost": 1
                            }
                        },
                        "path": "user"
                    }
                }
            ]
        }
    }

While the query above is correct syntactically, it is incorrect for the IS NULL scenario. The correct query is:

  "query": {
    "nested": {
      "query": {
        "bool": {
          "must_not": [
            {
              "exists": {
                "field": "user.last",
                "boost": 1
              }
            }
          ]
        }},
        "path": "user",
        "inner_hits":{}
    }
  }

The relevant bit is that the nested query should include the bool must_not, while the current version of ES SQL puts the nested query inside the bool must_not.

@astefan astefan added >bug :Analytics/SQL SQL querying labels Nov 1, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

@matriv matriv closed this as completed in 36da6e1 Nov 9, 2018
matriv pushed a commit that referenced this issue Nov 9, 2018
Add `IsNull` node in parser to simplify expressions so that `<value> IS NULL` is
no longer translated internally to `NOT(<value> IS NOT NULL)`

Replace `IsNotNullProcessor` with `CheckNullProcessor` to encapsulate both
isNull and isNotNull functionality.

Closes: #34876
Fixes: #35171
matriv pushed a commit that referenced this issue Nov 9, 2018
Add `IsNull` node in parser to simplify expressions so that `<value> IS NULL` is
no longer translated internally to `NOT(<value> IS NOT NULL)`

Replace `IsNotNullProcessor` with `CheckNullProcessor` to encapsulate both
isNull and isNotNull functionality.

Closes: #34876
Fixes: #35171
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants