Null search filter #879

hkethi002 · 2017-07-28T19:41:11Z

When search is given a list of values for a field in filters, if "null" is included, it will return all containers with those values and those for which the field does not have a value.

Review Checklist

Tests were added to cover all code changes
Documentation was added / updated
Code and tests follow standards in CONTRIBUTING.md

nagem · 2017-07-28T19:46:30Z

api/handlers/dataexplorerhandler.py

+                                                        ]
+                                                    }
+                                                },
+                                                {"exists":{"field":k.split('.')[0]}}


It should be rare that they would reference a top-level key like container_type, but we might want to be a bit more fault tolerant if they do reference a top level key. In that case, this line is unnecessary.

Ah, I see below where you are appending to the should - my mistake. Also this comment is supposed to go with the comment above, but I'm sure you picked that up.

Ah, I think I might have been confusing with my original comment about removing this line - I was assuming it would only be dynamically removed when there is no dot notation in the key (a top level key would have no dot notation). Wouldn't we still need it to prevent the return of documents that would never have the key anyway?

nagem · 2017-07-28T19:48:40Z

api/handlers/dataexplorerhandler.py

+                    if "null" in v:
+                        try:
+                            v.remove("null")
+                        except AttributeError:


In which cases would this fail? When v is a string, not a list? Mind making that more clear? Maybe checking v's type and acting on that knowledge instead?

nagem · 2017-07-28T19:51:31Z

Nice work, glad it was possible without too much complication. Mind adding the "missing": "null" key back in to the default term aggregations at the top? And in aggregate_field_values()?

nagem · 2017-07-28T19:56:44Z

api/handlers/dataexplorerhandler.py

+                            v = None
+                        null_filter = {
+                            'bool': {
+                                'should': [


Since this is the only bool keyword at this level, it works out having this be a should, but shoulds are used mostly to affect the scoring and it is not required that they match (when with more strict bool options like must). I would switch it to a must so if new logic is added here, it has the intention.

Reading the docs again it looks like since this is also in a filter context and not a query context, should means "must match at least one".

If the bool query is a filter context or has neither must or filter then at least one of the should queries must match a document for it to match the bool query.

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html#query-dsl-bool-query

Which doesn't actually make a difference in this context but ...

I thought it was needed for when multiple boxes were checked because if a must was used would that not return only docs that both have the field set to a value and not, which would be none of them?

nagem · 2017-07-31T16:05:18Z

Changes look good, nice work. I'm going to test locally before giving it the final LGTM.

nagem · 2017-07-31T21:50:22Z

Seems to be something off in the logic:

Request:

{"filters":[{"terms":{"subject.sex":["null"]}}],"return_type":"session","all_data":false}

Result:

uwsgi_1           | 2017-07-31 21:05:31      scitran.api                  base.py   339:ERRO Traceback (most recent call last):
uwsgi_1           |   File "/usr/local/lib/python2.7/dist-packages/webapp2.py", line 570, in dispatch
uwsgi_1           |     return method(*args, **kwargs)
uwsgi_1           |   File "./api/auth/__init__.py", line 63, in check_login
uwsgi_1           |     return handler_method(self, *args, **kwargs)
uwsgi_1           |   File "./api/handlers/dataexplorerhandler.py", line 484, in search
uwsgi_1           |     return_type, filters, search_string = self._parse_request()
uwsgi_1           |   File "./api/handlers/dataexplorerhandler.py", line 340, in _parse_request
uwsgi_1           |     null_filter['bool']['should'][0]['must'].append({'exists': {'field': k.split('.')[0]}})
uwsgi_1           | KeyError: 'must'
uwsgi_1           |  request_id=9c5d87fe-1501535131
uwsgi_1           | [pid: 21|app: 0|req: 13/49] 172.19.0.1 () {50 vars in 1020 bytes} [Mon Jul 31 21:05:31 2017] POST /api/dataexplorer/search?facets=true => generated 41 bytes in 84 msecs (HTTP/1.1 500) 3 headers in 130 bytes (2 switches on core 1)

Using the UI I started a new search and checked the "null" option for subject sex.

hkethi002 · 2017-07-31T21:59:46Z

Small typo in parse_request function fixed that was causing that error when I added the {"exists":{"field":k.split('.')[0]}} line back in.

nagem · 2017-08-01T20:13:52Z

Fix works for me, feel free to merge!

hkethi002 requested a review from nagem July 28, 2017 19:41

nagem reviewed Jul 28, 2017

View reviewed changes

hkethi002 force-pushed the null-search branch from f75815d to 16f031e Compare July 31, 2017 14:41

hkethi002 force-pushed the null-search branch from 16f031e to dc79765 Compare July 31, 2017 18:18

hkethi002 force-pushed the null-search branch from 80dae6a to 7b6b1da Compare July 31, 2017 21:58

hkethi002 force-pushed the null-search branch from 7b6b1da to 9f6ae93 Compare August 1, 2017 20:16

hkethi002 added 4 commits August 1, 2017 15:18

null search filter

a3771f7

null search filter unit test

2a1b090

add missing filter, removed extra bools

9029f40

conditional addition of top level container filter

9f6ae93

hkethi002 merged commit 615b948 into master Aug 1, 2017

hkethi002 deleted the null-search branch August 1, 2017 20:22

Null search filter #879

Null search filter #879

Uh oh!

Conversation

hkethi002 commented Jul 28, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review Checklist

Uh oh!

nagem Jul 28, 2017

Choose a reason for hiding this comment

Uh oh!

nagem Jul 28, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nagem Jul 31, 2017

Choose a reason for hiding this comment

Uh oh!

nagem Jul 28, 2017

Choose a reason for hiding this comment

Uh oh!

nagem commented Jul 28, 2017

Uh oh!

nagem Jul 28, 2017

Choose a reason for hiding this comment

Uh oh!

nagem Jul 28, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hkethi002 Jul 28, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nagem commented Jul 31, 2017

Uh oh!

nagem commented Jul 31, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hkethi002 commented Jul 31, 2017

Uh oh!

nagem commented Aug 1, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hkethi002 commented Jul 28, 2017 •

edited

Loading

nagem Jul 28, 2017 •

edited

Loading

nagem Jul 28, 2017 •

edited

Loading

hkethi002 Jul 28, 2017 •

edited

Loading

nagem commented Jul 31, 2017 •

edited

Loading