Permalink
Browse files

Query tags field with a match query (use mapping analyzer)

The "tags" field is stored in Elasticsearch analysed (by our custom
"uni_normalizer" analyzer). This means, broadly, that in the inverted
index, individual terms in a tag will be stripped of punctuation and
case-normalised.  As such, we need to do the same normalisation when
querying, and the easiest way to do this is with a match query.

This commit changes "tag" queries so that:

- all tag parameters must match
- multiple tokens within one tag query are all required to match

By way of example, for an annotation with tags:

    ["hello", "#THERE", "big world"]

These queries will match:

    tag=hello
    tag=HELLO
    tag=#hello
    tag=hello&tag=there
    tag=hello&tag=there&tag=big+world
    tag=hello&tag=there&tag=world+big

Whereas these will not:

    tag=he+llo
    tag=hello&tag=monkeys
    tag=hello&tag=world

Fixes #2655.
  • Loading branch information...
nickstenning committed Oct 29, 2015
1 parent 6172ee4 commit 9a655a0ce22bb450d454049b2f786a3ca83ae456
Showing with 11 additions and 3 deletions.
  1. +3 −1 h/api/search/query.py
  2. +8 −2 h/api/search/test/query_test.py
View
@@ -197,4 +197,6 @@ def __call__(self, params):
del params['tags']
except KeyError:
pass
return {'terms': {'tags': [tag for tag in tags]}} if tags else None
matchers = [{'match': {'tags': {'query': t, 'operator': 'and'}}}
for t in tags]
return {'bool': {'must': matchers}} if matchers else None
@@ -402,7 +402,10 @@ def test_tagsmatcher_aliases_tag_to_tags():
result = query.TagsMatcher()(params)
assert result == {'terms': {'tags': ['foo', 'bar']}}
assert result == {'bool': {'must': [
{'match': {'tags': {'query': 'foo', 'operator': 'and'}}},
{'match': {'tags': {'query': 'bar', 'operator': 'and'}}},
]}}
def test_tagsmatcher_with_both_tag_and_tags():
@@ -411,7 +414,10 @@ def test_tagsmatcher_with_both_tag_and_tags():
result = query.TagsMatcher()(params)
assert result == {'terms': {'tags': ['foo', 'bar']}}
assert result == {'bool': {'must': [
{'match': {'tags': {'query': 'foo', 'operator': 'and'}}},
{'match': {'tags': {'query': 'bar', 'operator': 'and'}}},
]}}
@pytest.fixture

0 comments on commit 9a655a0

Please sign in to comment.