Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partial search #20

Merged
merged 8 commits into from
Jun 5, 2019
Merged

Partial search #20

merged 8 commits into from
Jun 5, 2019

Conversation

ageorgou
Copy link
Contributor

@ageorgou ageorgou commented May 24, 2019

Fixes #17, partially addresses #6.

This makes it so partial matches will be returned - for example, searching for cat will match both cat and catch (in any of the searched fields). As before, if a user searches for multiple words (separated by spaces), this will return results matching any of those words.

Main parts:

  • Change from exact to partial matching
  • Search for words individually rather than treating the whole phrase as one prefix
  • Add tests for some simple cases

Up to now we only searched for exact matches, which was never the
intention! With this change, searching for e.g. "water" will also
return entries containing "waters" and "waterclock".

Apparently this only works with text, not keyword fields
(see https://stackoverflow.com/questions/46696190)
but this covers all the fields we are searching for
(we do have some keyword fields, but those are only used for
sorting, not matching).
This restores the OR behaviour of searching for multiple words
(see #17). It will be easy to change if we want to make AND the
default behaviour.
@ageorgou ageorgou mentioned this pull request May 24, 2019
4 tasks
The Elasticsearch client will be needed for tests of the search
results, so it's easier if it's accessible to all test files. Same
for the test glossary entries.
@ageorgou
Copy link
Contributor Author

ageorgou commented Jun 5, 2019

The setup of the test data could be improved, but that can be done as part of using an Elasticsearch plugin for pytest (see this comment in #6).

I think this is ready for review now.

@ageorgou ageorgou marked this pull request as ready for review June 5, 2019 10:20
@ageorgou ageorgou added infrastructure Tests, documentation, deployment etc search Search process and code and removed in progress labels Jun 5, 2019
Copy link
Contributor

@raquelalegre raquelalegre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved with a minor suggestion.

# Also note that this fixture cannot be module-scoped because monkeypatch is
# function-scoped.
monkeypatch.setattr(ingest.bulk_upload, "INDEX_NAME", test_index_name)
assert ingest.bulk_upload.INDEX_NAME == test_index_name # just making sure
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has taken me a little while and some StackOverflow research to feel OK about an assert being inside of a fixture.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just tested it myself: a bad assertion in a fixture makes all tests that use it fail, before any code in those tests is run. I think it's safe enough to leave in!

tests/test_search.py Outdated Show resolved Hide resolved
@ageorgou ageorgou merged commit 75d7988 into development Jun 5, 2019
@ageorgou ageorgou deleted the partial-search branch June 7, 2019 16:47
This was referenced Jul 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
infrastructure Tests, documentation, deployment etc search Search process and code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants