Skip to content
This repository has been archived by the owner on Jan 8, 2020. It is now read-only.

Points refactor #145

Merged
merged 10 commits into from
Sep 8, 2015
Merged

Points refactor #145

merged 10 commits into from
Sep 8, 2015

Conversation

jmarin
Copy link
Contributor

@jmarin jmarin commented Sep 2, 2015

Improves the address points search by doing a match search instead of match_phrase in Elasticsearch. This produces more positive results through a fuzzier search, but the relevance (and thus spatial accuracy) could be lower. In order to provide a sense of quality, the result provides a match score that detects how similar the input and found strings are (maximum value 1 or 100% equal)

@hkeeler
Copy link
Member

hkeeler commented Sep 2, 2015

@jmarin, overall looks good. I think it would be worthwhile to add some comments for SearchUtils. levenshtein as to why this is necessary, even though Elasticsearch's fuzziness feature appears to do the same thing.

One main difference I see is that Elasticsearch's fuzziness feature only supports a max of 2 edits per string, and we'd definitely need more for how you're proposing we use this.

For reference: Elasticsearch Fuzziness

@jmarin
Copy link
Contributor Author

jmarin commented Sep 3, 2015

@hkeeler The fuzziness feature in Elasticsearch is different from the Levenshtein implementation as used in this PR. While the algorithm is the same, the ES one is something you pass to the query, as a parameter or configuration to affect fuzziness of a term search (or match, as per the documentation, and the one we are using). The way this is being used here is for reporting the difference between what was found by the engine vs what was issued as input (in %, derived from the Levenshtein distance).

hkeeler added a commit that referenced this pull request Sep 8, 2015
@hkeeler hkeeler merged commit d640b10 into cfpb:master Sep 8, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants