feat(addressUsingIdsQuery): Add slop parameter#101
feat(addressUsingIdsQuery): Add slop parameter#101orangejulius wants to merge 1 commit intomasterfrom
Conversation
This adds support for the `match_phrase` [slop parameter](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-match-query-phrase.html). The default is set to 0, so as it stands, this addition will not change the outcome of any queries.
Combined with pelias/query#101, this change allows for improved matching of streets when using the addressSearchUsingIDs query (the first query executed by the search endpoint). Essentially, by allowing a `slop` of 1 in the `match_phrase` clauses of the Elasticsearch query, its now possible to omit a single word _in the middle_ of a street name and still match it. Omitting words at the start or end of the street name has been supported since pelias/schema#310. Some examples of where this helps: | Actual street name | Newly matching input text | | --- | --- | | SE Martin Luther King Jr Blvd | Martin Luther King Blvd | | Carrer de Balmes | Carrer Balmes |
|
|
||
| // default address:street:slop to 0 if unset | ||
| if (!vs.isset('address:street:slop')) { | ||
| vs.var('address:street:slop', 0); |
There was a problem hiding this comment.
@missinglink, I put this here as a shortcut. I notice that most of our code in API makes copies of the variable store rather than changing variables in views. Do you think that would be a better approach here?
There was a problem hiding this comment.
I would prefer that if the variable was unset then we totally omit it from the query, rather than trying to set it to the default in ES.
The default slop set by elasticsearch is 0 but this could change at any time.
That's how most of the views I've written work, they have the 'mandatory' properties in the main blob and then a bunch of if conditions below which set optional params.
There was a problem hiding this comment.
If you want to set defaults then the correct place to do this would be in the search_defaults file in API.
There was a problem hiding this comment.
Funny, I prefer the opposite, I explicitly wrote it intending to ensure the slop parameter is always in the query so that it's clear what its set to. As far as pelias/query is concerned, I believe the default should be set to zero. If Elasticsearch changes the default slop value that would be a huge pain and I'd prefer we avoid that (extremely slim) possibility.
The default in search_defaults in the API is going to be 1 to get the behavior we want, and that is handled in pelias/api#1322
There was a problem hiding this comment.
As a general principal I think that we should reserve the option of omission of optional properties.
There is a valid case for a developer wishing to rely on the default behaviour rather than specifying it explicitly.
I can see the benefit of setting it to 0, mainly for clarity (at the cost of verbosity).
Although I feel it is still better as an optional parameter where the caller (pelias/api) is able to make the decision about whether it will be included or excluded from the query.
Just my 2c :)
There was a problem hiding this comment.
Funny thing is that our existing phrase view does it as you suggest:
There was a problem hiding this comment.
It seems odd to me in the existing phrase.js view that we have two optional properties (slop and cutoff_frequency) which are handled differently.
I think we agree that we wouldn't want to set a default value for cutoff_frequency fuzziness et al?
There is something about slop which makes it feel like it's mandatory even though it isn't.
|
I intend to add slop to more |
Combined with pelias/query#101, this change allows for improved matching of streets when using the addressSearchUsingIDs query (the first query executed by the search endpoint). Essentially, by allowing a `slop` of 1 in the `match_phrase` clauses of the Elasticsearch query, its now possible to omit a single word _in the middle_ of a street name and still match it. Omitting words at the start or end of the street name has been supported since pelias/schema#310. Some examples of where this helps: | Actual street name | Newly matching input text | | --- | --- | | SE Martin Luther King Jr Blvd | Martin Luther King Blvd | | Carrer de Balmes | Carrer Balmes |
This change builds on the previous series of PRs (#112, and #111) to support the `slop` parameter on the `match_phrase` queries responsible for handling street addresses. Replaces #101 Connects pelias/api#1322
Combined with pelias/query#101, this change allows for improved matching of streets when using the addressSearchUsingIDs query (the first query executed by the search endpoint). Essentially, by allowing a `slop` of 1 in the `match_phrase` clauses of the Elasticsearch query, its now possible to omit a single word _in the middle_ of a street name and still match it. Omitting words at the start or end of the street name has been supported since pelias/schema#310. Some examples of where this helps: | Actual street name | Newly matching input text | | --- | --- | | SE Martin Luther King Jr Blvd | Martin Luther King Blvd | | Carrer de Balmes | Carrer Balmes |
Combined with pelias/query#101, this change allows for improved matching of streets when using the addressSearchUsingIDs query (the first query executed by the search endpoint). Essentially, by allowing a `slop` of 1 in the `match_phrase` clauses of the Elasticsearch query, its now possible to omit a single word _in the middle_ of a street name and still match it. Omitting words at the start or end of the street name has been supported since pelias/schema#310. Some examples of where this helps: | Actual street name | Newly matching input text | | --- | --- | | SE Martin Luther King Jr Blvd | Martin Luther King Blvd | | Carrer de Balmes | Carrer Balmes | Since pelias/query#113 has been merged, this commit will result in changes to how queries behave.
Combined with pelias/query#101, this change allows for improved matching of streets when using the addressSearchUsingIDs query (the first query executed by the search endpoint). Essentially, by allowing a `slop` of 1 in the `match_phrase` clauses of the Elasticsearch query, its now possible to omit a single word _in the middle_ of a street name and still match it. Omitting words at the start or end of the street name has been supported since pelias/schema#310. Some examples of where this helps: | Actual street name | Newly matching input text | | --- | --- | | SE Martin Luther King Jr Blvd | Martin Luther King Blvd | | Carrer de Balmes | Carrer Balmes | Since pelias/query#113 has been merged, this commit will result in changes to how queries behave.
Combined with pelias/query#101, this change allows for improved matching of streets when using the addressSearchUsingIDs query (the first query executed by the search endpoint). Essentially, by allowing a `slop` of 1 in the `match_phrase` clauses of the Elasticsearch query, its now possible to omit a single word _in the middle_ of a street name and still match it. Omitting words at the start or end of the street name has been supported since pelias/schema#310. Some examples of where this helps: | Actual street name | Newly matching input text | | --- | --- | | SE Martin Luther King Jr Blvd | Martin Luther King Blvd | | Carrer de Balmes | Carrer Balmes | Since pelias/query#113 has been merged, this commit will result in changes to how queries behave.
Combined with pelias/query#101, this change allows for improved matching of streets when using the addressSearchUsingIDs query (the first query executed by the search endpoint). Essentially, by allowing a `slop` of 1 in the `match_phrase` clauses of the Elasticsearch query, its now possible to omit a single word _in the middle_ of a street name and still match it. Omitting words at the start or end of the street name has been supported since pelias/schema#310. Some examples of where this helps: | Actual street name | Newly matching input text | | --- | --- | | SE Martin Luther King Jr Blvd | Martin Luther King Blvd | | Carrer de Balmes | Carrer Balmes | Since pelias/query#113 has been merged, this commit will result in changes to how queries behave.
This adds support for the
match_phraseslop parameter.The default is set to 0, so as it stands, this addition will not change the outcome of any queries.