Skip to content

Commit

Permalink
feat(peliasAdmin): Remove word delimiter filter
Browse files Browse the repository at this point in the history
The first error seen when trying to use our current schema with
Elasticsearch 7 is:

```
[illegal_argument_exception] Token filter [word_delimiter] cannot be
used to parse synonyms
```

The [word delimiter](https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-word-delimiter-tokenfilter.html)
token filter is only used in one place: the `peliasAdmin` analyzer.

Looking at the documentation for `word_delimiter`, it does _a lot_:
splitting words, handling punctuation, and even some basic stemming.

It really feels like an extremely broad tool and at this point feels
like something that Elasticsearch would deprecate in the future.

Furthermore, looking at our integration tests, it seems one of the key
reasons we used it was to tokenize on hyphens, which we have done using
the `peliasNameTokenizer` since
#375.

Considering how complicated this token filter is, and how it's now being
used with relatively little effect, it seems like something we can
remove.

Connects pelias/pelias#831
  • Loading branch information
orangejulius committed Nov 7, 2019
1 parent 83e0cf2 commit 4b5dcb4
Show file tree
Hide file tree
Showing 3 changed files with 1 addition and 3 deletions.
2 changes: 1 addition & 1 deletion integration/analyzer_peliasAdmin.js
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ module.exports.tests.analyze = function(test, common){
assertAnalysis( 'notnull', ' ^ ', [] );

// remove punctuation (handled by the char_filter)
assertAnalysis( 'punctuation', punctuation.all.join(''), [] );
assertAnalysis( 'punctuation', punctuation.all.join(''), ['0:&'] );

suite.run( t.end );
});
Expand Down
1 change: 0 additions & 1 deletion settings.js
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,6 @@ function generate(){
"lowercase",
"icu_folding",
"trim",
"word_delimiter",
"custom_admin",
"unique_only_same_position",
"notnull"
Expand Down
1 change: 0 additions & 1 deletion test/fixtures/expected.json
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@
"lowercase",
"icu_folding",
"trim",
"word_delimiter",
"custom_admin",
"unique_only_same_position",
"notnull"
Expand Down

0 comments on commit 4b5dcb4

Please sign in to comment.