Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove nGram and edgeNGram token filter names #38911

Merged
merged 5 commits into from Feb 15, 2019

Conversation

cbuescher
Copy link
Member

In #30209 we deprecated the camel case nGram filter name in favour of ngram and
did the same for edgeNGram and edge_ngram. Using these names has been deprecated
since 6.4 and is issuing deprecation warnings since then.
I think we can remove these filters in 8.0. In a backport of this PR I would change what was a
dreprecation warning from 6.4. to an error starting with new indices created in 7.0.

@cbuescher cbuescher added :Search/Analysis How text is split into tokens >refactoring v8.0.0 labels Feb 14, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search

@@ -1,9 +1,9 @@
[[analysis-edgengram-tokenfilter]]
=== Edge NGram Token Filter

A token filter of type `edgeNGram`.
A token filter of type `edge_ngram`.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These doc-changes should go back to 7.0 as well.

@@ -81,7 +81,7 @@ public void testNgramHighlightingWithBrokenPositions() throws IOException {
.put("analysis.tokenizer.autocomplete.max_gram", 20)
.put("analysis.tokenizer.autocomplete.min_gram", 1)
.put("analysis.tokenizer.autocomplete.token_chars", "letter,digit")
.put("analysis.tokenizer.autocomplete.type", "nGram")
.put("analysis.tokenizer.autocomplete.type", "ngram")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is about the tokenizer, not the filter, but I think we also shouldn't use the camel case version of that one anymore.

@@ -133,7 +101,7 @@
text: "foobar"
explain: true
tokenizer:
type: nGram
type: ngram
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, this is about the tokenizer, but we shouldn't use the camel case name here regardless

Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, can you add an entry in the breaking changes (you'll need to create one for 8.0) ?

@cbuescher
Copy link
Member Author

LGTM, can you add an entry in the breaking changes (you'll need to create one for 8.0) ?

Will do. I was wondering if we also need to start throwing errors starting with 7.0 when token filters are used via "nGram" or "edgeNGram" in new indices, so essentially throwing an error where since 6.4 we have issued the deprecation warning, or if we can remove the filter in 8.0 only on the basis of having deprecated it in 6.4? The problem with token filters is AFAIK we currently cannot easily throw errors on e.g. index creation really but only when the filter is used (e.g. an index or analyze operation). Wdyt?

@jimczi
Copy link
Contributor

jimczi commented Feb 15, 2019

I was wondering if we also need to start throwing errors starting with 7.0 when token filters are used via "nGram" or "edgeNGram" in new indices, so essentially throwing an error

+1, we should throw an error if the deprecated name is used on an index created in 7.0+.

@cbuescher cbuescher merged commit 7bb2da1 into elastic:master Feb 15, 2019
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Feb 15, 2019
* master:
  Address some CCR REST test case flakiness (elastic#38975)
  Edits to text in Completion Suggester doc (elastic#38980)
  SQL: doc polishing
  [DOCS] Fixes broken formatting
  SQL: Polish the rest chapter (elastic#38971)
  Remove `nGram` and  `edgeNGram` token filter names (elastic#38911)
  Add an exception throw if waiting on transport port file fails (elastic#37574)
  Improve testcluster distribution artifact handling (elastic#38933)
  Advance max_seq_no before add operation to Lucene (elastic#38879)
  Reduce global checkpoint sync interval in disruption tests (elastic#38931)
  [test] disable packaging tests for suse boxes
  Relax testStressMaybeFlushOrRollTranslogGeneration (elastic#38918)
  [DOCS] Edits warning in put watch API (elastic#38582)
  Fix serialization bug in ShardFollowTask after cutting this class over to extend from ImmutableFollowParameters.
  [DOCS] Updates methods for upgrading machine learning (elastic#38876)
cbuescher pushed a commit to cbuescher/elasticsearch that referenced this pull request Feb 18, 2019
In elastic#30209 we deprecated the camel case `nGram` filter name in favour of `ngram` and
did the same for `edgeNGram` and `edge_ngram`. Using these names has been deprecated
since 6.4 and is issuing deprecation warnings since then.
I think we can remove these filters in 8.0. In a backport of this PR I would change what was a
dreprecation warning from 6.4. to an error starting with new indices created in 7.0.
cbuescher pushed a commit that referenced this pull request Feb 21, 2019
In #30209 we deprecated the camel case `nGram` filter name in favour of `ngram` and
did the same for `edgeNGram` and `edge_ngram` and we are removing those names in
8.0. This change disallows using the deprecated names for new indices created in 7.0 by
throwing an error if these filters are used.

Relates to #38911
cbuescher pushed a commit that referenced this pull request Feb 21, 2019
In #30209 we deprecated the camel case `nGram` filter name in favour of `ngram` and
did the same for `edgeNGram` and `edge_ngram` and we are removing those names in
8.0. This change disallows using the deprecated names for new indices created in 7.0 by
throwing an error if these filters are used.

Relates to #38911
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants