Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove extraneous references to 'tokenized' in the mapper code. #31010

Merged
merged 1 commit into from
Jun 8, 2018

Conversation

jtibshirani
Copy link
Contributor

@jtibshirani jtibshirani commented May 31, 2018

These are likely left over from when there were three options for the index mapping (no, analyzed, and not_analyzed).

These are likely left over from when there were three options for
the index mapping ('no', 'analyzed', 'not_analyzed').
@jtibshirani jtibshirani added WIP :Search Foundations/Mapping Index mappings, including merging and defining field types v7.0.0 v6.4.0 labels May 31, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

Copy link
Member

@cbuescher cbuescher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The deletion in the BooleanFieldMapper and the FieldMapper make sense to me, the code seems unused. I left two questions though where I might not understand the difference of FieldMapper#tokenized (removed) vs. FieldType#tokenized.

@@ -376,9 +371,8 @@ protected void doXContentBody(XContentBuilder builder, boolean includeDefaults,

boolean indexed = fieldType().indexOptions() != IndexOptions.NONE;
boolean defaultIndexed = defaultFieldType.indexOptions() != IndexOptions.NONE;
if (includeDefaults || indexed != defaultIndexed ||
fieldType().tokenized() != defaultFieldType.tokenized()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another tokenized() method than the one deleted above, no? I don't know enough about the details of how mappings work, but how are they different? How is this removal in the if-clause related to the rest of this change?

@@ -163,7 +163,7 @@ public void checkCompatibility(MappedFieldType other, List<String> conflicts) {
boolean indexed = indexOptions() != IndexOptions.NONE;
boolean mergeWithIndexed = other.indexOptions() != IndexOptions.NONE;
// TODO: should be validating if index options go "up" (but "down" is ok)
if (indexed != mergeWithIndexed || tokenized() != other.tokenized()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, this it the field type tokenized flag, no? Not sure if this matters.

@jtibshirani
Copy link
Contributor Author

@cbuescher thanks for taking a look! They are indeed separate methods, but they concern the same flag. In short, each FieldMapper, which represents a field mapping, produces a MappedFieldType, which corresponds to an actual lucene field type and is used when constructing queries, highlighting, etc. Many methods on FieldMapper.Builder simply write through to the corresponding methods on the underlying MappedFieldType, like the tokenized method that I removed.

Previously when we had the string type (along with the index options analyzed, not_analyzed, and no), a single type of field could be set to tokenized, or not tokenized. My understanding is that now that string is split into keyword and text, this is no longer the case — a field’s type now fully determines whether its tokenized flag will be true or false (it’s always false for BooleanFieldType, always true for TextFieldType, etc.) This means that (1) it does not make sense to have a FieldMapper.Builder#tokenized method, as a field type’s tokenized flag shouldn’t change, and (2) if two fields are of the same type, we don’t need to check whether their ‘tokenized()’ values are the same.

I hope this helps explain the reasoning behind the change. I really appreciate additional sanity checks on these mapping refactors, as it seems like the code is a bit older and non-obvious.

Copy link
Member

@cbuescher cbuescher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is no longer the case — a field’s type now fully determines whether its tokenized flag will be true or false

Thanks for the great explanation, I think I understand now. LGTM.

Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jtibshirani jtibshirani merged commit 2378fa1 into elastic:master Jun 8, 2018
@jtibshirani jtibshirani deleted the refactor/field-mapper branch June 8, 2018 16:23
jtibshirani added a commit that referenced this pull request Jun 8, 2018
These are likely left over from when there were three options for
the index mapping ('no', 'analyzed', 'not_analyzed').

(cherry picked from commit 2378fa1)
dnhatn added a commit that referenced this pull request Jun 10, 2018
* 6.x:
  Move default location of dependencies report (#31228)
  Remove dependencies report task dependencies (#31227)
  Add recognition of MPL 2.0 (#31226)
  Fix unknown licenses (#31223)
  Fully encapsulate LocalCheckpointTracker inside of the engine (#31213)
  Remove version from license file name for GCS SDK (#31221)
  Remove DocumentFieldMappers#simpleMatchToFullName. (#31041)
  [DOCS] Removes 6.3.1 release notes
  [DOCS] Splits release notes by major version
  Remove DocumentFieldMappers#smartNameFieldMapper, as it is no longer needed. (#31018)
  Remove extraneous references to 'tokenized' in the mapper code. (#31010)
  SQL: Make a single JDBC driver jar (#31012)
  QA: Fix rolling restart tests some more
  Allow to trim all ops above a certain seq# with a term lower than X
  high level REST api: cancel task (#30745)
  Mute TokenBackwardsCompatibilityIT.testMixedCluster
  Mute WatchBackwardsCompatibilityIT.testWatcherRestart
  Enhance license detection for various licenses (#31198)
  [DOCS] Add note about long-lived idle connections (#30990)
  Add high-level client methods that accept RequestOptions (#31069)
  Remove RestGetAllMappingsAction (#31129)
  Move RestGetSettingsAction to RestToXContentListener (#31101)
  Move number of language analyzers to analysis-common module (#31143)
  flush job to ensure all results have been written (#31187)
dnhatn added a commit that referenced this pull request Jun 10, 2018
* master:
  Move default location of dependencies report (#31228)
  Remove dependencies report task dependencies (#31227)
  Add recognition of MPL 2.0 (#31226)
  Fix unknown licenses (#31223)
  Remove version from license file name for GCS SDK (#31221)
  Fully encapsulate LocalCheckpointTracker inside of the engine (#31213)
  [DOCS] Added 'fail_on_unsupported_field' param to MLT. Closes #28008 (#31160)
  Add licenses for transport-nio (#31218)
  Remove DocumentFieldMappers#simpleMatchToFullName. (#31041)
  Allow to trim all ops above a certain seq# with a term lower than X, post backport fix (#31211)
  Compliant SAML Response destination check (#31175)
  Remove DocumentFieldMappers#smartNameFieldMapper, as it is no longer needed. (#31018)
  Remove extraneous references to 'tokenized' in the mapper code. (#31010)
  Allow to trim all ops above a certain seq# with a term lower than X (#30176)
  SQL: Make a single JDBC driver jar (#31012)
  Enhance license detection for various licenses (#31198)
  [DOCS] Add note about long-lived idle connections (#30990)
  Move number of language analyzers to analysis-common module (#31143)
  Default max concurrent search req. numNodes * 5 (#31171)
  flush job to ensure all results have been written (#31187)
jasontedor added a commit to rjernst/elasticsearch that referenced this pull request Jun 10, 2018
…ecker

* elastic/master: (309 commits)
  [test] add fix for rare virtualbox error (elastic#31212)
  Move default location of dependencies report (elastic#31228)
  Remove dependencies report task dependencies (elastic#31227)
  Add recognition of MPL 2.0 (elastic#31226)
  Fix unknown licenses (elastic#31223)
  Remove version from license file name for GCS SDK (elastic#31221)
  Fully encapsulate LocalCheckpointTracker inside of the engine (elastic#31213)
  [DOCS] Added 'fail_on_unsupported_field' param to MLT. Closes elastic#28008 (elastic#31160)
  Add licenses for transport-nio (elastic#31218)
  Remove DocumentFieldMappers#simpleMatchToFullName. (elastic#31041)
  Allow to trim all ops above a certain seq# with a term lower than X, post backport fix (elastic#31211)
  Compliant SAML Response destination check (elastic#31175)
  Remove DocumentFieldMappers#smartNameFieldMapper, as it is no longer needed. (elastic#31018)
  Remove extraneous references to 'tokenized' in the mapper code. (elastic#31010)
  Allow to trim all ops above a certain seq# with a term lower than X (elastic#30176)
  SQL: Make a single JDBC driver jar (elastic#31012)
  Enhance license detection for various licenses (elastic#31198)
  [DOCS] Add note about long-lived idle connections (elastic#30990)
  Move number of language analyzers to analysis-common module (elastic#31143)
  Default max concurrent search req. numNodes * 5 (elastic#31171)
  ...
@jimczi jimczi added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>refactoring :Search Foundations/Mapping Index mappings, including merging and defining field types v6.4.0 v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants