Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add index mapping parameter for counted_keyword #103646

Merged

Conversation

danielmitterdorfer
Copy link
Member

With this commit we add a new mapping parameter index to the counted_keyword mapping type. This allows to reduce disk usage for use cases where indexed fields are not required.

Relates #101826

With this commit we add a new mapping parameter `index` to the
`counted_keyword` mapping type. This allows to reduce disk usage for use
cases where indexed fields are not required.

Relates elastic#101826
@elasticsearchmachine
Copy link
Collaborator

Hi @danielmitterdorfer, I've created a changelog YAML for you.

@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Dec 21, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left two testing comments. Looks good otherwise.

@@ -49,3 +66,20 @@ setup:
- match: { aggregations.event_terms.buckets.2.key: "c" }
- match: { aggregations.event_terms.buckets.2.doc_count: 2 }
- length: { aggregations.event_terms.buckets: 3 }

# although the field is not indexed, the counted_terms agg should still work
- do:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also verify that no inverted index is created for the events field in test-event-no-index index? Maybe by executing a field caps api call and/or disk usage api call and verify the response?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! I've also split the test case files for the default options and for index: false as the latter will be introduced in 8.13 + it keeps the two cases neatly separated.

@@ -78,16 +78,24 @@ public class CountedKeywordFieldMapper extends FieldMapper {
public static final String CONTENT_TYPE = "counted_keyword";
public static final String COUNT_FIELD_NAME_SUFFIX = "_count";

public static final FieldType FIELD_TYPE;
private static final FieldType FIELD_TYPE_INDEXED;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also update the CountedKeywordFieldMapperTests test case to have a test similar to KeywordFieldMapperTests$testDisableIndex()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in 47732ce.

Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@danielmitterdorfer danielmitterdorfer merged commit 53296e2 into elastic:main Dec 22, 2023
15 checks passed
@danielmitterdorfer danielmitterdorfer deleted the counted-keyword-no-index branch December 22, 2023 10:25
jbaiera pushed a commit to jbaiera/elasticsearch that referenced this pull request Jan 10, 2024
With this commit we add a new mapping parameter `index` to the
`counted_keyword` mapping type. This allows to reduce disk usage for use
cases where indexed fields are not required.

Relates elastic#101826
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Aggregations Aggregations >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.13.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants