Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove support for fielddata loading on _id. #64511

Open
jtibshirani opened this issue Nov 2, 2020 · 4 comments
Open

Remove support for fielddata loading on _id. #64511

jtibshirani opened this issue Nov 2, 2020 · 4 comments
Labels
:Analytics/Aggregations Aggregations >breaking :Search/Search Search-related issues that do not fall into other categories Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:Search Meta label for search team >tech debt

Comments

@jtibshirani
Copy link
Contributor

jtibshirani commented Nov 2, 2020

In 7.6 we introduced a cluster setting to prevent fielddata loading on _id (#49166). In a future version, we'd like to go further and remove support for loading fielddata on _id entirely.

The reasoning: _id fielddata can use substantial heap memory, making it more complex to manage Elasticsearch. At the same time, it's uncommon to require fielddata on _id:

  • The main use case for sorting on _id is to achieve a consistent order in search_after requests. We plan to introduce a dedicated way to tiebreak in point-in-time searches that does not require _id (Virtual Sort field for automatic tie-breaking #56828).
  • It's hard to think of a strong use case for aggregating on _id.
@jtibshirani jtibshirani added >breaking :Search/Search Search-related issues that do not fall into other categories labels Nov 2, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (:Search/Search)

@elasticmachine elasticmachine added the Team:Search Meta label for search team label Nov 2, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (:Analytics/Aggregations)

@elasticmachine elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Nov 2, 2020
@ppf2
Copy link
Member

ppf2 commented Nov 2, 2020

Before we remove support for this from Elasticsearch, let's also make sure that we address the outstanding Kibana issue (elastic/kibana#57181) where it is using aggregations against the _id field for autocomplete features (and anywhere in Kibana we may be using _id for sorting/aggregations/scripting). @alexfrancoeur @AlonaNadler

@mitar
Copy link
Contributor

mitar commented Oct 4, 2022

The use case I have for fielddata on _id (but it could also be just doc values on _id, see #60778) is that I want to predictably but randomly sort results which have the same score. So I am using random_score scoring function on _id field with random seed tied to the user (so that user has same ordering even if they redo the query). By removing fielddata such random tiebreaking would not be possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Aggregations Aggregations >breaking :Search/Search Search-related issues that do not fall into other categories Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:Search Meta label for search team >tech debt
Projects
None yet
Development

No branches or pull requests

6 participants