Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Support rollover of state index #29938

Closed
9 tasks done
elasticmachine opened this issue May 12, 2017 · 4 comments
Closed
9 tasks done

[ML] Support rollover of state index #29938

elasticmachine opened this issue May 12, 2017 · 4 comments

Comments

@elasticmachine
Copy link
Collaborator

elasticmachine commented May 12, 2017

Original comment by @dimitris-athanasiou:

This issue is to track work necessary for supporting rolling the state index. We can add tasks as we identify them.

  • Add read/write aliases
  • For quantiles, check for existing and write to the same index any old quantiles are in
  • For data frame analytics progress, check for existing and write to the same index any old progress documents are in
  • For persistent state document, check for existing and write to the same index any old documents are in
  • Make sure that all get requests for state documents are replaced with ID query _search requests that ignore duplicates
  • Make sure all deletes can handle multiple indices
  • Make sure index template has a wildcard so new state indices get the template
  • Use ILM or other mechanism to create new state indices when the current one reaches a certain size
  • Remove empty state indices during nightly maintenance
@droberts195
Copy link
Contributor

@benwtrent this issue lists the work required to make it possible for ML to work when there is more than 1 state index, so is somewhat related to state index upgrade. If you find any other changes will be required when you search for uses of the state index in our code please add them to this issue.

@dimitris-athanasiou
Copy link
Contributor

dimitris-athanasiou commented Sep 11, 2019

Note that the persister in ml-cpp expects an index but the CSingleStreamDataAdder does not use the argument. In general, we should maintain control of the index in the java side; c++ should not know which index we're storing the state into.

@droberts195
Copy link
Contributor

droberts195 commented Jan 8, 2020

c++ should not know which index we're storing the state into

We checked this and currently Java is controlling the index that the state is written into. However, the C++ has constants that contain .ml-state which is confusing, and these should be removed. Currently the C++ "index" variables are just used for unit tests that persist to a directory instead of to Elasticsearch. The "index" is the directory name. But this should be renamed to something like "dummy_index_for_testing" to avoid confusion.

@przemekwitek
Copy link
Contributor

ILM policy for .ml-state* is implemented in master and 7.x.
I'm starting work on the last missing piece i.e. removing empty .ml-state* indices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants