Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Rollup] Managing index lifecycle #33065

Closed
polyfractal opened this issue Aug 22, 2018 · 5 comments
Closed

[Rollup] Managing index lifecycle #33065

polyfractal opened this issue Aug 22, 2018 · 5 comments
Labels
>enhancement :StorageEngine/Rollup Turn fine-grained time-based data into coarser-grained data

Comments

@polyfractal
Copy link
Contributor

Today, a Rollup job will store its results in a single rollup index. There is currently no provision for handling jobs that generate such a large volume that they need multiple indices to scale even the rollup data.

There are a couple routes we can take... it's not clear to me what the best is. Current Rollup limitations make it tricky too.

Wait for ILM

Easiest option... wait for ILM (#29823) to be merged and then revisit this conversation. Integrating with ILM somehow will likely provide a better experience instead of baking smaller parts into Rollup.

Support external Rollover

Rollup doesn't play nicely with Rollover today because we try to create the destination rollup index (and if it exists, update the metadata). So if the user points their config at a Rollover alias, we throw an exception.

We could allow Rollup to point at aliases, which I think would let the user manually Rollover indices. There are some tricky bits to this though. Because Rollup uses deterministic IDs for disaster recovery after a checkpoint, the user would have to make sure a checkpoint has been fully committed before rolling over:

  1. Stop the job, wait for it to checkpoint and finish
  2. Rollover the index
  3. Re-enable the job

It's not terrible, but not super user-friendly either.

Internally support Rollover

We could instead implement the Rollover functionality in Rollup. It'd be essentially the same thing, same procedure, just handled by Rollup. Probably as another config option, and we just check the Rollover criteria when checkpointing or something.

Destination date math/patterns

We could implement something like:

"index_pattern": "logstash-*"
"rollup_index": "logstash-%{+YYYY.MM.dd}",

Which would dynamically create destination indices according to the timestamp of the rollup document. Unlike Rollover, we don't have to worry about backtracking and replaying documents because docs will deterministically land in their destination index too.

This does complicate job creation a little bit, since indices are generated on-demand instead of up-front. Meaning we'd need to find a way to enrich those indices with metadata after it is generated dynamically

Big issue related to all approaches

The major problem with all of these approaches is that Rollup doesn't allow more than one rollup index in a RollupSearch API. This was mainly to limit complexity internally rather than a hard limit. And I think the restriction is less important now that missing_bucket is implemented.

I think we could loosen this restriction as long as all indices involved in the search share the exact same set of rollup jobs, that way we don't have to worry about weird mixed-job incompatibilities.

@polyfractal polyfractal added >enhancement :StorageEngine/Rollup Turn fine-grained time-based data into coarser-grained data labels Aug 22, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

@polyfractal
Copy link
Contributor Author

@colings86 and I chatted about this some. Enabling this via ILM is a good long-term goal. With that in mind, ILM would likely invoke this sort of behavior with the Rollover API. From Rollup's point of view, that would look the same as an external Rollover event, so if we enable that behavior we can A) let users do it manually today and B) be ready to work with ILM when it's ready.

We'll need to loosen the restriction multi-rollup-index search, but I think that should be easy enough as long as we can guarantee all indices in the query have identical jobs.

At rollover time, we have two concerns. The first is the orchestration of start/rollover/stop as described in the OP. The other issues is dealing with _meta on the newly created index. Since Rollover will be creating the index and not Rollup, we'll need some way to get the job metadata on to the new index.

  • One approach is a periodic thread that checks to see if we're writing to a different concrete index, and upgrade it's metadata when we notice. This seems a bit scary since disaster recover at that point could be unpleasant.
  • We could enhance the Rollover API to also copy over settings/mappings from the old index to the new index. This would be the best solution from Rollup's perspective, since it would transparently handle the new index.
    • There's an open question how the first index would get metadata. Potentially, Rollup could create the alias and initial index automatically, defaulting to a rollover setup.
  • Something else

@jasondu168
Copy link

I am using the rollup_index and now the rollup_index becomes bigger and bigger.
I have to find a way to delete old rollup data. I like the idea:

"index_pattern": "logstash-*"
"rollup_index": "logstash-%{+YYYY.MM.dd}",

_rollup_search Api should be able to use alias that contains all the rollup_index end with{+YYYY.MM.dd}.

@bhatch4
Copy link

bhatch4 commented Mar 1, 2019

I would love to see the rollup_index pattern like mentioned above.
We get a lot of metrics and a lot of log data. Some indices get 100-300 million documents every day. Because of the amount of data we can only keep data around for so long. The rollup feature could allow us to keep summarized data for much longer. We could also move some reports onto the rollup index to save on resources. For example we could keep the original data for 1 month, and the rollup date for 1 year. Curator jobs would take care of removing indices when they are too old.
However without a way of managing the rollup index this isn't possible. At least not without some manual control.

@polyfractal
Copy link
Contributor Author

Superseded by #48003

We're going to try and integrate Rollup directly into ILM (as an action) rather than trying to get ILM and the current Rollup indexer to "coordinate". Easier for the user, simpler to manage, one configuration, and no need to keep the two tasks synced. And fits nicely into how people naturally want to use Rollups (as part of their index lifecycle and overall retention scheme).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :StorageEngine/Rollup Turn fine-grained time-based data into coarser-grained data
Projects
None yet
Development

No branches or pull requests

4 participants