Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disallow new rollup jobs in clusters with no rollup usage. #108624

Merged
merged 22 commits into from
May 21, 2024

Conversation

martijnvg
Copy link
Member

This change will add logic to the put rollup api that fails if no rollup job is active and no rollup index exists in the cluster.

The logic first check whether there is an active rollup persistent task if there are no active rollup persistent tasks, then it checks whether any rollup index exists. The latter check is an expensive check, but assuming that it only runs as part of the pu rollup job api and only when there are no rollup jobs, this should be ok.

All tests that invoke the pu rollup job api will need to be adjusted to create a dummy index that has rollup mapping metadata. Otherwise, tests can't create a rollup job.

Closes #108381

This change will add logic to the put rollup api that fails if no rollup job is active and no rollup index exists in the cluster.

The logic first check whether there is an active rollup persistent task if there are no active rollup persistent tasks, then it checks whether any rollup index exists. The latter check is an expensive check, but assuming that it only runs as part of the pu rollup job api and only when there are no rollup jobs, this should be ok.

All tests that invoke the pu rollup job api will need to be adjusted to create a dummy index that has rollup mapping metadata. Otherwise, tests can't create a rollup job.

Closes elastic#108381
@martijnvg martijnvg added >breaking :StorageEngine/Rollup Turn fine-grained time-based data into coarser-grained data labels May 14, 2024
@elasticsearchmachine
Copy link
Collaborator

Hi @martijnvg, I've created a changelog YAML for you. Note that since this PR is labelled >breaking, you need to update the changelog YAML to fill out the extended information sections.

@martijnvg martijnvg marked this pull request as ready for review May 17, 2024 08:17
@martijnvg martijnvg requested a review from a team as a code owner May 17, 2024 08:17
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

listener.onFailure(
new IllegalArgumentException(
"rollup has been deprecated and will be removed, therefor starting "
+ "from 8.15.0, creating new rollup jobs is no longer allowed in clusters that don't have any rollup usage."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

therefore*

I'd just say: "new rollup jobs are not allowed in clusters that don't have any rollup usage, since rollup has been deprecated". The details should be included in rollup documentation.

if (parser.nextToken() == XContentParser.Token.START_OBJECT) {
if ("_meta".equals(parser.nextFieldName())) {
if (parser.nextToken() == XContentParser.Token.START_OBJECT) {
if ("_rollup".equals(parser.nextFieldName())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty fragile, what if there's another field before _rollup?

Since this is fairly infrequent, I'd go with using sourceAsMap.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty fragile, what if there's another field before _rollup?

I don't there will be another field. The logic in TransportPutRollupJobAction#createMappings(...) always add a _doc._meta._rollup field and the PARSER_CONFIGURATION constant in TransportPutRollupJobAction only allows one field to be present. Let me add some unit tests for this.

I'd go with using sourceAsMap.

Mappings can be pretty large and I like to avoid converting large mappings in map of maps in the jvm heap space.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, it should throw if there's another field?

Sorry, did you add a unittest this that I missed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I don't see a case with multiple fields under _doc._meta in this file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. I added that now too: 3b2b145

@martijnvg martijnvg requested a review from kkrik-es May 17, 2024 13:12
@martijnvg martijnvg merged commit 9585504 into elastic:main May 21, 2024
15 checks passed
This was referenced Jun 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>breaking :StorageEngine/Rollup Turn fine-grained time-based data into coarser-grained data Team:StorageEngine v8.15.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Prohibit creation of rollup jobs in clusters that have never used rollup.
4 participants