Skip to content

Conversation

@gmarouli
Copy link
Contributor

Downsample tasks run on nodes that hold a searchable version of the shard to be downsampled. So far, we chose the primary shard. This is sufficient in general but not in a stateless distribution when only non-primary shards are searchable.

In this PR we add functionality to distinguish a stateless deployment and choose only search shards.

@gmarouli gmarouli requested a review from martijnvg June 27, 2025 06:15
@gmarouli gmarouli added >non-issue :StorageEngine/Downsampling Downsampling (replacement for rollups) - Turn fine-grained time-based data into coarser-grained data labels Jun 27, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

@elasticsearchmachine elasticsearchmachine added the serverless-linked Added by automation, don't add manually label Jun 27, 2025
Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, looks great Mary! I left one minor comment.

* @param indexShardRouting the routing of the shard to be downsampled
* @return the set of candidate nodes downsampling can run on.
*/
Set<String> getEligibleNodes(IndexShardRoutingTable indexShardRouting) {
Copy link
Member

@martijnvg martijnvg Jun 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rewrite this method like this:

Predicate<DiscoveryNode> getEligibleNodes(IndexShardRoutingTable indexShardRouting) {
        if (isStateless) {
            return candidateNode -> {
                for (var shardRouting : indexShardRouting.replicaShards()) {
                    if (shardRouting.started()) {
                        return shardRouting.currentNodeId().equals(candidateNode.getId());
                    }
                }
                return false;
            };
        } else if (indexShardRouting.primaryShard().started()) {
            return candidateNode -> indexShardRouting.primaryShard().currentNodeId().equals(candidateNode.getId());
        } else {
            return null;
        }
    }

That way the logic is better contained and no intermediate set is created?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been thinking about this, and I think both my current implementation and this can be a bit inefficient in big clusters because we iterate over all the candidate nodes that could be a lot.

Thinking about this some more, I am considering iterating over the eligible nodes which are probably way less than the total candidates and check if they are a candidate node.

What do you think?

About the code you shared, it does read nicely, but I am concerned that for every candidate node we are going to be building a new iterator over the replica shards ending up with more object churn than with the previous solution. Right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martijnvg I refactored it, I think it looks a bit cleaner now. Let me know what you think.

Copy link
Contributor

@fcofdez fcofdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I left a minor comment.

Copy link
Contributor

@fcofdez fcofdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gmarouli gmarouli requested a review from martijnvg June 27, 2025 12:23
Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gmarouli gmarouli added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Jun 27, 2025
@elasticsearchmachine elasticsearchmachine merged commit 33e7db0 into elastic:main Jun 27, 2025
32 checks passed
@gmarouli gmarouli deleted the downsampling-runs-only-search-nodes branch June 27, 2025 21:29
Copy link
Contributor

@kingherc kingherc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one! Was a bit late reviewing, just one comment.

* For simplicity, in non-stateless deployments we use the primary shard.
*/
private boolean isEligible(ShardRouting shardRouting) {
return shardRouting.started() && (isStateless ? shardRouting.isPromotableToPrimary() == false : shardRouting.primary());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit I think the last parenthesis could be simplified with shardRouting.isSearchable()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even better, I will change it.

mridula-s109 pushed a commit to mridula-s109/elasticsearch that referenced this pull request Jul 3, 2025
elastic#130160)

Downsample tasks run on nodes that hold a searchable version of the
shard to be downsampled. So far, we chose the primary shard. This is
sufficient in general but not in a stateless distribution when only
non-primary shards are searchable.

In this PR we add functionality to distinguish a stateless deployment
and choose only search shards.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) >non-issue serverless-linked Added by automation, don't add manually :StorageEngine/Downsampling Downsampling (replacement for rollups) - Turn fine-grained time-based data into coarser-grained data Team:StorageEngine v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants