Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

max_shards_per_node not behaving as documented #15803

Closed
hlcianfagna opened this issue Apr 4, 2024 · 5 comments · Fixed by #15860
Closed

max_shards_per_node not behaving as documented #15803

hlcianfagna opened this issue Apr 4, 2024 · 5 comments · Fixed by #15860
Assignees
Labels
docs A documentation issue

Comments

@hlcianfagna
Copy link
Contributor

CrateDB version

5.6.3

CrateDB setup information

3 nodes cluster deployed with https://github.com/crate/crate-terraform

Problem description

According to https://cratedb.com/docs/crate/reference/en/5.6/config/cluster.html#shard-limits

Any operations that would result in the creation of additional shard copies that would exceed this limit are rejected.

However in a multi-node cluster (could not repro this on single-node environments) it seems under certain conditions the limit is not strictly enforced.

Steps to Reproduce

cr> select node['name'] ,closed,count(*) from sys.shards group by 1,2 limit 100;
+------------------+--------+----------+
| node['name']     | closed | count(*) |
+------------------+--------+----------+
| Hochschwung      | FALSE  |       26 |
| Schafberg        | FALSE  |       27 |
| Monte La Palazza | FALSE  |       27 |
+------------------+--------+----------+
SELECT 3 rows in set (0.053 sec)
cr> set GLOBAL PERSISTENT "cluster.max_shards_per_node"=30;
SET OK, 1 row affected (0.056 sec)
cr> create table hernan.shardstest (a int) clustered into 8 shards;
CREATE OK, 1 row affected (0.149 sec)
cr> create table hernan.shardstest2 (a int) clustered into 5 shards;
SQLParseException[Validation Failed: 1: this action would add [5] total shards, but this cluster currently has [96]/[90] maximum shards open;]
cr> select node['name'] ,closed,count(*) from sys.shards group by 1,2 limit 100;
+------------------+--------+----------+
| node['name']     | closed | count(*) |
+------------------+--------+----------+
| Hochschwung      | FALSE  |       32 |
| Schafberg        | FALSE  |       32 |
| Monte La Palazza | FALSE  |       32 |
+------------------+--------+----------+

Actual Result

The nodes end up with 32 shards

Expected Result

The limit of 30 shards is enforced, or the documentation is updated explaining how the limit works/when it may not be enforced.

@hlcianfagna hlcianfagna added the triage An issue that needs to be triaged by a maintainer label Apr 4, 2024
@jeeminso jeeminso self-assigned this Apr 4, 2024
@jeeminso jeeminso added bug Clear identification of incorrect behaviour and removed triage An issue that needs to be triaged by a maintainer labels Apr 4, 2024
@jeeminso
Copy link
Contributor

jeeminso commented Apr 5, 2024

I think it is caused by number_of_replicas which is by default set to 0-1. A workaround could be to not use a range value.

@BaurzhanSakhariev
Copy link
Contributor

BaurzhanSakhariev commented Apr 16, 2024

  1. Regarding original issue - I think it's more documentation issue.
    See Clarify the difference between cluster.routing.allocation.total_shards_per_node and cluster.max_shards_per_node elastic/elasticsearch#51839

cluster.max_shards_per_node controls how many shards are allowed to exist in the cluster as a whole, and is checked at shard creation time, but does not pay attention to how many shards any individual node has

and 7.10 backport elastic/elasticsearch@e4054e4

NOTE: This setting does not limit shards for individual nodes.

I will port those docs + mention in docs that max_shards_per_node doesn't take to account closed shards.

On auto-expanding replicas (which we have enabled by default, it's 0-1).
I found elastic/elasticsearch#2869 and elastic/elasticsearch@eb3d184.
Links above are talking about total_shards_per_node - I will check whether it holds true for max_shards_per_node and update docs if needed.

UPD: https://www.elastic.co/guide/en/elasticsearch/reference/7.17/index-modules.html

Note that the auto-expanded number of replicas only takes allocation filtering rules into account, but ignores other allocation rules such as total shards per node,

Probably by design, but I will check why we cannot do it like @jeeminso proposed in #15805

  1. Regarding partitioned tables - it's a legitimate bug but visible only for INSERT INTO ... VALUES (many values).
    insert-from subquery is not that badly exposed - see details in a fix.

@BaurzhanSakhariev
Copy link
Contributor

BaurzhanSakhariev commented Apr 16, 2024

Hi @hlcianfagna could you please post your initital create table statment(s) - how you got those 26/27 shards?

Also, couldn't exactly reproduce locally: do you get 32 shards per node after running
create table hernan.shardstest2 (a int) clustered into 5 shards;

What I'm saying is that slightly overshooting limit might be expected behavior but reporting error and still incrementing number of shards would be a bug.

I suspect that actually 32 shards were already there after
create table hernan.shardstest (a int) clustered into 8 shards;

@hlcianfagna
Copy link
Contributor Author

how you got those 26/27 shards?

They were already there on a cluster that I do not have at hand anymore, however I just reproduced this again successfully using this on an empty cluster:

create table hernan.legacytables (a int) clustered into 40 shards;

I confirmed the 32 shards per node are there right after the command with clustered into 8 shards

@BaurzhanSakhariev
Copy link
Contributor

BaurzhanSakhariev commented Apr 17, 2024

I confirmed the 32 shards per node are there right after the command with clustered into 8 shards

ok, then I will follow my original plan - improve docs (basically expected behaviour for auto-expanding replicas).

As said, throwing an error this action would add ... total shards and actually adding shards would be a bug -> but it's not the case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment