Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JS stream count not balanced among the cluster nodes #5071

Closed
kohlisid opened this issue Feb 13, 2024 · 3 comments · Fixed by #5079
Closed

JS stream count not balanced among the cluster nodes #5071

kohlisid opened this issue Feb 13, 2024 · 3 comments · Fixed by #5079
Labels
defect Suspected defect such as a bug or regression

Comments

@kohlisid
Copy link

Observed behavior

When creating a multiple streams (with replicas = 3) on a Jetstream cluster (with number of nodes > stream replica count), I have been observing a behaviour where the streams are not evenly distributed among the servers.

Some of the server instances end up getting a large chunk of the stream replicas.

skohli@macos-JQWR9T560R ~ % nats --context east-sys-ac server report jetstream
╭──────────────────────────────────────────────────────────────────────────────────────────────────╮
│                                         JetStream Summary                                        │
├────────┬─────────┬─────────┬───────────┬──────────┬───────┬────────┬──────┬─────────┬────────────┤
│ Server │ Cluster │ Streams │ Consumers │ Messages │ Bytes │ Memory │ File │ API Req │ API Err    │
├────────┼─────────┼─────────┼───────────┼──────────┼───────┼────────┼──────┼─────────┼────────────┤
│ n1-c1  │ C1      │ 28      │ 0         │ 0        │ 0 B   │ 0 B    │ 0 B  │ 55      │ 0          │
│ n2-c1* │ C1      │ 1       │ 0         │ 0        │ 0 B   │ 0 B    │ 0 B  │ 33      │ 2 / 6.060% │
│ n3-c1  │ C1      │ 28      │ 0         │ 0        │ 0 B   │ 0 B    │ 0 B  │ 30      │ 0          │
│ n4-c1  │ C1      │ 0       │ 0         │ 0        │ 0 B   │ 0 B    │ 0 B  │ 0       │ 0          │
│ n5-c1  │ C1      │ 27      │ 0         │ 0        │ 0 B   │ 0 B    │ 0 B  │ 63      │ 0          │
├────────┼─────────┼─────────┼───────────┼──────────┼───────┼────────┼──────┼─────────┼────────────┤
│        │         │ 84      │ 0         │ 0        │ 0 B   │ 0 B    │ 0 B  │ 181     │ 2          │
╰────────┴─────────┴─────────┴───────────┴──────────┴───────┴────────┴──────┴─────────┴────────────╯

On some testing, if we wait for some time (sleep = 3s) before creating consecutive streams we end up seeing a far balanced distribution.

skohli@macos-JQWR9T560R ~ % nats --context east-sys-ac server report jetstream
╭─────────────────────────────────────────────────────────────────────────────────────────────────────╮
│                                          JetStream Summary                                          │
├────────┬─────────┬─────────┬───────────┬──────────┬───────┬────────┬──────┬─────────┬───────────────┤
│ Server │ Cluster │ Streams │ Consumers │ Messages │ Bytes │ Memory │ File │ API Req │ API Err       │
├────────┼─────────┼─────────┼───────────┼──────────┼───────┼────────┼──────┼─────────┼───────────────┤
│ n1-c1  │ C1      │ 18      │ 0         │ 0        │ 0 B   │ 0 B    │ 0 B  │ 696     │ 381 / 54.741% │
│ n2-c1* │ C1      │ 14      │ 0         │ 0        │ 0 B   │ 0 B    │ 0 B  │ 496     │ 253 / 51.008% │
│ n3-c1  │ C1      │ 17      │ 0         │ 0        │ 0 B   │ 0 B    │ 0 B  │ 145     │ 0             │
│ n4-c1  │ C1      │ 18      │ 0         │ 0        │ 0 B   │ 0 B    │ 0 B  │ 64      │ 0             │
│ n5-c1  │ C1      │ 17      │ 0         │ 0        │ 0 B   │ 0 B    │ 0 B  │ 227     │ 42 / 18.502%  │
├────────┼─────────┼─────────┼───────────┼──────────┼───────┼────────┼──────┼─────────┼───────────────┤
│        │         │ 84      │ 0         │ 0        │ 0 B   │ 0 B    │ 0 B  │ 1,628   │ 676           │
╰────────┴─────────┴─────────┴───────────┴──────────┴───────┴────────┴──────┴─────────┴───────────────╯

Expected behavior

The expectation was to see a balanced distribution even without the wait between the create calls.
In my use case I need to create multiple streams on a Jetstream cluster and such a behaviour might cause performance issues.

I could add a wait to help with the issue but that creates a long delay in the init process when creating a large number of streams.

It would be great if you could highlight if this is the expected behaviour or if there is some other way in which the issue can be remediated?

Server and client version

Nats Server Version: nats-server: v2.10.9
Client version: nats --version 0.1.1

Host environment

uname -a

Darwin  22.3.0 Darwin Kernel Version 22.3.0: Mon Jan 30 20:38:37 PST 2023; root:xnu-8792.81.3~2/RELEASE_ARM64_T6000 arm64

CPU: Apple M1 Pro arm64

Steps to reproduce

  1. Create a Jetstream cluster with 5 server nodes, I'm using the following config for the nodes and starting the servers individually using nats-server -js -c node.conf.
    Each server having a unique name and the port is added to the cluster
server_name=n1-c1
listen=4222

include sys.conf
 
jetstream {
   store_dir=nats/storage
}
 
cluster {
  name: C1
  listen: 0.0.0.0:6222
  routes: [
  	nats://0.0.0.0:6222
        nats://0.0.0.0:6223
        nats://0.0.0.0:6224
        nats://0.0.0.0:6225
        nats://0.0.0.0:6226
  ]
}
  1. Once all the servers are up and running, Create multiple JS streams. All streams here have an identical configuration apart from having a unique name and subject.

I'm using the nats-cli to create 30 streams

for i in {1..30}; do nats --context east-sys stream create bar$i --subjects="test$i.*" --ack --max-msgs=-1 --max-bytes=-1 --max-age=1y --storage file --retention limits --max-msg-size=-1 --discard old --dupe-window="0s" --no-allow-rollup --max-msgs-per-subject=-1 --no-deny-delete  --no-deny-purge --replicas 3; done

The configuration of the streams are as follows

Information for Stream bar1 created 2024-02-12 20:15:57

              Subjects: test1.*
              Replicas: 3
               Storage: File

Options:

             Retention: Limits
       Acknowledgments: true
        Discard Policy: Old
      Duplicate Window: 2m0s
            Direct Get: true
     Allows Msg Delete: true
          Allows Purge: true
        Allows Rollups: false

Limits:

      Maximum Messages: unlimited
   Maximum Per Subject: unlimited
         Maximum Bytes: unlimited
           Maximum Age: 1y0d0h0m0s
  Maximum Message Size: unlimited
     Maximum Consumers: unlimited

Cluster Information:

                  Name: C1
                Leader: n3-c1
               Replica: n1-c1, current, seen 344ms ago
               Replica: n5-c1, current, seen 344ms ago

State:

              Messages: 0
                 Bytes: 0 B
        First Sequence: 0
         Last Sequence: 0
      Active Consumers: 0
  1. Once all streams are created check the Jetstream server report to find the stream count on each server node
nats --context east-sys-ac server report jetstream
╭─────────────────────────────────────────────────────────────────────────────────────────────────────╮
│                                          JetStream Summary                                          │
├────────┬─────────┬─────────┬───────────┬──────────┬───────┬────────┬──────┬─────────┬───────────────┤
│ Server │ Cluster │ Streams │ Consumers │ Messages │ Bytes │ Memory │ File │ API Req │ API Err       │
├────────┼─────────┼─────────┼───────────┼──────────┼───────┼────────┼──────┼─────────┼───────────────┤
│ n1-c1* │ C1      │ 28      │ 0         │ 0        │ 0 B   │ 0 B    │ 0 B  │ 443     │ 226 / 51.015% │
│ n2-c1  │ C1      │ 0       │ 0         │ 0        │ 0 B   │ 0 B    │ 0 B  │ 442     │ 251 / 56.787% │
│ n3-c1  │ C1      │ 28      │ 0         │ 0        │ 0 B   │ 0 B    │ 0 B  │ 87      │ 0             │
│ n4-c1  │ C1      │ 0       │ 0         │ 0        │ 0 B   │ 0 B    │ 0 B  │ 52      │ 0             │
│ n5-c1  │ C1      │ 28      │ 0         │ 0        │ 0 B   │ 0 B    │ 0 B  │ 146     │ 0             │
├────────┼─────────┼─────────┼───────────┼──────────┼───────┼────────┼──────┼─────────┼───────────────┤
│        │         │ 84      │ 0         │ 0        │ 0 B   │ 0 B    │ 0 B  │ 1,170   │ 477           │
╰────────┴─────────┴─────────┴───────────┴──────────┴───────┴────────┴──────┴─────────┴───────────────╯
  1. If the same steps are followed but with the slight modification of adding a sleep interval between the stream creation we are able to see a well balanced system
for i in {1..30}; do sleep 3; nats --context east-sys stream create bar$i --subjects="test$i.*" --ack --max-msgs=-1 --max-bytes=-1 --max-age=1y --storage file --retention limits --max-msg-size=-1 --discard old --dupe-window="0s" --no-allow-rollup --max-msgs-per-subject=-1 --no-deny-delete  --no-deny-purge --replicas 3; done
nats --context east-sys-ac server report jetstream
╭────────────────────────────────────────────────────────────────────────────────────────────────────╮
│                                          JetStream Summary                                         │
├────────┬─────────┬─────────┬───────────┬──────────┬───────┬────────┬──────┬─────────┬──────────────┤
│ Server │ Cluster │ Streams │ Consumers │ Messages │ Bytes │ Memory │ File │ API Req │ API Err      │
├────────┼─────────┼─────────┼───────────┼──────────┼───────┼────────┼──────┼─────────┼──────────────┤
│ n1-c1  │ C1      │ 18      │ 0         │ 0        │ 0 B   │ 0 B    │ 0 B  │ 82      │ 0            │
│ n2-c1* │ C1      │ 15      │ 0         │ 0        │ 0 B   │ 0 B    │ 0 B  │ 191     │ 87 / 45.549% │
│ n3-c1  │ C1      │ 17      │ 0         │ 0        │ 0 B   │ 0 B    │ 0 B  │ 49      │ 0            │
│ n4-c1  │ C1      │ 19      │ 0         │ 0        │ 0 B   │ 0 B    │ 0 B  │ 27      │ 0            │
│ n5-c1  │ C1      │ 18      │ 0         │ 0        │ 0 B   │ 0 B    │ 0 B  │ 92      │ 0            │
├────────┼─────────┼─────────┼───────────┼──────────┼───────┼────────┼──────┼─────────┼──────────────┤
│        │         │ 87      │ 0         │ 0        │ 0 B   │ 0 B    │ 0 B  │ 441     │ 87           │
╰────────┴─────────┴─────────┴───────────┴──────────┴───────┴────────┴──────┴─────────┴──────────────╯

node_configs.zip

@kohlisid kohlisid added the defect Suspected defect such as a bug or regression label Feb 13, 2024
@derekcollison
Copy link
Member

Longer story but the reason is the selection mechanism is sync but the sorting mechanism works on data that is delivered async from the other servers around mostly HAAssets but also usage etc.

I will look into if we can improve.

derekcollison added a commit that referenced this issue Feb 14, 2024
Calculate peer group based on streams as well as usage and realtime data
on HAAssets vs async.

Resolves: #5071 

Signed-off-by: Derek Collison <derek@nats.io>
@derekcollison
Copy link
Member

Should be fixed now, will be in 2.10.11, next release which may go out this week.

@kohlisid
Copy link
Author

@derekcollison Thanks for the prompt fix on this :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defect Suspected defect such as a bug or regression
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants