Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KV Get message rate increases with message size #4221

Closed
jjthiessen opened this issue Jun 7, 2023 · 6 comments · Fixed by #4232
Closed

KV Get message rate increases with message size #4221

jjthiessen opened this issue Jun 7, 2023 · 6 comments · Fixed by #4232
Assignees

Comments

@jjthiessen
Copy link

Please see my NATS KV benchmark script (just a wrapper around nats bench) along with transcripts/console output for two runs (one with file storage on SSD, the other with file storage on a ramfs mount) (all part of the same gist). I only included file storage results here, as memory backed KV appears to be fairly uniformly slow (much slower than file storage) at the moment.

The Put/publish rates and trends more-or-less make sense to me/aren't surprising.
However, the Get/subscribe message throughput rates seem to increase with message size.
This is rather surprising (for me, at least) — I would expect the data throughput rate to increase with message size (as it does), but for the message throughput rate to decrease.
The effect of this is that in order to get better KV read performance, one only needs to pad values to produce larger messages (even if the extra content is otherwise useless).

Does anybody know i) why this might be the case; and ii) whether I'm holding it wrong?

For additional context see this thread on Slack.

@jjthiessen jjthiessen changed the title KV get message rate increases with message size KV Get message rate increases with message size Jun 7, 2023
mprimi added a commit to mprimi/nats-server that referenced this issue Jun 8, 2023
@mprimi
Copy link
Contributor

mprimi commented Jun 8, 2023

Hello @jjthiessen, i ran a quick benchmark to check if i could reproduce the "kv.Get is faster for larger value than for smaller values".

The numbers I'm getting are exactly what one would expect:

  • Larger values are slower in terms of op/s (sub-linear slowdown w.r.t. size, good!)
  • Larger values are faster in terms of bps (almost linear, good!)

Screenshot 2023-06-08 at 1 26 03 PM

These numbers are for a single NATS server (which is what you are doing, according to your gist).
And they are produced using the KV benchmark included in nats-server source so you can try them yourself.
(I used a slightly modified version to test GET only, single server, and with the value sizes you are using.

One possibility for the numbers you are seeing is related to how nats bench reports numbers. I'll see if i can confirm that theory.

@jjthiessen
Copy link
Author

It's not just the numbers that are being reported though, it's actual wall clock time. It isn't apparent from my original logs, but if I add time in there, you can see that the nats bench invocations are taking their... time (and the math works out).

...
+ for s in 64 256 1k 4k 16k 64k
+ rm -rf ./js/jetstream
+ sleep 3
+ nats-server-2.9.17 -c server.conf
+ nats kv add bar --max-bucket-size 4500MB --storage file
+ nats bench foo --kv --no-progress --multisubject --bucket bar --storage file --msgs 65536 --maxbytes 4500MB --size 1k --pub 4
14:29:30 KV mode, using the subject name as the KV bucket name. Publishers do puts, subscribers do gets
14:29:30 Starting KV benchmark [bucket=bar, kv=true, msgs=65,536, msgsize=1.0 KiB, maxbytes=4.4 GiB, pubs=4, sub=0, storage=file, replicas=1, pubsleep=0s, subsleep=0s]
14:29:30 Starting KV putter, putting 16,384 messages
14:29:30 Starting KV putter, putting 16,384 messages
14:29:30 Starting KV putter, putting 16,384 messages
14:29:30 Starting KV putter, putting 16,384 messages

Pub stats: 17,159 msgs/sec ~ 16.76 MB/sec
 [1] 4,314 msgs/sec ~ 4.21 MB/sec (16384 msgs)
 [2] 4,304 msgs/sec ~ 4.20 MB/sec (16384 msgs)
 [3] 4,302 msgs/sec ~ 4.20 MB/sec (16384 msgs)
 [4] 4,290 msgs/sec ~ 4.19 MB/sec (16384 msgs)
 min 4,290 | avg 4,302 | max 4,314 | stddev 8 msgs


real	0m3.860s
user	0m2.580s
sys	0m1.925s
+ nats bench foo --kv --no-progress --multisubject --bucket bar --storage file --msgs 65536 --maxbytes 4500MB --size 1k --sub 1
14:29:34 KV mode, using the subject name as the KV bucket name. Publishers do puts, subscribers do gets
14:29:34 Starting KV benchmark [bucket=bar, kv=true, msgs=65,536, msgsize=1.0 KiB, maxbytes=4.4 GiB, pubs=0, sub=1, storage=file, replicas=1, pubsleep=0s, subsleep=0s]
14:29:34 Starting KV getter, trying to get 65,536 messages

Sub stats: 2,074 msgs/sec ~ 2.03 MB/sec


real	0m31.636s
user	0m5.040s
sys	0m3.043s
+ nats-server-2.9.17 --signal ldm
+ for s in 64 256 1k 4k 16k 64k
+ rm -rf ./js/jetstream
+ sleep 3
+ nats-server-2.9.17 -c server.conf
+ nats kv add bar --max-bucket-size 4500MB --storage file
+ nats bench foo --kv --no-progress --multisubject --bucket bar --storage file --msgs 65536 --maxbytes 4500MB --size 4k --pub 4
14:30:09 KV mode, using the subject name as the KV bucket name. Publishers do puts, subscribers do gets
14:30:09 Starting KV benchmark [bucket=bar, kv=true, msgs=65,536, msgsize=4.0 KiB, maxbytes=4.4 GiB, pubs=4, sub=0, storage=file, replicas=1, pubsleep=0s, subsleep=0s]
14:30:09 Starting KV putter, putting 16,384 messages
14:30:09 Starting KV putter, putting 16,384 messages
14:30:09 Starting KV putter, putting 16,384 messages
14:30:09 Starting KV putter, putting 16,384 messages

Pub stats: 16,047 msgs/sec ~ 62.68 MB/sec
 [1] 4,028 msgs/sec ~ 15.74 MB/sec (16384 msgs)
 [2] 4,018 msgs/sec ~ 15.70 MB/sec (16384 msgs)
 [3] 4,017 msgs/sec ~ 15.69 MB/sec (16384 msgs)
 [4] 4,011 msgs/sec ~ 15.67 MB/sec (16384 msgs)
 min 4,011 | avg 4,018 | max 4,028 | stddev 6 msgs


real	0m4.129s
user	0m2.677s
sys	0m1.956s
+ nats bench foo --kv --no-progress --multisubject --bucket bar --storage file --msgs 65536 --maxbytes 4500MB --size 4k --sub 1
14:30:13 KV mode, using the subject name as the KV bucket name. Publishers do puts, subscribers do gets
14:30:13 Starting KV benchmark [bucket=bar, kv=true, msgs=65,536, msgsize=4.0 KiB, maxbytes=4.4 GiB, pubs=0, sub=1, storage=file, replicas=1, pubsleep=0s, subsleep=0s]
14:30:13 Starting KV getter, trying to get 65,536 messages

Sub stats: 4,041 msgs/sec ~ 15.79 MB/sec


real	0m16.253s
user	0m5.087s
sys	0m2.784s
+ nats-server-2.9.17 --signal ldm
...

I was originally using a dev version of nats, but switched back to v0.0.35 for this latest run (which is why --no-context was removed from my nats invocations (I checked that I was using nats://localhost:4222 beforehand)).

@mprimi
Copy link
Contributor

mprimi commented Jun 8, 2023

Alright, I can reproduce the odd results you are seeing if I use the same large number of keys (65k), my earlier test was using just 100 (thanks @jnmoyne!).

So issue seems to be confirmed.

Screenshot 2023-06-08 at 2 54 16 PM

@jjthiessen
Copy link
Author

jjthiessen commented Jun 8, 2023

Fun, right?

In any case, the performance is still better than memory storage.

I've been told that at least some indexing that happens for file based streams don't happen for memory based ones, making memory backed KV particularly slow. It's probably less of an issue with log/stream ingestion type loads, but I do wonder where the tipping point is (indexing overhead vs search gains). Even for log/stream ingestion/slurping, there would be some stream size (message count and/or data volume) threshold where indexing would be helpful (e.g., for subject filtered subscriptions), so it's not just a KV or object storage concern.

Back to the message size vs message throughput issue, though — I imagine that it'd apply equally well to at least some other file backed JS streams (as KV is just JS with certain defaults/conventions, as I understand it), but it'd be interesting to know/see if it's specific to some/which of those options/conventions, or whether it applies more generally to all file backed streams (I haven't tried testing/playing with that).

@derekcollison derekcollison self-assigned this Jun 10, 2023
@derekcollison
Copy link
Member

I will fix, I can see why it degrades, and even worse so for memory stores. Will be part of 2.9.18 next Tuesday.

derekcollison added a commit that referenced this issue Jun 10, 2023
…#4232)

When messages were very small and the key space was very large the
performance of last message gets in the store layer (both file and
memory) would degrade.

If the subject is literal we can optimize and avoid sequence scans that
are needed when multiple subject states need to be considered.

Signed-off-by: Derek Collison <derek@nats.io>

Resolves #4221
mprimi added a commit to mprimi/nats-server that referenced this issue Jun 13, 2023
@mprimi
Copy link
Contributor

mprimi commented Jun 13, 2023

Here's a re-do of the previous benchmark on the latest main. Looking good!

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants