Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve length field prepending on bungee -> server connection #3451

Closed
wants to merge 1 commit into from

Conversation

Janmm14
Copy link
Contributor

@Janmm14 Janmm14 commented Mar 20, 2023

Use alternative implementation of Varint21LengthFieldPrepender on bungee -> server connection for improved speed - it uses separate buffer to prepend the length to avoid copying large data buffer.

Not applied bungee -> client because encrypting 1-5 bytes of length separately through expensive jni call could make it not worth (not measured).

Netty's LengthFieldPrepender also uses an extra buffer for the length.

Use alternative implementation of Varint21LengthFieldPrepender on bungee -> server connection for improved speed - it uses separate buffer to prepend the length to avoid copying large data buffer.
Not applied bungee -> client because encrypting 1-5 bytes of length separately through expensive jni call could make it not worth (not measured).
@md-5 md-5 self-assigned this Mar 20, 2023
@Janmm14
Copy link
Contributor Author

Janmm14 commented Mar 23, 2023

This will theoretically synergize well with #3393

md-5 pushed a commit that referenced this pull request Mar 25, 2023
Use alternative implementation of Varint21LengthFieldPrepender on bungee -> server connection for improved speed - it uses separate buffer to prepend the length to avoid copying large data buffer.
Not applied bungee -> client because encrypting 1-5 bytes of length separately through expensive jni call could make it not worth (not measured).
@md-5
Copy link
Member

md-5 commented Mar 25, 2023

Thanks

@md-5 md-5 closed this Mar 25, 2023
@bob7l
Copy link
Contributor

bob7l commented Mar 26, 2023

It'd probably be much faster in terms of performance to use a heapBuffer considering you're only working with a couple bytes of memory.

@Janmm14
Copy link
Contributor Author

Janmm14 commented Mar 26, 2023

It'd probably be much faster in terms of performance to use a heapBuffer considering you're only working with a couple bytes of memory.

Afaik epoll requires direct/unsafe byte buffers anyway, buffers without a mem address require extra handling.

The buffer() method used by netty's lengthfieldprepender (i used ioBuffer() ) by default prefers direct buffers too.

Two types of buffer impls in use might also remove some hotspot optimizations which would be appliable if only one buffer impl is used etc.

I wouldn't go against what netty itself uses, if you believe that heap buffers should be used / are faster for a few bytes, you could open an issue at netty about their lengthfieldprepender (after you benchmarked that it's slower).
Maybe you could also just create a discussion or ask in their discord about it and someone might be able to answer you or benchmarks it for you.

@bob7l
Copy link
Contributor

bob7l commented Mar 27, 2023

It'd probably be much faster in terms of performance to use a heapBuffer considering you're only working with a couple bytes of memory.

Afaik epoll requires direct/unsafe byte buffers anyway, buffers without a mem address require extra handling.

The buffer() method used by netty's lengthfieldprepender (i used ioBuffer() ) by default prefers direct buffers too.

Two types of buffer impls in use might also remove some hotspot optimizations which would be appliable if only one buffer impl is used etc.

I wouldn't go against what netty itself uses, if you believe that heap buffers should be used / are faster for a few bytes, you could open an issue at netty about their lengthfieldprepender (after you benchmarked that it's slower). Maybe you could also just create a discussion or ask in their discord about it and someone might be able to answer you or benchmarks it for you.

I was more concerned with writing the varInt to a direct buffer using the current loop implementation (Netty uses bulk writes) which would likely be much slower. Allocation I doubt really matters since we're using pooling. But since most packets are small, I doubt there would be any benefit from using a heap buffer. Not worth going through the trouble of all the benchmarking required for a micro-optimization, and it'll obviously lead to slowdowns further down the chain

Onto these bulk operations though. We could easily utilize bulk writes like writeShort/WriteMedium/etc in this implementation (Or really any pertaining to VarInts). We already need to find a VarInt length for allocation, might as well use it for single-writing our VarInt too as opposed to writing the VarInt in a loop. Would most likely get smacked by the JIT and or further runtime optimizations too.

@Janmm14
Copy link
Contributor Author

Janmm14 commented Mar 27, 2023

Onto these bulk operations though. We could easily utilize bulk writes like writeShort/WriteMedium/etc in this implementation (Or really any pertaining to VarInts). We already need to find a VarInt length for allocation, might as well use it for single-writing our VarInt too as opposed to writing the VarInt in a loop. Would most likely get smacked by the JIT and or further runtime optimizations too.

I did a benchmark with combined writes (short / medium / int / int+byte) and I couldn't find a significant, reproducable difference. Run-by-run differences too high.

Upgrading jvm to 19 from 16 has easily more impact in my benchmark. Nope, just blackhole impl difference or idk what.

Edit: Here is my benchmark code: https://gist.github.com/Janmm14/1afff414893bac7e0daf85492912ed22

Edit:
I slightly altered the benchmark.

New code: https://gist.github.com/Janmm14/67cd16e0966b8f0c2a4bfc3de33c3bd2
Results: https://gist.github.com/Janmm14/9d8396aff212f91135ce8200d8380c42

java ran with CPU frequency locked, priority on high and pc not being actively used for anything else.
Results vary too much to be meaningful - there is at max a slight hint that the custom impl with a switch is a little bit faster, but its definitely not enough. In other runs I have seen orig impl also for max of 60000 producing less than 10ns/op.

Yet another edit:
I upped DATA_COUNT to 1_000_000, this got rid of the hikkups (maybe they came from inaccurate per-invocation measurements). The data suggests that there is a 10% improvement (more for bigger numbers) when we use a switch and consolidated buffer write calls - I also would spend the results of benchmarks without a 2 (the benchmakrs with blackhole) more importance than those without blackhole.
Results: https://gist.github.com/Janmm14/aee19d08ffcb34df251f0ff0ad11113c

@Leymooo
Copy link

Leymooo commented Mar 27, 2023

New code: https://gist.github.com/Janmm14/67cd16e0966b8f0c2a4bfc3de33c3bd2

why use Level.Invocation?
Your becnhmark is broken, you have different test data for each benchmark, one benchmark can have more 4,5 bytes varint while another can have more 1,2 bytes varints.

Also this was measured before https://steinborn.me/posts/performance/how-fast-can-you-write-a-varint/ ,and it shows that unrolling varint writting loop gives a lot of performance(https://github.com/astei/varint-writing-showdown)

@Janmm14
Copy link
Contributor Author

Janmm14 commented Mar 27, 2023

why use Level.Invocation? Your becnhmark is broken, you have different test data for each benchmark, one benchmark can have more 4,5 bytes varint while another can have more 1,2 bytes varints.

My thought was that 100k or 1 million would be large enough to roughly even the randomness, especially for the random runs with ints <30k/60k. I will run a short test to see if not using invocation level gives changes.

Also this was measured before https://steinborn.me/posts/performance/how-fast-can-you-write-a-varint/ ,and it shows that unrolling varint writting loop gives a lot of performance(https://github.com/astei/varint-writing-showdown)

Testing stuff, the main difference between our benchmarks is not randomness of data, but the data_count difference, making the bottleneck in my benchmark to probably be memory bandwhich.
Lowering DATA_COUNT to 2048 causes my benchmark to reach similar speed to his zen2 result (me being on zen3 without boost clocks for testing) (3.1ns vs 5.1ns without blackhole, 5.7ns vs 7.7ns with b) - 40% reduction.

Using a DATA_COUNT of 32768 (prob causing branch prediction cache to be less relevant) gives me 6.2ns vs 8.1ns (7.7ns vs 9.4ns with blackhole) even when limiting the max number to just 10k. (9.6ns vs 16ns when using full integer range).

Given this, the real-world-speedups inside bungee are nowhere near 80% (most packets and other sent varint numbers are smaller), but they are still emasureable. PR is coming.

WejsoneKK added a commit to WejsoneKK/BungeeCord-BotFilter that referenced this pull request Jun 24, 2023
* SpigotMC#3451: Improve length field prepending on bungee -> server connection

Use alternative implementation of Varint21LengthFieldPrepender on bungee -> server connection for improved speed - it uses separate buffer to prepend the length to avoid copying large data buffer.
Not applied bungee -> client because encrypting 1-5 bytes of length separately through expensive jni call could make it not worth (not measured).

* Remove use of internal gson API

* Deprecate string join

* Update native libraries

* SpigotMC#3452: Bump animal-sniffer-maven-plugin from 1.22 to 1.23

Bumps [animal-sniffer-maven-plugin](https://github.com/mojohaus/animal-sniffer) from 1.22 to 1.23.
- [Release notes](https://github.com/mojohaus/animal-sniffer/releases)
- [Commits](mojohaus/animal-sniffer@animal-sniffer-parent-1.22...1.23)

---
updated-dependencies:
- dependency-name: org.codehaus.mojo:animal-sniffer-maven-plugin
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update dependencies

* Update Github actions to ubuntu-22.04

* SpigotMC#3456: Bump netty-bom from 4.1.90.Final to 4.1.91.Final

Bumps [netty-bom](https://github.com/netty/netty) from 4.1.90.Final to 4.1.91.Final.
- [Release notes](https://github.com/netty/netty/releases)
- [Commits](netty/netty@netty-4.1.90.Final...netty-4.1.91.Final)

---
updated-dependencies:
- dependency-name: io.netty:netty-bom
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Add profile for Java 20 compilation of bootstrap

* SpigotMC#3457: Bump maven-resolver-connector-basic from 1.9.7 to 1.9.8

Bumps [maven-resolver-connector-basic](https://github.com/apache/maven-resolver) from 1.9.7 to 1.9.8.
- [Release notes](https://github.com/apache/maven-resolver/releases)
- [Commits](apache/maven-resolver@maven-resolver-1.9.7...maven-resolver-1.9.8)

---
updated-dependencies:
- dependency-name: org.apache.maven.resolver:maven-resolver-connector-basic
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* SpigotMC#3458: Bump maven-resolver-transport-http from 1.9.7 to 1.9.8

Bumps [maven-resolver-transport-http](https://github.com/apache/maven-resolver) from 1.9.7 to 1.9.8.
- [Release notes](https://github.com/apache/maven-resolver/releases)
- [Commits](apache/maven-resolver@maven-resolver-1.9.7...maven-resolver-1.9.8)

---
updated-dependencies:
- dependency-name: org.apache.maven.resolver:maven-resolver-transport-http
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* SpigotMC#3459: Bump mysql-connector-j from 8.0.32 to 8.0.33

Bumps [mysql-connector-j](https://github.com/mysql/mysql-connector-j) from 8.0.32 to 8.0.33.
- [Release notes](https://github.com/mysql/mysql-connector-j/releases)
- [Changelog](https://github.com/mysql/mysql-connector-j/blob/release/8.0/CHANGES)
- [Commits](mysql/mysql-connector-j@8.0.32...8.0.33)

---
updated-dependencies:
- dependency-name: com.mysql:mysql-connector-j
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update dependabot.yml

* Add maven-enforcer-plugin for dependency convergence

* SpigotMC#3460: Bump maven-checkstyle-plugin from 3.2.1 to 3.2.2

Bumps [maven-checkstyle-plugin](https://github.com/apache/maven-checkstyle-plugin) from 3.2.1 to 3.2.2.
- [Release notes](https://github.com/apache/maven-checkstyle-plugin/releases)
- [Commits](apache/maven-checkstyle-plugin@maven-checkstyle-plugin-3.2.1...maven-checkstyle-plugin-3.2.2)

---
updated-dependencies:
- dependency-name: org.apache.maven.plugins:maven-checkstyle-plugin
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* SpigotMC#3461: Bump netty-bom from 4.1.91.Final to 4.1.92.Final

Bumps [netty-bom](https://github.com/netty/netty) from 4.1.91.Final to 4.1.92.Final.
- [Release notes](https://github.com/netty/netty/releases)
- [Commits](netty/netty@netty-4.1.91.Final...netty-4.1.92.Final)

---
updated-dependencies:
- dependency-name: io.netty:netty-bom
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* SpigotMC#3464: Bump maven-gpg-plugin from 3.0.1 to 3.1.0

Bumps [maven-gpg-plugin](https://github.com/apache/maven-gpg-plugin) from 3.0.1 to 3.1.0.
- [Commits](apache/maven-gpg-plugin@maven-gpg-plugin-3.0.1...maven-gpg-plugin-3.1.0)

---
updated-dependencies:
- dependency-name: org.apache.maven.plugins:maven-gpg-plugin
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump maven-resolver dependencies

* SpigotMC#3455: Don't lock connections for offline uuid lookup when given uuid is not offline mode

* SpigotMC#3466: Bump maven-checkstyle-plugin from 3.2.2 to 3.3.0

Bumps [maven-checkstyle-plugin](https://github.com/apache/maven-checkstyle-plugin) from 3.2.2 to 3.3.0.
- [Commits](apache/maven-checkstyle-plugin@maven-checkstyle-plugin-3.2.2...maven-checkstyle-plugin-3.3.0)

---
updated-dependencies:
- dependency-name: org.apache.maven.plugins:maven-checkstyle-plugin
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* SpigotMC#3467: Bump maven-source-plugin from 3.2.1 to 3.3.0

Bumps [maven-source-plugin](https://github.com/apache/maven-source-plugin) from 3.2.1 to 3.3.0.
- [Commits](apache/maven-source-plugin@maven-source-plugin-3.2.1...maven-source-plugin-3.3.0)

---
updated-dependencies:
- dependency-name: org.apache.maven.plugins:maven-source-plugin
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* SpigotMC#3468: Bump lombok from 1.18.26 to 1.18.28

Bumps [lombok](https://github.com/projectlombok/lombok) from 1.18.26 to 1.18.28.
- [Release notes](https://github.com/projectlombok/lombok/releases)
- [Changelog](https://github.com/projectlombok/lombok/blob/master/doc/changelog.markdown)
- [Commits](projectlombok/lombok@v1.18.26...v1.18.28)

---
updated-dependencies:
- dependency-name: org.projectlombok:lombok
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* SpigotMC#3469: Bump netty-bom from 4.1.92.Final to 4.1.93.Final

Bumps [netty-bom](https://github.com/netty/netty) from 4.1.92.Final to 4.1.93.Final.
- [Commits](netty/netty@netty-4.1.92.Final...netty-4.1.93.Final)

---
updated-dependencies:
- dependency-name: io.netty:netty-bom
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* sodium fix

* prepare for 1.20

* compile fix?

* 1.20 support

* Fix

* [Release] 3.8.13

* Add Support 1.20.1, Change Brand-Name

* Add Polish Lang to Log and botfilter messages

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Janmm14 <gitconfig1@janmm14.de>
Co-authored-by: md_5 <git@md-5.net>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Janmm14 <Janmm14@users.noreply.github.com>
Co-authored-by: Leymooo <Vjatseslav.Maspanov@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants