Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance could be a lot better #47

Open
chabad360 opened this issue Mar 18, 2021 · 6 comments
Open

Performance could be a lot better #47

chabad360 opened this issue Mar 18, 2021 · 6 comments

Comments

@chabad360
Copy link
Contributor

chabad360 commented Mar 18, 2021

Hey all, I used this library in a project of mine, and in the process discovered that during the processing of each message, a rather large amount of memory is used, far more than necessary. I created a fork with the goal of bringing down the amount of heap allocations as much as possible (primarily for the server). I think I succeeded:

BenchmarkOriginalParsePacket-4        107958         10519 ns/op        4584 B/op         16 allocs/op
BenchmarkMyParsePacket-4             1000000          2298 ns/op         320 B/op         10 allocs/op

This is when running on repl.it, the performance on an 8th gen I7 is much better (around 310 ns/op).

I would like to contribute back some of these changes (some of them are not suited for concurrent workloads without introducing locking). As well as start a discussion about what else could be improved (there are some stuff that I didn't get to).
Please feel free to explore my fork. I should note, I originally (in my master branch) divided up the single file into multiple to make it easier to modify without getting lost, perhaps this is a change that should be introduced as well.

master...chabad360:patch-1

@hypebeast
Copy link
Owner

Hey,

Thanks a lot for your input. I'm always happy to receive a PR for your performance improvements. Furthermore, I'm also open to discuss the restructuring. Maybe you can create two PRs: one for the perf. improvements and one for the restructuring. This makes it a little bit easier for me. And first I want to merge the PRs #39, #40 and #41 before I start with a restructuring.

@chabad360
Copy link
Contributor Author

chabad360 commented Mar 19, 2021

The compare link that I put in the OP is compatible with the current structure.

But I think I would rather do a restructuring first, as that would help make it more clear where the various optimizations are happening. Also it extremely simple to do, as the code itself is already quite organized into what would become various files.

@chabad360
Copy link
Contributor Author

Let's move the conversation about this over to #48, this way we can easily discuss actual code.

@hypebeast
Copy link
Owner

Thanks for the PR. I'll have a look in the next days.

@depili
Copy link

depili commented Jan 24, 2022

One major performance bottleneck that I ended up fixing when running on a rpi with high osc load was the dispatching logic. For some reason the current baseline implementation compiles the regexp patters for each message processing hook for each message. I ended up putting the patterns as compiled regexps into an array and bailing after the first match and optimizing the most used message handlers to the start of the array.

This really bites when you have large amount of messages and handlers for a complex system

@chabad360
Copy link
Contributor Author

I haven't even touched the dispatch portion yet (and I think it could be much more efficient) mainly because I just use a contains in my logic.

But I do intend to make a significant rewrite or this library, so this will definitely be included.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants