New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase ZT_MAX_PEER_NETWORK_PATHS to 128 #1891
Conversation
No to idea 1 please :) If you're on wifi and ethernet, and have ipv6, you have: 2 ipv4 addresses
is 128 enough? |
I wanted to be conservative in the increase and let people re-complain if needed. I'm not opposed to a higher number if we can prove it doesn't degrade performance or consume too much memory. |
If doubling the number of tracked paths increases the (almost) idle network CPU base load you're closing in on getting CPU bound on an Apple M1 Max running macOS Ventura. Please spend the resources to understand the problem. The 1.10.3 release is unusable in production networks for me because of the drastic performance regressions going as far as effectively cutting the connection between certain nodes on my network all while burning through the battery runtime faster than Chrome with dozens of tabs of social media sides (auto-)playing noise. I have serval questions and don't know the best way to interact with a source available project as both an open source developer as well as a paying customer. These are the questions that I asked myself looking at this problem:
Please don't just bump the limit hoping the problem will go away and not come back. |
Would it be hard to make it runtime configurable in local.conf? So embedded people can set it to (low number). |
+1 to local.conf
…------- Original Message -------
On Tuesday, March 7th, 2023 at 10:52 AM, Travis LaDuke ***@***.***> wrote:
Would it be hard to make it runtime configurable in local.conf? So embedded people can set it to (low number).
I'm biased towards it just working for everyday users and letting engineers fiddle with fiddly bits.
—
Reply to this email directly, [view it on GitHub](#1891 (comment)), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/AAAET27S5SSCMRT6QWPDB6DW257WNANCNFSM6AAAAAAVSZ62VQ).
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
Of course we are. This is why I've asked you for more details in the linked discussion. Thanks again btw.
Yes. This was explained in the linked discussion.
Depends entirely on your unique network conditions. Impossible for us to know.
I'm not sure what you mean. Can you rephrase this if it wasn't answered elsewhere?
Probably a more restrictive aliveness timer based on the last packet received. Mentioned in the PR but not yet included.
This is described in the PR
That is unique to each case. We cannot know.
Unknown right now which is why this is a PR on the dev branch to receive testing
The overhead should be insignificant and nearly undetectable. The spamming of
This performance degradation is unrelated to path learning, it is likely related to the fixing of another bug.
This isn't what is suggested. This PR is a work in progress and more changes will eventually be included before it is merged. |
Very possible. But the advantage of a statically allocated array would be less performance overhead per packet. However this could be minimal if we find the right tricks. Worth testing. |
Just pure speculative but it seem the same problem is hitting android as well. A root cause analysis with a potential fix other than max paths would be welcome. https://discuss.zerotier.com/t/android-zerotier-keep-stop-working-after-1-10-3-update |
Closing this and continuing in a different PR: #1914 |
Too many paths
Historically if ZeroTier received a packet on a path that had the same IP but a different port it would simply overwrite that entry and learn the new path (to a max of
16
). This worked well enough for simple cases. In fact, the vast majority of cases. However if we look at the logs of those early versions you could see ZeroTier constantly learning, forgetting, and re-learning the same paths. Even in this state it wasn't too big of a problem since it didn't prevent communication (the path would be re-learnt for each incoming packet just in time) and didn't eat up much CPU, but it was still an undesirable state.In response to this I changed how we learn paths and bumped the max allowed paths to
64
and now ZeroTier will accumulate more paths (this isn't the dupe path issue), and it will only overwrite an entry if there are no slots left, even when it overwrites a path it will prioritize based on similarity to the new path. This seems to work a lot better but the side effect is that we end up with huge numbers of paths when people use multiple interfaces with multiple addresses. In most cases this is fine. However we're seeing people reporting high CPU usage when instances of ZeroTier are running into the64
max cap since people are using multiple interfaces with multiple addresses on both ends. (See: https://discuss.zerotier.com/t/constant-high-cpu-usage-on-raspberry-pi-4/11533/32)Here are the options as I see it:
Ignore reports and let the high CPU bugs persist.
Once again bump the max path count to
128
orwhatever
. This doesn't solve the issue but does kick the can down the road. Even though a linear search happens for each incoming/outgoing packet I don't think it's a significant performance bottleneck and even if it is, it would only affect a small number of people filling up the array. One additional side effect would be an increased memory footprint that wouldn't be severe but possibly an issue on memory constrained environments like mobile where memory per process is absurdly low.introduce a learning backoff timer but if we aren't learning all the paths possible then there might be missed traffic.
Forget paths more quickly
My inclination would be to bump it to
128
and confirm that it doesn't cause too much additional memory usage and possibly tune ZeroTier to forget paths more quickly.Thoughts?