Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Beam VM becomes stucked when the number of connections is high #119

Open
pallix opened this issue Jan 14, 2020 · 5 comments
Open

Beam VM becomes stucked when the number of connections is high #119

pallix opened this issue Jan 14, 2020 · 5 comments

Comments

@pallix
Copy link

pallix commented Jan 14, 2020

I have an application that connects to multiple MQTT servers, each running locally in its network namespaces. When the number of connection is too high the beam vm becomes stuck. I have written an example project by extracting code from our codebase and wrote a README to reproduce the problem. You can find it here. I would be happy if you can find the time to try it and tell me if you can reproduce the problem.

I am using Debian 9 (stretch), Elixir 1.8.2 and Erlang/OTP 21.3.8.8.

And thanks for writing this software :-).

P.S: I was once lucky enough to have the observer print a few more graphs before being completely overloaded:

Screenshot_2020-01-06_16-18-00

This is surprising because the rate of creation for the network namespace and start of the mosquitto servers is approx one per second.

@gausby
Copy link
Owner

gausby commented Jan 15, 2020

I am currently in the process of a major rewrite (admittedly it has been going on for a long while), which will bring MQTT 5 support to Tortoise. I hace recently picked up development again, and wrapping my head around what is needed to make it a release candidate, but the architecture differ, so I hope you will let me release that, and then loom at this issue ?

And thanks for using Tortoise; spawning 300 tortoises on a single node is not a use-case I anticipated :)

@pallix
Copy link
Author

pallix commented Jan 15, 2020

It's great to hear that you are planning to further develop Tortoise! Maybe the problem will go away after the rewrite?

Thanks, I will keep an eye on the project development.

And thanks for using Tortoise; spawning 300 tortoises on a single node is not a use-case I anticipated :)

It sounds a big unusual but that's what is need to simulate IoT devices for my team.

@pallix
Copy link
Author

pallix commented Jan 16, 2020

Forgot this information: most of the schedulers processes states were in the same calls when generated a dump (from the original problem, I did not generate a dump of the example):

Current Process CP: 0x00007f26a080db08 ('Elixir.Registry':unregister_match/4 + 952)
Current Process Limited Stack Trace:
0x00007f260d5c9348:SReturn addr 0x15A873D8 ('Elixir.Tortoise.Events':unregister/2 + 152)
0x00007f260d5c93a0:SReturn addr 0x15A79DF0 ('Elixir.Tortoise.Connection':connection/2 + 952)
0x00007f260d5c93a8:SReturn addr 0x15A8FCD8 ('Elixir.Tortoise':publish/4 + 384)
[...]

@gausby
Copy link
Owner

gausby commented Jan 16, 2020

Oh; I have a registry I use as a pubsub, such that processes can subscribe to a tcp socket—I move the tcp socket to the process that will send a QoS=0 message. Could be because the registry gets overwhelmed when too many tortoises are running.

…an interesting case. I will look further into this at a later time.

@pallix
Copy link
Author

pallix commented Jan 22, 2020

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants