Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runq_overload everyday for few minutes #12919

Closed
chaymankala opened this issue Apr 23, 2024 · 4 comments
Closed

runq_overload everyday for few minutes #12919

chaymankala opened this issue Apr 23, 2024 · 4 comments

Comments

@chaymankala
Copy link

chaymankala commented Apr 23, 2024

What happened?

Im getting this alarm runq_overload: VM is overloaded on node: '<node_name>': 104
for around 1-2 mins(duration) everyday.
I have two nodes in my cluster, each node is of 4GB memory
My traffic is almost the same all day,
~420 incoming msgs/sec
~220 outgoing msgs/sec
~32,000 clients
usually my RAM is usage always 1.2GB/4GB, there is no spike in RAM when i receive alerts
But i observed a spike in Disk I/O when there is runq_overload

Please help me in resolving this issue.

What did you expect to happen?

No error everyday, as my system resources are more than enough

How can we reproduce it (as minimally and precisely as possible)?

No response

Anything else we need to know?

No response

EMQX version

$ ./bin/emqx_ctl broker
sysdescr  : EMQX
version   : 5.5.0
datetime  : 2024-04-23T15:23:27.941069362+00:00
uptime    : 38 days, 9 hours, 56 minutes, 6 seconds

OS version

# On Linux:
$ cat /etc/os-release
NAME="Ubuntu"
VERSION="20.04.4 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.4 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
$ uname -a
Linux emqlatest-2 5.4.0-173-generic #191-Ubuntu SMP Fri Feb 2 13:55:07 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Log files

@ieQu1
Copy link
Member

ieQu1 commented Apr 24, 2024

Hello,

This alarm is raised when there are too many Erlang processes with planned tasks. It can happen for many reasons:

  • Once a day the broker broadcasts a message to all of its clients at the same time, so they all become runnable processes.
  • Once a day there is an unrelated cron job that causes CPU starvation for EMQX
  • ...

So it can't be pinpointed to a single cause. Therefore, we need to know more about your setup. What authentication and authorization plugins are enabled? Is there any particular pattern to the client behavior?

@qzhuyan
Copy link
Contributor

qzhuyan commented Apr 25, 2024

Im getting this alarm runq_overload: VM is overloaded on node: '<node_name>': 104

runq_overload means you are lacking of computing (CPU) resources during that period, there are 104 processes cannot be scheduled.

It is not critical issue, if it happens very often, it is a sign to scale up your node.

Do you see CPU spikes? how many cores do you have, are they shared?

But i observed a spike in Disk I/O when there is runq_overload

do you have other service run on the same node?
is it both read/write? every day same time for how long?
Could be your CPU is used for handling IO interrupts.

@qzhuyan qzhuyan removed the BUG label Apr 25, 2024
@ieQu1
Copy link
Member

ieQu1 commented Apr 28, 2024

To slightly elaborate on @qzhuyan 's reply:

there are 104 processes cannot be scheduled.

There are 104 processes that are waiting to be scheduled, to be more precise. Essentially, this alarm tells that the system stopped being soft real-time, and there is CPU time starvation.
Under the ideal conditions the number of runnable processes per scheduler should be in single digits.

@ieQu1
Copy link
Member

ieQu1 commented May 3, 2024

We've provided a high level explanation of the alarm. I close the issue, since no further details were given.

@ieQu1 ieQu1 closed this as not planned Won't fix, can't repro, duplicate, stale May 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants