Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics of last restarted instance looks a bit overloaded #22254

Open
ankar84 opened this issue Jun 6, 2021 · 4 comments
Open

Metrics of last restarted instance looks a bit overloaded #22254

ankar84 opened this issue Jun 6, 2021 · 4 comments

Comments

@ankar84
Copy link

ankar84 commented Jun 6, 2021

Description:

We have docker deployment with 25 instances for users connections and 2 instances for present monitor without users (actually one with presence monitor enabled).
But we have couple instances with strange load according to Rocket chat Metrics.
That instances have completely similar configurations in docker-compose.yml file and normal users amount (not more then other instances)

Steps to reproduce:

  1. Setup docker deployment and Metrics monitoring
  2. Check Metrics for some graphs
  3. Few instances behave strange

Expected behavior:

All instances should have similar load and metrics.

Actual behavior:

Here is a metric size graph
image
As you can see 5 instance on 2 server have much more metrics size
image
And another screen right now
image
And 4 instance on that same 2 server too
image
Event loop lag is bigger then other instances
image
image
Pod heap same as others
image
But amount of WS sessions is even less then other instances
image
image

Server Setup Information:

  • Version of Rocket.Chat Server: 3.15.0
  • Operating System: CentOS7
  • Deployment Method: docker
  • Number of Running Instances: 25
  • DB Replicaset Oplog: Enabled
  • NodeJS Version: 12.22.1
  • MongoDB Version: 4.2.14 WiredTiger

Client Setup Information

  • Desktop App or Browser Version: 3.2.2
  • Operating System: Windows 10

Additional context

That graphs collected at weekend with really low load, but metrics for that few instances behave same all times.
In general all 5 (last) instances have more metrics size and event loop lag.

We restart every instances one-by-one and in each server I restart from 5 to 1 instances (5-4-3-2-1)
So 5 instance of 5 server restarted first and 1 instance of 2 server restarted last (actually 1 instance of 1 server restarted last, but 1 server holds 2 instances without user sessions)

Relevant logs:

No

@johncrisp
Copy link

Thanks for reporting Anton.

I have this on the 'performance Issues' list

@ankar84
Copy link
Author

ankar84 commented Aug 11, 2021

We restart instances every night to mitigate RC memory leak and other performance issue and I found that these much more loaded instance is that actual instance that was rebooted last.

We change order of reboot and now you can see another instance with much more metric size and other metrics
image
image

So I will change topic a bit.

@ankar84 ankar84 changed the title Metrics of few instances looks overloaded without reasons Metrics of last restarted instance looks a bit overloaded Aug 11, 2021
@ankar84
Copy link
Author

ankar84 commented Sep 20, 2021

still see it in 3.18.1

@ankar84
Copy link
Author

ankar84 commented Jan 11, 2022

Still on 4.2.2
image
image
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants