Metrics of last restarted instance looks a bit overloaded #22254

ankar84 · 2021-06-06T06:11:17Z

Description:

We have docker deployment with 25 instances for users connections and 2 instances for present monitor without users (actually one with presence monitor enabled).
But we have couple instances with strange load according to Rocket chat Metrics.
That instances have completely similar configurations in docker-compose.yml file and normal users amount (not more then other instances)

Steps to reproduce:

Setup docker deployment and Metrics monitoring
Check Metrics for some graphs
Few instances behave strange

Expected behavior:

All instances should have similar load and metrics.

Actual behavior:

Here is a metric size graph

As you can see 5 instance on 2 server have much more metrics size

And another screen right now

And 4 instance on that same 2 server too

Event loop lag is bigger then other instances

Pod heap same as others

But amount of WS sessions is even less then other instances

Server Setup Information:

Version of Rocket.Chat Server: 3.15.0
Operating System: CentOS7
Deployment Method: docker
Number of Running Instances: 25
DB Replicaset Oplog: Enabled
NodeJS Version: 12.22.1
MongoDB Version: 4.2.14 WiredTiger

Client Setup Information

Desktop App or Browser Version: 3.2.2
Operating System: Windows 10

Additional context

That graphs collected at weekend with really low load, but metrics for that few instances behave same all times.
In general all 5 (last) instances have more metrics size and event loop lag.

We restart every instances one-by-one and in each server I restart from 5 to 1 instances (5-4-3-2-1)
So 5 instance of 5 server restarted first and 1 instance of 2 server restarted last (actually 1 instance of 1 server restarted last, but 1 server holds 2 instances without user sessions)

Relevant logs:

No

johncrisp · 2021-06-06T21:25:59Z

Thanks for reporting Anton.

I have this on the 'performance Issues' list

ankar84 · 2021-08-11T02:47:38Z

We restart instances every night to mitigate RC memory leak and other performance issue and I found that these much more loaded instance is that actual instance that was rebooted last.

We change order of reboot and now you can see another instance with much more metric size and other metrics

So I will change topic a bit.

ankar84 · 2021-09-20T09:29:33Z

still see it in 3.18.1

ankar84 · 2022-01-11T05:27:37Z

Still on 4.2.2

ankar84 changed the title ~~Metrics of few instances looks overloaded without reasons~~ Metrics of last restarted instance looks a bit overloaded Aug 11, 2021

ankar84 mentioned this issue Jan 24, 2022

[Bug] All apps are disabled after restart of instance with highest value of Active Handlers in muti-instance deployment #24260

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Metrics of last restarted instance looks a bit overloaded #22254

Metrics of last restarted instance looks a bit overloaded #22254

ankar84 commented Jun 6, 2021 •

edited

johncrisp commented Jun 6, 2021

ankar84 commented Aug 11, 2021 •

edited

ankar84 commented Sep 20, 2021

ankar84 commented Jan 11, 2022 •

edited

Metrics of last restarted instance looks a bit overloaded #22254

Metrics of last restarted instance looks a bit overloaded #22254

Comments

ankar84 commented Jun 6, 2021 • edited

Description:

Steps to reproduce:

Expected behavior:

Actual behavior:

Server Setup Information:

Client Setup Information

Additional context

Relevant logs:

johncrisp commented Jun 6, 2021

ankar84 commented Aug 11, 2021 • edited

ankar84 commented Sep 20, 2021

ankar84 commented Jan 11, 2022 • edited

ankar84 commented Jun 6, 2021 •

edited

ankar84 commented Aug 11, 2021 •

edited

ankar84 commented Jan 11, 2022 •

edited