Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leaking #388

Closed
aphex008 opened this issue Nov 4, 2020 · 17 comments · Fixed by #427 or #429
Closed

Memory leaking #388

aphex008 opened this issue Nov 4, 2020 · 17 comments · Fixed by #427 or #429

Comments

@aphex008
Copy link

aphex008 commented Nov 4, 2020

Hello!

I am observing strange mercure behaviour. It has around 1000 or less subscribers but it seems that memory consumption increases gradually until it eats up all available memory. Then processor usage sky-rocket and situation does not improve until restarted.

Zabix screenshot

I started from 1 core and 2 gigs of ram. Then upped to 2 cores and 4 gigs. Now it runs on 2 cores and 8 gigs of ram. Behaviour seems repeat no matter the amount of ram we give it.

After restart situation gets back to normal.

Zabix screenshot

I'm running official docker installation dunglas/mercure:v0.10.4

What should be expected memory usage? What can be done to debug the situation?

@dunglas
Copy link
Owner

dunglas commented Nov 5, 2020

Hi, and thanks for reporting. To debug this, you'll have to use a debug build with the Go profiler enabled. I can provide you one if you don't know how to do. Feel free to contact me on Symfony's Slack.

@aphex008
Copy link
Author

aphex008 commented Nov 9, 2020

Hello, Kevin, thank you for reaching out to me.
Mercure is not currently critical to my setup so I can deploy debug build to production and reproduce memory leak situation.
I would be very thankful if you provided debug build with go profiler and short instruction on where to find debug information that is relevant to the situation.I'm total newb in go so your guidance is much appreciated.
It's now docker setup. So if you have docker image with profiler enabled that would be very easy to deploy.

@aphex008
Copy link
Author

aphex008 commented Nov 9, 2020

I had an assumption that logging would help not to eat up memory so fast but same happens event with logging.driver: none. We have around 500 topics and around 600-1000 subscribers. There are around 1-2 updates per second.

@dunglas
Copy link
Owner

dunglas commented Nov 9, 2020

I'm in the process of changing the logger to Uber Zap (we have known memory issues with Logrus, which is deprecated anyway). I'll open a PR soon.

@aurelijusrozenas
Copy link

Hi @dunglas,

do you have any timeframe when the next release will be done? We are realy waiting for this... :)

@dunglas
Copy link
Owner

dunglas commented Nov 17, 2020

Could you try the master version to check if it fixes the issue?

@ip512
Copy link

ip512 commented Nov 26, 2020

Hi,
we have the same issue with our running instance of mercure on GCP kubernetes. We made a load test with a small node script to try to reproduce this memory over-consumption. Even with more subscribers and more notifications published (compared to prod), we can't reproduce this memory consumption.
If the issue was coming from logs, it should be reproductible with a test script, right ?

@JSmythSSG
Copy link

Just noticed same thing on our server using Mercure in docker, it eventually uses up all the RAM then the server becomes unstable and we have to restart the container to get it back down to low memory usage.

@dunglas
Copy link
Owner

dunglas commented Dec 1, 2020

Hi! Could you try with the latest alpha version using Caddy? It should fix the issue but confirmation would be very welcome.

Also, this new version has a built-in profiler. If the problem persists this will help a lot.

@aurelijusrozenas
Copy link

Will report later but so far 0.11 version seems to be doing just great! You can see from CPU graph when the version was deployed :)
2020-12-10_17-19

@aphex008
Copy link
Author

aphex008 commented Dec 11, 2020

Looks promising so far! Nice job @dunglas 👍
Memory is slowly increasing since yesterday when @aurelijusrozenas posted but at much slower rate. We had to reboot previous version every hour so this looks much better. 0.11 has held thru the night. We'll post followup.

@aurelijusrozenas
Copy link

So it is much better than it was but it is not enough. Previously it would crash every hour or so now it's ~24 hours. Maybe we just need to give it more RAM but how we do we figure what is reasonable?

2020-12-16_16-14

@dunglas
Copy link
Owner

dunglas commented Dec 16, 2020

I added a documentation entry explaining how to profile the hub: #422
Having a heap and allocs profiles would help a lot. If you aren't able to do it by yourself, could you enable the debug mode and send me privately the URL of your hub?

@dunglas
Copy link
Owner

dunglas commented Dec 17, 2020

Thanks to the data provided by @aurelijusrozenas we identified the issue. It looks like the topic selector cache isn't cleared even when no subscribers use this topic selector anymore. The problem is probably somewhere near https://github.com/dunglas/mercure/blob/main/topic_selector.go#L94

image

@dunglas
Copy link
Owner

dunglas commented Dec 18, 2020

I made some tests and unfortunately I'm not able to reproduce the problem yet locally. A script (JS or anything else) allowing to trigger the memory leak could help a lot.

@dunglas
Copy link
Owner

dunglas commented Dec 19, 2020

For the record, I plan to switch from our built-in cache implementation to risretto. This should fix the issue and improve memory usage for everyone.

@aurelijusrozenas
Copy link

Fixed with v0.11.0-rc.2 🥳
paveikslas

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants