-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sentry Self Hosted keeps crashing randomly and maxes out the RAM #2700
Comments
Your Kafka just blew up (this is a regular problem with Kafka on high traffic -- it's totally sucks, I know). The proper solution to this is to scale out Kafka horizontally, that means you'd need to spawn multiple VM and deploy Kafka there (usually it's 3 VM or 5 VM), and you'll end up having to maintain Kafka cluster regularly. But, please don't do this for self-hosted Sentry. It's a total nightmare and maintenance burden. The possible root cause of your problem is simply the dependencies (with majority on Kafka) can't handle high traffic. The solution you can do are:
What you can do to mitigate your current condition is (or basically what I'd do if I were you):
Let me know if that helps |
So I followed your instructions to drop the Kafka volume and this happened: I also followed the "nuclear option" on this page: https://develop.sentry.dev/self-hosted/troubleshooting/ /var/log/error.log shows:
/var/log/access.log shows:
And docker-compose logs show:
Currently trying to get it running again. |
For this error:
Try to apply solution defined here: https://github.com/getsentry/self-hosted/pull/2722/files For this error:
It's usually a problem between Kafka and Zookeeper. My guess is that Zookeeper didn't give out correct state (or maybe it hasn't been started for long enough that it's not stable enough to be used by Kafka). From this, what I'd do:
Would you please try to use the stable release instead of the nightly images? Pin your version to |
I will try those steps shortly. I just figured out what's causing the 502 error, looks like |
Here are some of the things I'd do (I've got that issue as well).
|
Holy crap thank you so much for all your help. Do you have a tip jar? lol Good news. I was able to free enough space by using pruning. I followed your suggestions for After doing this I pulled 24.1.0and reinstalled. I applied the fix for Here's some output from the docker logs, it still shows some errors. These are errors that I've seen in the logs since I initially set up this server months ago. Do you think these can be fixed?
|
I see so many connection refused to kafka. Is the kafka container healthy though? Try |
Sure, here's a chunk of
|
It seems like the Kafka failed to connect to Zookeeper. You should check if your Zookeeper is healthy, or does it have corrupted data. |
How can I do that?
On Jan 27, 2024, at 2:21 AM, Reinaldy Rafli ***@***.***> wrote:
It seems like the Kafka failed to connect to Zookeeper. You should check if your Zookeeper is healthy, or does it have corrupted data.
—
Reply to this email directly, view it on GitHub<#2700 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AC4IOO56MPZFKYJIQKZ5YQ3YQS2GRAVCNFSM6AAAAABBVLWPSWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMJTGA3TINRTGU>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
If you can't find the root cause of the issue, you can actually use Redpanda instead of Kafka + Zookeeper, it saves so much RAM usage. I made a simple guide on Sentry's Discord here: https://discord.com/channels/621778831602221064/796028405833007104/1201076383426809948 -- but please beware that this is not officially supported by the employees of Sentry. If there are any future updates, you'll have to do
|
This issue has gone three weeks without activity. In another week, I will close it. But! If you comment or otherwise update it, I will reset the clock, and if you remove the label "A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀 |
Question for you. You mentioned that it may be beneficial to set a rate limiter middleware on the reverse proxy. I noticed that Sentry > Admin > Settings page has a Could this be used in place of a rate limiter middleware? |
Self-Hosted Version
24.1.0.dev0
CPU Architecture
x86_64
Docker Version
24.0.7
Docker Compose Version
2.21.0
Steps to Reproduce
No specific steps, a few days ago during a planned maintenance window when our servers were down a lot of clients started reporting events, exactly at this time something stopped working and no events were captured after that.
The Sentry dashboard was accessible but the last reported event was from the time when the maintenance window started. The issues never resumed being received after that. I had to restart the containers, then it worked properly.
This is when the issues started but now I'm seeing issues throughout the day, there's not much activity going on right now and the dashboard keeps showing a "Service Unavailable - The service is temporarily unavailable. Please try again later." page every now and then. After a few minutes I'm able to load the dashboard. Looking at the sentry vm, it's using almost 100% of the RAM... and the output from the logs is attached below.
Looking at the logs I see lots of errors regarding kafka. I'm not very familiar with this so I would appreciate any help.
Expected Result
Expected Sentry not crash
Actual Result
Event ID
No response
The text was updated successfully, but these errors were encountered: