New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MM-52867] feat: add advisor warn log for Elasticsearch #23729
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this is a great idea, I'd rather use a simple job mechanism like we do in various jobs. Eg:
mattermost/server/channels/app/server.go
Line 1220 in f7f2ebf
func runSessionCleanupJob(s *Server) { |
As a future improvement, we can use the admin notices instead. WDYT?
Thanks for the input @isacikgoz, I will make the changes to fit the job system we have in place. I'll also take a look at notices while I'm on it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good!
return | ||
} | ||
|
||
if postCount > 2000000 && userCount > 500 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why also the user count?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is how the webapp displays the warning as well: https://github.com/mattermost/mattermost/blob/master/webapp/channels/src/components/admin_console/workspace-optimization/dashboard_checks/performance.ts#L37
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a bit of changes needed :)
Co-authored-by: Ibrahim Serdar Acikgoz <serdaracikgoz86@gmail.com>
Thanks for the reviews, addressed the above comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also wanted to understand what the end goal here is. Nobody reads warning logs. It's not actionable at all.
A better way might be to have a big red banner in the system console asking admins to use elasticsearch.
server/channels/app/server.go
Outdated
model.CreateRecurringTaskFromNextIntervalTime("Elasticsearch workspace optimization check (startup)", func() { | ||
doElasticsearchWorkspaceOptimizationCheck(s) | ||
}, time.Minute*10) | ||
|
||
// Schedule recurring job | ||
model.CreateRecurringTask("Elasticsearch workspace optimization check", func() { | ||
doElasticsearchWorkspaceOptimizationCheck(s) | ||
}, time.Hour*24) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure this is the right thing? This sets up the same job in two different frequencies, once every 10 mins, once every 24 hours. Any reason we are doing it this way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not, I though it worked in a different way. Let me take a look.
It does have an impact in performance. A full post count can take around 10s on large installations. That consumes a lot of DB CPU. It can be noticeable for large customers. |
Yeah, what I meant is that this only happens if the customer has specific conditions (2m+ posts & 500+ users) and does not have elasticsearch enabled, so it's not like that this is a performance hit for everybody, and besides that, is once a day.
The main idea here is that since this is a WARN level, the customers alert systems will trigger an alert upon seeing this every day. |
Right. The thing is that if it's a small customer, then it's useless to them but we are running a query anyways. And if it's a large customer, it's a performance hit.
I want to confirm that we have data points on this and have confirmed this from multiple large customers that all of them have alert systems configured on WARN levels. AFAI am aware, large customers have alerts on system level health checks like CPU, mem etc. And they process their logs for compliance. I am not aware of any alerts set up from logs, but I might be wrong on that. Just wanted to confirm we aren't going ahead on assumptions. I have two recommendations:
This way we avoid the perf hit and make this more visible to admins. |
@agnivade Valid points. Let me discuss your feedback with the team. |
Summary
This commit addresses the addition of a new warning log line for server admins that advise the usage of Elasticsearch when the post count is too big and Elastic is not enabled.
Notes
This is merely a draft, requesting some feedback since this is my first contribution to the suite:
app.GetAnalytics()
, but that performs more queries that I needed and I didn't wan't to modify that method (at least not without confirming that I should). Maybe we could cache the analytics?COUNT
queries daily. It shouldn't be much in terms of performance, and since the requirements are pretty specific it shouldn't pose an issue. If you have any concerns please let me know.Ticket Link
https://mattermost.atlassian.net/browse/MM-52867
Release Note