[FIX] app status inconsistencies when running multiple instances in a cluster #29219
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Proposed changes (including videos or screenshots)
App status inconsistencies between multiple instances in a cluster boil down to the fact that the Apps-Engine is currently responsible for orchestrating when these events are triggered and is overly verbose in doing so.
Upon analysis, the framework itself should not have the concept of "other instances" - this is a deployment detail of the host system, and as such should be controlled by the host. The correct solution for this problem is to review this notification system, potentially removing it from the framework and leaving the responsibility solely for Rocket.Chat.
However, this is hindering the current app management experience for workspaces, so this PR cuts the control of some notifications that come from the framework (the more problematic ones) and moves the control over to RC in a short and practical way.
This is done by turning the methods of the most problematic events in the
AppActivationBridge
into no-ops, and instead triggering theAppServerNotifier
directly in the api endpoints that are applicable.It is not the most correct solution to the problem, but due to time constraints and urgency this will be applied first so we can move with the correct solution in a future point.
Issue(s)
Steps to test or reproduce
1 - Setup a clustered deployment (6.x), either micro-services or high availability;
2 - Install an app on instance A
3 - App will not be available in other instances
Further comments