-
Notifications
You must be signed in to change notification settings - Fork 831
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Active failover fails to install foxx app #13915
Comments
Hi @rivet351, Are you able to reproduce the problem in a local setup or only on Azure? I wonder if it's Azure specific or not. Does the initial leader come back up as follower and remain follower? Can you compare the contents of the |
Hi @Simran-B We upgraded to 3.7.10 and have repeated multiple times with failover completing successfully so it seems the self-heal was the issue. Thanks for your help and I'm closing the ticket. |
Hi, I've re-opened the ticket as we have now had this happen to our leader without a failover event (similar errors as seen in the stack trace above). The only fix we have is to re-deploy the same foxx app build. This immediately fixes the issue and service resumes as normal. The two differences in the stack trace this time:
|
@rivet351 Can you confirm that this happened after an upgrade/replace of the Foxx app on the leader which then/later produced this log output? |
Hi, It looks to have happened randomly, several days after our last deployment. Timeline for this looks like (v.3.7.10): |
we too are seeing the issue on 3.7.6. Started after a failover. |
|
Hi, Any update on this? |
Unfortunately not. I created an internal ticket https://arangodb.atlassian.net/browse/BTS-484 for tracking the issue. |
Hi, |
My Environment
Component, Query & Data
Affected feature:
Installation of Foxx app following failover event
Replication Factor & Number of Shards (Cluster only):
Leader with 2 x Followers
Steps to reproduce
2021-04-06T15:18:32Z [1] ERROR [24213] Failed to load Foxx service mounted at "/mailapi"
2021-04-06T15:18:32Z [1] ERROR [24213] via ArangoError: service files missing
2021-04-06T15:18:32Z [1] ERROR [24213] Mount: /mailapi
2021-04-06T15:18:32Z [1] ERROR [24213] at loadInstalledService (/usr/share/arangodb3/js/server/modules/@arangodb/foxx/manager.js:616:13)
2021-04-06T15:18:32Z [1] ERROR [24213] at initLocalServiceMap (/usr/share/arangodb3/js/server/modules/@arangodb/foxx/manager.js:519:23)
2021-04-06T15:18:32Z [1] ERROR [24213] at selfHeal (/usr/share/arangodb3/js/server/modules/@arangodb/foxx/manager.js:245:5)
2021-04-06T15:18:32Z [1] ERROR [24213] at Object.selfHealAll [as healAll] (/usr/share/arangodb3/js/server/modules/@arangodb/foxx/manager.js:196:20)
2021-04-06T15:18:32Z [1] ERROR [24213] at Object.exports.manage (/usr/share/arangodb3/js/server/modules/@arangodb/foxx/queues/manager.js:234:19)
2021-04-06T15:18:32Z [1] ERROR [24213] at eval (eval at (unknown source), :2:50)
2021-04-06T15:18:32Z [1] ERROR [24213] at eval (eval at (unknown source), :3:9)
2021-04-06T15:18:32Z [1] ERROR [24213] at eval (eval at (unknown source), :3:21)
Problem:
The failover does look to take place (new leader is elected and the service attempts to install the foxx app previously working on the original leader. However we see error 503 - foxx app sometimes viewable, sometimes not - the foxx app then needs hard deleting before a new installation working installation can be done. Until this is done the foxx app remains in a partially broken state (sometimes performing the jobs as expected, otherwise returning 503 errors)
Expected result:
Foxx app reinstalls without errors on new leader following failover
The text was updated successfully, but these errors were encountered: