Hubs Cloud Rollback for Stack Update Offline to Online + Template update fails and rollback fails #6004

eyesnareinc · 2023-03-21T20:59:30Z

Hubscloud stack update offline to online fails and rolls back. Hubscloud template update fails and update rollback also fails.

To Reproduce
Steps to reproduce the behavior:

Go to Cloud Formation
Click on Stack name
Scroll down to "online/offline option" and update
AppCloudfrontDistribution shows Status: UPDATE_COMPLETE; AppASG shows Status: UPDATE_IN_PROGRESS, Status reason: New instance(s) added to autoscaling group - Waiting on 1 resource signal(s) with a timeout of PT30M
RESULT: Update Failed at AppDb Status Reason: Received 0 SUCCESS signal(s) out of 1. Unable to satisfy 100% MinSuccessfulInstancesPercent requirement; Stack rolled back to previous status.

THEN I TRIED
6. Updating the AWS Template to Hubs Cloud Personal 1.1.5 following the steps here (https://hubs.mozilla.com/docs/hubs-cloud-aws-updating-the-stack.html)
7. RESULT: Update failed at AppDb. Resource handler returned message: "Cannot find upgrade target from 10.serverless_21 with requested version 11.13 for serverless engine mode. Also stack roll back failed.

Expected behavior
Stack should update to online mode and work normally.

Hardware
AWS

Additional context
I've been hosting monthly events since Dec 2020 and this has never occurred.

eyesnareinc · 2023-03-21T21:02:23Z

eyesnareinc · 2023-03-21T21:09:33Z

standtech · 2023-03-22T15:43:20Z

If you are getting this DB error then definitely You are using a IAM admin account not root. Dont forget to cleanup route53 . It just keeps piling up. Also make sure your stack name is "short" , seems to be there is some character limits for the resource names which takes this up in their naming conventions. am also having similar issue.

ikenichiro · 2023-03-23T02:13:24Z

I was able to fix this issue from the following process, however, I'm not 100% confident that this is the right process since I'm facing a different issue and not able to run the stack ( #6005 ).
Logically it should not be a big issue, but consider it at your own risk.

Open CloudFormation and see Resources for "Appdb" and go to RDS, click Modify.
Check the drop down list, and check if it has "compatible with PostgreSQL 11.16". If the version number is slightly different, remember that number.
Go to (https://hubs.mozilla.com/docs/hubs-cloud-aws-updating-the-stack.html) and download the 1.1.5 to local drive.
Replace the version number of 11.13 with the version number indicated in the drop down list, whereas it was 11.16 in my case. To be more specific,, the string you need to modify is right here (Hubs-Foundation/hubs-ops@28b7276) as in this commit changed it from 12 to 11.13, you should change it from 11.13 to 11.16 or some other version number you have checked earlier.
Upload the modified template to S3 bucket, which can be accessed from public.
Copy the file URL (starting with https) and check if it's accessible from a private browsing mode or another browser.
Go back to CloudFormation, press update for the stack and update the template using the template URL from your own S3 bucket.
AppDb issue should be resolved.

standtech · 2023-03-23T07:17:18Z

DB version warning is just a drift not a issue triggering rollback.. I guess Issue is only with APPASG.

eyesnareinc · 2023-03-23T13:28:48Z

After the rollback failed, I created a new stack and tried to use the backup and restore steps. The create failed on AppDb with Status reason: Resource handler returned message: "The engine version you requested for your restored DB cluster (11.16) is not compatible with the engine version of the DB cluster snapshot (10.21). I wound up noting my vault Id and deleting the stack. I'm going to rebuild and make sure I keep a backup. (I did this for the first year but everything was working so well, I stopped.) :)

eyesnareinc · 2023-03-23T13:35:19Z

If you are getting this DB error then definitely You are using a IAM admin account not root. Dont forget to cleanup route53 . It just keeps piling up. Also make sure your stack name is "short" , seems to be there is some character limits for the resource names which takes this up in their naming conventions. am also having similar issue.

The Route 53 piling up of domain names hasn't happened for me in a long time. Regarding IAM and Root. I'm not sure why that would be an issue as long as you've given your IAM account enough privilege. I have always been told not work work in Root mode. I'll note the character limit for resource names. Thank you for your advice! :)

eyesnareinc · 2023-03-23T14:36:22Z

OK, it happened again with a NEW Stack. I just tried to update a new stack (moving it from offline to online). The stack updated seemed to stall in UPDATE_IN_PROGRESS (OK, it happened again with a NEW Stack. I just tried to update a new stack (moving it from offline to online). The stack updated seemed to stall in UPDATE_IN_PROGRESS (New instance(s) added to autoscaling group - Waiting on 1 resource signal(s) with a timeout of PT30M) and eventually the updated failed at AppASG) and eventually the updated failed at AppASG.

ikenichiro · 2023-03-24T11:17:50Z

DB version warning is just a drift not a issue triggering rollback.. I guess Issue is only with APPASG.

Sorry, I was only providing the solution for the roll back failure.
In my opinion, however you would probably need to fix the Appdb problem to have the stack to roll back successfully, or probably after the issues with AppASG/StreamASG would be resolved to launch.
(Its probably a different issue, hence I made a new issue #6005 )
This is from my experience having the same issue, and solving it so I could safely make backups in offline mode or continue to try updates to the stack.

standtech · 2023-03-24T11:40:18Z

I faced db issue , role name character limit , region abbreviation issue etc etc .. but common which I couldn't get around was AppASG issue because you at-least need an access to the EC2 instances to check whats going on there , which you can't until you get admin console working , which never happens if fresh install itself fails . .. other symptoms has workaround. .. So I guess All of us are facing same issues because they are all same templates/AMIs ... Backup-restore would also fail in current scenario as it seems AMI are affected ( must be pulling some script on load) not DB or data on EFS...

eyesnareinc added bug needs triage For bugs that have not yet been assigned a fix priority labels Mar 21, 2023

ikenichiro mentioned this issue Mar 23, 2023

Enterprise Multiserver fails at AppASG and StreamASG upon CloudFormation Update #6005

Closed

Hubs-Foundation locked and limited conversation to collaborators May 4, 2023

matthewbcool converted this issue into discussion #6063 May 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Hubs Cloud Rollback for Stack Update Offline to Online + Template update fails and rollback fails #6004

Hubs Cloud Rollback for Stack Update Offline to Online + Template update fails and rollback fails #6004

eyesnareinc commented Mar 21, 2023

eyesnareinc commented Mar 21, 2023

eyesnareinc commented Mar 21, 2023

standtech commented Mar 22, 2023

ikenichiro commented Mar 23, 2023

standtech commented Mar 23, 2023

eyesnareinc commented Mar 23, 2023

eyesnareinc commented Mar 23, 2023

eyesnareinc commented Mar 23, 2023 •

edited

Loading

ikenichiro commented Mar 24, 2023 •

edited

Loading

standtech commented Mar 24, 2023

This issue was moved to a discussion.

This issue was moved to a discussion.

Hubs Cloud Rollback for Stack Update Offline to Online + Template update fails and rollback fails #6004

Hubs Cloud Rollback for Stack Update Offline to Online + Template update fails and rollback fails #6004

Comments

eyesnareinc commented Mar 21, 2023

eyesnareinc commented Mar 21, 2023

eyesnareinc commented Mar 21, 2023

standtech commented Mar 22, 2023

ikenichiro commented Mar 23, 2023

standtech commented Mar 23, 2023

eyesnareinc commented Mar 23, 2023

eyesnareinc commented Mar 23, 2023

eyesnareinc commented Mar 23, 2023 • edited Loading

ikenichiro commented Mar 24, 2023 • edited Loading

standtech commented Mar 24, 2023

This issue was moved to a discussion.

eyesnareinc commented Mar 23, 2023 •

edited

Loading

ikenichiro commented Mar 24, 2023 •

edited

Loading