Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade Ubuntu operating system in production #3114

Closed
33 of 35 tasks
rbreslow opened this issue Jul 1, 2019 · 1 comment
Closed
33 of 35 tasks

Upgrade Ubuntu operating system in production #3114

rbreslow opened this issue Jul 1, 2019 · 1 comment
Assignees

Comments

@rbreslow
Copy link
Contributor

rbreslow commented Jul 1, 2019

Steps to take prior to release

  • Reconcile marriagez-rings/Stroud_ModelMyWatershed/AWS/default.yml
    • Papertrail
    • Bastion instance type
    • DNS records
  • Update RELEASES.md and MIGRATIONS.md with any changes (e.g. new Bastion hostname)
  • Create a release/1.25.0 branch
  • Build release AMIs using release docs
  • Test that it works on staging
  • Follow AMI Promotion instructions

Questions to answer prior to release

  • Were past migrations modified to appease the Django upgrade?
  • Do the on_delete alterations to existing tables lead to new database constraints or database constraint modifications?

Steps to take during the release

  • Bring up the dark stack using release docs
  • Connect to each node via SSH so that we can review logs in the event that things don't work
  • Test the dark stack
  • Apply migrations
  • Clear tile cache and cut over to new stack
$ aws --profile mmw-stg s3 rm --recursive s3://tile-cache.modelmywatershed.org
  • Ensure new tiles are getting cached to S3

Steps to take during the PostgreSQL upgrade

  • In a tmux pane, run this script to check for connectivity:
while true;
do
    date
    PGPASSWORD= \
        psql -h database.service.mmw.internal -U modelmywatershed -d modelmywatershed -c "SELECT version();"
    sleep 5
done
  • Vaccum the database:
modelmywatershed=> SET maintenance_work_mem='1GB';
modelmywatershed=> \timing
modelmywatershed=> VACUUM (FREEZE, ANALYZE, VERBOSE);
  • Verify there is no unsupported usage, using the PostgreSQL upgrade docs
  • Upgrade to 9.5.18 and apply the mmw-postgres95 parameter group
  • Upgrade the Bastion to postgresql-client-9.5:
sudo apt-get update
sudo apt-get install postgresql-client-9.5
  • Verify we aren't using any extensions and upgrade PostGIS:
modelmywatershed=> select * from pg_available_extensions where name like 'postgis%';
modelmywatershed=> ALTER EXTENSION postgis UPDATE;
modelmywatershed=> select postgis_full_version();
  • Vaccum the database:
modelmywatershed=> SET maintenance_work_mem='1GB';
modelmywatershed=> \timing
modelmywatershed=> VACUUM (FREEZE, ANALYZE, VERBOSE);
  • Upgrade to 9.6.14 and apply the mmw-postgres96 parameter group, using the RDS Console
  • Upgrade the Bastion to postgresql-client-9.6:
sudo apt-get install postgresql-client-9.6
  • Verify we aren't using any extensions and upgrade PostGIS:
modelmywatershed=> select * from pg_available_extensions where name like 'postgis%';
modelmywatershed=> ALTER EXTENSION postgis UPDATE;
modelmywatershed=> select postgis_full_version();

Steps to take to reconcile CloudFormation stack drift

  • Apply the following diff:
diff --git a/deployment/cfn/vpc.py b/deployment/cfn/vpc.py
index d25db866..a9c693bb 100644
--- a/deployment/cfn/vpc.py
+++ b/deployment/cfn/vpc.py
@@ -43,9 +43,9 @@ class VPC(StackNode):
         'Region': 'us-east-1',
         'StackType': 'Staging',
         'KeyName': 'mmw-stg',
-        'AvailabilityZones': 'us-east-1b,us-east-1d',
-        'PublicSubnetCIDRRanges': '10.0.2.0/24,10.0.4.0/24',
-        'PrivateSubnetCIDRRanges': '10.0.3.0/24,10.0.5.0/24',
+        'AvailabilityZones': 'us-east-1a,us-east-1b',
+        'PublicSubnetCIDRRanges': '10.0.0.0/24,10.0.2.0/24',
+        'PrivateSubnetCIDRRanges': '10.0.1.0/24,10.0.3.0/24',
         'NATInstanceType': 't2.micro',
     }
  • Generate new VPC stack template:
./mmw_stack.py launch-stacks \
    --aws-profile mmw-prd \
    --mmw-config-path default.yml \
    --mmw-profile production \
    --vpc \
    --print-json > ~/vpc-$(date +"%Y_%m_%d_%H_%M").json
  • Apply update stack operation
  • See that stacks are in-sync
  • Generate new DataPlane stack template:
./mmw_stack.py launch-stacks \
    --aws-profile mmw-prd \
    --mmw-config-path default.yml \
    --mmw-profile production \
    --data-plane \
    --print-json > ~/data-plane-$(date +"%Y_%m_%d_%H_%M").json
  • Apply update stack operation
  • See that stacks are in-sync

Post release checks

  • Vaccum the database:
modelmywatershed=> SET maintenance_work_mem='1GB';
modelmywatershed=> \timing
modelmywatershed=> VACUUM (FREEZE, ANALYZE, VERBOSE);
  • Reboot the RDS instance to make sure the parameter group is synced
  • Ensure new bastion DNS works
  • Confirm logging to Papertrail works
  • Ensure RWD works
  • Ensure geoprocessing web service works
@rbreslow rbreslow added this to the Operations Sprint: 7/12-7/25 milestone Jul 1, 2019
@rbreslow rbreslow self-assigned this Jul 12, 2019
@hectcastro hectcastro modified the milestones: Operations Sprint: 7/12-7/25, Operations Sprint: 7/26-8/8 Jul 25, 2019
@hectcastro hectcastro modified the milestones: Operations Sprint: 7/26-8/8, Operations Sprint: 8/9-8/22 Aug 8, 2019
@hectcastro hectcastro modified the milestones: Operations Sprint: 8/9-8/22, Operations Sprint: 8/23-9/6 Aug 18, 2019
@rbreslow
Copy link
Contributor Author

See that stacks are in-sync was not checked due to #3162.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants