Skip to content

Commit

Permalink
Merge pull request #190 from asfadmin/cjl/feature/v18.2.0.0
Browse files Browse the repository at this point in the history
Upgrade to Cumulus v18.2.0
  • Loading branch information
lindsleycj committed Mar 5, 2024
2 parents ed6dbf3 + 6392894 commit 9c3eb78
Show file tree
Hide file tree
Showing 13 changed files with 69 additions and 33 deletions.
12 changes: 10 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,15 @@
# CHANGELOG

## Unreleased

## v18.2.0.0

* Upgrade to [Cumulus v18.2.0](https://github.com/nasa/cumulus/releases/tag/v18.2.0)
* **NOTE** this version of Cumulus requires changes to the RDS database per
[these instructions](https://nasa.github.io/cumulus/docs/upgrade-notes/upgrade-rds-cluster-tf-postgres-13/)
* upgrade TEA to [v1.3.5](https://github.com/asfadmin/thin-egress-app/releases/tag/tea-release.1.3.5)
* update required terraform version to `>= 1.5` in all CIRRUS modules matching the requirements
from the Cumulus application.
* Add `DAR=YES` tag to terraform state bucket created by `make tf`
* replace deprecated use of terraform `s3_bucket_object` with `s3_object`
* expose the TEA lambda timeout value to allow for DAAC customization
* add `--platform linux/amd64` to all Docker commands in `Makefile` so `make image` and
`make container-shell` work on Apple Silicon machines
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
# PYTHON_VER: python3 or python38 which sets the build target in make file

# ---------------------------
DOCKER_TAG := v18.0.0.0
DOCKER_TAG := v18.2.0.0
export TF_IN_AUTOMATION="true"
export TF_VAR_MATURITY=${MATURITY}
export TF_VAR_DEPLOY_NAME=${DEPLOY_NAME}
Expand Down
4 changes: 2 additions & 2 deletions cumulus/main.tf
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
module "cumulus" {
source = "https://github.com/nasa/cumulus/releases/download/v18.0.0/terraform-aws-cumulus.zip//tf-modules/cumulus"
source = "https://github.com/nasa/cumulus/releases/download/v18.2.0/terraform-aws-cumulus.zip//tf-modules/cumulus"

cumulus_message_adapter_lambda_layer_version_arn = data.terraform_remote_state.daac.outputs.cma_layer_arn

Expand Down Expand Up @@ -83,7 +83,7 @@ module "cumulus" {

orca_lambda_copy_to_archive_arn = local.orca_lambda_copy_to_archive_arn
orca_sfn_recovery_workflow_arn = local.orca_sfn_recovery_workflow_arn
orca_api_uri = local.orca_api_uri
orca_api_uri = local.orca_api_uri

# must match stage_name variable for thin-egress-app module
tea_api_gateway_stage = local.tea_stage_name
Expand Down
6 changes: 3 additions & 3 deletions cumulus/thin-egress.tf
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
module "thin_egress_app" {
source = "s3::https://s3.amazonaws.com/asf.public.code/thin-egress-app/tea-terraform-build.1.3.3.zip"
source = "s3::https://s3.amazonaws.com/asf.public.code/thin-egress-app/tea-terraform-build.1.3.5.zip"

auth_base_url = var.urs_url
bucket_map_file = local.bucket_map_key == null ? aws_s3_bucket_object.bucket_map_yaml.id : local.bucket_map_key
bucket_map_file = local.bucket_map_key == null ? aws_s3_object.bucket_map_yaml.id : local.bucket_map_key
bucketname_prefix = ""
config_bucket = local.system_bucket
cookie_domain = var.thin_egress_cookie_domain
Expand Down Expand Up @@ -40,7 +40,7 @@ resource "aws_secretsmanager_secret_version" "thin_egress_urs_creds" {
})
}

resource "aws_s3_bucket_object" "bucket_map_yaml" {
resource "aws_s3_object" "bucket_map_yaml" {
bucket = local.system_bucket
key = "${local.prefix}/thin-egress-app/${local.prefix}-bucket_map.yaml"
content = templatefile("./thin-egress-app/bucket_map.yaml.tmpl", {
Expand Down
2 changes: 1 addition & 1 deletion cumulus/versions.tf
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
terraform {
required_version = ">= 0.13"
required_version = ">= 1.5"
}
2 changes: 1 addition & 1 deletion data-persistence/main.tf
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
module "data_persistence" {
source = "https://github.com/nasa/cumulus/releases/download/v18.0.0/terraform-aws-cumulus.zip//tf-modules/data-persistence"
source = "https://github.com/nasa/cumulus/releases/download/v18.2.0/terraform-aws-cumulus.zip//tf-modules/data-persistence"

prefix = local.prefix
subnet_ids = data.aws_subnets.subnet_ids.ids
Expand Down
2 changes: 1 addition & 1 deletion data-persistence/versions.tf
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
terraform {
required_version = ">= 0.13"
required_version = ">= 1.5"
}
Original file line number Diff line number Diff line change
@@ -1,13 +1,8 @@
# Resolve TEA CloudFormation Error
# Resolve TEA CloudFormation Errors

In CIRRUS v17.0.0.3 and all later versions the `cumulus/thin-egress.tf` file was updated
to pass Tags to the thin-egress terraform module. The bulk of the thin-egress application
is deployed via CloudFormation.

For ORNL, the first time the CloudFormation stack tried to apply these tags, it deleted
the TEA Api Gateway stage and generated an error which could not be automatically
recovered. Maybe other DAAC's will not see this issue but here is what ORNL saw and how
we resolved it.
ORNL has run into Cloudformation issues when deploying TEA via CIRRUS. If your DAAC has
these same issues, here are some instructions that allowed ORNL to recover from this
sitution. As we identify additional scenarios this document will be updated.

**It is important to take these steps after the first run of `make cumulus`. If you
wait until another `make cumulus` run you may put the CloudFormation stack into a state
Expand All @@ -17,7 +12,7 @@ to re-associate CloudFront to the new Api Gateway.**

## Deployment error

When deploying `make cumulus` trying to add tags to TEA you may see an error like this:
When deploying `make cumulus` you may see a TEA CloudFormation error like this:

```
Error: updating CloudFormation Stack (arn:aws:cloudformation:us-west-2:343218528358:stack/cxl1-cumulus-cxl-thin-egress-app/67007340-789f-11ee-be75-0a3a3e51928f): ValidationError: Stack:arn:aws:cloudformation:us-west-2:343218528358:stack/cxl1-cumulus-cxl-thin-egress-app/67007340-789f-11ee-be75-0a3a3e51928f is in UPDATE_ROLLBACK_FAILED state and can not be updated.
Expand All @@ -34,32 +29,50 @@ If your look at the CloudFormation stack you will see something like this:

![CloudFormation Update Rollback Failed](images/cloudformation_update_rollback_failed.png)

And if you click on the stack name and look at the Events tab you will see something
like this:
And if you click on the stack name and look at the Events tab you will see a couple of
possible scenarios. Both have the same root cause `invalid stage identifier` like this:

![invalid stage identifier](images/invalid_stage_identifier.png)

Above this error you might see a couple scenarios. `Scenario 1` has update failures
concerning IAM roles like this:

![CloudFormation update events](images/cloudformation_update_events.png)

If you look at the Thin Egress Api Gateway you will see that it no longer has a Stage:
`Scenario 2` mentions failures with lambda functions like this:

![Thin Egress Api Gateway with no Stage](images/tea_api_gateway_no_stage.png)
![errors with lambda functions](images/errors_with_lambda_functions.png)

The steps to correct the issues vary slightly.

## How to resolve the errors

## How to resolve the error
Both recoveries start by updating your Thin Egress Api Gateway. You will see that it no
longer has a Stage:

![Thin Egress Api Gateway with no Stage](images/tea_api_gateway_no_stage.png)

### Add new Api Gateway Stage

First step in resolving the error is to add a Stage to your Api Gateway matching the
`$MATURITY` of your deployment. Click on `Create Stage`. In the new window type in
your MATURITY value and select the latest `Deployment` from the dropdown. All other
values can be left as their default. Like this:
Add a Stage to your Api Gateway matching the `$MATURITY` of your deployment. Click on
`Create Stage`. In the new window type in your MATURITY value and select the latest
`Deployment` from the dropdown. All other values can be left as their default. Like
this:

![Create new Api Gateway Stage](images/create_new_api_gateway_stage.png)

### Rollback Cloudformation using the Advanced option

In the CloudFormation Stack Options select `Continue update rollback`
Both scenarios now require you to roll back the CloudFormation stack so another
`make cumulus` can be run. The senarios differ in the Advanced troubleshooting options
that should be selected when running the rollback.

![Continue update rollback](images/cloudformation_continue_update_rollback.png)

### Scenario 1 roll back

In the CloudFormation Stack Options select `Continue update rollback`

Select `Advanced troubleshooting` and then select the checkboxes to `skip` all the
resources and then the `Continue update rollback` button.

Expand All @@ -69,6 +82,20 @@ Your stack should now be in the `UPDATE_ROLLBACK_COMPLETE` state

![Update Rollback Complete](images/cloudformation_update_rollback_complete.png)

### Scenario 2 roll back

In the CloudFormation Stack Options select `Continue update rollback`

Select `Advanced troubleshooting` and then select the checkboxes to `skip` the Lambda
resources only, don't select the `Express Stage` and then the
`Continue update rollback` button.

![Skip Lambda resources](images/skip_lamba_resources.png)

Your stack should now be in the `UPDATE_ROLLBACK_COMPLETE` state

![Update Rollback Complete](images/cloudformation_update_rollback_complete.png)

### Re-run `make Cumulus`

You should now be able to run `make cumulus` successfully
Expand Down
Binary file added docs/images/errors_with_lambda_functions.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/invalid_stage_identifier.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/skip_lamba_resources.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 2 additions & 1 deletion tf/locals.tf
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ locals {
aws_account_id_last4 = substr(data.aws_caller_identity.current.account_id, -4, 4)

default_tags = {
Deployment = local.prefix
Deployment = local.prefix,
DAR = "YES"
}
}
2 changes: 1 addition & 1 deletion tf/versions.tf
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
terraform {
required_version = ">= 0.13"
required_version = ">= 1.5"
}

0 comments on commit 9c3eb78

Please sign in to comment.