New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve informational logging for migration steps #1568

Merged
merged 11 commits into from Jan 16, 2019

Conversation

2 participants
@chrisgilmerproj
Copy link
Contributor

chrisgilmerproj commented Jan 9, 2019

Description

Fairly often we get errors like this: https://circleci.com/gh/transcom/mymove/47693

It's hard to know what's going on in these scripts because they are completely silent, which means you can't easily tell at what point the script failed. I've gone through two scripts and added information that should make this more clear and included some helpful error parsing information.

Reviewer Notes

Do the messages make sense?

Code Review Verification Steps

  • Request review from a member of a different team.
  • Have the Pivotal acceptance criteria been met for this change?

References

@chrisgilmerproj chrisgilmerproj self-assigned this Jan 9, 2019

echo
aws ecs describe-tasks --tasks "$task_arn" --cluster "$cluster"
echo
show_logs "$task_arn"
echo
echo "If the log stream does not exist check the task description above for 'stoppedReason'"

This comment has been minimized.

@chrisgilmerproj

chrisgilmerproj Jan 9, 2019

Contributor

We often get the message:

An error occurred (ResourceNotFoundException) when calling the GetLogEvents operation: The specified log stream does not exist.

This confuses people as to the reason for the failure. I hoped that providing this line would make it more clear what to do.

command: bin/do-exclusively --job-name ${CIRCLE_JOB} bin/rds-snapshot-app-db $APP_ENVIRONMENT
- run:
name: Run migrations
command: bin/do-exclusively --job-name ${CIRCLE_JOB} bin/ecs-run-app-migrations-container config/app-migrations.container-definition.json ${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_DEFAULT_REGION}.amazonaws.com/app-migrations:git-${CIRCLE_SHA1} $APP_ENVIRONMENT

This comment has been minimized.

@chrisgilmerproj

chrisgilmerproj Jan 9, 2019

Contributor

I think this is a good change but is there an issue with spitting it into two steps?

This comment has been minimized.

@pjdufour-truss

pjdufour-truss Jan 9, 2019

Contributor

No bueno, if the snapshot fails, wont it still run the migrations?

This comment has been minimized.

@chrisgilmerproj

chrisgilmerproj Jan 9, 2019

Contributor

I was looking into that. I think I confirmed that time doesn't swallow error messages. The error we were seeing was really a failure of ECS to obtain the correct network settings to set up the task for migration. However, its probably worth some more investigation into the snapshot code to ensure that when it fails we don't continue any processes.

This comment has been minimized.

@chrisgilmerproj

chrisgilmerproj Jan 9, 2019

Contributor

The new check looks at the snapshot description and compares the status text.

@chrisgilmerproj chrisgilmerproj requested a review from pjdufour-truss Jan 9, 2019

@chrisgilmerproj

This comment has been minimized.

Copy link
Contributor

chrisgilmerproj commented Jan 9, 2019

I deployed this to experimental and I think that it worked as expected: https://circleci.com/gh/transcom/mymove/53468

@chrisgilmerproj

This comment has been minimized.

Copy link
Contributor

chrisgilmerproj commented Jan 16, 2019

@chrisgilmerproj

This comment has been minimized.

Copy link
Contributor

chrisgilmerproj commented Jan 16, 2019

Successful deploy!

@chrisgilmerproj chrisgilmerproj merged commit 3c48591 into master Jan 16, 2019

12 checks passed

ci/circleci: acceptance_tests Your tests passed on CircleCI!
Details
ci/circleci: build_app Your tests passed on CircleCI!
Details
ci/circleci: build_migrations Your tests passed on CircleCI!
Details
ci/circleci: build_tools Your tests passed on CircleCI!
Details
ci/circleci: client_test Your tests passed on CircleCI!
Details
ci/circleci: integration_tests_mymove Your tests passed on CircleCI!
Details
ci/circleci: integration_tests_office Your tests passed on CircleCI!
Details
ci/circleci: integration_tests_tsp Your tests passed on CircleCI!
Details
ci/circleci: pre_deps_golang Your tests passed on CircleCI!
Details
ci/circleci: pre_deps_yarn Your tests passed on CircleCI!
Details
ci/circleci: pre_test Your tests passed on CircleCI!
Details
ci/circleci: server_test Your tests passed on CircleCI!
Details

@chrisgilmerproj chrisgilmerproj deleted the cg_162792722_fix_rds_snapshot_script branch Jan 16, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment