Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Docker stack notifications further and fix ES indexing when DB is not refreshed #6045

Merged
merged 7 commits into from
Sep 28, 2020
27 changes: 15 additions & 12 deletions Jenkinsfile
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ pipeline {
STACK_PREFIX = 'cfgov'
NOTIFICATION_CHANNEL = 'cfgov-deployments'
LAST_STAGE = 'Init'
DEPLOY_SUCCESS = false
}

parameters {
Expand Down Expand Up @@ -43,6 +44,7 @@ pipeline {
steps {
script {
env.STACK_NAME = dockerStack.sanitizeStackName("${env.STACK_PREFIX}-${JOB_BASE_NAME}")
env.STACK_URL = dockerStack.getStackUrl(env.STACK_NAME)
env.CFGOV_HOSTNAME = dockerStack.getHostingDomain(env.STACK_NAME)
env.IMAGE_NAME_LOCAL = "${env.IMAGE_REPO}:${env.IMAGE_TAG}"
env.IMAGE_NAME_ES_LOCAL = "${env.IMAGE_ES_REPO}:${env.IMAGE_TAG}"
Expand Down Expand Up @@ -125,6 +127,7 @@ pipeline {
timeout(time: 30, unit: 'MINUTES') {
dockerStack.deploy(env.STACK_NAME, 'docker-stack.yml')
}
DEPLOY_SUCCESS = true
}
echo "Site available at: https://${CFGOV_HOSTNAME}"
}
Expand Down Expand Up @@ -152,22 +155,22 @@ pipeline {
post {
success {
script {
if (env.GIT_BRANCH != 'main') {
notify("${NOTIFICATION_CHANNEL}", ":white_check_mark: [**${env.GIT_BRANCH}**](${env.CHANGE_URL}) by ${env.CHANGE_AUTHOR} deployed via [Jenkins](${env.BUILD_URL}) and available at https://${env.CFGOV_HOSTNAME}/")
}
else {
notify("${NOTIFICATION_CHANNEL}", ":white_check_mark: **main** branch stack deployed via [Jenkins](${env.BUILD_URL}) and available at https://${env.CFGOV_HOSTNAME}/")
}
author = env.CHANGE_AUTHOR ? "by ${env.CHANGE_AUTHOR}" : "branch"
changeUrl = env.CHANGE_URL ? env.CHANGE_URL : env.GIT_URL
notify("${NOTIFICATION_CHANNEL}",
""":white_check_mark: **${STACK_PREFIX} [${env.GIT_BRANCH}]($changeUrl)** $author [deployed](https://${env.CFGOV_HOSTNAME}/)!
\n:jenkins: [Details](${env.RUN_DISPLAY_URL}) :mantelpiece_clock: [Pipeline History](${env.JOB_URL}) :docker-dance: [Stack URL](${env.STACK_URL}) """)
}
}

unsuccessful {
script{
if (env.GIT_BRANCH != 'main') {
notify("${NOTIFICATION_CHANNEL}", ":x: [**${env.GIT_BRANCH}**](${env.CHANGE_URL}) by ${env.CHANGE_AUTHOR} failed at stage **${LAST_STAGE}** \n:jenkins-devil: [Failure Details](${env.RUN_DISPLAY_URL}) :mantelpiece_clock: [Pipeline History](${env.JOB_URL})")
}
else {
notify("${NOTIFICATION_CHANNEL}", ":x: **main** branch stack deployment failed at stage **${LAST_STAGE}** \n:jenkins-devil: [Failure Details](${env.RUN_DISPLAY_URL}) :mantelpiece_clock: [Pipeline History](${env.JOB_URL})")
}
author = env.CHANGE_AUTHOR ? "by ${env.CHANGE_AUTHOR}" : "branch"
changeUrl = env.CHANGE_URL ? env.CHANGE_URL : env.GIT_URL
deployText = DEPLOY_SUCCESS ? "[deployed](https://${env.CFGOV_HOSTNAME}/) but failed" : "failed"
notify("${NOTIFICATION_CHANNEL}",
""":x: **${STACK_PREFIX} [${env.GIT_BRANCH}]($changeUrl)** $author $deployText at stage **${LAST_STAGE}**
\n:jenkins-devil: [Details](${env.RUN_DISPLAY_URL}) :mantelpiece_clock: [Pipeline History](${env.JOB_URL}) :docker-dance: [Stack URL](${env.STACK_URL}) """)
}
}
}
Expand Down
3 changes: 2 additions & 1 deletion docker-stack.yml
Original file line number Diff line number Diff line change
Expand Up @@ -87,8 +87,9 @@ services:

if [ $$django_tables_exist -gt 0 ] || [ $$REFRESH_DB == 'true' ]; then
./refresh-data.sh
./cfgov/manage.py update_index
fi

./cfgov/manage.py update_index
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marcesher is this a deliberate change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chosak Yes. I added an explanation to the commit but haven't added it to the PR yet. I discovered the bug while testing the notification changes. I will update the PR description with the explanation momentarily

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chosak If you'd prefer I pull that into a separate PR, I'm happy to do so. Whatever the team norms are for a case like this, I'll gladly abide

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll defer to others more familiar with the Elasticsearch changes -- I don't know enough about those tests to understand why this would be necessary again. Is it because the ES content gets wiped out with each redeploy, unlike the Postgres content (if REFRESH_DB is unset)?

Copy link
Contributor Author

@marcesher marcesher Sep 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chosak That's exactly it. My understanding of this problem is that prior to #6014 , the ES image would not be redeployed, and thus its indexes would persist over multiple builds of a given PR. Thus, the logic in docker-entrypoint.sh made sense. But with #6014, that changes, but I failed to account for that in docker-entrypoint.sh and that's why post-deploy functional tests have been consistently failing on the steps that involve searching. It also explains why those tests pass on Build 1, but fail on subsequent builds.

I did talk with @Scotchester this morning about an alternative approach, which we use for Collab, which would build the ES image in a separate repo, on a separate cadence (daily? weekly?) and then the docker-stack.yml file would change to pull that image (from internal hub) similar to how it used to pull from external hub. I haven't thought hard enough though about how well the Collab model would work in cf.gov's case. We'd need to discuss it as a team.

Perhaps we merge this change for now, to alleviate all the functional test failures, and then take up the alternate approach separately?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The good news is that updating the index will not take long if there are no changes to it (if I remember correctly).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait 🤦 that's not true at all if the image is rebuilt and there is no index. I do think we need to take a look at that alternate approach sooner, rather than later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Roger that, Scott. Just this morning, we prioritized solving the Jenkins job-dsl rebuild problem that we discussed elsewhere and which I anticipated would be a tough nut to crack, and fortunately we solved that unexpectedly quickly. So that serendipitously leaves an opening for us to prioritize this in our current sprint, as well. I'll talk to the team about it. 🙌

Would you be OK merging this PR now, knowing that we'll prioritize the alternate approach this sprint?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, for sure! Merge when ready :)


httpd -d cfgov/apache -D FOREGROUND"

Expand Down