Skip to content
This repository has been archived by the owner on Nov 30, 2022. It is now read-only.

Parallelize CI safe checks #717

Merged
merged 2 commits into from
Jun 27, 2022
Merged

Parallelize CI safe checks #717

merged 2 commits into from
Jun 27, 2022

Conversation

sanders41
Copy link
Contributor

@sanders41 sanders41 commented Jun 26, 2022

Purpose

Parallelize the safe CI checks to reduce run times.

Currently the safe CI checks take an average of around 13 minutes to run. I did some testing on a private fork and was getting an average run time of around 8:45 running the jobs in parallel, which is about a 33% reduction in time.

Changes

  • Modify the tests in safe_pr_checks.yml to run in parallel rather than serially.
  • Run linting outside of docker. Doing this further reduces run times. Pylint is the longest running job in this section at about 2 minutes.

Checklist

  • Update CHANGELOG.md file
    • Merge in main so the most recent CHANGELOG.md file is being appended to
    • Add description within the Unreleased section in an appropriate category. Add a new category from the list at the top of the file if the needed one isn't already there.
    • Add a link to this PR at the end of the description with the PR number as the text. example: #1
  • Applicable documentation updated (guides, quickstart, postman collections, tutorial, fidesdemo, database diagram.
  • If docs updated (select one):
    • documentation complete, or draft/outline provided (tag docs-team to complete/review on this branch)
    • documentation issue created (tag docs-team to complete issue separately)
  • Good unit test/integration test coverage
  • This PR contains a DB migration. If checked, the reviewer should confirm with the author that the down_revision correctly references the previous migration before merging
  • The Run Unsafe PR Checks label has been applied, and checks have passed, if this PR touches any external services

Ticket

Fixes #

@ThomasLaPiana
Copy link
Contributor

@sanders41 another solution is to copy how we do it over on fidesctl, but this does definitely have an immediate positive impact as-is 🙂

@sanders41
Copy link
Contributor Author

sanders41 commented Jun 26, 2022

@ThomasLaPiana I tried copying it but was getting errors, I think because of the way the tag is set, IMAGE_TAG := $(shell git fetch --force --tags && git describe --tags --dirty --always).

Because of this I tried the way I have it now, and to my surprise it was significantly faster than building the image once and sharing it. The reason is the upload/down load of the image to storage is slow, and each job could build it faster itself rather than waiting on it to be built, uploaded, then downloading it. Not the result I was expecting, but I couldn't argue with the numbers. With the pre-built image the first post-build job would fail at around 13 minutes. When done the way it is here all tests are done in around 8 minutes 45 seconds.

@seanpreston seanpreston requested review from PSalant726 and removed request for PSalant726 June 27, 2022 15:05
@seanpreston seanpreston self-assigned this Jun 27, 2022
Copy link
Contributor

@PSalant726 PSalant726 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tried breaking out the python setup, dependency installation, and database initialization into their own job(s), and making those jobs dependencies of the others that require them? That seems like a nice way to DRY things up and further reduce run times.

Otherwise, this LGTM.

@sanders41
Copy link
Contributor Author

@PSalant726 I am already doing this everywhere that doesn't use a database. I considered also doing this everywhere, i.e. only running external services in Docker (i.e. postgres, redis, etc) and not the app itself. It would make things much faster, but I don't know how everyone feels about doing this. Thoughts?

@seanpreston seanpreston merged commit 2ec8fd2 into main Jun 27, 2022
@seanpreston seanpreston deleted the parallelize-safe-checks branch June 27, 2022 17:33
@seanpreston
Copy link
Contributor

Have merged this now to reap the benefits, we can add only running external services in Docker as a separate incremenn

sanders41 added a commit that referenced this pull request Sep 22, 2022
* Parallelize CI safe checks

* Update CHANGELOG

Co-authored-by: Paul Sanders <pau@ethyca.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants