Skip to content

Conversation

@vab
Copy link
Contributor

@vab vab commented Mar 3, 2025

Purpose and background context

This PR adds the infrastructure deploy workflows to the this application, dsc. It adds new github automation, and updates the project's Makefile with the new workflows.

How this addresses that need:

* Add dev build workflow
* Add stage build workflow
* Add prod promotion workflow
* Update Makefile w/ new workflows
* Add CODEOWNERS
* Update README.md

Side effects of this change:

None

Changes to be committed:
new file: .github/CODEOWNERS
new file: .github/workflows/dev-build.yml
new file: .github/workflows/prod-promote.yml
new file: .github/workflows/stage-build.yml
modified: Makefile
modified: README.md

How can a reviewer manually see the effects of these changes?

There will be no immediate effect. However, after the PR is merged with main, future code updates should allow the running of the deployment workflows.

Includes new or updated dependencies?

NO

Changes expectations for external applications?

NO

What are the relevant tickets?

https://mitlibraries.atlassian.net/browse/IN-1148

Developer

  • All new ENV is documented in README
  • All new ENV has been added to staging and production environments
  • All related Jira tickets are linked in commit message(s)
  • Stakeholder approval has been confirmed (or is not needed)

Code Reviewer(s)

  • The commit message is clear and follows our guidelines (not just this PR message)
  • There are appropriate tests covering any new functionality
  • The provided documentation is sufficient for understanding any new functionality introduced
  • Any manual tests have been performed or provided examples verified
  • New dependencies are appropriate or there were no changes

Why these changes are being introduced:

This commit adds the infrastructure deploy workflows to the this application,
asati. It adds new github automation, and updates the project's Makefile with
the new workflows. It also updates repo documentation.

How this addresses that need:

    * Add dev build workflow
    * Add stage build workflow
    * Add prod promotion workflow
    * Update Makefile w/ new workflows
    * Add CODEOWNERS
    * Update README.md

Side effects of this change:

None

Changes to be committed:
      new file:   .github/CODEOWNERS
      new file:   .github/workflows/dev-build.yml
      new file:   .github/workflows/prod-promote.yml
      new file:   .github/workflows/stage-build.yml
      modified:   Dockerfile
      modified:   Makefile
      modified:   README.md
@vab vab self-assigned this Mar 3, 2025
Dockerfile Outdated
RUN pip install --no-cache-dir --upgrade pip pipenv

RUN apt-get update && apt-get upgrade -y && apt-get install -y git
RUN apt-get update && apt-get dist-upgrade -y && apt-get autoremove -y && apt-get install -y git

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since DataEng is building off the python:3.12-slim container, I don't think it's necessary to do the dist-upgrade here. I would leave any decisions regarding changes to the Dockerfile to the data engineers, especially @ghukill.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend that they do a dist-upgrade and autoremove. The dist-upgrade will pick up changes related, through dependencies, to any packages that would be updated through an apt-get upgrade. This will help avoid ABI/API conflicts between packages and libraries. The autoremove command will uninstall packages that have been superseded during the upgrade or dist-upgrade process. This will save disk space (helping control our costs) and prevent either leaving insecure software installed in the container or a developer accidentally using software that has been replaced.

Performing our Dockerfile updates this way will bring us into line with the way AWS does their OS/Container updates. This would help prevent a situation where we could have containers that differ from our previous containers in installed software if we do a completely fresh build of a project or clone a project.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is supposed to be related just to the additional of the GitHub Actions workflows for publishing container images to AWS ECR, so any change to DataEng's Dockerfile feels out-of-scope for this PR.

Specific to this line in the Dockerfile, DataEng is using a pre-built Python container (python:3.12-slim), not a full Ubuntu distribution. So,the container is already highly optimized (see the Dockerfile that actually creates the source image here).

It does make sense to me to do apt-get upgrade to catch any out-of-date packages, but an apt-get dist-upgrade goes beyond what is necessary for the Python application to actually run.

The dependencies that matter for the Python application are all handled by pipenv install line below (since DataEng is tracking all their Python dependencies in the Pipfile in their application code).

Copy link

@ghukill ghukill Mar 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for raising this up @vab , and the detailed reponse @cabutlermit. I agree with Christopher's last three points about a specific image used, likely not needing to think/touch the underlying distribution, and that we spend quite a bit of time in our application code managing dependencies.

To my thinking -- and perhaps this is incorrect -- by tethering ourselves closely to the Docker image (and whatever updates it may receieve, knowing we get the updated form each CI build), we are effectively saying that our dependencies work with this particular image more than python 3.12 or Ubuntu 22.04 (arbitrary examples) more generally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ghukill I have made the requested change. Can you approve this PR for merge? It will unblock me on some tickets. Thanks!

Why these changes are being introduced:

This commit makes a minor CODEOWNERS updated based on PR review.

Side effects of this change:

None

Changes to be committed:
      modified:   .github/CODEOWNERS
@coveralls
Copy link

coveralls commented Mar 3, 2025

Pull Request Test Coverage Report for Build 13706817815

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 95.251%

Totals Coverage Status
Change from base Build 13596240336: 0.0%
Covered Lines: 722
Relevant Lines: 758

💛 - Coveralls

Why these changes are being introduced:

This commit makes a minor update to the README.md file that removes this
repository's classification as an infrastructure repository.

Side effects of this change:

None

Changes to be committed:
      modified:   README.md
@vab vab requested a review from ghukill March 3, 2025 18:28
vab added 2 commits March 6, 2025 14:33
Why these changes are being introduced:

Reversion of Dockerfile requested in code review.

Side effects of this change:

None

Changes to be committed:
      modified:   Dockerfile
Why these changes are being introduced:

Revert changes to Dockerfile requested by code review.

Side effects of this change:

None

Changes to be committed:
      modified:   Dockerfile
@vab vab merged commit 7b119cc into main Mar 6, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants