Skip to content

Conversation

@potiuk
Copy link
Member

@potiuk potiuk commented Apr 20, 2022

We have the new Python-based breeze and we want to replace all
the functionality used from the old Breeze with the new one.

The most important is image management. It is used in many
places and this PR replaces all the places and removes image
builidng, pushing, pulling to use the new Breeze everywhere.

That includes:

  • Building and pushing CI image on CI
  • Building and pushing PROD image on CI which includes:
    • building Airflow packages
    • building Provider packages
    • waiting for images in parallel (both CI and and PROD)
    • veifying the images (both CI and PROD)

All those commands have been moved to breeze.py and that required
some modification in the Breeze parameter handling - mainly related
to adding spaces when help was displayed because long list of
packages looked very bad in help output.

It's been easier to implement it in one big PR as using
image building and pulling was deeply embedded in many scripts.

With this change all the scripts in CI use directly breeze
and combine using command line parameters with evn variables
directly in the job that execute the breeze commands which
makes it much easier to understand what is going on and
repeat it locally using Breeze.

Fixes: #22825
Fixes: #23077
Fixes: #23076
Fixes: #22829
Fixes: #22828
Fixes: #22826
Fixes: #20961
Fixes: #23102


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

@potiuk potiuk changed the title Use new Breese for building, pulling and verifying the images. Use new Breeze for building, pulling and verifying the images. Apr 20, 2022
@potiuk potiuk force-pushed the move-package-preparation-to-new-breeze branch 5 times, most recently from b758f81 to 24d157a Compare April 20, 2022 17:37
@potiuk potiuk marked this pull request as ready for review April 20, 2022 17:56
@potiuk potiuk requested review from dstandish and eladkal April 20, 2022 17:56
@potiuk potiuk force-pushed the move-package-preparation-to-new-breeze branch from 24d157a to 9e9c5cd Compare April 20, 2022 18:08
@potiuk
Copy link
Member Author

potiuk commented Apr 20, 2022

Apologies for such a HUGE PR, but I could not find a better way of replacing old with new Breeze's build/pull/push image.

I could keep on replacing piece by piece but keeping the two in parallel in CI was really burdensome. That's why I replaced all the related build/pull/push tasks in one go. I am still testing it (I will do some tests with "main" push to my own repo). But it should be generally working.

It removes another 1000ish lines of bash code (YAY!) and paves the way to remove nearly all of them in subsequent PRs.

Most of all It unifies the way how CI and local development works - rather than having scripts on CI and local breeze commands, everythong is done via python-based breeze commands (including running stuff in parallel on CI which does not use Gnu Parallel but Python Multiprocessing)

It's way neater, nicere and much easier to reproduce locally. Example - most complex "building PROD image" is this:

      - name: "Build & Push PROD image ${{ matrix.python-version }}:${{ env.IMAGE_TAG_FOR_THE_BUILD }}"
        run: |
          rm -f "./dist/"*.{whl,tar.gz}
          rm -f "./docker-context-files/"*.{whl,tar.gz}
          breeze prepare-provider-packages \
              --package-list-file scripts/ci/installed_providers.txt" \
              --package-format wheel
          breeze prepare-airflow-package --package-format wheel
          mv "./dist/"*.{whl,tar.gz} "./docker-context-files"
          breeze build-prod-image --push-image --install-from-docker-context-files
        env:
          PYTHON_MAJOR_MINOR_VERSION: ${{ matrix.python-version }}
          UPGRADE_TO_NEWER_DEPENDENCIES: ${{ needs.build-info.outputs.upgradeToNewerDependencies }}
          DOCKER_CACHE: ${{ needs.build-info.outputs.cacheDirective }}
          IMAGE_TAG: ${{ env.IMAGE_TAG_FOR_THE_BUILD }}

Which you can very easily reproduce locally with the new breeze. It combines (for clarity of the commands) both evn-vars and flags, but all the relevant env vars used are now added directly in each step rather than in job or in workflow, which makes it much more obvious what's going on.

@mik-laj - I also removed the "special" preparation of virtualenv for docker tests - the docker tests now can be executed via breeze command and they will automatically use breeze's virtualenv (which already containts pytest, pytest-xdistts and requests). Same with "imaage-verification". With the new "verify-image" and "verify-prod-image" you can verify both "breeze-built" image as well as you can specify the image you want to test manually (say breeze verify-prod-image --image-name apache/airfow:2.2.5

Ach and spaces added to separate options improved the readability of "choice" fields vastly in help.

@potiuk
Copy link
Member Author

potiuk commented Apr 20, 2022

This is the currernt set of commands supported now:

image

@potiuk
Copy link
Member Author

potiuk commented Apr 20, 2022

cc: @Bowrna @edithturn -> getting much closer to replace all of it with the new Breeze :)

Copy link
Contributor

@eladkal eladkal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is massive!
🎉

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If you plan to build a number of images, probably better solution is to set up a hardware remote builder
If you plan to build a number of images, probably better solution is to set up a hardware remote builder

@potiuk
Copy link
Member Author

potiuk commented Apr 20, 2022

This is massive! tada

Oh yeah. Unfortunately. Now You know why your PRs were neglected a bit recently ;). The nice thing about it that it really implements all the remaining bits of the "breeze" infrastructure (for example running stuff in parallel is now embedded in some Breeze's commands which make the maintenance stuff much nicer and easier to perform by anyone without too much of a learning curve. After we finish - pretty much everything what happens in CI will be just the right sequence of breeze commands.

@github-actions
Copy link

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

@github-actions github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label Apr 20, 2022
@potiuk potiuk force-pushed the move-package-preparation-to-new-breeze branch 3 times, most recently from a61c0f8 to 7b1c13b Compare April 21, 2022 00:31
@potiuk potiuk force-pushed the move-package-preparation-to-new-breeze branch 14 times, most recently from f3074aa to 261b7a1 Compare April 22, 2022 20:37
We have the new Python-based breeze and we want to replace all
the functionality used from the old Breeze with the new one.

The most important is image management. It is used in many
places and this PR replaces all the places and removes image
builidng, pushing, pulling to use the new Breeze everywhere.

That includes:

* Building and pushing CI image on CI
* Building and pushing PROD image on CI which includes:
  * building Airflow packages
  * building Provider packages
  * waiting for images in parallel (both CI and and PROD)
  * veifying the images (both CI and PROD)

All those commands have been moved to breeze.py and that required
some modification in the Breeze parameter handling - mainly related
to adding spaces when help was displayed because long list of
packages looked very bad in help output.

It's been easier to implement it in one big PR as using
image building and pulling was deeply embedded in many scripts.

With this change all the scripts in CI use directly breeze
and combine using command line parameters with evn variables
directly in the job that execute the breeze commands which
makes it much easier to understand what is going on and
repeat it locally using Breeze.

Fixes: apache#22825
Fixes: apache#23077
Fixes: apache#23076
Fixes: apache#22829
Fixes: apache#22828
Fixes: apache#22826
Fixes: apache#20961
Fixes: apache#23102
Fixes: apache#21098
@potiuk potiuk force-pushed the move-package-preparation-to-new-breeze branch from 261b7a1 to 065a5c7 Compare April 22, 2022 20:38
@potiuk
Copy link
Member Author

potiuk commented Apr 23, 2022

All right. All green. Merging.

I tested:

That should cover all scenarios - there is slight chance of problems with cache refreshing but we can handle it if we see a problem.

@potiuk potiuk merged commit 8b6b084 into apache:main Apr 23, 2022
@potiuk potiuk deleted the move-package-preparation-to-new-breeze branch April 23, 2022 07:12
potiuk added a commit to potiuk/airflow that referenced this pull request Apr 23, 2022
This is an aftermath of apache#23104 after switchig to docs building
by breeze, failure of build documentation did not trigger failure
of the docs build (but it did trigger main failure of pushing
the documentation).

This change improves and simplifies the return code processing and
propagation in the commands executed by breeze - thanks to common
returncode, stdout, stderr available in both CompletedProcess
and CalledProcessError and returning fake CompletedProcess in dry_run
mode, we can also satisfy MyPy type check by returning non-optional
Union of those two types which simplifies returncode processing.

This change fixes the error in the docs (lack of empty lines before
auto-generated extras).

All commands have been reviewed to see if the returncode is
correctly handled where needed.
potiuk added a commit that referenced this pull request Apr 23, 2022
This is an aftermath of #23104 after switchig to docs building
by breeze, failure of build documentation did not trigger failure
of the docs build (but it did trigger main failure of pushing
the documentation).

This change improves and simplifies the return code processing and
propagation in the commands executed by breeze - thanks to common
returncode, stdout, stderr available in both CompletedProcess
and CalledProcessError and returning fake CompletedProcess in dry_run
mode, we can also satisfy MyPy type check by returning non-optional
Union of those two types which simplifies returncode processing.

This change fixes the error in the docs (lack of empty lines before
auto-generated extras).

All commands have been reviewed to see if the returncode is
correctly handled where needed.
potiuk added a commit to potiuk/airflow that referenced this pull request Apr 23, 2022
This is another small aftermath after the apache#23104 - this could not
be tested during PRs because generate-constraints only run in
main in apache/airflow repo and a problem crept in that I have
forgotten to add --run-in-parallel for those breeze commands,
which resulted in missing the python version to generate
constraints for.

This change adds --run-in-parallell and list of python versions
to work on so that generate constraints might start work again.
potiuk added a commit that referenced this pull request Apr 24, 2022
This is another small aftermath after the #23104 - this could not
be tested during PRs because generate-constraints only run in
main in apache/airflow repo and a problem crept in that I have
forgotten to add --run-in-parallel for those breeze commands,
which resulted in missing the python version to generate
constraints for.

This change adds --run-in-parallell and list of python versions
to work on so that generate constraints might start work again.
@jedcunningham jedcunningham added the changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) label Apr 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:dev-tools changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) full tests needed We need to run full set of tests for this PR to merge

Projects

None yet

3 participants