Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert change detection to a Python script #129627

Merged
merged 14 commits into from
Feb 5, 2025

Conversation

AA-Turner
Copy link
Member

The change detection workflow is becoming increasingly complex.

We have grown from a fairly simple grep command to skip documentation files (#19983) to a large and complex shell script (embedded within a YAML document), indeed one sufficiently complex to merit a dedicated workflow file (#122336)

A potted history of significant revisions is thus:

There are further proposed changes, such as skipping Windows tests on changes to the Unix build configuration:

Even having recently improved readability of the central grep command (#128754), this workflow remains difficult to correctly modify (#128450).

This PR converts the core logic into a short Python script, Tools/build/compute-changes.py, which determines which workflows to run. I imagine this will make it easier to introduce future conditional workflows, which we should probably adopt more of to reduce time and resources spent waiting for CI.

I have reused the "changed files" logic introduced in #108065, meaning we can also combine the duplicative MSI and Docs changes steps, which reduces the overall work done.

I've tested this quite a bit on my fork and detection works well, as does workflow dispatch.

A

cc @webknjaz (sorry for the ping; I can't request-review)

@AA-Turner
Copy link
Member Author

@webknjaz
Copy link
Contributor

webknjaz commented Feb 4, 2025

I'll try to review later today. Thanks for the ping!

Copy link
Member

@hugovk hugovk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this! I was thinking of something like this given the wrangling of #128450 :)

Mostly nits here:

AA-Turner and others added 3 commits February 4, 2025 10:55
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
* Rename `config_hash` to `config-hash`
* Rename `run_tests` to `run-tests`
* Rename `run-win-msi` to `run-windows-msi`
* Rename `run_hypothesis` to `run-hypothesis`
* Rename `run_cifuzz` to `run-ci-fuzz`
Co-authored-by: 🇺🇦 Sviatoslav Sydorenko (Святослав Сидоренко) <wk.cvs.github@sydorenko.org.ua>
@AA-Turner AA-Turner requested a review from webknjaz February 5, 2025 12:52
@webknjaz
Copy link
Contributor

webknjaz commented Feb 5, 2025

@AA-Turner were the CI jobs cancelled by accident? It's not immediately obvious what happened in there…

else
echo "Branch too old for CIFuzz tests; or no C files were changed"
echo "run-cifuzz=false" >> "$GITHUB_OUTPUT"
fi
- name: Compute hash for config cache key
id: config-hash
run: |
echo "hash=${{ hashFiles('configure', 'configure.ac', '.github/workflows/build.yml') }}" >> "$GITHUB_OUTPUT"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this would belong in the same script… In the past I'd just use sha512 to compute hashes in python: https://github.com/ansible/awx-plugins/blob/0d569b5/.github/workflows/ci-cd.yml#L222C16-L222C52.

If not, I'd question if this reusable workflow should even be called “change detection”. I think I tend to call the computation job “pre-setup” or something, since the changes isn't the only thing being detected…

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd question if this reusable workflow should even be called “change detection”.

"reusable-choose-workflows"?

I wonder if this would belong in the same script

We could move it to compute-changes.py, sure. It seems it should be possible to replicate the same output as hashFiles

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"reusable-choose-workflows"?

"reusable-build-settings"? "reusable-settings"? "reusable-workflow-run-context"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could move it to compute-changes.py, sure. It seems it should be possible to replicate the same output as hashFiles

Totally, although, it doesn't really matter if it's the same given the context where it's being used. It just has to be unique and predictable. Hashing is only used because you can't put entire file contents into cache keys 🤷‍♂️

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we change the key, all current caches will be invalidated?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but they won't be picked up after this PR anyway because they depend on the contents of .github/workflows/build.yml which you're changing here. You're already invalidating the cache.

@AA-Turner AA-Turner merged commit 7d9a22f into python:main Feb 5, 2025
41 checks passed
@AA-Turner AA-Turner deleted the change-detection-py branch February 5, 2025 17:19
@cmaloney
Copy link
Contributor

cmaloney commented Feb 5, 2025

As a note, this change made it so PRs need to merge in main to get the script otherwise the github workflow tries running a script which doesn't exist and fails. I really like having it in Python, but is a hazard that breaking changes to the script may break outstanding PRs until they merge in main.

@AA-Turner
Copy link
Member Author

Do you have a link to an example run? Not ideal, but it should be fixable by clicking the update branch button.

@cmaloney
Copy link
Contributor

cmaloney commented Feb 5, 2025

Update branch button definitely worked for my case and was straightforward to figure out and get working.

#129560 is a PR opened before this change, I made more commits and pushed, that caused this run after compute-changes.py was committed https://github.com/python/cpython/actions/runs/13163983268/job/36739530516, then I did "update branch" and https://github.com/python/cpython/actions/runs/13164025409/job/36739667617 run happened and succeeded

@webknjaz
Copy link
Contributor

webknjaz commented Feb 6, 2025

That's weird. actions/checkout usually gets the merged-in changes into jobs..

@hugovk
Copy link
Member

hugovk commented Feb 6, 2025

Yeah, we often have to "Update branch" after making CI checks stricter, for example, recently when making Blurb validation stricter.

@miss-islington-app
Copy link

Thanks @AA-Turner for the PR 🌮🎉.. I'm working now to backport this PR to: 3.12.
🐍🍒⛏🤖

@miss-islington-app
Copy link

Thanks @AA-Turner for the PR 🌮🎉.. I'm working now to backport this PR to: 3.13.
🐍🍒⛏🤖

@miss-islington-app
Copy link

Sorry, @AA-Turner, I could not cleanly backport this to 3.12 due to a conflict.
Please backport using cherry_picker on command line.

cherry_picker 7d9a22f50923309955a2caf7d57013f224071e6e 3.12

@miss-islington-app
Copy link

Sorry, @AA-Turner, I could not cleanly backport this to 3.13 due to a conflict.
Please backport using cherry_picker on command line.

cherry_picker 7d9a22f50923309955a2caf7d57013f224071e6e 3.13

@bedevere-app
Copy link

bedevere-app bot commented Feb 20, 2025

GH-130367 is a backport of this pull request to the 3.13 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.13 bugs and security fixes label Feb 20, 2025
@bedevere-app
Copy link

bedevere-app bot commented Feb 20, 2025

GH-130370 is a backport of this pull request to the 3.12 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.12 bug and security fixes label Feb 20, 2025
@webknjaz
Copy link
Contributor

Yeah, we often have to "Update branch" after making CI checks stricter, for example, recently when making Blurb validation stricter.

I meant that this is supposed to be happening automatically. Just restarting the jobs should be enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants