Skip to content

CI : testcase prioritization in macos workflow#7335

Draft
ShubhamDesai wants to merge 2 commits into
OSGeo:mainfrom
ShubhamDesai:macos_ci
Draft

CI : testcase prioritization in macos workflow#7335
ShubhamDesai wants to merge 2 commits into
OSGeo:mainfrom
ShubhamDesai:macos_ci

Conversation

@ShubhamDesai
Copy link
Copy Markdown
Contributor

@ShubhamDesai ShubhamDesai commented Apr 19, 2026

Hello everyone,
The intention of this PR is to reduce CI time by implementing test case prioritization in the macOS workflow.
Recently, my paper ["PrioTestCI: Efficient Test Case Prioritization in GitHub Workflows for CI Optimization"](https://ieeexplore.ieee.org/document/11334426) was accepted at the Automated Software Engineering Conference (IEEE/ACM) 2025. In this paper, we applied test case prioritization on Pytest's fork repo and saw a significant reduction in CI runtime (81.55% on average). Detailed results can be found in the paper.
I believe the same approach could be applied to the GRASS repo, so over the past few months I have been working on implementing it here.
How it works
First run on a PR — All test cases run as usual. The results (passed and failed) are stored as GitHub Artifacts.
On subsequent commits to the same PR:

Check the artifacts from the previous run for any failed test cases.
If failed tests exist, re-run those first.

If they still fail → stop early and provide immediate feedback to the developer.
If they now pass → continue running the remaining test cases.

Why run failed tests first? When a contributor pushes follow-up commits, they are usually trying to fix an issue. There is a high chance the same test case will fail again. By running those first, we can stop early and free up the runner so other PRs get a chance to run.
No mixing of results across PRs
Each PR gets its own isolated artifact storage scoped by PR ID. For example, if there are two open PRs (PR-1135 and PR-1127), the artifacts are stored in separate folders (pr_1135/ and pr_1127/). When fetching previous results, we only retrieve the artifacts for that specific PR.
Deployment strategy — looking for your opinions
The main concern is stability: what if something goes wrong with the prioritization logic and CI itself breaks? I want to minimize downtime and would appreciate your feedback on the following strategies:
Strategy A (Simple rollback): If anything goes wrong, revert to the previous macOS workflow and disable the new one. This is the simplest approach.
Strategy B (Automatic fallback flag): Use a fallback flag that starts as false at the beginning of each run. The workflow performs a series of checks (Is the GitHub API responding? Did the previous artifact download successfully? Did the JSON parse correctly?). If any check fails, the flag flips to true and the workflow automatically falls back to running tests using the original logic — no manual intervention needed. This works independently for each test section (pytest and gunittest), so a failure in one section doesn't affect the other.
These are the deployment strategies I have so far. I am also actively looking into other approaches. Please let me know your thoughts and suggestions.
References
📄 Paper: PrioTestCI — IEEE Xplore
📊 Workflow Diagram: View Diagram

@github-actions github-actions Bot added CI Continuous integration docs markdown Related to markdown, markdown files labels Apr 19, 2026
@@ -0,0 +1 @@
This PR intends to implement testcase prioritization in macos workflow. No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pre-commit] reported by reviewdog 🐶

Suggested change
This PR intends to implement testcase prioritization in macos workflow.
This PR intends to implement testcase prioritization in macos workflow.

@echoix
Copy link
Copy Markdown
Member

echoix commented Apr 19, 2026

What about using the « debug logging » flag that you can use for manual reruns in GitHub actions? I think it might be useful for that kind of fallback, providing an escape hatch.

We might want to have clean runs for the main branch and the Releasebranches that we backport too, and especially when doing releases. We would want to avoid cache poisoning.

@echoix
Copy link
Copy Markdown
Member

echoix commented Apr 20, 2026

Your idea also makes me think of codecov ATS, automated test selection, I don’t know where it’s at now, or if it still exists at all.

@echoix
Copy link
Copy Markdown
Member

echoix commented Apr 20, 2026

I think you forgot to commit some files to your PR

@github-actions github-actions Bot added the macOS macOS specific label May 15, 2026
name: testreport-macOS
path: testreport
retention-days: 3
retention-days: 3 No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pre-commit] reported by reviewdog 🐶

Suggested change
retention-days: 3
retention-days: 3

@ShubhamDesai
Copy link
Copy Markdown
Contributor Author

I think you forgot to commit some files to your PR

@echoix @wenzeslaus I have pushed the code and tested against my fork.
if possible please have look at it once and let me know.
Also if you have specific scenarios please test it at your end as well and let me know issues.

I have enabled flag structures so in any case if something goes wrong we can just enable flags and make sure there is no downtime.

Also i took help of AI for scripting so if anything you dont like please let me know i will change.
Also, the gunittest thorough section at the end is the most impactful since it takes around 40 minutes. If you'd prefer a more incremental approach, we could start by applying prioritization only to that section and keep the other sections unchanged.

Also i am open to suggestion, thoughts, critical points.

@ShubhamDesai
Copy link
Copy Markdown
Contributor Author

What about using the « debug logging » flag that you can use for manual reruns in GitHub actions? I think it might be useful for that kind of fallback, providing an escape hatch.

We might want to have clean runs for the main branch and the Releasebranches that we backport too, and especially when doing releases. We would want to avoid cache poisoning.

For specific branches i havent implemented yet but can do that. For that as well we can use flags

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI Continuous integration docs macOS macOS specific markdown Related to markdown, markdown files

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants