Skip to content

Add "Prune Deployed Branches" workflow with Python-based logic#205

Merged
litlfred merged 5 commits intomainfrom
copilot/prune-deployed-branches-workflow
Mar 3, 2026
Merged

Add "Prune Deployed Branches" workflow with Python-based logic#205
litlfred merged 5 commits intomainfrom
copilot/prune-deployed-branches-workflow

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 2, 2026

gh-pages accumulates a commit per deployed branch preview with no mechanism to clean up stale entries. This adds a workflow_dispatch-triggered workflow that removes one or all subdirectories under branches/ on gh-pages and squashes the branch history down to a single commit.

Workflow (prune-branches.yml)

  • Single job; triggered manually from the Actions UI
  • Three inputs: target_branch (optional), squash_only (boolean, default false), and confirm (must be the literal string CONFIRM as a safety gate)
  • Sparse-checkouts only prune_branches.py from the default branch, then checks out gh-pages to a separate path — no overlap between the two
  • GitHub Actions inputs passed exclusively via env: block; never interpolated into the shell command string

Script (input/scripts/prune_branches.py)

All logic is in Python; the YAML is a thin launcher.

Three pruning modes:

squash_only target_branch Effect
false (empty) Remove all of branches/, squash history
false my-feature Remove branches/my-feature/, squash history
true (ignored) Keep all of branches/ intact, squash history only

The squash_only mode is a dedicated boolean input rather than a magic sentinel string, so branch names like NONE are handled normally.

Injection / tampering protections:

  • Inputs read from os.environ only — not sys.argv — so shell metacharacters in workflow inputs are inert
  • GIT_ENV_BLOCKLIST strips GIT_DIR, GIT_WORK_TREE, GIT_INDEX_FILE, and related vars from every subprocess call, preventing git-redirection via injected env vars
  • All subprocess calls use explicit list args with shell=False
  • Path traversal rejected in two layers: lexical (.., separators) then canonical (os.path.realpath + os.path.commonpath must stay inside branches/)
  • validate_current_branch() aborts unless git rev-parse --abbrev-ref HEAD returns exactly gh-pages

History squash:
Creates an orphan branch from the post-deletion working tree and force-pushes it to origin/gh-pages, collapsing all accumulated deployment history to a single commit. The squash-only mode uses force=True so the orphan commit is created even when no files were deleted.

Dynamic branch listing:
GitHub Actions workflow_dispatch only supports statically-defined choice lists. As the best practical alternative, the script prints every subdirectory of branches/ at the start of each run so operators can identify valid names before re-running with a specific target.

Original prompt

This section details on the original issue you should resolve

<issue_title>Depopulate gh-pages/branches</issue_title>
<issue_description>How about instead of a script we'd make it a GitHub workflow called "Prune deployed branches" that will remove either everything under the branches/directory under GH pages Branch or a selected feature branch via a workflow dispatch on the GitHub actions user interface.

The workflow should be dispatched from the GitHub user interface. Script should be very cautious and ensure that it is executing on the GH pages Branch and that it is executing in the proper directory

Squash commit to remove the history and do whatever is possible to reduce the size of the branch and the commit log/history and keep the repo small </issue_description>

Comments on the Issue (you are @copilot in this section)


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI and others added 2 commits March 2, 2026 13:27
Co-authored-by: litlfred <662242+litlfred@users.noreply.github.com>
Co-authored-by: litlfred <662242+litlfred@users.noreply.github.com>
Copilot AI changed the title [WIP] Add GitHub workflow to prune deployed branches Add "Prune Deployed Branches" workflow with Python-based logic Mar 2, 2026
@litlfred
Copy link
Copy Markdown
Collaborator

litlfred commented Mar 2, 2026

does this remove the branch history @copilot so that git repo is reduced in size

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Mar 2, 2026

does this remove the branch history @copilot so that git repo is reduced in size

Yes — the squash_and_push function in prune_branches.py uses the orphan-branch technique to collapse the entire gh-pages commit history down to a single commit:

  1. git checkout --orphan gh-pages-squashed — creates a new branch with no parent commits (zero history), containing only the current working tree
  2. One commit is made on that orphan branch
  3. git push --force origin HEAD:gh-pages — replaces gh-pages on GitHub with that single-commit history

After the force-push, all the old deployment commits become unreachable/dangling objects on GitHub's servers. GitHub runs garbage collection automatically on orphaned objects, which permanently removes them and reclaims the storage. There is no way to trigger GitHub's GC manually from a workflow, but the old history is gone from all refs immediately after the push and will be fully purged on the next GC cycle.

The net result: gh-pages goes from potentially thousands of deployment commits to a single commit regardless of how many branches were deployed before.

@litlfred
Copy link
Copy Markdown
Collaborator

litlfred commented Mar 2, 2026

can you also add a prune none - preserves all gh-pages but removes its hustory

@litlfred
Copy link
Copy Markdown
Collaborator

litlfred commented Mar 2, 2026

@copilot

Co-authored-by: litlfred <662242+litlfred@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Mar 2, 2026

can you also add a prune none - preserves all gh-pages but removes its hustory

Done in commit 393b51f. Set target_branch to the literal value NONE in the workflow UI to squash the gh-pages history without deleting any branch previews. The three modes are now:

target_branch value Effect
(empty) Remove all of branches/, squash history
my-feature Remove branches/my-feature/, squash history
NONE Keep all of branches/ intact, squash history only

@litlfred
Copy link
Copy Markdown
Collaborator

litlfred commented Mar 3, 2026

what if there is a branch named NONE ? @copilot

…anch name collision

Co-authored-by: litlfred <662242+litlfred@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Mar 3, 2026

what if there is a branch named NONE ? @copilot

Good catch. Fixed in 1bece91 — the NONE string sentinel is gone. Squash-only mode is now a dedicated squash_only boolean checkbox in the workflow UI, completely independent of target_branch. A branch literally named NONE can now be pruned normally.

The three modes remain:

squash_only target_branch Effect
false (empty) Remove all of branches/, squash history
false my-feature Remove branches/my-feature/, squash history
true (ignored) Keep all of branches/ intact, squash history only

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 3, 2026

✅ Deployment Successful

Branch: copilot/prune-deployed-branches-workflow
Commit: 1bece91
Deployed to: https://WorldHealthOrganization.github.io/smart-base/branches/prune-deployed-branches-workflow

🌐 Open Preview
🔍 View Build Log


Deployment completed successfully! The site is now live at the URL above.

@litlfred litlfred marked this pull request as ready for review March 3, 2026 11:31
Copilot AI review requested due to automatic review settings March 3, 2026 11:31
@litlfred litlfred merged commit c6ddc8c into main Mar 3, 2026
3 checks passed
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a manually triggered GitHub Actions maintenance workflow to clean up stale branch preview deployments under gh-pages/branches/ and optionally squash the gh-pages history down to a single commit to reduce repository size over time.

Changes:

  • Introduces input/scripts/prune_branches.py to validate inputs, delete one/all branches/* previews, and squash+force-push gh-pages.
  • Adds .github/workflows/prune-branches.yml (workflow_dispatch) to run the pruning/squash operation with safety gating (confirm=CONFIRM) and controlled input passing via env.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
input/scripts/prune_branches.py Implements pruning + orphan-commit squash logic with input validation and git subprocess wrappers.
.github/workflows/prune-branches.yml Adds a workflow_dispatch launcher that checks out gh-pages and runs the pruning script with guarded inputs.

Comment on lines +61 to +65
"GIT_CONFIG_NOSYSTEM",
"GIT_CONFIG_COUNT",
"GIT_CONFIG_KEY_0",
"GIT_CONFIG_VALUE_0",
})
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The git env sanitization blocklist only removes GIT_CONFIG_KEY_0 / GIT_CONFIG_VALUE_0, but git supports GIT_CONFIG_KEY_<n>/VALUE_<n> for arbitrary indices (controlled by GIT_CONFIG_COUNT) and other config-related env vars (e.g. GIT_CONFIG_PARAMETERS). As written, injected config entries beyond index 0 would still reach git subprocesses. Consider stripping all env vars with a GIT_CONFIG_ prefix (and any other git-redirection vars you intend to defend against), or adjust the security claims in the docstring to match what’s actually blocked.

Copilot uses AI. Check for mistakes.
Comment on lines +65 to +69
- name: Checkout prune script (default branch)
uses: actions/checkout@v4
with:
sparse-checkout: |
input/scripts/prune_branches.py
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This step is labeled as checking out the prune script from the “default branch”, but actions/checkout will use the workflow’s selected ref unless ref: is set. Because workflow_dispatch lets operators choose a branch, this can end up running a different prune_branches.py than intended. Set ref: ${{ github.event.repository.default_branch }} (or explicitly main) for this checkout, and/or prevent dispatches from non-default branches.

Copilot uses AI. Check for mistakes.
Comment on lines +22 to +26
name: Prune Deployed Branches

on:
workflow_dispatch:
inputs:
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This workflow force-pushes gh-pages, so concurrent manual runs can race and overwrite each other’s result (last writer wins). Add a concurrency group (e.g. keyed on the workflow name) with cancel-in-progress: false to ensure only one prune operation can run at a time.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Depopulate gh-pages/branches

3 participants