Skip to content

fix: clean up leftover gh-pages branch files on deploy (#204)#207

Merged
JohannesHoppe merged 2 commits intomainfrom
fix/issue-204-submodule-cleanup
Apr 22, 2026
Merged

fix: clean up leftover gh-pages branch files on deploy (#204)#207
JohannesHoppe merged 2 commits intomainfrom
fix/issue-204-submodule-cleanup

Conversation

@JohannesHoppe
Copy link
Copy Markdown
Member

@JohannesHoppe JohannesHoppe commented Apr 22, 2026

Summary

Fixes #204 — when deploying from a repo with submodules (and/or top-level dotfiles like .github/, .gitignore, .gitmodules), v3 leaves those files behind on the gh-pages branch instead of replacing the branch with just the dist contents. v2 did this correctly.

Root cause

gh-pages@6.3.0's "Removing files" step calls globby.sync(options.remove, { cwd })without dot: true — so dotfiles and dot-named directories from the gh-pages branch are never matched and never removed. Submodules are also out of scope for a filesystem-based glob. Our dist is then copied on top and committed along with the stale files.

Upstream fix tschaub/gh-pages#612 (merged 2025-08-09) addresses this by changing the default remove pattern to '**/*' AND adding dot: true to the globby call. But that PR is not released: gh-pages@6.3.0 is still the latest on npm (latest dist-tag), no newer git tag exists, and no next/rc/preview dist-tag includes it. Upstream main is 63 commits ahead of v6.3.0, mixing the fix with a major internal rewrite (globby → tinyglobby), so a drop-in patch release is unlikely any time soon.

Approach

Register a beforeAdd hook (gh-pages exposes this as a user-pluggable callback that fires after copy, before git-add). The hook:

  1. Walks our dist directory to build the set of expected file paths.
  2. Runs git ls-files -z — at that point the index still contains everything from the gh-pages branch that gh-pages' broken remove step failed to remove (dotfiles, submodule gitlinks).
  3. Diffs the two lists and runs git rm -- <leftovers> on the difference.

git rm handles submodule gitlinks correctly, so the reporter's build gitlink is also cleaned up. The hook is skipped when the user sets add: true (that mode explicitly preserves existing files).

Because the hook operates at the git-level (not via globby), it's unaffected by upstream's pending globby → tinyglobby swap. When gh-pages eventually ships a release containing #612, this hook becomes a no-op and can be removed.

gh-pages@6.3.0's "Removing files" step calls globby without `dot: true`,
so dotfiles (.gitignore, .gitmodules, .github/) and submodule gitlinks
from the gh-pages branch are not removed before our dist is copied on
top. They then get re-committed and leak into the deploy.

Upstream fix tschaub/gh-pages#612 (merged 2025-08-09) adds both
`remove: '**/*'` and `dot: true`, but is unreleased as of v6.3.0. We
can't reach the globby call from options alone.

Workaround: register a `beforeAdd` hook that runs after gh-pages'
broken remove + our file copy. The hook asks git what it still has
indexed (`git ls-files -z`), diffs against the set of files in our
dist, and `git rm`s the leftovers. `git rm` handles submodule gitlinks
correctly, so the `build` gitlink from the reporter's scenario is also
cleaned up.

Skipped when the user opts into `add: true` (additive mode explicitly
preserves existing files).

Adds a spawn-level regression test in engine.gh-pages-behavior.spec.ts
that mocks `git ls-files` to return leftover dotfiles + a submodule
gitlink, runs engine.run() end-to-end through real gh-pages, and
asserts `git rm` targets the leftovers while leaving dist files alone.

Fixes #204
Prior tests covered the cleanup hook at the spawn-mock level only — they
verified that the right git commands get issued, but not that the
resulting gh-pages commit is actually clean. Add a real-filesystem,
real-git test that:

  1. Creates a local bare repo.
  2. Seeds a gh-pages branch containing the exact conditions from #204:
     dotfiles (.gitignore, .gitmodules), nested dot-directory contents
     (.github/workflows/deploy.yml), a submodule gitlink, and a stale
     non-dot file.
  3. Runs engine.run() (no mocks of child_process, fs, or gh-pages).
  4. Inspects `git ls-tree -r gh-pages` on the bare repo and asserts
     the final tree contains ONLY index.html.

Also adds a baseline test that calls gh-pages.publish() directly
(bypassing our hook) and asserts the dotfiles + submodule DO leak —
i.e. the upstream bug is real on the installed gh-pages version. If a
future gh-pages release fixes the bug, that baseline test will fail as
a clear signal that our workaround can be removed.

No production changes; the hook implementation and its PR #206-era
callback wrapper are exercised end-to-end by the new tests.
@JohannesHoppe JohannesHoppe merged commit 7ee64bb into main Apr 22, 2026
@JohannesHoppe JohannesHoppe deleted the fix/issue-204-submodule-cleanup branch April 22, 2026 21:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

v3: 'Removing files' step doesn't properly clean existing gh-pages content when repo contains submodules

1 participant