Skip to content

fix: prevent gh-pages repo bloat from doc preview artifacts#1309

Merged
kevalmorabia97 merged 3 commits intomainfrom
fix/gh-pages-docs-bloat
Apr 21, 2026
Merged

fix: prevent gh-pages repo bloat from doc preview artifacts#1309
kevalmorabia97 merged 3 commits intomainfrom
fix/gh-pages-docs-bloat

Conversation

@kevalmorabia97
Copy link
Copy Markdown
Collaborator

@kevalmorabia97 kevalmorabia97 commented Apr 21, 2026

What does this PR do?

Type of change: Bug fix

Fixes gh-pages branch bloat that grew from ~26 MB to ~441 MB in four weeks (nvbug 6099503). Three compounding causes were identified and addressed:

  1. Sphinx .doctrees/ cache published to gh-pagessphinx-build was writing its build cache inside build/html/ which was then uploaded verbatim. Accounts for ~3.3 GB uncompressed across history.
  2. JamesIves/github-pages-deploy-action appending a commit on every push — main-site files accumulated forever with single-commit: false (default).
  3. PR preview deploying on every synchronize event for all PRsrossjrw/pr-preview-action re-deployed the full site for every push to any PR regardless of whether docs changed (e.g. PR add: DFlash block diffusion speculative decoding #1128 triggered 64 preview deploys × ~11 MB each).

Changes:

  • Pass -d /tmp/doctrees to sphinx-build so .doctrees/ is never written into build/html/
  • Add paths: [docs/**, modelopt/**] filter to pull_request trigger so the docs workflow only runs on PRs that touch docs or source code
  • Set single-commit: true on the deploy action so main-site pushes squash into one commit
  • Deduplicate docs build: deploy-preview now downloads the artifact from build-docs instead of running a second sphinx-build
  • Set retention-days: 1 on the artifact since it is only needed for the duration of the workflow run

The one-time cleanup (force-push squashed orphan to gh-pages) was already applied separately — repo is now ~59 MB for a full clone vs ~441 MB before.

Usage

N/A — CI/workflow change only.

Testing

  • Workflow logic reviewed manually.
  • The one-time cleanup was verified: git rev-list --objects --disk-usage origin/gh-pages now reports ~28 MB; full clone is ~59 MB.

Before your PR is "Ready for review"

  • Is this change backward compatible?: ✅
  • If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: N/A
  • Did you write any new necessary tests?: N/A
  • Did you update Changelog?: N/A

Additional Information

nvbug 6099503

Summary by CodeRabbit

  • Chores
    • Optimized documentation build and preview workflow: PR previews now run only for doc-related changes, builds complete faster with shorter timeouts, and previews depend on completed doc builds.
    • Simplified artifact handling with unconditional uploads during runs and shorter retention.
    • Standardized Pages deployment to produce a single consolidated commit.
    • Stabilized docs build by directing temporary build state to a fixed temp path.

…99503)

- Pass `-d /tmp/doctrees` to sphinx-build so .doctrees/ cache is never
  written into build/html and never uploaded to gh-pages
- Add `paths` filter to pull_request trigger so the docs workflow only
  runs on PRs touching docs/** or modelopt/**
- Set `single-commit: true` on JamesIves deploy action so main-site
  pushes squash into one commit instead of accumulating forever
- Deduplicate docs build: deploy-preview now downloads the artifact
  produced by build-docs instead of running a second sphinx-build
- Set retention-days: 1 on the artifact since it is only needed for the
  duration of the workflow run

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
@kevalmorabia97 kevalmorabia97 requested a review from a team as a code owner April 21, 2026 18:52
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 21, 2026

📝 Walkthrough

Walkthrough

CI docs workflow updated: added path-filtered PR detection and a changes job, guarded builds on closed PRs, reduced job timeouts, changed artifact upload/retention and deploy conditions (including single-commit Pages deploy). Sphinx doctree output redirected to /tmp/doctrees in the nox docs session.

Changes

Cohort / File(s) Summary
GitHub Actions workflow
​.github/workflows/pages.yml
Added a PR-only changes job using dorny/paths-filter to detect edits under docs/**, modelopt/**, or the workflow file; build-docs now skips when PR action is closed, timeout reduced 30→10 min, artifact upload unconditionally runs with retention-days: 1; deploy-preview depends on build-docs and changes, updated if to run for PRs when closed or when needs.changes.outputs.docs == 'true', and downloads docs-html artifact for preview; deploy-gh-pages now sets single-commit: true.
Docs build (nox)
noxfile.py
Sphinx invocation in the docs nox session now passes -d /tmp/doctrees, directing doctree output to a fixed temporary path.

Sequence Diagram(s)

sequenceDiagram
  participant PR as Pull Request
  participant Actions as GitHub Actions
  participant Paths as Paths Filter
  participant Builder as build-docs job
  participant Artifact as Artifact Storage
  participant Preview as deploy-preview job
  participant Pages as GitHub Pages

  PR->>Actions: push / open / update
  Actions->>Paths: run `dorny/paths-filter`
  Paths-->>Actions: outputs.docs
  Actions->>Builder: run build-docs (skip if action == "closed")
  Builder->>Artifact: upload `docs-html` (retention-days:1)
  Artifact-->>Preview: provide artifact when needed
  Actions->>Preview: run deploy-preview (depends on build-docs + changes)
  Preview->>Pages: deploy to gh-pages (single-commit:true)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately describes the main change: preventing gh-pages repository bloat from documentation preview artifacts through workflow optimization.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Security Anti-Patterns ✅ Passed PR contains only workflow and noxfile.py configuration changes with no security anti-patterns detected in package code.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/gh-pages-docs-bloat

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
.github/workflows/pages.yml (1)

54-59: Consider guarding artifact download against build failure.

When build-docs fails (not just skipped), no artifact is uploaded, but the download step will still attempt to run—producing a confusing "artifact not found" error instead of clearly indicating the build failure.

Proposed improvement to check build outcome
      - name: Download docs artifact
-       if: github.event.action != 'closed'
+       if: github.event.action != 'closed' && needs.build-docs.result == 'success'
        uses: actions/download-artifact@v4
        with:
          name: docs-html
          path: docs/build/html
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/pages.yml around lines 54 - 59, The "Download docs
artifact" step tries to fetch "docs-html" even when the build-docs job failed,
causing an "artifact not found" error; update that step's conditional to run
only when the build-docs job succeeded (for example replace or extend its if
with needs.build-docs.result == 'success'), so the Download docs artifact step
runs only when the "build-docs" job produced and uploaded the "docs-html"
artifact.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In @.github/workflows/pages.yml:
- Around line 54-59: The "Download docs artifact" step tries to fetch
"docs-html" even when the build-docs job failed, causing an "artifact not found"
error; update that step's conditional to run only when the build-docs job
succeeded (for example replace or extend its if with needs.build-docs.result ==
'success'), so the Download docs artifact step runs only when the "build-docs"
job produced and uploaded the "docs-html" artifact.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 8022e0fb-c96b-43e1-a50a-7b0502ee2f13

📥 Commits

Reviewing files that changed from the base of the PR and between 5ffb848 and f30871a.

📒 Files selected for processing (2)
  • .github/workflows/pages.yml
  • noxfile.py

Copy link
Copy Markdown
Collaborator

@shengliangxu shengliangxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and let's see if it solve the issue

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 21, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.90%. Comparing base (2fef374) to head (77f3432).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1309      +/-   ##
==========================================
- Coverage   75.60%   74.90%   -0.70%     
==========================================
  Files         462      464       +2     
  Lines       49960    50230     +270     
==========================================
- Hits        37771    37624     -147     
- Misses      12189    12606     +417     
Flag Coverage Δ
gpu 58.59% <ø> (-0.51%) ⬇️
regression 14.85% <ø> (+0.07%) ⬆️
unit 52.33% <ø> (-0.08%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@kevalmorabia97 kevalmorabia97 enabled auto-merge (squash) April 21, 2026 19:06
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 21, 2026

PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-04-21 20:19 UTC

Remove the workflow-level paths filter so build-docs always runs as a
required CI check on every PR. Add a lightweight 'changes' job using
dorny/paths-filter to detect whether docs-relevant files changed, and
condition deploy-preview on that output so previews are still only
deployed for PRs that touch docs/**, modelopt/**, or pages.yml.

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/pages.yml:
- Around line 50-54: The deploy-preview filter list currently omits noxfile.py
so changes to the Sphinx invocation (nox -s docs in noxfile.py) won't trigger
the preview; update the filters block to include 'noxfile.py' alongside
'docs/**', 'modelopt/**', and '.github/workflows/pages.yml' so edits to
noxfile.py (which modify docs/build/html output) will correctly gate
deploy-preview.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 2cecc1c4-05c4-4f76-99c2-8b551b4efc0a

📥 Commits

Reviewing files that changed from the base of the PR and between 6d16719 and 77f3432.

📒 Files selected for processing (1)
  • .github/workflows/pages.yml

Comment thread .github/workflows/pages.yml
@kevalmorabia97 kevalmorabia97 merged commit c51c176 into main Apr 21, 2026
41 checks passed
@kevalmorabia97 kevalmorabia97 deleted the fix/gh-pages-docs-bloat branch April 21, 2026 20:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants