Skip to content

INFR: Migrate to github artifacts for gh-pages (deleting gh-pages branch) #282

@mmcky

Description

@mmcky

GitHub Pages Deployment Strategy: Analysis and Migration Guide

Prepared for QuantEcon Infrastructure Team
Date: January 2026
Related Issues: #261


Table of Contents

  1. Executive Summary
  2. Background
  3. Deployment Approaches Compared
  4. Pros and Cons Analysis
  5. Recommendation
  6. Technical Deep Dive
  7. Migration Guide
  8. Post-Migration Procedures
  9. Appendix: Complete Workflow Examples

1. Executive Summary

This report compares two approaches to GitHub Pages deployment:

  • Traditional gh-pages branch method (currently used by lecture-python.myst)
  • Modern artifact-based deployment with deployment environments (used by 2026-tom-course)

Decision: We recommend migrating to artifact-based deployment for all QuantEcon lecture repositories.

Key reasons:

  • Eliminates repository bloat from accumulated deployment history
  • Maintains fast clone times regardless of deployment frequency
  • Uses official GitHub-maintained actions with first-party support
  • Deployed site persists indefinitely (artifacts expiring does NOT affect the live site)

Migration impact: Zero downtime when following the documented procedure. The gh-pages branch and all its history can be safely deleted after migration, reclaiming significant repository space.


2. Background

2.1 Current State

lecture-python.myst uses the traditional approach:

  • peaceiris/actions-gh-pages@v4 pushes built HTML to a gh-pages branch
  • GitHub Pages serves content from this branch
  • Each deployment creates a new commit, accumulating history
  • Repository: 727+ commits on main, 182 releases

2026-tom-course uses the modern approach:

  • actions/upload-pages-artifact and actions/deploy-pages
  • Content deployed via GitHub's deployment environment system
  • No persistent branch; artifacts stored separately
  • Clean repository structure

2.2 The Problem with gh-pages Branches

The gh-pages branch bloat is a well-documented issue across the GitHub ecosystem:

Repository Reported Size Cause
Mozilla VPN Client 1.5 GB+ WASM binaries in gh-pages history
Eclipse Theia 2.6 GB API documentation history
Scratch GUI 2 GB+ Built JavaScript bundles

For lecture repositories with:

  • Large built outputs (Jupyter Book HTML, CSS, JS)
  • Frequent updates
  • Binary files that don't delta compress well

This problem compounds over time, affecting every contributor's clone operation.


3. Deployment Approaches Compared

3.1 How Each Approach Works

gh-pages Branch Method:

Build → Commit to gh-pages branch → GitHub detects change → Serves content
        ↓
        Accumulates in git history forever

Artifact-Based Method:

Build → Upload artifact → Deploy to Pages infrastructure → Serves content
        ↓                         ↓
        Expires after 90 days     Persists indefinitely
        (configurable)            (until replaced)

3.2 Detailed Comparison

Aspect gh-pages Branch Artifact Deployment
Repository Size Grows with each deployment; can reach GB+ No impact; artifacts stored separately
Clone Time Degrades over time; gh-pages fetched by default Always fast; no extra branches
Site Persistence Permanent (in git history) Permanent (in Pages infrastructure)
Deployment History Full git history preserved Limited to artifact retention (~90 days)
Security Branch protection rules Environment protection + OIDC verification
Workflow Complexity Single action Multi-step: configure, upload, deploy
GitHub Support Third-party (peaceiris) Official GitHub actions
Rollback Speed Instant (git reset) Minutes (rebuild or artifact redeploy)
Inspect Deployed Files Yes (browse gh-pages branch) No (only via artifact download)

3.3 Critical Clarification: Artifact Expiration

Important: Artifact expiration does NOT affect the live site.

  • Workflow artifacts (downloadable from Actions tab) expire after retention period
  • Deployed site content persists indefinitely in GitHub Pages infrastructure

Once actions/deploy-pages succeeds, content is copied to GitHub's hosting. The original artifact is just a staging mechanism—its expiration has no effect on the live site.

You could deploy once and never touch the repository for years—the site would remain live.


4. Pros and Cons Analysis

4.1 gh-pages Branch Method

Advantages:

  • Complete deployment history preserved in git
  • Instant rollback via git operations (git reset, git revert)
  • Simpler workflow configuration (single action)
  • Well-documented with extensive community examples
  • Can inspect deployed files directly in branch

Disadvantages:

  • Repository size bloat: Each deployment adds commits; large sites grow to GB+
  • Clone time degradation: All contributors download gh-pages history
  • Binary files compress poorly: Built JS bundles, images don't delta well
  • Requires periodic maintenance: Force-pushing or orphan branches needed
  • Third-party action dependency

4.2 Artifact-Based Deployment

Advantages:

  • Zero repository bloat: Artifacts stored separately, expire automatically
  • Fast clones forever: Repository size constant regardless of deployment frequency
  • First-party GitHub support: Official actions maintained by GitHub
  • Enhanced security: OIDC token verification, environment protection rules
  • Modern best practice: Recommended by GitHub documentation
  • Cleaner repository structure (no orphan branch)
  • gh-pages branch can be deleted: All history removed, space reclaimed

Disadvantages:

  • Limited deployment history (artifact retention period)
  • More complex workflow setup (multiple steps/jobs)
  • Cannot directly inspect deployed files in repository
  • Requires repository settings change
  • Rollback requires rebuilding or dedicated workflow

5. Recommendation

5.1 Decision

We recommend adopting artifact-based deployment as the standard for all QuantEcon lecture repositories.

5.2 Justification

  1. Sustainable infrastructure: Repository size remains constant regardless of how many times you deploy

  2. Better contributor experience: New contributors always get fast clones

  3. Official support: First-party GitHub actions ensure long-term maintenance and compatibility

  4. The trade-off is acceptable: Limited deployment history is mitigated by:

    • Source code history remains complete
    • Artifact retention (90-180 days) covers typical rollback needs
    • Any version can be rebuilt from source
    • Deployed site persists indefinitely
  5. Clean break: Deleting gh-pages branch removes all accumulated bloat permanently

5.3 Implementation Strategy

Phase 1 — New Projects:

  • Use artifact-based deployment for all new repositories
  • 2026-tom-course serves as reference implementation

Phase 2 — Existing Projects:

  • Evaluate current gh-pages branch size
  • Migrate following the guide below
  • Delete gh-pages branch after successful migration

6. Technical Deep Dive

6.1 Rollback Strategies

With gh-pages Branch (Old Method)

# Instant rollback to previous deployment
git checkout gh-pages
git reset --hard HEAD~1
git push --force

# Rollback to specific date
git checkout gh-pages@{2025-01-01}
git push --force

With Artifact Deployment (New Method)

Option A: Re-run Previous Workflow

# Find the run ID
gh run list --workflow=deploy.yml --limit=10

# Re-run (rebuilds from that commit's source)
gh run rerun <run-id>

Option B: Deploy from Tag

# Trigger workflow on specific tag
gh workflow run deploy.yml --ref publish-2025dec15

Option C: Dedicated Rollback Workflow

See Section 9.2 for complete workflow file.

# Rollback to specific run's artifact
gh workflow run rollback.yml -f run_id=12345678

6.2 Can We Safely Delete gh-pages After Migration?

Yes. Once migrated to artifact-based deployment:

  1. The deployed site content lives in GitHub's Pages infrastructure
  2. It is completely independent of any repository branch
  3. Deleting gh-pages removes all accumulated history
  4. Repository size decreases (after GitHub garbage collection)
  5. The live site continues serving without interruption

The only thing you lose: The ability to instantly rollback via git. But you retain:

  • Full source code history (rebuild any version)
  • Artifact history (90+ days of deployments)
  • The rollback workflow option

7. Migration Guide

7.1 Pre-Migration Checklist

# Check gh-pages branch size
git fetch origin gh-pages
git rev-list --count origin/gh-pages  # Number of commits

# Estimate size impact
git clone --single-branch --branch gh-pages \
  https://github.com/QuantEcon/your-repo.git gh-pages-only
du -sh gh-pages-only/.git

Document current configuration:

  • Custom domain (e.g., python.quantecon.org)
  • HTTPS enforcement setting
  • Any special build requirements (GPU runners, etc.)

7.2 Migration Sequence

Critical: Follow this exact order to avoid downtime.

Step 1: Create new workflow file (keep old one)
           ↓
Step 2: Change Pages source to "GitHub Actions"  ← MUST do before Step 3
           ↓
Step 3: Run new workflow → deploys via artifacts
           ↓
Step 4: Verify site works correctly
           ↓
Step 5: Remove old peaceiris deployment step
           ↓
Step 6: Delete gh-pages branch (reclaims space)

7.3 Step-by-Step Instructions

Step 1: Create New Workflow File

Create .github/workflows/deploy-pages.yml:

name: Build & Deploy to GitHub Pages

on:
  push:
    tags:
      - 'publish*'
  workflow_dispatch:  # Manual trigger for testing

permissions:
  contents: read
  pages: write
  id-token: write

concurrency:
  group: "pages"
  cancel-in-progress: false

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v4
        with:
          fetch-depth: 0

      # Your build steps here...
      - name: Build site
        run: jupyter-book build lectures --path-output ./

      - name: Setup Pages
        uses: actions/configure-pages@v5

      - name: Upload artifact
        uses: actions/upload-pages-artifact@v4
        with:
          path: '_build/html'
          retention-days: 180

  deploy:
    environment:
      name: github-pages
      url: ${{ steps.deployment.outputs.page_url }}
    runs-on: ubuntu-latest
    needs: build
    steps:
      - name: Deploy to GitHub Pages
        id: deployment
        uses: actions/deploy-pages@v4

Step 2: Change GitHub Pages Source Setting

  1. Go to repository SettingsPages
  2. Under "Build and deployment":
    • Change Source from "Deploy from a branch" to "GitHub Actions"
  3. Click Save

Note: This creates a github-pages environment automatically with protection rules.

Step 3: Test New Workflow

# Manual trigger (if workflow_dispatch enabled)
gh workflow run deploy-pages.yml

# Or create test tag
git tag publish-test-migration
git push origin publish-test-migration

Step 4: Verify Site

  • Homepage loads correctly
  • All pages accessible
  • Custom domain works
  • HTTPS enforced
  • No broken assets

Step 5: Remove Old Deployment Step

Edit your workflow to remove:

# DELETE THIS:
- name: Deploy to GitHub Pages
  uses: peaceiris/actions-gh-pages@v4
  with:
    github_token: ${{ secrets.GITHUB_TOKEN }}
    publish_dir: _build/html/
    cname: python.quantecon.org

Step 6: Delete gh-pages Branch

Via GitHub CLI:

gh api -X DELETE repos/:owner/:repo/git/refs/heads/gh-pages

Via Command Line:

git push origin --delete gh-pages

Via GitHub UI:

  1. Go to repository → click "X branches"
  2. Find gh-pages
  3. Click trash icon

7.4 Custom Domain Handling

If you have a custom domain (e.g., python.quantecon.org):

Option A: Configure via Settings (Recommended)

  • Settings → Pages → Custom domain → enter domain
  • GitHub manages CNAME automatically

Option B: Include in Build Output

- name: Add CNAME
  run: echo "python.quantecon.org" > _build/html/CNAME

8. Post-Migration Procedures

8.1 Verify Migration Success

# Check deployment status
gh api repos/:owner/:repo/pages

# List environments
gh api repos/:owner/:repo/environments

# Verify no gh-pages branch
git fetch --prune
git branch -r | grep gh-pages  # Should return nothing

8.2 Monitor Repository Size

GitHub runs garbage collection periodically. Size reduction may take up to 24 hours.

# Check repository size
gh api repos/:owner/:repo --jq '.size'

For large repositories, contact GitHub Support to request manual garbage collection.

8.3 Rollback If Needed

If migration fails and gh-pages still exists:

  1. Settings → Pages → Source → "Deploy from a branch"
  2. Select gh-pages
  3. Site immediately serves from gh-pages again

If gh-pages already deleted:

  1. Keep/restore old workflow with peaceiris/actions-gh-pages
  2. Change Pages source to "Deploy from a branch"
  3. Run old workflow (recreates gh-pages branch)

9. Appendix: Complete Workflow Examples

9.1 Full Production Workflow

Adapted for QuantEcon lecture repositories:

name: Build & Deploy Lectures

on:
  push:
    tags:
      - 'publish*'
  workflow_dispatch:
    inputs:
      debug:
        description: 'Enable debug mode'
        required: false
        default: 'false'

permissions:
  contents: read
  pages: write
  id-token: write

concurrency:
  group: "pages"
  cancel-in-progress: false

jobs:
  build:
    # For GPU builds, use custom runner:
    # runs-on: "runs-on=${{ github.run_id }}/family=g4dn.2xlarge/image=quantecon_ubuntu2404"
    runs-on: ubuntu-latest
    
    steps:
      - name: Checkout
        uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Setup Miniconda
        uses: conda-incubator/setup-miniconda@v3
        with:
          auto-update-conda: true
          auto-activate-base: true
          miniconda-version: 'latest'
          python-version: "3.11"
          environment-file: environment.yml
          activate-environment: quantecon

      - name: Build HTML
        shell: bash -l {0}
        run: |
          jupyter-book build lectures --path-output ./

      - name: Build Download Notebooks
        shell: bash -l {0}
        run: |
          jupyter-book build lectures --path-output ./ \
            --builder=custom --custom-builder=jupyter
          mkdir -p _build/html/_notebooks
          cp _build/jupyter/*.ipynb _build/html/_notebooks/

      - name: Configure Custom Domain
        run: echo "python.quantecon.org" > _build/html/CNAME

      - name: Setup Pages
        uses: actions/configure-pages@v5

      - name: Upload Pages Artifact
        uses: actions/upload-pages-artifact@v4
        with:
          path: '_build/html'
          retention-days: 180

  deploy:
    environment:
      name: github-pages
      url: ${{ steps.deployment.outputs.page_url }}
    runs-on: ubuntu-latest
    needs: build
    steps:
      - name: Deploy to GitHub Pages
        id: deployment
        uses: actions/deploy-pages@v4

9.2 Rollback Workflow

Save as .github/workflows/rollback.yml:

name: Rollback Deployment

on:
  workflow_dispatch:
    inputs:
      run_id:
        description: 'Workflow run ID to rollback to (find via: gh run list)'
        required: true
        type: string

permissions:
  contents: read
  pages: write
  id-token: write
  actions: read

jobs:
  rollback:
    runs-on: ubuntu-latest
    environment:
      name: github-pages
      url: ${{ steps.deployment.outputs.page_url }}
    steps:
      - name: Download artifact from previous run
        uses: actions/download-artifact@v4
        with:
          name: github-pages
          path: ./rollback-artifact
          github-token: ${{ secrets.GITHUB_TOKEN }}
          run-id: ${{ inputs.run_id }}

      - name: Extract artifact
        run: |
          mkdir -p ./pages-content
          tar -xvf ./rollback-artifact/artifact.tar -C ./pages-content

      - name: Upload for deployment
        uses: actions/upload-pages-artifact@v4
        with:
          path: ./pages-content

      - name: Deploy to GitHub Pages
        id: deployment
        uses: actions/deploy-pages@v4

Usage:

# Find available runs
gh run list --workflow=deploy.yml --limit=20

# Trigger rollback
gh workflow run rollback.yml -f run_id=12345678

Summary

Question Answer
Should we migrate? Yes — artifact-based deployment is the modern best practice
Will the site go down? No — zero downtime if following the migration sequence
Can we delete gh-pages? Yes — safely delete after migration; site persists independently
What do we lose? Instant git-based rollback (mitigated by rollback workflow)
What do we gain? Sustainable repo size, fast clones, official GitHub support

Final recommendation: Proceed with migration for all lecture repositories. Use 2026-tom-course as the reference implementation and migrate lecture-python.myst and other repositories following this guide.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions