Skip to content

chore: regenerate only changed colab notebooks in CI and make target #413

@andreatgretel

Description

@andreatgretel

Context

The make generate-colab-notebooks target and the check-colab-notebooks CI workflow currently regenerate all colab notebooks every time, even when only one source file changed. This causes PRs to include cell-ID-only diffs across unrelated notebooks (e.g. PR #403 touched only notebook 4's source but the diff includes 188 lines of cell-ID changes across notebooks 1-3, 5-6).

The CI already filters cell-ID diffs to avoid false failures, but the unnecessary regeneration still creates noisy commits.

Proposal

1. CI workflow: regenerate only changed source files

In .github/workflows/check-colab-notebooks.yml, detect which docs/notebook_source/*.py files changed and pass them to the script's existing --files flag:

- name: Get changed notebook sources
  id: changed
  run: |
    FILES=$(git diff --name-only ${{ github.event.pull_request.base.sha || 'HEAD~1' }} -- docs/notebook_source/*.py | xargs -I{} basename {} || true)
    echo "files=$FILES" >> "$GITHUB_OUTPUT"

- name: Generate Colab notebooks
  run: |
    if [ -n "${{ steps.changed.outputs.files }}" ]; then
      make generate-colab-notebooks FILES="${{ steps.changed.outputs.files }}"
    else
      make generate-colab-notebooks
    fi

2. Makefile: add a FILES parameter

generate-colab-notebooks:
	@echo "📓 Generating Colab-compatible notebooks..."
ifdef FILES
	uv run --group docs python docs/scripts/generate_colab_notebooks.py --files $(FILES)
else
	uv run --group docs python docs/scripts/generate_colab_notebooks.py
endif
	@echo "✅ Colab notebooks created in docs/colab_notebooks/"

3. Remove the cell-ID diff filter

Once only changed notebooks are regenerated, the cell-ID filtering hack in the CI diff check becomes unnecessary and can be removed (or kept as a safety net).

Benefits

  • Cleaner PR diffs (no unrelated notebook churn)
  • Faster CI (regenerate 1 notebook instead of 6)
  • Simpler diff check (no need to filter cell IDs)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions