ci: deploy docs via GitHub Actions to py.sdk.modelcontextprotocol.io#2632
ci: deploy docs via GitHub Actions to py.sdk.modelcontextprotocol.io#2632maxisbey wants to merge 1 commit into
Conversation
Replace the manual gh-deploy workflow with the GitHub Actions Pages artifact pipeline (configure-pages -> upload-pages-artifact -> deploy-pages) and deploy on push to main. The previous workflow pushed the built site to the gh-pages branch with mkdocs gh-deploy --force. With branch-based Pages, a custom domain only works via a CNAME file in the branch, which a force-push overwrites on every publish. Switching to artifact-based deployment lets the custom domain live as a persistent repository Pages setting instead, so it survives every deploy, and drops the contents: write permission in favour of the scoped pages/id-token tokens. Also point site_url at https://py.sdk.modelcontextprotocol.io/ so canonical links and the sitemap match the SDK's published domain. Closes #2614.
| concurrency: | ||
| group: deploy-docs | ||
| cancel-in-progress: true |
There was a problem hiding this comment.
🟡 GitHub's official Pages starter workflows set cancel-in-progress: false for the deploy concurrency group, with the rationale that production Pages deployments should be allowed to complete rather than be cancelled mid-flight — and the TypeScript SDK workflow this PR cites as prior art does the same. Consider flipping this to false so a second push to main queues behind an in-progress deploy instead of cancelling it.
Extended reasoning...
What it is. The new deploy-docs.yml declares a concurrency group with cancel-in-progress: true. GitHub's own Pages starter workflows (e.g. actions/starter-workflows/pages/mkdocs.yml, static.yml, jekyll.yml) all set this to false, with an explicit inline comment: "Allow only one concurrent deployment, skipping runs queued between the run in-progress and latest queued. However, do NOT cancel in-progress runs as we want to allow these production deployments to complete." The TypeScript SDK workflow that this PR's description cites as prior art (modelcontextprotocol/typescript-sdk#1584) also follows the false pattern.
The code path. With cancel-in-progress: true, two pushes to main in quick succession will cause the second run to cancel the first. If the first run is already inside the actions/deploy-pages step — after the artifact has been registered with the Pages API but before the deployment has been promoted — the cancellation aborts a production deployment mid-flight.
Why nothing else prevents it. The concurrency group correctly serialises runs (only one at a time), but cancel-in-progress: true overrides the default queueing behaviour. There's no other guard in the workflow that distinguishes "safe to cancel" (still building the site) from "don't cancel" (already calling the Pages deploy API).
Impact. In practice the blast radius is small: GitHub Pages deployments are largely atomic on the server side, so the most likely outcome of a cancelled deploy is that the previous version keeps serving, not a half-applied site. The newer push will deploy strictly newer content anyway. The realistic bad case is narrow — deploy A is cancelled, then deploy B fails for an unrelated reason, leaving you with no successful deploy when A would have landed. Still, it deviates from both GitHub's documented guidance and the prior art the PR explicitly follows.
Step-by-step example.
- Commit X is pushed to
main→ run A starts, builds the site, reachesactions/deploy-pages. - Commit Y is pushed to
main30 seconds later → run B starts in thedeploy-docsconcurrency group. - Because
cancel-in-progress: true, GitHub cancels run A mid-way through thedeploy-pagesstep. - Run B builds and deploys commit Y. If B succeeds, all is well. If B fails (transient runner issue, broken build, etc.), the docs are now stale and there is no record of A having succeeded — even though A had already finished building and was deploying.
Fix. One-line change:
concurrency:
group: deploy-docs
cancel-in-progress: falseThis matches GitHub's starter templates and the TS SDK workflow: the in-flight deploy is allowed to finish, intermediate queued runs are skipped, and only the latest queued run is kept.
Motivation and Context
Fixes #2614. The Python SDK docs link on https://modelcontextprotocol.io/docs/sdk
points to
https://py.sdk.modelcontextprotocol.io/, which returns 404. Everyother SDK serves correctly at its
*.sdk.modelcontextprotocol.iodomain.The DNS is already fully configured in
modelcontextprotocol/dns(py.sdkCNAME →
modelcontextprotocol.github.io, plus the GitHub Pages domainverification TXT records) and resolves correctly. The 404 is entirely a
Pages-side gap: no Pages site claims the custom domain
(
gh api repos/modelcontextprotocol/python-sdk/pagesshowscname: null).The root cause is the deployment mechanism. Docs are published with
mkdocs gh-deploy --force, which force-pushes the built site to thegh-pagesbranch. With branch-based Pages, a custom domain only works if aCNAMEfile is present in that branch — and a force-push overwrites it onevery publish. This is a known wart of the
gh-deploymodel.This PR migrates to the GitHub Actions Pages artifact pipeline
(
configure-pages→upload-pages-artifact→deploy-pages), which isGitHub's recommended modern model and what the TypeScript SDK already uses:
file that gets overwritten — it survives every deploy.
contents: write(a bot pushing commits) in favour of scoped,short-lived
pages: write+id-token: write(OIDC).gh-pagesbranch; atomic deployswith a real
github-pagesenvironment.maininstead of only via manualworkflow_dispatch, so docs no longer drift frommain(
workflow_dispatchis retained as a manual trigger).site_urlis also updated tohttps://py.sdk.modelcontextprotocol.io/socanonical links and
sitemap.xmlmatch the published domain. This subsumesand supersedes #2615 (which changed only
site_url— necessary but notsufficient on its own).
Prior art: this mirrors the TypeScript SDK's docs deployment
(modelcontextprotocol/typescript-sdk#1109 introduced GitHub Pages
deployment; modelcontextprotocol/typescript-sdk#1584 moved it to the
artifact-based
deploy-pagesmodel used here).Maintainer action required after merge
The workflow change alone does not flip the Pages mechanism — a repo admin
must do the following one-time settings change (the manual-publish path
required a maintainer too, so this adds no new gate). Recommended order to
avoid a broken intermediate state:
build_typefrom
legacytoworkflow).workflow_dispatch) and confirmthe first Actions deployment succeeds.
py.sdk.modelcontextprotocol.io,Save. (Equivalently:
gh api -X PUT repos/modelcontextprotocol/python-sdk/pages -f cname=py.sdk.modelcontextprotocol.io.)DNS is already verified, so this is near-instant.
No DNS changes are needed. After step 3 the custom domain persists across
all future deploys. The old
modelcontextprotocol.github.io/python-sdk/path keeps working — GitHub 301-redirects it to the custom domain.
Verify:
The stale
gh-pagesbranch can be deleted once the custom domain isconfirmed serving.
How Has This Been Tested?
uv run --frozen --no-sync mkdocs buildwas run locally: it builds instrict mode, outputs to
site/(matching the artifactpath:), and thegenerated
index.htmlcanonical tag andsitemap.xmlcorrectly usehttps://py.sdk.modelcontextprotocol.io/. The Pages source/custom-domaintoggle is a repo setting and is covered in the maintainer checklist above.
Breaking Changes
No. User-facing docs URLs are unchanged (the new domain is where the docs
were always meant to be served; the old github.io path redirects).
Types of changes
Checklist
Additional context
Pages actions are pinned to commit SHAs to match the repository's existing
workflow convention.
AI Disclaimer