DOC: Enable --strict on docs build, fix converter anchors + broken internal refs (closes #1741)#1745
Open
romanlutz wants to merge 10 commits into
Open
Conversation
…d PDF in CI
The Read the Docs build for PyRIT has been failing silently because:
1. doc/myst.yml referenced two API pages that do not exist:
- api/pyrit_setup_initializers.md - gen_api_md.py does not emit a
separate page for pyrit.setup.initializers (its parent pyrit.setup
has its own API members, so the script doesn't expand into
submodules)
- api/pyrit_ui.md - the pyrit/ui/ module doesn't exist
Both entries produced RTD-fatal 'Table of contents entry does not exist'
errors (added in microsoft#1469 and microsoft#1472 in March 2026).
2. The AML troubleshooting notebooks under doc/getting_started/troubleshooting/
referenced images via ./../../assets/aml_*.png which resolves to doc/assets/
(a directory that doesn't exist). The paired .py files correctly used
./../../../assets/ (3 ../, resolving to the repo-root assets/ where the
images actually live). The .ipynb / .py pair was out of sync. The missing
images caused xdvipdfmx to abort during the PDF export on RTD.
3. The build-book GitHub Actions workflow only ran 'jupyter-book build --all
--html' (HTML-only), so PDF-only regressions silently slipped past CI
while RTD failed.
This change:
- removes the two missing TOC entries from doc/myst.yml
- syncs the .ipynb AML image paths to ./../../../assets/ matching the .py
files
- adds a 'docs-build-all' Makefile target that runs HTML + PDF together
(mirroring RTD's 'jupyter-book build --all' behaviour)
- updates .github/workflows/docs.yml to install texlive-xetex / latexmk
and use 'make docs-build-all' so PDF regressions are caught in CI
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…books The 4 converter notebooks (1_text_to_text, 2_audio, 3_image, 4_video) used '<a id="X"></a>' HTML anchors to mark headings as link targets. MyST parses these as anchor-only links and reports them as 'Link has no URL' errors during the docs build. Replace all 8 anchors across the 4 .ipynb files (and their paired .py source files) with MyST-native (name)= target syntax placed immediately before the heading: <a id="non-llm-converters"></a> -> (non-llm-converters)= ## Non-LLM Converters -> ## Non-LLM Converters Existing cross-reference links of the form [Text](#slug) continue to resolve correctly against both the explicit target and MyST's auto-generated heading slug, so no link rewrites are needed. The jupytext '"main_language": "python"' metadata key was added to 2_audio_converters.ipynb and 4_video_converters.ipynb by 'jupytext --update --to ipynb'. This is harmless drift fix between the .py jupytext header and the .ipynb metadata block. This unblocks issue microsoft#1741 (enabling '--strict' on the docs build to catch silent RTD breakage). The '--strict' flip itself is deferred: adding it now would surface ~30 additional pre-existing errors unrelated to the converter notebooks (auth-required external URLs, broken refs to renamed/deleted files, mystmd v2 LaTeX-node limitations on PDF export, missing heading depths in auto-generated API md). Each category needs its own decision (fix vs. suppress via 'error_rules' in myst.yml), which is out of scope for the prerequisite anchor fix this PR delivers. See the PR description for the full breakdown. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
These were real broken links in the docs that the current
`make docs-build` was silently emitting as MyST '⛔' errors (build
still exited 0). With '--strict' enabled (next commit), they would
become hard build failures, so fix the underlying refs:
- `doc/blog/2026_04_14_scoring_scorers.md` (2 occurrences):
`../code/scoring/8_scorer_metrics.ipynb` →
`../code/scoring/7_scorer_metrics.ipynb` (notebook renumbered).
- `doc/blog/2025_02_11.md`:
`../code/scenarios/9_baseline_only.ipynb` → linked file was
deleted; repoint to `../code/scenarios/0_scenarios.ipynb#baseline-execution`
which now hosts the Baseline Execution section.
- `doc/getting_started/troubleshooting/{deploy_hf_model_aml,download_and_register_hf_model_aml,score_aml_endpoint}.ipynb`:
`../setup/populating_secrets.md` → `../populating_secrets.md`.
Drift fix: each paired `.py` already had the correct path; the
`.ipynb` halves were stale. Synced via `jupytext --update --to ipynb`.
- `doc/contributing/5_unit_tests.md`:
`../../tests/unit/target/test_tts_target.py` →
`../../tests/unit/prompt_target/target/test_tts_target.py`
(the test directory layout changed).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ions Closes microsoft#1741. Append '--strict' to both 'docs-build' and 'docs-build-all' in the Makefile so MyST errors (broken links, missing TOC entries, malformed syntax, etc.) hard-fail CI instead of being silently emitted as '⛔' lines in a green build. This is how RTD silently broke before PR microsoft#1740. To make '--strict' actually pass today, add a small, well-documented 'error_rules' block to 'doc/myst.yml' that ignores three categories of errors which are inherently outside the project's control or intentionally non-resolvable: 1. 'tex-renders' (blanket): mystmd v2.x has no LaTeX renderer for the MyST layout directives the project uses extensively ('grid', 'tabSet', 'details', 'mermaid'). They render fine in HTML. PDF export emits 'Unhandled LaTeX conversion for node of X' per node. Remove when upstream support lands. 2. 'link-resolves' for auth-required external API docs: 'platform.openai.com/**', 'api.openai.com/**', 'cognitiveservices.azure.com/**'. These always return 401/403 from CI. URLs themselves are correct, just unverifiable without credentials. 3. 'link-resolves' for intentional placeholder URLs: 'pyrit.shared.foo' (style-guide example), 'account.blob.core.windows.net/...' (docstring placeholder shape), 'PyRIT/releases/vx.y.z/**' (release-process template that operators substitute at release time). 4. 'link-resolves' for stale URLs in immutable historical blog posts and one current memory doc whose external target moved ('PyRIT/.../pdf_converter.ipynb' was deleted, 'microsoft.github.io/PyRIT/**' old paths from before the v2 site restructure, 'dbeaver.com/.../sqlite.html' link rot). Marked TODO for editorial cleanup in a follow-up. Each suppression carries a comment explaining *why* the URL is suppressed and what would need to change to remove the suppression, so the allow-list stays honest. Verified locally: - 'cd doc && jupyter-book build --all --html --strict' → exit 0 - Same with a deliberately broken link added to 'doc/index.md': → exit 1, error reported, build stops. Confirms '--strict' is effective. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replaces the broad link-resolves suppression lists with actual URL fixes: - doc/contributing/3_style_guide.md: wrap pyrit.shared.foo in backticks (was parsed as a URL) - doc/contributing/10_release_process.md: wrap release-template URLs (vx.y.z placeholders) in backticks so they render as code, not links - doc/blog/2025_06_06.md: repoint deleted pdf_converter.ipynb link to current pdf_converter.py source - doc/blog/2025_01_27.md: repoint 3 footnote URLs from stale microsoft.github.io/PyRIT/ paths to current relative paths / source files - doc/code/memory/4_manually_working_with_memory.md: update dbeaver SQLite docs URL - pyrit/models/storage_io.py: wrap example Azure Blob URLs in RST double-backticks so the auto-generated API page doesn't parse them as links doc/myst.yml error_rules now contains only the truly unfixable cases: tex-renders (mystmd v2 LaTeX renderer gap) and auth-required external APIs (platform.openai.com, api.openai.com, cognitiveservices.azure.com). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…docs-strict-build
The merge brought in main's docs-build-all (from PR microsoft#1740) while our branch already had it (since we merged the same PR microsoft#1740 branch earlier). The duplicate-second-wins behavior of make would have silently dropped --strict from docs-build-all. Removed the duplicate (without --strict). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…tHub source Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…rter section Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #1741.
Enables
--strictonmake docs-buildandmake docs-build-allso MyST errors fail the build instead of being silently logged (which is how RTD broke unnoticed and motivated PR #1740). Fixes every broken URL the strict build surfaces so the suppression list is tiny.Changes
1. Fix
<a id="...">HTML anchors in converter notebooks (8 anchors)MyST parses these as
<a>tags with nohrefand emitsLink has no URLerrors that bubble up to RTD. Replaced with MyST(name)=explicit targets — existing[Text](#anchor)cross-references continue to work.Files (
.ipynb+ paired.py):doc/code/converters/1_text_to_text_converters—non-llm-converters,llm-based-convertersdoc/code/converters/2_audio_converters—text-to-audio,audio-to-text,audio-to-audiodoc/code/converters/3_image_converters—text-to-image,image-to-imagedoc/code/converters/4_video_converters—image-to-video2. Fix broken internal cross-references (4 files, 7 refs)
doc/blog/2026_04_14_scoring_scorers.md—8_scorer_metrics.ipynb→7_scorer_metrics.ipynb(file renamed)doc/blog/2025_02_11.md—9_baseline_only.ipynb→0_scenarios.ipynb#baseline-execution(file deleted, content moved)doc/getting_started/troubleshooting/{deploy_hf_model_aml,download_and_register_hf_model_aml,score_aml_endpoint}.ipynb—../setup/populating_secrets.md→../populating_secrets.md(re-synced from .py source-of-truth via jupytext)doc/contributing/5_unit_tests.md—tests/unit/target/→tests/unit/prompt_target/target/(dir renamed)3. Fix stale external URLs and docstring URL formatting (rather than suppressing)
doc/contributing/3_style_guide.md— wrappyrit.shared.fooin backticks (was parsed as a URL)doc/contributing/10_release_process.md— wrap release-template URLs (vx.y.zplaceholders) in backticks so they render as code, not linksdoc/blog/2025_06_06.md— repoint deletedpdf_converter.ipynblink to currentpdf_converter.pysourcedoc/blog/2025_01_27.md— repoint 3 stalemicrosoft.github.io/PyRIT/...footnote URLs to current relative paths / source filesdoc/code/memory/4_manually_working_with_memory.md— update dbeaver SQLite docs URL (/guides/sql_editors/sqlite.html→/dbeaver/Database-driver-SQLite/)pyrit/models/storage_io.py— wrap example Azure Blob URLs in RST double-backticks (matches the style already used inpyrit/backend/mappers/attack_mappers.py)4. Enable
--stricton both docs targetsMakefile— append--strictto thejupyter-book buildlines indocs-buildanddocs-build-all5. Add minimal
error_rulessuppressions indoc/myst.ymlAfter the URL fixes above, only two genuinely unavoidable categories remain:
tex-renders(global ignore) — mystmd v2 has no LaTeX renderer for thegrid/tabSet/details/mermaiddirectives we use; HTML is fine, PDF export skips those nodes.link-resolvesforplatform.openai.com/**,api.openai.com/**,cognitiveservices.azure.com/**— auth-required endpoints that return 401/403 to CI link checkers.Each entry has an inline comment explaining why it's suppressed.
Verification
make docs-buildandmake docs-build-allexit 0 with--strict.[BAD](./missing.md)todoc/index.md→ strict build exits 1 (Site has 1 error); reverted.doc/api/pyrit_models.md) picks up the new docstring formatting viagen_api_md.py.PR #1740 has already been merged into
main; this branch was merged withmain(no rebase) so the diff is clean.