[ci/doc] Guard against annotated subpackages that escape the API walk#64514
[ci/doc] Guard against annotated subpackages that escape the API walk#64514dstrodtman wants to merge 6 commits into
Conversation
The API-doc consistency checker enforced only one direction of the contract: every annotated public API must be documented. Extend it with two deterministic, Sphinx-free policies over the same parsed doc surface. Policy 02 (docs subset of code): every documented name must resolve to a live object, and a name that resolves to a deprecated or private object fails. Resolution is strict -- a miss is reported instead of being swallowed into the raw name string the way get_canonical_name does -- and each documented name's annotation is read live from the resolved object rather than the placeholder PUBLIC_API stamped by the autosummary and autoclass parsers. This catches stale doc entries (renamed, deleted, or misspelled symbols) that previously only the full Sphinx render flagged. Documented but un-annotated methods are accepted, since they are public by virtue of their annotated class. Policy 04 (no duplicate documentation): a canonical name may not appear in more than one autosummary or autoclass block in a team's walked doc surface. An intentional-duplicate allowlist holds ray.actor.ActorMethod.bind (documented in both the Core and Compiled Graph APIs), mirroring conf.py's DuplicateObjectFilter. Both policies run per team in _check_team and fail the team on violation. Adds unit tests for resolution, live re-introspection, and both policy splits, plus _check_team-level tests over the mock module and a new py_test target. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Douglas Strodtman <douglas@anyscale.com>
Found by running the new policies end-to-end for the core team against a matching ray build: - API.resolve() walked names with getattr first, so a re-exported function that shadows a same-named submodule (e.g. ray.util.placement_group, the function, shadows the ray.util.placement_group submodule) caused the remaining tokens to fail to resolve -- a false "documented API does not resolve". Import the dotted path first and fall back to getattr, matching how Sphinx autosummary resolves the name. - ray.remote (canonical ray._private.worker.remote) is intentionally cross-listed under both Tasks and Actors in ray-core/api/core.rst, since @ray.remote defines both. Add it to core's intentional-duplicate allowlist next to ActorMethod.bind. With these, the core team passes all three policies. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Douglas Strodtman <douglas@anyscale.com>
Address review feedback: a malformed doc entry (an empty name, or one with
a leading/double dot that splits to an empty first token) would make
importlib.import_module("") raise ValueError and crash the checker. Guard
the empty-first-token case and catch ValueError alongside ImportError so a
malformed entry is reported as unresolved instead. Adds regression cases.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Douglas Strodtman <douglas@anyscale.com>
Policy 02 derived an entry's identity from get_canonical_name() (a getattr-only walk) while reading its annotation from resolve() (an import-first walk). When a dotted segment is shadowed -- e.g. a re-exported function sharing a name with a submodule -- those two walks can land on different objects, so the annotation could be read off one object while the reported / white-list-checked name came from another. Resolve each documented name once and derive the canonical name from that object (new API.canonical_name_of), so identity and annotation share one walk. Policy 04 uses the same path, which also collapses two spellings of a shadowed object to one canonical key. get_canonical_name() is unchanged and still backs Policy 01. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Douglas Strodtman <douglas@anyscale.com>
The API-doc consistency check reads autosummary stub .rst files (the per-class method tables the hand-written API pages `.. include::`). Those stubs were produced only as a side effect of `make -C doc/ html`, so every Python-touching premerge PR paid for a full Sphinx render just to generate them. Extract the stub generation into doc/source/api_autogen.py, shared by the render and a standalone entry point: - conf.py imports the autogen file list, the filename map, and the custom autosummary Jinja filters from the new module (importing it registers the filters), keeping the render and the standalone path in lockstep. - generate_api_stubs() raises when generation produces no files, and _autogen_apis no longer swallows failures in a try/except + warning. A broken autogen step now fails the build instead of emitting an empty fixture the consistency check then reads as "nothing to do". - lint.sh's api_policy_check runs `python doc/source/api_autogen.py` instead of `make -C doc/ html`, removing the full render from the premerge check while preserving the contract guarantee. Stubs are generated after installing Ray so they reflect the checkout's source. Verified end-to-end: the standalone script generates the same stubs the render produced (82 files for the data API), and the consistency check reads them. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Douglas Strodtman <douglas@anyscale.com>
There was a problem hiding this comment.
Code Review
This pull request optimizes the API documentation consistency check by generating only the necessary autosummary stub files via a new standalone script (api_autogen.py), eliminating the need to build the entire HTML documentation. It also introduces Policy 02 (ensuring documented APIs resolve to public code) and Policy 04 (preventing duplicate documentation), along with a coverage guard to catch unwalked annotated subpackages. The feedback suggests adding exception handling around dynamic attribute inspection and module iteration in cmd_check_api_discrepancy.py to prevent crashes from lazy imports or invalid paths.
| for attr in dir(mod): | ||
| obj = getattr(mod, attr, None) | ||
| origin = getattr(obj, "__module__", None) | ||
| if not origin or (origin != module and not origin.startswith(f"{module}.")): | ||
| continue | ||
| if hasattr(obj, "_annotated"): | ||
| return (True, True) |
There was a problem hiding this comment.
Accessing attributes dynamically via getattr on a module can trigger lazy imports or properties that raise exceptions (such as ModuleNotFoundError for missing optional dependencies). Wrapping the attribute inspection in a try-except block ensures that optional dependency failures or buggy attributes do not crash the entire API consistency check.
| for attr in dir(mod): | |
| obj = getattr(mod, attr, None) | |
| origin = getattr(obj, "__module__", None) | |
| if not origin or (origin != module and not origin.startswith(f"{module}.")): | |
| continue | |
| if hasattr(obj, "_annotated"): | |
| return (True, True) | |
| for attr in dir(mod): | |
| try: | |
| obj = getattr(mod, attr, None) | |
| origin = getattr(obj, "__module__", None) | |
| if not origin or (origin != module and not origin.startswith(f"{module}.")): | |
| continue | |
| if hasattr(obj, "_annotated"): | |
| return (True, True) | |
| except Exception: | |
| continue |
| children = [] | ||
| for info in pkgutil.iter_modules(pkg.__path__, prefix=f"{package}."): | ||
| if info.name.rsplit(".", 1)[-1].startswith("_"): | ||
| continue | ||
| children.append(info.name) |
There was a problem hiding this comment.
Calling pkgutil.iter_modules on pkg.__path__ can raise exceptions if any of the paths in __path__ are invalid, non-existent, or inaccessible (e.g., due to permission issues or virtual environment configurations). Wrapping this in a try-except block makes the subpackage discovery more robust.
| children = [] | |
| for info in pkgutil.iter_modules(pkg.__path__, prefix=f"{package}."): | |
| if info.name.rsplit(".", 1)[-1].startswith("_"): | |
| continue | |
| children.append(info.name) | |
| children = [] | |
| try: | |
| for info in pkgutil.iter_modules(pkg.__path__, prefix=f"{package}."): | |
| if info.name.rsplit(".", 1)[-1].startswith("_"): | |
| continue | |
| children.append(info.name) | |
| except Exception: | |
| pass |
## Why The code<->docs API consistency check only sees APIs that its module walk reaches, and the walk only follows submodules a parent `__init__` actually imports. A public, `@PublicAPI`-annotated subpackage that its parent does not import is invisible to the check, so its APIs can silently rot undocumented. `ray.serve.llm` and `ray.data.llm` are both such surfaces today: they are documented and annotated, but no walk reaches them. ## What - Add `ray.serve.llm` to the serve `head_modules` so its public surface is walked. It is import-safe in the docbuild image (transformers is a soft `try_import`, vLLM is only imported lazily). - Add a cross-team guard that enumerates one level of subpackages under every configured head module and fails when an annotated subpackage is neither reached by any walk nor knowingly excluded. A reviewed allowlist records the intentional exclusions and why. - Allowlist `ray.data.llm`: it eagerly imports the vLLM/SGLang engine processor configs, which do a hard `import transformers` at module load, and transformers is absent from the docbuild image, so it cannot be imported by this check. Its surface stays documented in `doc/source/data/api/llm.rst`. - Track reachable module names on `Module` so the guard can tell "walked" from "merely name-prefixed" (`ray.data.llm` is prefixed by `ray.data` yet never imported). - Unit-test the guard's decision core and helpers against the mock module. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Douglas Strodtman <douglas@anyscale.com>
17028a4 to
02f54be
Compare
|
Thanks for the review. Two notes: Scope: this PR only adds three things — Both inline suggestions adopted in the latest push:
Both are consistent with the guard's own principle: an optional-dependency failure should downgrade coverage of one module, never crash the whole consistency check. |
Why
The API doc consistency check (
ci/ray_ci/doc/cmd_check_api_discrepancy.py)only sees APIs that its module walk reaches, and the walk only follows
submodules a parent
__init__actually imports. A public,@PublicAPI-annotated subpackage that its parent does not import is invisibleto the check — its APIs can silently rot undocumented.
ray.serve.llmandray.data.llmare both such surfaces today: documented andannotated, but reached by no walk.
What
ray.serve.llmby adding it to the servehead_modules. It isimport-safe in the docbuild image (
transformersis a softtry_import, andvLLM is only imported lazily), so it can be its own walk root.
under every configured head module and fails when an annotated subpackage is
neither reached by any walk nor knowingly excluded. A reviewed allowlist
records the intentional exclusions and the reason for each.
ray.data.llm. Unlikeray.serve.llm, it eagerly imports thevLLM/SGLang engine processor configs, which do a hard
import transformersat module load.
transformersis absent from the docbuild image(
python/deplocks/docs/docbuild_depset_py3.11.lock), soray.data.llmcannot be imported by this check. Its surface stays documented in
doc/source/data/api/llm.rst.Moduleso the guard distinguishes "actuallywalked" from "merely name-prefixed" (
ray.data.llmis prefixed byray.datayet never imported by
ray/data/__init__.py).Stacking
This PR is stacked on top of the API-ref consistency-policy work and should be
reviewed/merged after it (its diff transitively shows those commits until they
land):
Kept as a draft until the base PRs merge; it will be rebased onto
masterafterward.
Checks
The decisive check is doc: check API doc consistency — it exercises the new
guard,
ray.serve.llminhead_modules, and theray.data.llmallowlistagainst the real docbuild image.
🤖 Generated with Claude Code