Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
158 changes: 83 additions & 75 deletions src/rustdoc-internals.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,23 @@
# Rustdoc Internals
# Rustdoc internals

This page describes [`rustdoc`]'s passes and modes. For an overview of `rustdoc`,
see the ["Rustdoc overview" chapter](./rustdoc.md).
This page describes [`rustdoc`]'s passes and modes.
For an overview of `rustdoc`, see the ["Rustdoc overview" chapter](./rustdoc.md).

[`rustdoc`]: https://github.com/rust-lang/rust/tree/HEAD/src/tools/rustdoc

## From Crate to Clean
## From crate to clean

In [`core.rs`] are two central items: the [`rustdoc::core::DocContext`]
`struct`, and the [`rustdoc::core::run_global_ctxt`] function. The latter is
where `rustdoc` calls out to `rustc` to compile a crate to the point where
`rustdoc` can take over. The former is a state container used when crawling
through a crate to gather its documentation.
`struct`, and the [`rustdoc::core::run_global_ctxt`] function.
The latter is where `rustdoc` calls out to `rustc` to compile a crate to the point where
`rustdoc` can take over.
The former is a state container used when crawling through a crate to gather its documentation.

The main process of crate crawling is done in [`clean/mod.rs`] through several
functions with names that start with `clean_`. Each function accepts an `hir`
or `ty` data structure, and outputs a `clean` structure used by `rustdoc`. For
example, [this function for converting lifetimes]:
functions with names that start with `clean_`.
Each function accepts an `hir`
or `ty` data structure, and outputs a `clean` structure used by `rustdoc`.
For example, [this function for converting lifetimes]:

```rust,ignore
fn clean_lifetime<'tcx>(lifetime: &hir::Lifetime, cx: &mut DocContext<'tcx>) -> Lifetime {
Expand All @@ -34,17 +35,19 @@ fn clean_lifetime<'tcx>(lifetime: &hir::Lifetime, cx: &mut DocContext<'tcx>) ->
```

Also, `clean/mod.rs` defines the types for the "cleaned" [Abstract Syntax Tree
(`AST`)][ast] used later to render documentation pages. Each usually accompanies a
(`AST`)][ast] used later to render documentation pages.
Each usually accompanies a
`clean_*` function that takes some [`AST`][ast] or [High-Level Intermediate
Representation (`HIR`)][hir] type from `rustc` and converts it into the
appropriate "cleaned" type. "Big" items like modules or associated items may
Representation (`HIR`)][hir] type from `rustc` and converts it into the appropriate "cleaned" type.
"Big" items like modules or associated items may
have some extra processing in its `clean` function, but for the most part these
`impl`s are straightforward conversions. The "entry point" to this module is
`impl`s are straightforward conversions.
The "entry point" to this module is
[`clean::utils::krate`][ck0], which is called by [`run_global_ctxt`].

The first step in [`clean::utils::krate`][ck1] is to invoke
[`visit_ast::RustdocVisitor`] to process the module tree into an intermediate
[`visit_ast::Module`]. This is the step that actually crawls the
[`visit_ast::RustdocVisitor`] to process the module tree into an intermediate [`visit_ast::Module`].
This is the step that actually crawls the
[`rustc_hir::Crate`], normalizing various aspects of name resolution, such as:

* handling `#[doc(inline)]` and `#[doc(no_inline)]`
Expand All @@ -57,14 +60,14 @@ The first step in [`clean::utils::krate`][ck1] is to invoke
they're defined as a reexport or not

After this step, `clean::krate` invokes [`clean_doc_module`], which actually
converts the `HIR` items to the cleaned [`AST`][ast]. This is also the step where cross-
crate inlining is performed, which requires converting `rustc_middle` data
structures into the cleaned [`AST`][ast].
converts the `HIR` items to the cleaned [`AST`][ast].
This is also the step where cross-crate inlining is performed,
which requires converting `rustc_middle` data structures into the cleaned [`AST`][ast].

The other major thing that happens in `clean/mod.rs` is the collection of doc
comments and `#[doc=""]` attributes into a separate field of the [`Attributes`]
`struct`, present on anything that gets hand-written documentation. This makes it
easier to collect this documentation later in the process.
`struct`, present on anything that gets hand-written documentation.
This makes it easier to collect this documentation later in the process.

The primary output of this process is a [`clean::types::Crate`] with a tree of [`Item`]s
which describe the publicly-documentable items in the target crate.
Expand All @@ -90,13 +93,14 @@ which describe the publicly-documentable items in the target crate.
### Passes Anything But a Gas Station (or: [Hot Potato](https://www.youtube.com/watch?v=WNFBIt5HxdY))

Before moving on to the next major step, a few important "passes" occur over
the cleaned [`AST`][ast]. Several of these passes are `lint`s and reports, but some of
them mutate or generate new items.
the cleaned [`AST`][ast].
Several of these passes are `lint`s and reports, but some of them mutate or generate new items.

These are all implemented in the [`librustdoc/passes`] directory, one file per pass.
By default, all of these passes are run on a crate, but the ones
regarding dropping private/hidden items can be bypassed by passing
`--document-private-items` to `rustdoc`. Note that unlike the previous set of [`AST`][ast]
`--document-private-items` to `rustdoc`.
Note that, unlike the previous set of [`AST`][ast]
transformations, the passes are run on the _cleaned_ crate.

Here is the list of passes as of <!-- date-check --> March 2023:
Expand All @@ -105,8 +109,7 @@ Here is the list of passes as of <!-- date-check --> March 2023:
flag.

- `check-doc-test-visibility` runs `doctest` visibility–related `lint`s. This pass
runs before `strip-private`, which is why it needs to be separate from
`run-lints`.
runs before `strip-private`, which is why it needs to be separate from `run-lints`.

- `collect-intra-doc-links` resolves [intra-doc links](https://doc.rust-lang.org/nightly/rustdoc/write-documentation/linking-to-items-by-name.html).

Expand All @@ -121,8 +124,8 @@ Here is the list of passes as of <!-- date-check --> March 2023:

- `bare_urls` detects links that are not linkified, e.g., in Markdown such as
`Go to https://example.com/.` It suggests wrapping the link with angle brackets:
`Go to <https://example.com/>.` to linkify it. This is the code behind the <!--
date-check: may 2022 --> `rustdoc::bare_urls` `lint`.
`Go to <https://example.com/>.` to linkify it.
This is the code behind the <!-- date-check: may 2022 --> `rustdoc::bare_urls` `lint`.

- `check_code_block_syntax` validates syntax inside Rust code blocks
(<code>```rust</code>)
Expand All @@ -131,33 +134,37 @@ Here is the list of passes as of <!-- date-check --> March 2023:
in doc comments.

- `strip-hidden` and `strip-private` strip all `doc(hidden)` and private items
from the output. `strip-private` implies `strip-priv-imports`. Basically, the
goal is to remove items that are not relevant for public documentation. This
pass is skipped when `--document-hidden-items` is passed.
from the output.
`strip-private` implies `strip-priv-imports`.
Basically, the goal is to remove items that are not relevant for public documentation.
This pass is skipped when `--document-hidden-items` is passed.

- `strip-priv-imports` strips all private import statements (`use`, `extern
crate`) from a crate. This is necessary because `rustdoc` will handle *public*
crate`) from a crate.
This is necessary because `rustdoc` will handle *public*
imports by either inlining the item's documentation to the module or creating
a "Reexports" section with the import in it. The pass ensures that all of
these imports are actually relevant to documentation. It is technically
only run when `--document-private-items` is passed, but `strip-private`
a "Reexports" section with the import in it.
The pass ensures that all of these imports are actually relevant to documentation.
It is technically only run when `--document-private-items` is passed, but `strip-private`
accomplishes the same thing.

- `strip-private` strips all private items from a crate which cannot be seen
externally. This pass is skipped when `--document-private-items` is passed.
externally.
This pass is skipped when `--document-private-items` is passed.

There is also a [`stripper`] module in `librustdoc/passes`, but it is a
collection of utility functions for the `strip-*` passes and is not a pass
itself.
collection of utility functions for the `strip-*` passes and is not a pass itself.

[`librustdoc/passes`]: https://github.com/rust-lang/rust/tree/HEAD/src/librustdoc/passes
[`stripper`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustdoc/passes/stripper/index.html

## From Clean To HTML
## From clean to HTML

This is where the "second phase" in `rustdoc` begins. This phase primarily lives
This is where the "second phase" in `rustdoc` begins.
This phase primarily lives
in the [`librustdoc/formats`] and [`librustdoc/html`] folders, and it all starts with
[`formats::renderer::run_format`]. This code is responsible for setting up a type that
[`formats::renderer::run_format`].
This code is responsible for setting up a type that
`impl FormatRenderer`, which for `HTML` is [`Context`].

This structure contains methods that get called by `run_format` to drive the
Expand All @@ -168,8 +175,8 @@ doc rendering, which includes:
* `after_krate` generates other global resources like `all.html`

In `item`, the "page rendering" occurs, via a mixture of [Askama] templates
and manual `write!()` calls, starting in [`html/layout.rs`]. The parts that have
not been converted to templates occur within a series of `std::fmt::Display`
and manual `write!()` calls, starting in [`html/layout.rs`].
The parts that have not been converted to templates occur within a series of `std::fmt::Display`
implementations and functions that pass around a `&mut std::fmt::Formatter`.

The parts that actually generate `HTML` from the items and documentation start
Expand All @@ -183,11 +190,13 @@ pieces like "how should I print a where clause as part of some other item".

Whenever `rustdoc` comes across an item that should print hand-written
documentation alongside, it calls out to [`html/markdown.rs`] which interfaces
with the Markdown parser. This is exposed as a series of types that wrap a
string of Markdown, and implement `fmt::Display` to emit `HTML` text. It takes
special care to enable certain features like footnotes and tables and add
with the Markdown parser.
This is exposed as a series of types that wrap a
string of Markdown, and implement `fmt::Display` to emit `HTML` text.
It takes special care to enable certain features like footnotes and tables and add
syntax highlighting to Rust code blocks (via `html/highlight.rs`) before
running the Markdown parser. There's also a function [`find_codes`] which is
running the Markdown parser.
There's also a function [`find_codes`] which is
called by `find_testable_codes` that specifically scans for Rust code blocks so
the test-runner code can find all the `doctest`s in the crate.

Expand All @@ -208,11 +217,11 @@ the test-runner code can find all the `doctest`s in the crate.
[video]: https://www.youtube.com/watch?v=hOLAGYmUQV0

It's important to note that `rustdoc` can ask the compiler for type information
directly, even during `HTML` generation. This [didn't used to be the case], and
directly, even during `HTML` generation.
This [didn't used to be the case], and
a lot of `rustdoc`'s architecture was designed around not doing that, but a
`TyCtxt` is now passed to `formats::renderer::run_format`, which is used to
run generation for both `HTML` and the
(unstable as of <!-- date-check --> March 2023) JSON format.
run generation for both `HTML` and the (unstable as of <!-- date-check --> Nov 2025) JSON format.

This change has allowed other changes to remove data from the "clean" [`AST`][ast]
that can be easily derived from `TyCtxt` queries, and we'll usually accept
Expand All @@ -222,18 +231,17 @@ is complicated from two other constraints that `rustdoc` runs under:
* Docs can be generated for crates that don't actually pass type checking.
This is used for generating docs that cover mutually-exclusive platform
configurations, such as `libstd` having a single package of docs that
cover all supported operating systems. This means `rustdoc` has to be able
to generate docs from `HIR`.
cover all supported operating systems.
This means `rustdoc` has to be able to generate docs from `HIR`.
* Docs can inline across crates. Since crate metadata doesn't contain `HIR`,
it must be possible to generate inlined docs from the `rustc_middle` data.

The "clean" [`AST`][ast] acts as a common output format for both input formats. There
is also some data in clean that doesn't correspond directly to `HIR`, such as
synthetic `impl`s for auto traits and blanket `impl`s generated by the
`collect-trait-impls` pass.
The "clean" [`AST`][ast] acts as a common output format for both input formats.
There is also some data in clean that doesn't correspond directly to `HIR`, such as
synthetic `impl`s for auto traits and blanket `impl`s generated by the `collect-trait-impls` pass.

Some additional data is stored in
`html::render::context::{Context, SharedContext}`. These two types serve as
Some additional data is stored in `html::render::context::{Context, SharedContext}`.
These two types serve as
ways to segregate `rustdoc`'s data for an eventual future with multithreaded doc
generation, as well as just keeping things organized:

Expand All @@ -247,44 +255,44 @@ generation, as well as just keeping things organized:
[didn't used to be the case]: https://github.com/rust-lang/rust/pull/80090
[`SharedContext`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustdoc/html/render/context/struct.SharedContext.html

## Other Tricks Up Its Sleeve
## Other tricks up its sleeve

All this describes the process for generating `HTML` documentation from a Rust
crate, but there are couple other major modes that `rustdoc` runs in. It can also
be run on a standalone Markdown file, or it can run `doctest`s on Rust code or
standalone Markdown files. For the former, it shortcuts straight to
crate, but there are couple other major modes that `rustdoc` runs in.
It can also be run on a standalone Markdown file, or it can run `doctest`s on Rust code or
standalone Markdown files.
For the former, it shortcuts straight to
`html/markdown.rs`, optionally including a mode which inserts a Table of
Contents to the output `HTML`.

For the latter, `rustdoc` runs a similar partial-compilation to get relevant
documentation in `test.rs`, but instead of going through the full clean and
render process, it runs a much simpler crate walk to grab *just* the
hand-written documentation. Combined with the aforementioned
render process, it runs a much simpler crate walk to grab *just* the hand-written documentation.
Combined with the aforementioned
"`find_testable_code`" in `html/markdown.rs`, it builds up a collection of
tests to run before handing them off to the test runner. One notable location
in `test.rs` is the function `make_test`, which is where hand-written
tests to run before handing them off to the test runner.
One notable location in `test.rs` is the function `make_test`, which is where hand-written
`doctest`s get transformed into something that can be executed.

Some extra reading about `make_test` can be found
[here](https://quietmisdreavus.net/code/2018/02/23/how-the-doctests-get-made/).

## Testing Locally
## Testing locally

Some features of the generated `HTML` documentation might require local
storage to be used across pages, which doesn't work well without an `HTTP`
server. To test these features locally, you can run a local `HTTP` server, like
this:
storage to be used across pages, which doesn't work well without an `HTTP` server.
To test these features locally, you can run a local `HTTP` server, like this:

```bash
```console
$ ./x doc library
# The documentation has been generated into `build/[YOUR ARCH]/doc`.
$ python3 -m http.server -d build/[YOUR ARCH]/doc
```

Now you can browse your documentation just like you would if it was hosted
on the internet. For example, the url for `std` will be `rust/std/`.
Now you can browse your documentation just like you would if it was hosted on the internet.
For example, the url for `std` will be `rust/std/`.

## See Also
## See also

- The [`rustdoc` api docs]
- [An overview of `rustdoc`](./rustdoc.md)
Expand Down
Loading
Loading