Add sphinx_astropy.ext.example extension for building an example gallery #29

jonathansick · 2019-12-06T04:28:41Z

This PR implements a functional example gallery Sphinx extension, as initially described in astropy/astropy#7242 — this PR supersedes #22.

The premise is that documentation pages potentially contain many useful snippets of content that are useful in their own right, outside the context of the page where they are originally written. These snippets can be examples, how-tos, and so on. This Sphinx extension provides a way of surfacing these pieces of content into a centralized gallery.

From a documentation author's perspective, the main API is the example directive, which demarks example content:

.. example:: Title of the example
   :tags: first-tag, another-tag

   Content of the example.

   More content in the example.

This content is not part of the example.

The example directive does not change the visual appearance of the example content in the documentation text.

During the Sphinx build, though, that content is copied into a new, auto-generated page at /examples/title-of-the-example.html. There is an index page that lists all examples at /examples/index.html. There are also pages for each tag that list examples with the associated tag: examples/tags/first-tag.html and examples/tags/another-tag.html

Demo

This is a demo of an example gallery generated from examples identified in the astropy.io and astropy.nddata packages.

http://astropy-example.jsick.codes.s3-website-us-east-1.amazonaws.com/examples/index.html

Configurability

There are three configuration variables:

astropy_examples_enabled. If True, the example gallery (i.e., pages in /examples/) are generated. This defaults to False so that projects can begin using the example directive, but not need to change the appearance of the build documentation. This provides a means of incremental adoption.
astropy_examples_dir. The directory where the example gallery is published. The default is examples.
astropy_examples_h1. Configures the character to use for "h1" headlines in reStructuredText. Defaults to #.

Processing overview

During the builder-inited phase, the preprocess_examples function scans every doc in the Sphinx source tree to find example directives. This function generates a standalone reStructuredText page for each example, a page for each tag, and the main index page. By creating these pages in the builder-inited phase, they can be parsed later on during the regular build.

When the example directive is handled, the directive passes the content through, but wrapped in a custom ExampleMarkerNode. In HTML, this node becomes a <div> with class astropy-example-source and an id attribute that identifies the example.

The standalone example pages are templated to include a example-content directive. These directives add a <div> to the HTML page with a class of astropy-example-content and an id attribute that identifies the example that belongs.

The builder-finished phase is when the example content is copied onto the standalone example pages. Using BeautifulSoup4, the extension copies content within div.astropy-example-source tags and then replaces the div.astropy.example-content tags with that content.

Next steps

Beyond this PR, the next steps for this extension are:

Create a more sophisticated browsing interface on the example gallery's landing pages. We want a card layout with a thumbnail image and teaser text for each example.
Publish data about the standalone examples so that they can be listed from learn.astropy.org.
Better support for incremental Sphinx builds. Right now, the extension clears out the example directory on each build, but a more selective approach to rebuilds can be taken.

This package will gather all Sphinx extensions related to the example gallery. There will be two main parts: a directive that marks examples, and extensions that index and render those examples. This boilerplate includes the standard Sphinx setup function for the extension.

This includes the sphinx_astropy.ext.example Sphinx extension by default in Sphinx builds.

This directive marks the scope of example content in the original documentation, and lets authors add a title and tags. Currently a pass-through directive. It parses the content of the directive and adds it back to the document Examples are persisted in the build environment for later post-processing (to build the example gallery). Examples are keyed by their unique ID (a slugified version of the title). The dict items contains metadata and the content of the example (to later generate standalone example pages). The directive also collects title and tags as metadata.

This target node lets us backlink to the example in the main documentation. In html, the link is an id on the first element of the example content. example IDs are unique since they're the keys of the env.sphinx_astropy_examples dictionary in the environment.

sphinx.testing.fixtures let us build Sphinx sites from pytest and then inspect the built site. There can be multiple test sites, each test site is a directory in the sphinx_astropy/test/roots/ directory. This is the same pattern that Sphinx uses for its test, so this is likely the easiest way to test our Sphinx extensions. http://www.sphinx-doc.org/en/master/devguide.html#unit-testing Note I had to add the pytest_plugins line, to load the Sphinx pytest plugin, from a new conftest.py file at the root of the project, not from sphinx_astropy/tests/conftest.py The reason for this is outlined in https://docs.pytest.org/en/latest/deprecations.html#pytest-plugins-in-non-top-level-conftest-files The rest of the pytest configuration is done in sphinx_astropy/tests/conftest.py, which is where you'd expect most configuration to go. This configuration is largely based on Sphinx's: https://github.com/sphinx-doc/sphinx/blob/master/tests/conftest.py

The example-marker.rst file contains several instances of the example directive, testing different conditions (having tags, or not, and having different types of content in the example).

This test generates a site in the XML format since then it's easy to search for nodes and their attributes. Unfortunately this test strategy doesn't work for Sphinx <1.7, because the pytest fixtures aren't available. Thus I have pytest skip these tests for Sphinx <1.7. I think this is still the best way to test sphinx extensions and will continue to be so in the future because this is how Sphinx tests itself.

There is a new test case (test-example-gallery-duplicates) because otherwise the SphinxError would always be raised for regular testing of the example directive.

As the docstring comment says, I found a weird case that while enabling the numpydoc extension in the test environment, I would get false alarms about duplicate examples already in the build environment. These duplicate examples came from other test functions. Somehow the environment is being preserved across builds now that numpydoc is activated. To make the ExampleMarkerDirective robust against this case, it's now making sure that the duplicate instance is from a different document and line number before raising a SphinxError.

If the tests are failing, it's useful to see the debug-level logging. Note: the '2' verbosity enables DEBUG-level logging. I can't find a cleaner alias to this. Also, it would be nicer to make this the default while using the sphinx pytest mark, but I can't find an easy way to do that.

Now the test-example-gallery root is using an autodoc+numpydoc processing pipeline in its build configuration. This confirms that the directive does work in a docstring as expected.

The purge_doc callback is required to remove examples from a cached environment if a document is removed in a subsequent build. Otherwise the examples in the cached environment from previous builds would continue to exist in subsequent builds. The tests simulate a env-purge-doc event and separately ensure that purge_examples got registered as a env-purge-doc callback.

This refactoring allows _check_for_existing_example to be used outside the ExampleMarkerDirective, like in a env-merge-info event callback.

This env-merge-info callback handles merging sphinx_astropy_examples from parallel build environments when Sphinx is run in parallel read mode. The tests run a full-scale integration-type build with Sphinx running in parallel (-j 4).

These should be separate things so that the "example ID" is the slugified version of the title (which is unique by design). Then the ref ID has the example-src prefix to be a unique reference ID to the example's source location. Also adds a ref_id field to the example's dict in the build environment.

This way the tranlation between a title and an example ID is codified into an API that can be used multiple places.

The content_node key in the example data stores a copy of the parsed docutils nodes for the example. It turns out that it's easier to use the parsed content here rather than parse it during the process_pending_example_nodes() callback where a "state" is not readily available for parsing reStructuredText.

The strategy behind detect_examples() is to identify examples in the reStructuredText source before Sphinx parses them by using a regular expression. This lets us create stubs for example pages before Sphinx does its regular parsing.

This configuration lets us control the directory where the example gallery is generated.

The ExamplePage class builds upon ExampleSource, but now contains the concerns about rendering a standalone example page.

Templates for the standalone example pages, landing pages, tag browsing pages, and so on, can be Jinja-formatted templates. This implementation is adapted from sphinx.ext.autosummary, which has similar needs. By doing this, the user can customize the templates at the builder, theme, or project level. The render is integrated into the ExamplePage class (ExamplePage.render), which automatically detects and uses a template named 'astropy_example/examplepage.rst'. The extension ships with a default implementation of the template. The under tests demonstrate rendering a standalone example page. This ins't hooked up to piping in the content of the example, yet, though.

This commit puts together the work on detecting examples from source (detect_examples) and the work on rendering stubs for standalone example pages (ExamplePage) and run a pipeline during the builder-inited event. This event happens early in the Sphinx build processor so that we can create example pages before Sphinx actually begins to read and process these pages. This also adds a new config variable, astropy_examples_h1, which customizes the underline character used for making titles for "h1" headings in reStructuredText.

The landing page is the index.rst for the example gallery. It provides a toctree for all the examples. The LandingPage class is implemented similarly to the ExamplePage class in that it takes page data and is responsible for computing paths, docnames, and rendering for itself. In the future the template for the landing page could be enhanced into a tiled gallery view, for example. The test verifies that the index.rst file's reStructuredText is rendered correctly. Since there's now a toctree, individual example pages don't need the `:orphan:` field.

The TagPage is like a specialized version of the LandingPage that indexes exaples that have a given tag. The TagPage.generate_tag_pages constructor simultaneously makes tag pages for the set of tags given the population of examples, and also provides references to those tag pages with each relevant example page.

This provides a nice way to categorize examples and to provide discovery of other tags. This is implemented purely in the Jinja templating layer.

This makes it easier to use from Jinja to test lengths

This demonstrates how to use Jinja templating to provide links from a standalone example page back to the original source page and to pages for each associated tag.

This directive inserts parsed content for the example from the application environment.

The source pages are read *before* standalone example pages are read to ensure that they can be parsed into the environment and are available to the ExampleContentDirective.

With Sphinx <= 1.7, :download: roles with external (i.e., https:// urls) download links don't work. This prevents these tests from running.

Named equation reference links do not seem to work with Sphinx 1.7. As well, the format for download links from the Matplotlib plot extension with Sphinx 1.7 are different compared to more recent Sphinx versions, so its best to skip that test since the tests would need to be customized for Sphinx 1.7.

Since substitution_reference nodes are resolved just _after_ the ExampleMarkerDirective is run, substitution_reference nodes could be part of the example content that is republished on the standalone example page. Thus we also need to copy the substitution_definition nodes to include them in the standalone example page. Based on experimentation, it seems that traversing the document's nodes and the directive content's nodes together only gets substitution definitions that are written above or within the example directive. This is a caveat that will need to be added to the documentation.

This commit takes the substitution_definitions field captured by the ExampleMarkerDirective and inserts those substitution_definition nodes into the standalone example page. The key part of this is to ensure we call document.note_substitution_def because these new substitution_defs weren't already parsed.

Normally the document.note_footnote and document.note_footnote_refs (and their autonumbered counterparts) are called when reStructuredText is parsed. However, since the footnotes and footnote_refs are pre-parsed in a standalone example page, we need to manually note them. This state is consumed by the docutils Footnotes transform.

It will be useful for ExampleSource.docname to be a real docname so it can be used with the BuildEnvironment APIs that translate between docnames and paths. Now the example page template can use a new, separate attribute abs_docname that is useful in the doc role.

This provides the actual docname, for use with Sphinx APIs.

Now the ExampleMarkerDirective wraps the generated example in a custom container node, ExampleMarkerNode. This node has visitors that implement this node into a <div> in HTML with a class of astropy-example-source. This div+class marks the content of an example so that it can be copied by the HTML postprocessor into the standalone example pages. NOTE: is_node_registered() is backported here from Sphinx 1.8+ This small function from is needed for tests, and is no longer needed here if Sphinx 1.8 becomes the minimum supported version of the extension.

This node, ExampleContentNode, just provides an empty <div> on the HTML page. The class is "astropy-example-content" and the ID is the ID of the example (which will be used to look up the example content). Because we're no longer adding example content into the standalone example page as part of the regular Sphinx/docutils build, we no longer need to access and process any of the docutils nodes related to the original example content. Many tests are skipped because the expected content isn't be rendered given this change. These tests will be reactivated later when examples are being published via an HTML postprocessor.

Content nodes and metadata are no longer stored in the `sphinx_astropy_examples` attribute of the BuildEnvironment. Now that exmaples are copied and rendered as an HTML post-processing step, we don't need to keep this metadata in the Sphinx build. Consequently, some additonal hooks can also go: - merge_examples in env-merge-info - purge_examples in env-purge-doc - reorder_example_page_reading in env-before-read-docs This also means that the extension can be marked as safe for reading in parallel. Some tests no longer apply because the BuildEnvironment.sphinx_astropy_examples attribute is no longer set. In many cases, I've marked these to be skipped; eventually we'll want to reactivate these tests in the future given the new scheme.

Metadata is cached to the sphinx_astropy_examples attribute of the BuildEnvironment as part of the example gallery preprocessing step during build-inited (recall that before this caching got added as part of the example directive. This is necessary so that the postprocessing step know where to find the source for each example and the associated standalone example page. This is now also the best way to check if there are duplicate examples since all examples are combined at this point. Notes on tests: 1. It seems this SphinxError is build raised in the equivalent of the test set up with the pytest.mark.sphinx pytest extension. This means that a full build is the best way to simulate this. 2. Because docstrings are not scanned by the preprocessor, examples embedded in docstrings are no longer part of the example gallery. Thus I've dropped the associated test to look for example from the example_func docstring.

Will be used to go from example ID to the reference label for the example's source.

This is used by the example extension to manipulate built HTML pages to insert examples into standalone pages.

This implements a new approach for populating standalone example pages. Rather than republish the examples to standalone example pages as part of the Sphinx build, this approach operates on the HTML that Sphinx has built. The postprocess_examples function operates as a hook for the build-finished Sphinx event. For every example in the cached sphinx_astropy_examples attribute of the BuildEnvironment, this method extracts the example from the source page, adapts any relative links and image sources, and inserts the example's <div> into the standalone example page. If config.astropy_examples_enabled is False, then do not run post-processing.

With the large change in sphinx_astropy.ext.example to populate examples by post processing HTML rather than copying docutils nodes, we had to temporarily disable many of the unit tests. This commit re-works the tests to work with the new processing strategy. The pytest.mark.sphinx mark doesn't work with the build-finished event (not sure why), but this means we can't use that approach to test the HTML output. Instead, we use the CLI-based build of the example projects. Since a lot of test functions consume those builds, I've turned them into session-scoped fixtures. Now that BeautifulSoup4 is a dependency of sphinx_astropy, the tests use BeautifulSoup to check the build output. This approach is a lot cleaner than directly using the built-in html.parser module. Finally, I've also organized the tests around different types: 1. Unit tests that don't depend on a Sphinx build. 2. Unit tests that use the ``sphinx`` pytest mark. These tests operate on the Sphinx application instance after a build and test environment persistence. 3. Tests that run a Sphinx build through its command-line interface and analyze the resulting HTML product.

A link to an anchor, like #section-id, now works like this: 1. If the ID exists within the scope of the example, then the href is left as-is. This is what the reader expects and minimizes the disruption of follow a link in an example to a different part of the example. 2. If the ID doesn't exist in the example, but only on the source page, then the href is adapted to point back to the source page.

We want to support incremental rebuilds so that only those example on pages that were changed are re-scanned and rebuilt. However, that involves finer-grained cache invalidation. To make the example extension work with incremental rebuilds we're starting with the *simplest* thing, which is to start the example gallery fresh on each build. The sphinx_astropy_examples attribute on the BuildEnvironment was already being reset on each build; now the example directory in the source tree itself is deleted and re-created on each Sphinx build.

Since Sphinx 1.7 is the minimum required version, it's no longer necessary to avoid testing against Sphinx 1.6.

The original 'roots' terminology is based on how Sphinx organizes its own tests. Specifically, that terminology also carries over into the pytest.mark.sphinx marker's testroot parameter. To match sphinx-automodapi, this commit renames the 'roots' directory to 'cases' (as in, test cases). The pytest.mark.sphinx marker still works, but I've added an extra "casesdir" pytest fixture to refer to the cases directory with the canonical terminology.

This is a draft of "getting started" documentation for sphinx_astropy.ext.example. Right now it isn't part of a documentation build, but could be once a Sphinx project is set up.

jonathansick · 2019-12-19T15:31:39Z

@astrofrog do you want to review this as-is, or should I first transplant this into a standalone Python package and separate GitHub repo?

I was thinking about what the package should be named, in that case. PyPI sphinx-examples is available. I was also thinking about the comment from the coord meeting that we aren't necessarily building an "example gallery" (assuming that a gallery implies images/plots). Can we incorporate some terminology into the name that reflects its nature of transplanting existing content? Maybe sphinx-highlights? What do you think @kelle ?

jonathansick · 2019-12-19T16:02:01Z

I remember that "example library" was an emerging consensus from the Coord meeting. sphinx-example-library could be the PyPI package?

bsipocz · 2019-12-19T17:25:35Z

I rather like the mix of the two from above sphinx-example-highlights. calling it library is a bit overloads the word with normal life meaning, and that's a bit confusing...

astrofrog · 2020-04-25T21:01:07Z

@jonathansick - can we close this PR now that the extension lives in https://github.com/astropy/sphinx-example-index?

jonathansick · 2020-05-01T14:12:47Z

Yep, moving this over to the sphinx-example-index repo right now.

jonathansick added 30 commits December 4, 2019 21:44

Add example extension to v1 sphinx config

4264474

This includes the sphinx_astropy.ext.example Sphinx extension by default in Sphinx builds.

Add an example-gallery test site

d52a3e4

The example-marker.rst file contains several instances of the example directive, testing different conditions (having tags, or not, and having different types of content in the example).

Test example directive's persistence to app env

acc769e

Test for examples with duplicate titles

655d8e3

There is a new test case (test-example-gallery-duplicates) because otherwise the SphinxError would always be raised for regular testing of the example directive.

Test an example directive in numpydoc docstring

81b3d37

Now the test-example-gallery root is using an autodoc+numpydoc processing pipeline in its build configuration. This confirms that the directive does work in a docstring as expected.

Refactor check for existing examples

47834a5

This refactoring allows _check_for_existing_example to be used outside the ExampleMarkerDirective, like in a env-merge-info event callback.

Add merge_examples callback for parallel builds

8d4f597

This env-merge-info callback handles merging sphinx_astropy_examples from parallel build environments when Sphinx is run in parallel read mode. The tests run a full-scale integration-type build with Sphinx running in parallel (-j 4).

Refactor example ID generation to function

d8b56d2

This way the tranlation between a title and an example ID is codified into an API that can be used multiple places.

Detect examples by regex

5c6403f

The strategy behind detect_examples() is to identify examples in the reStructuredText source before Sphinx parses them by using a regular expression. This lets us create stubs for example pages before Sphinx does its regular parsing.

Add astropy_examples_dir config variable

496c0c1

This configuration lets us control the directory where the example gallery is generated.

Add the ExamplePage class

8318c9a

The ExamplePage class builds upon ExampleSource, but now contains the concerns about rendering a standalone example page.

Landing and tag pages list tags of each example

89bcf7e

This provides a nice way to categorize examples and to provide discovery of other tags. This is implemented purely in the Jinja templating layer.

Change tag_pages to a list property

d50461d

This makes it easier to use from Jinja to test lengths

Link from example pages tags and original

d434bbe

This demonstrates how to use Jinja templating to provide links from a standalone example page back to the original source page and to pages for each associated tag.

Add ExampleContentDirective

69e2177

This directive inserts parsed content for the example from the application environment.

Add env-before-read-docs cb to reorder docs

e310889

The source pages are read *before* standalone example pages are read to ensure that they can be parsed into the environment and are available to the ExampleContentDirective.

jonathansick added 22 commits December 4, 2019 22:00

Sphinx 1.7 does not support external downloads

739ec9a

With Sphinx <= 1.7, :download: roles with external (i.e., https:// urls) download links don't work. This prevents these tests from running.

Add test with an internally-defined substitution

abe42af

Add ExamplePage.docname attribute

efd0a9e

This provides the actual docname, for use with Sphinx APIs.

Add format_example_id_to_source_ref_id

ee65eae

Will be used to go from example ID to the reference label for the example's source.

Add dependency on beautifulsoup4

d65dda0

This is used by the example extension to manipulate built HTML pages to insert examples into standalone pages.

Drop importskip on versions < Sphinx 1.7

cc08a83

Since Sphinx 1.7 is the minimum required version, it's no longer necessary to avoid testing against Sphinx 1.6.

Announce new example extension in change log

00349f9

Add a user documentation page for the example ext

f12be5f

This is a draft of "getting started" documentation for sphinx_astropy.ext.example. Right now it isn't part of a documentation build, but could be once a Sphinx project is set up.

jonathansick mentioned this pull request Mar 29, 2020

Initial packaging astropy/sphinx-example-index#1

Merged

jonathansick closed this May 1, 2020

jonathansick mentioned this pull request May 1, 2020

Port initial extension astropy/sphinx-example-index#2

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add sphinx_astropy.ext.example extension for building an example gallery #29

Add sphinx_astropy.ext.example extension for building an example gallery #29

jonathansick commented Dec 6, 2019

jonathansick commented Dec 19, 2019

jonathansick commented Dec 19, 2019

bsipocz commented Dec 19, 2019

astrofrog commented Apr 25, 2020

jonathansick commented May 1, 2020

Add sphinx_astropy.ext.example extension for building an example gallery #29

Add sphinx_astropy.ext.example extension for building an example gallery #29

Conversation

jonathansick commented Dec 6, 2019

Demo

Configurability

Processing overview

Next steps

jonathansick commented Dec 19, 2019

jonathansick commented Dec 19, 2019

bsipocz commented Dec 19, 2019

astrofrog commented Apr 25, 2020

jonathansick commented May 1, 2020