We wanted to have a configurable and easy to use Sphinx API documentation generator for our C++ projects. To achieve this we leaned on others for inspiration:
- Breathe (https://github.com/michaeljones/breathe): Excellent extension and the default choice for many.
- Gasp (https://github.com/troelsfr/Gasp): Gasp inspired us by allowing templates to control the output. Unfortunately development of Gaps seems to have stopped.
So what is wurfdocs
:
- Essentially we picked up where Gasp let go. We have borrowed the idea of templates to make it highly configurable.
- We made it easy to use by automatically running Doxygen to generate the initial API documentation.
- We parse the Doxygen XML into an easy to use Python dictionary. Which can be consumed in the templates.
- We prepared the extension for other backends (replacing Doxygen) e.g. https://github.com/foonathan/standardese once they become ready.
We are still very much in the initial development phase - all things are subject to change.
- Parsing Doxygen XML: We do not support everything yet (and probably never will). We still are missing some crucial elements like proper parsing of the text elements in comments, parameter descriptions etc.
To use the extension, the following steps are needed:
Install the extension e.g.:
pip install wurfdocs
If you already have
Shinx
documentation setup go to setup 4 otherwise go to setup 3.Generate the initial
Shinx
documentation by running:mkdir docs cd docs python sphinx-quickstart
You will need to enter some basic information about your project such as the project name etc.
Open the
conf.py
generated bysphinx-quickstart
and add the the following:# Append or insert 'wurfdocs' in the extensions list extensions = ['wurfdocs'] # Wurfdocs options - relative to your docs dir wurfdocs = { 'source_path': '../src', 'parser': {'type': 'doxygen', 'download': True } }
Note: if you do not want to automatically download Doxygen, set
download
toFalse
. In that casewurfdocs
will try to invoke plaindoxygen
without specifying any path or similar. This means itdoxygen
must be available in the path.To generate the API documentation for a class open a
.rst
file e.g.index.rst
if you ransphinx-quickstart
. Say we want to generate docs for a class calledtest
in the namespaceproject
.To do this we add the following directive to the rst file:
.. wurfdocs:: class_synopsis.rst :selector: project::coffee::machine
Such that
index.rst
becomes something like:Welcome to Coffee's documentation! =================================== .. toctree:: :maxdepth: 2 :caption: Contents: .. wurfdocs:: class_synopsis.rst :selector: project::coffee::machine Indices and tables ================== * :ref:`genindex` * :ref:`modindex` * :ref:`search` To do this we use the ``class_synopsis.rst`` template.
To use this on readthedocs.org you need to have the wurfdocs
Sphinx
extension installed. This can be done by adding a requirements.txt
in the
documentation folder. readthedocs.org can be configured to use the
requirements.txt
when building a project. Simply put wurfdocs
in to the
requirements.txt
.
Edit
NEWS.rst
,wscript
andsrc/wurfdocs/wurfdocs.py
(set correctVERSION
)Run
./waf upload
The tests will run automatically by passing --run_tests
to waf:
./waf --run_tests
This follows what seems to be "best practice" advise, namely to install the package in editable mode in a virtualenv.
A bunch of the tests use a class called Record
, defined in
(test/record.py
). The Record
class is used to store output as
files from different parsing and rendering operations.
E.g. say we want to make sure that a parser function returns a certain
dict
object. Then we can record that dict
:
recorder = record.Record(filename='test.json', recording_path='/tmp/recording', mismatch_path='/tmp/mismatch') recorder.record(data={'foo': 2, 'bar': 3})
If data
changes compared to a previous recording a mismatch will be
detected. To update a recording simply delete the recording file.
You will also notice that a bunch of the tests take a parameter called
testdirectory
. The testdirectory
is a pytest fixture, which
represents a temporary directory on the filesystem. When running the tests
you will notice these temporary test directories pop up under the
pytest_temp
directory in the project root.
You can read more about that here:
The sphinx documentation on creating extensions: http://www.sphinx-doc.org/en/stable/extdev/index.html#dev-extensions
- An extension is a Python module. When an extension loads, Sphinx will import
it and execute its
setup()
function. - Understanding how to put together docutils nodes seems pretty difficult. One suggesting form the mailinglist was to look at the following document: https://github.com/docutils-mirror/docutils/blob/master/test/functional/expected/standalone_rst_pseudoxml.txt
- While researching who to do this, there seem to be three potential approaches:
- Use the standard Sphinx approach and operate with the doctree.
- Create RST based on jinja templates
- Create HTML based on jinja templates
- Inspiration - Sphinx extensions that were used as inspiration while developing this extension.
- Understanding how to write stuff with docutils: * http://agateau.com/2015/docutils-snippets/
- Creating custom directive * http://www.xavierdupre.fr/blog/2015-06-07_nojs.html
- Nice looking Sphinx extensions * https://github.com/bokeh/bokeh/tree/master/bokeh/sphinxext
- This part of the documentation was useful in order to understand the need for ViewLists etc. in the directives run(...) function. http://www.sphinx-doc.org/en/stable/extdev/markupapi.html
- This link provided inspiration for the text json format: https://github.com/micnews/html-to-article-json
- More xml->json for the text: https://www.xml.com/pub/a/2006/05/31/converting-between-xml-and-json.html
We want to support different "backends" like Doxygen to parse the source code. To make this possible we define an internal source code description format. We then translate e.g. Doxygen XML to this and use that to render the API documentation.
This way a different "backend" e.g. Doxygen2 could be use used as the source code parser and the API documentation could be generated.
In order to be able to reference the different entities in the API we need to assign them a name.
We use a similar approach here as described in standardese.
This means that the unique-name
of an entity is the name with all
scopes e.g. foo::bar::baz
.
- For functions you need to specify the signature (parameter types and for
member functions cv-qualifier and ref-qualifier) e.g.
foo::bar::baz::func()
orfoo::bar::baz::func(int a, char*) const
. See cppreference for more information.
The internal structure is a dicts with the different API entities. The
unique-name
of the entity is the key and the entity type also a
Python dictionary is the value e.g:
api = { 'unique-name': { ... }, 'unique-name': { ... }, ... }
To make this a bit more concrete consider the following code:
namespace ns1 { class shape { void print(int a) const; }; namespace ns2 { struct box { void hello(); }; void print(); } }
Parsing the above code would produce the following API dictionary:
api = { 'ns1': { 'type': 'namespace', ...}, 'ns1::shape': { 'type': 'class', ... }, 'ns1::shape::print(int) const': { type': function' ... }, 'ns1::ns2': { 'type': 'namespace', ... }, 'ns1::ns2::box': { 'type': 'struct', ... }, 'ns1::ns2::box::hello()': { type': function' ... }, 'ns1::ns2::print()': { 'type': 'function', ...} }
The different entity types expose different information about the API. We will document the different types in the following.
Python dictionary representing a C++ namespace:
info = { 'type': 'namespace', 'name': 'unqualified-name', 'parent': 'unique-name' | None, 'members: [ 'unique-name', 'unique-name' ] }
Python dictionary representing a C++ class or struct:
info = { 'type': 'class' | 'struct', 'name': 'unqualified-name', 'location' { 'file': 'filename.h', 'line-start': 10, 'line-end': 23 }, 'scope': 'unique-name' | None, 'members: [ 'unique-name', 'unique-name' ], 'briefdescription: 'some text', 'detaileddescription: 'some text }
Python dictionary representing a C++ function:
info = { 'type': 'function', 'name': 'unqualified-name', 'location' { 'file': 'filename.h', 'line': 10}, 'scope': 'unique-name' | None, 'return_type': 'sometype', 'is_const': True | False, 'is_static': True | False, 'access': 'public' | 'protected' | 'private', 'briefdescription: 'some text', 'detaileddescription: 'some text 'parameters': [ { 'type': 'sometype', 'name': 'somename' }, { 'type': 'sometype', 'name': 'somename' } ] }
Text information is stored in a list of paragraphs:
description = { 'has_content': true | false, 'paragraphs' : [ { "type": "text" | "code", ... ] } text = { 'type': 'text', 'content': 'hello', 'italic': true | false, 'bold': true | false, 'link': unique-name } code = { 'type': 'code', 'content': 'void print();', }
Issue equivalent C++ function signatures can be written in a number of different ways:
void hello(const int *x); // x is a pointer to const int void hello(int const *x); // x is a pointer to const int
We can also move the asterix (*
) to the left:
void hello(const int* x); // x is a pointer to const int void hello(int const* x); // x is a pointer to const int
So we need some way to normalize the function signature when transforming it
to unique-name
. We cannot simply rely on sting comparisons.
According to the numerous google searches it is hard to write a regex for this. Instead we will try to use a parser:
- Python parser: https://github.com/erezsh/lark
- C++ Grammar: http://www.externsoft.ch/media/swf/cpp11-iso.html#parameters_and_qualifiers
We only need to parse the function parameter list denoted as the
http://www.externsoft.ch/media/swf/cpp11-iso.html#parameters_and_qualifiers
.
Since we are going to be using Doxygen's XML output as input to the extension we need a place to store it.
We will use the approach by Breathe and store it in
_build/.doctree/wurfdocs
. Note, this is available in the Sphinx application
object as the sphinx.application.Sphinx.doctreedir
- Source directory: In Sphinx the source directory is where our .rst files are
located. This is what you pass to
sphinx-build
when building your documentation. We will use this in our extension to find the C++ source code and output customization templates.
- Why use an
src
folder (https://hynek.me/articles/testing-packaging/). tl;dr you should run your tests in the same environment as your users would run your code. So by placing the source files in a non-importable folder you avoid accidentally having access to resources not added to the Python package your users will install... - Python packaging guide: https://packaging.python.org/distributing/