Add support for include directives #80

chrisjsewell · 2020-02-26T09:41:00Z

MyST now supports the include directive! See tests/test_sphinx/sourcedirs/includes and tests/test_sphinx/test_sphinx_builds for an example sphinx project with includes, and it's build outputs.

@choldgraf and @akhmerov, given the discussion in executablebooks/MyST-NB#32,
over having multiple kernels on a single HTML page. You should be able to just use this out-of-the box to work with the current jupter-sphinx and multiple MyST files:

```{include} nb1.ipynb.md
```

```{include} nb2.ipynb.md
```

Note: **/*.ipynb.md should be in the exclude_patterns so they are not executed twice.

The include directive only works though for including documents of the same type (e.g. md ->md or rst -> rst) if you want to actually parse the file content. So it would be interesting to consider if/how something like an include-nb directive could work, for stitching multiple notebooks into one MyST file:

```{include-nb} nb1.ipynb
```

```{include-nb} nb2.ipynb
```

chrisjsewell · 2020-02-26T09:56:30Z

@mmcky this was triggered by your question, even though it wasn't actually for includes 😆

akhmerov · 2020-02-26T12:44:07Z

Where is the restriction of the same filetype come from?

chrisjsewell · 2020-02-26T13:12:17Z

Where is the restriction of the same filetype come from?

Because include is done at the parser (docutils) level and is essentially just injecting more text into the block of text that the parser must tokenize. It has no 'knowledge' of filetype, it just reads the text from the file you specify and parses it. For example, this would still be read as an rST file, even though it has a different extension.

.. include:: include.html

chrisjsewell · 2020-02-26T13:39:56Z

Following from the above comment, it's also important to note that any relative file references in the included document will be read as relative to the importing/parent file. See this issue: https://stackoverflow.com/a/50261574/5033292.

(FYI in LaTex you also have the import package which 'fixes' this issue with relative paths)

Although this is a docutils/sphinx related matter, maybe it would be helpful to talk about this behavior in the MyST documentation?

Also parse content to directive as `StringList`, which allows 'glossary' directive to work, and fix `document.source` -> `document["source"]`

choldgraf · 2020-02-26T18:36:54Z

I'm +1 on documenting this feature in the docs!

Am I correct in thinking that we needed this PR because the include:: directive is a special-case of other directives?

chrisjsewell · 2020-02-26T18:55:28Z

Am I correct in thinking that we needed this PR because the include:: directive is a special-case of other directives?

Well it was one of the 'edge-case' directives that didn't work with the minimal docutils 'Mock' classes that cover most of the core docutils/sphinx directives, because they call extra attributes/methods that haven't yet been implemented. With enough mocked methods it probably wouldn't be a special case, but actually I think I've done a better job than the docutils code, of ensuring that errors are reported correctly; pointing to the correct included document, and correct source text line number.

In this PR I've also actually added some more methods to cover the glossary and csv-table directives, which just about covers them all now (see tests/test_renderers/test_roles_directives.py for the final outliers).

choldgraf · 2020-02-26T19:06:47Z

chrisjsewell · 2020-02-26T19:32:59Z

I'm +1 on documenting this feature in the docs!

Added #81 for that, so I'll merge this, unless you want to check over any of the code/circleCI artefacts @choldgraf?

akhmerov · 2020-02-26T19:41:07Z

Does it actually cover the use case of executablebooks/MyST-NB#32 (combining multiple kernels in one file)?

chrisjsewell · 2020-02-26T19:57:08Z

Does it actually cover the use case of ExecutableBookProject/MyST-NB#32 (combining multiple kernels in one file)?

For text-based documents, i.e. {kernel1.md, kernel2.md, ...} -> merge.md, it should yes. This should be easy for you to check with rST or MyST and jupyter-sphinx at your earliest convenience 😄. For notebooks {nb1.ipynb, nb2.ipynb, ...} -> merge.md, it would require an additional directive on the MyST-NB side, that should be able to be written along the lines of the Include class, but specifically calling the notebook parser, to inject AST into the merge.md document node.

akhmerov · 2020-02-26T20:07:04Z

Yeah, I was wondering about the notebooks, or otherwise files that don't have an directive for specifying the kernel, but provide the kernel either in the front matter, or metadata.

It seems that the AST resulting from the inclusion of several such files should contain several instances of the jupyter-sphinx kernel nodes or equivalent. That, in turn, would seem to require postprocessing the output of the parsing.

chrisjsewell · 2020-02-26T20:33:14Z

It seems that the AST resulting from the inclusion of several such files should contain several instances of the jupyter-sphinx kernel nodes or equivalent.

Well that's if you actually require a 'kernel node'. Surely you only need this if you intend to execute the notebooks after the parsing phase?

MyST-NB doesn't currently use any such node, because it assumes the notebooks are pre-executed. Then if you're (hopefully at some point) using the jupyter-cache you are retrieving outputs from it at the parsing phase; with it either having been populated before sphinx-build, or I could envisage an option to do this in one of sphinx's early (pre-parsing) phases, whereby:

You get the list of updated notebooks from Sphinx (or text-based documents that can be converted to notebooks via jupytext). ~~This is all of them, not just the ones that Sphinx deems are outdated.~~
You 'stage' them in the cache, so it can work out which ones require re-execution.
The executor is called and the cache updated.
Continue the sphinx build as normal.

Each 'included' notebook here, and its outputs, are dealt with at the time it is parsed/included. The only post-processing then is what MyST-NB already does with mime-bundles.

akhmerov · 2020-02-26T20:45:27Z

I thought that querying the cache would happen after the parsing—at least that was in your previous proposal. This is also why I thought that keeping the kernel node is important, since otherwise one cannot obtain the cache key.

You're saying it'd be done differently now?

chrisjsewell · 2020-02-26T20:52:51Z

You're saying it'd be done differently now?

Yep. I reserve the right to flip-flop in my proposals 😆 I (now) think it works better with 'getting the execution done early', because (a) it makes it easier to separate the execution from the documentation generation, (b) you have less post-processing to deal with.

akhmerov · 2020-02-26T20:56:41Z

On the other hand, if the execution and treating outputs is done before parsing, then it needs to be implemented once per file format. Say one wants to keep some rst files?

chrisjsewell · 2020-02-26T21:20:53Z

On the other hand, if the execution and treating outputs is done before parsing, then it needs to be implemented once per file format.

Why once per file format? You just gather all files that can be converted to notebooks (e.g. have a jupytext converter) and stage them on the cache. Note we are doing an initial 'fast' parse here, whereby we just want to identify the kernel and code-cells and don't need to tokenize the whole document. This may make it a bit slower than doing a single parse and post-processing, but makes the process a lot more robust and easy(ish) to implement.

Thinking about it now, during this 'fast' parse phase, we would store a mapping of each document's docame -> hash (the one for code+kernel) in the sphinx env. Then at the parsing stage, we could quickly retrieve it, without the need for a second parse.

Say one wants to keep some rst files?

Not sure what you mean by 'keep' here? Nothing is actually changed in the sphinx source folder, just the cache is updated.

akhmerov · 2020-02-26T21:41:27Z

Not sure what you mean by 'keep' here?

As in "keep using". Jupytext doesn't have an rst reader, AFAIK, and therefore the workflow you propose would mean implementing this. At the same time generating a sphinx AST is the one thing all formats for the EBP have in common.

chrisjsewell · 2020-02-26T22:04:22Z

At the same time generating a sphinx AST is the one thing all formats for the EBP have in common.

Well we can discuss this more on MyST-NB, but for now, I'm going to merge this 😄

chrisjsewell added 3 commits February 26, 2020 20:16

Add support for include directives

7ec9f09

python 3.5 test fix

89d602a

python 3.5 test fix

5f09f73

chrisjsewell requested a review from choldgraf February 26, 2020 09:42

Add testing

cbf2bf3

chrisjsewell added 4 commits February 27, 2020 01:01

Add example of relative vs absolute figure paths in test

f1821cb

Ensure error reporting refers to the included files path / lineno

36480b3

interpret absolute paths "correctly" for include

49eb4f5

Also parse content to directive as `StringList`, which allows 'glossary' directive to work, and fix `document.source` -> `document["source"]`

add support for csv-table (and other table like) directives

febaf09

chrisjsewell mentioned this pull request Feb 26, 2020

Documentation of directives, roles, etc #81

Open

chrisjsewell merged commit 46bf7e1 into develop Feb 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for include directives #80

Add support for include directives #80

chrisjsewell commented Feb 26, 2020 •

edited

chrisjsewell commented Feb 26, 2020

akhmerov commented Feb 26, 2020

chrisjsewell commented Feb 26, 2020

chrisjsewell commented Feb 26, 2020 •

edited

choldgraf commented Feb 26, 2020

chrisjsewell commented Feb 26, 2020 •

edited

choldgraf commented Feb 26, 2020

chrisjsewell commented Feb 26, 2020

akhmerov commented Feb 26, 2020

chrisjsewell commented Feb 26, 2020 •

edited

akhmerov commented Feb 26, 2020

chrisjsewell commented Feb 26, 2020 •

edited

akhmerov commented Feb 26, 2020

chrisjsewell commented Feb 26, 2020

akhmerov commented Feb 26, 2020

chrisjsewell commented Feb 26, 2020

akhmerov commented Feb 26, 2020

chrisjsewell commented Feb 26, 2020 •

edited

Add support for include directives #80

Add support for include directives #80

Conversation

chrisjsewell commented Feb 26, 2020 • edited

chrisjsewell commented Feb 26, 2020

akhmerov commented Feb 26, 2020

chrisjsewell commented Feb 26, 2020

chrisjsewell commented Feb 26, 2020 • edited

choldgraf commented Feb 26, 2020

chrisjsewell commented Feb 26, 2020 • edited

choldgraf commented Feb 26, 2020

chrisjsewell commented Feb 26, 2020

akhmerov commented Feb 26, 2020

chrisjsewell commented Feb 26, 2020 • edited

akhmerov commented Feb 26, 2020

chrisjsewell commented Feb 26, 2020 • edited

akhmerov commented Feb 26, 2020

chrisjsewell commented Feb 26, 2020

akhmerov commented Feb 26, 2020

chrisjsewell commented Feb 26, 2020

akhmerov commented Feb 26, 2020

chrisjsewell commented Feb 26, 2020 • edited

chrisjsewell commented Feb 26, 2020 •

edited

chrisjsewell commented Feb 26, 2020 •

edited

chrisjsewell commented Feb 26, 2020 •

edited

chrisjsewell commented Feb 26, 2020 •

edited

chrisjsewell commented Feb 26, 2020 •

edited

chrisjsewell commented Feb 26, 2020 •

edited