Documentation build hangs indefinitely for pdfhtml on master #1308

choldgraf · 2021-04-19T02:09:11Z

Describe the bug

In #1283 we re-worked the documentation structure a bit. This seems to have introduced a bug where the singlepage PDF build via HTML now hangs indefinitely.

To Reproduce

Build the book PDF via HTML with

tox -e docs-pdfhtml-update

this should hang indefinitely

The text was updated successfully, but these errors were encountered:

mmcky · 2021-04-27T01:07:19Z

This seems to be an issue during the html -> pdf step using pyppeteer as

jb build docs --builder=custom --custom-builder=singlehtml

the sphinx build for singlehtml seems to be working as well as the html builder.

However when I run the builder=pdfhtml the html that is used in the _build/html folder doesn't load due to some error so must be an issue in the javascript loading. The left hand index is missing when I open _build/html/intro.html when building with pdfhtml

~~@choldgraf does the pdfhtml use the singlehtml builder but at the page level?~~ Scrap that it builds into a single file

choldgraf · 2021-04-27T01:20:20Z

the plot thickens :-) I think you're right that this is probably some kinda infinite recursion kinda thing, or javascript library problem or something

mmcky · 2021-04-27T05:21:08Z

So I commented out the tableofcontents directives 42d8934 and the pdfhtml builder executes and produces the pdf

I then tried adding singlehtml=(skip,None) to the tableofcontents node in https://github.com/executablebooks/jupyter-book/tree/test-tests to add it to the singlehtml context and the next issue seems that a node called fontawesome isn't registered with singlehtml builder and writer.

I have tried a few things this afternoon in #1307. I think the main issue is tableofcontents node isn't registered with singlehtml builder causing an unstable build process (which sometimes appears to work just fine). I think we may need to review all the extensions used by jupyter-book and match them up to the builders we use.

mmcky · 2021-04-27T05:54:38Z

I also checked out ca79899 (just before the docs rework) which is building fine according to the cli output but taking a closer look at _build/pdf/book.pdf it also appears to be missing content.

book.pdf

choldgraf · 2021-04-27T16:44:58Z

Hmmm - I didn't realize that singlehtml and html builders needed to be manually added to a node, that's super annoying :-/

chrisjsewell · 2021-04-27T16:51:32Z

Note tableofcontents doesn't now now need to be "registered" in sphinx-external-toc.
fontawesome is probably from sphinx-panels

mmcky · 2021-04-27T23:17:27Z

thanks @chrisjsewell yeah in #1307 I found the fontawesome nodes but then I was left with one I can't find _TabInput.

Note tableofcontents doesn't now now need to be "registered" in sphinx-external-toc

~~Roger that -- where does that node get registered then?~~ I see they will be registered here in jupyter-book 👍 like in #1307 (?)

chrisjsewell · 2021-04-27T23:26:03Z

No the key thing here is that you only need to register a node if they will actually reach the builder.
Since tableofcontents nodes are all now replaced/removed in a transform (that is always applied, as opposed to the current post-transform that is not), they can never reach the builder.
Hence they are not registered anywhere

mmcky · 2021-04-28T03:22:59Z

Documenting a list of packages that will need to be updated for singlehtml:

sphinx-inline-tabs for _TabInput, _TabLabel nodes
jupyter-sphinx for all the nodes associated with jupyter output such as JupyterWidgetViewNode
sphinx-panels for fa node to support fontawesome
~~sphinx-external-toc for tableofcontents~~ no longer required

mmcky · 2021-04-28T03:27:35Z

@chrisjsewell @choldgraf happy to start coordinating updates for these packages via PR's but this might take some time to resolve for #1293 as these are split across a diverse set of repos that aren't within the executablebooks organisation.

I think this would be the best long term strategy (i.e. adding support for singlehtml in the packages themselves) but wondering if you guys thought a high priority transform might be able to attach html settings to singlehtml if they aren't found from the extension.

Update: Talked this idea through with @AakashGfude. It might not be feasible as the nodes are loaded during the app setup phase so once the singlehtml builder is instantiated and transforms run it will likely have no knowledge of the app.add_node() phase

mmcky · 2021-04-28T10:23:42Z

I did a bit of digging and it looks like this could actually be done. I am just not sure if assigning singlehtml with nodes from html is a robust thing to do.

After the sphinx.app is instantiated we do have access to:

app.registry.translation_handlers

which records the handlers for html, and singlehtml (amongst others).

So we could loop through any nodes missing from singlehtml and add the nodes from the html registry.

Then we can go trough all the packages and update them with singlehtml specific methods (if needed).

Update: Bummer -- it looks like sphinx doesn't save the actual node just the node name attribute and leaves the node to be added to docutils

mmcky · 2021-04-28T11:31:43Z

The nodes could be recovered from docutils:

            if builder == "singlehtml":
                singlehtml_node_names = app.registry.translation_handlers['singlehtml'].keys()
                html_nodes = app.registry.translation_handlers['html']
                #Recover registered nodes from docutils
                registered_nodes = {}
                from sphinx.util import docutils
                for node in docutils.additional_nodes:
                    registered_nodes[node.__name__] = node
                # Set missing nodes with visit,depart methods from html
                for node_name, methods in html_nodes.items():
                    if node_name not in singlehtml_node_names:
                        try:
                            app.registry.add_translation_handlers(registered_nodes[node_name], singlehtml=methods)
                        except:
                            raise ValueError(f"{node_name} is missing from the docutils registry")

and could be added here but still not sure this is a good idea or not.

This does seem to fix a lot of missing content from the html generation step using singlehtml

The resulting intro.html file is 6.4Mb

mmcky · 2021-04-29T10:19:34Z

@chrisjsewell any thoughts on the above proposal to add any missing singlehtml nodes from html registry? I think most of the time they will share visit/depart methods for building the html so it seems like a pretty safe option to me. As packages are updated one-by-one with singlehtml support these nodes wouldn't be replaced so if there are differences it will take precedence.

mmcky · 2021-04-29T23:38:03Z

I have checked in with the sphinx-dev community about this idea:

https://groups.google.com/g/sphinx-dev/c/J4if9AvxNaA/m/76yAj4vCDwAJ

mmcky · 2021-05-02T09:17:25Z

@chrisjsewell @choldgraf I heard from Takeshi (on the dev community channel) and we may be able to have a look at adjusting this upstream in sphinx.

choldgraf · 2021-05-02T23:59:28Z

Great news! Hopefully he is cool putting a fix into 3.x

mmcky · 2021-05-03T05:15:18Z

@choldgraf I am trying to dig in to the original issue here with pdfhtml when you merged #1283 but when using

tox -e docs-pdfhtml-update

I find I get the NotImplementedError: Unknown node: TableOfContentsNode error rather than the hanging behaviour. Any ideas?

mmcky · 2021-05-03T05:59:25Z

I have changed the pyppeteer wait settings for PR #1293 which seems pretty reasonable which resolves part of this issue to get pdfhtml to pass ci. However the missing nodes are still required to get all content included in the pdf file.

There is discussion upstream to add html nodes based on format rather than builder.

sphinx-doc/sphinx#9160

This would however target sphinx>=4.1

mmcky · 2021-05-04T02:15:58Z

With the latest pyppeteer settings the test for pdfhtml is now passing.
I have opened a separate issue regarding the missing content as linked above.

choldgraf · 2021-05-04T02:40:38Z

Think we could tie this issue to that PR so that it's closed when the PR is merged, rather than closing this issue straight away?

choldgraf added bug Something isn't working 🔥 priority labels Apr 19, 2021

choldgraf self-assigned this Apr 19, 2021

choldgraf mentioned this issue Apr 19, 2021

🐛FIX: pdfhtml build error #1307

Closed

chrisjsewell mentioned this issue Apr 24, 2021

♻️ REFACTOR: Delegate ToC logic to sphinx-external-toc #1293

Merged

14 tasks

mmcky mentioned this issue Apr 27, 2021

FIX: add nodes for singlehtml builder (TEST) executablebooks/sphinx-panels#61

Open

choldgraf mentioned this issue Apr 30, 2021

🔧 MAIN: Pin sphinx-thebe and sphinx book theme #1312

Merged

mmcky mentioned this issue May 3, 2021

Using html visit/depart methods for html family of builders (i.e. singlehtml) sphinx-doc/sphinx#9160

Open

mmcky mentioned this issue May 4, 2021

Missing content when using pdfhtml #1315

Open

mmcky closed this as completed May 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Documentation build hangs indefinitely for pdfhtml on master #1308

Documentation build hangs indefinitely for pdfhtml on master #1308

choldgraf commented Apr 19, 2021

mmcky commented Apr 27, 2021 •

edited

Loading

choldgraf commented Apr 27, 2021

mmcky commented Apr 27, 2021 •

edited

Loading

mmcky commented Apr 27, 2021 •

edited

Loading

choldgraf commented Apr 27, 2021

chrisjsewell commented Apr 27, 2021

mmcky commented Apr 27, 2021 •

edited

Loading

chrisjsewell commented Apr 27, 2021 •

edited

Loading

mmcky commented Apr 28, 2021

mmcky commented Apr 28, 2021 •

edited

Loading

mmcky commented Apr 28, 2021 •

edited

Loading

mmcky commented Apr 28, 2021 •

edited

Loading

mmcky commented Apr 29, 2021

mmcky commented Apr 29, 2021

mmcky commented May 2, 2021

choldgraf commented May 2, 2021

mmcky commented May 3, 2021

mmcky commented May 3, 2021 •

edited

Loading

mmcky commented May 4, 2021

choldgraf commented May 4, 2021

Documentation build hangs indefinitely for pdfhtml on master #1308

Documentation build hangs indefinitely for pdfhtml on master #1308

Comments

choldgraf commented Apr 19, 2021

mmcky commented Apr 27, 2021 • edited Loading

choldgraf commented Apr 27, 2021

mmcky commented Apr 27, 2021 • edited Loading

mmcky commented Apr 27, 2021 • edited Loading

choldgraf commented Apr 27, 2021

chrisjsewell commented Apr 27, 2021

mmcky commented Apr 27, 2021 • edited Loading

chrisjsewell commented Apr 27, 2021 • edited Loading

mmcky commented Apr 28, 2021

mmcky commented Apr 28, 2021 • edited Loading

mmcky commented Apr 28, 2021 • edited Loading

mmcky commented Apr 28, 2021 • edited Loading

mmcky commented Apr 29, 2021

mmcky commented Apr 29, 2021

mmcky commented May 2, 2021

choldgraf commented May 2, 2021

mmcky commented May 3, 2021

mmcky commented May 3, 2021 • edited Loading

mmcky commented May 4, 2021

choldgraf commented May 4, 2021

mmcky commented Apr 27, 2021 •

edited

Loading

mmcky commented Apr 27, 2021 •

edited

Loading

mmcky commented Apr 27, 2021 •

edited

Loading

mmcky commented Apr 27, 2021 •

edited

Loading

chrisjsewell commented Apr 27, 2021 •

edited

Loading

mmcky commented Apr 28, 2021 •

edited

Loading

mmcky commented Apr 28, 2021 •

edited

Loading

mmcky commented Apr 28, 2021 •

edited

Loading

mmcky commented May 3, 2021 •

edited

Loading