Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Latex file is not updated when using the option numfig = True #11474

Closed
tombolano opened this issue Jun 28, 2023 · 5 comments · Fixed by #11939
Closed

Latex file is not updated when using the option numfig = True #11474

tombolano opened this issue Jun 28, 2023 · 5 comments · Fixed by #11939

Comments

@tombolano
Copy link

tombolano commented Jun 28, 2023

Describe the bug

Latex file is not updated when using the option numfig = True.

How to Reproduce

The bug can be reproduced with the following bash code:

# generate a default project
sphinx-quickstart --no-sep -p "myproject" -a test -r "" -l en
# add numfig = True option to conf.py
echo -e "\nnumfig=True" >> conf.py
# build latex & print main latex file modification time
make latex && ls _build/latex/myproject.tex --full-time
# now add a change to index.rst
echo -e "\n*new change*" >> index.rst
# build latex & print main latex file modification time again
make latex && ls _build/latex/myproject.tex --full-time

In the code above, after updating index.rst, when running make latex && ls _build/latex/myproject.tex --full-time the file myproject.tex is not updated.

The output of make latex in the terminal seems normal (I see no difference with respect to running without setting numfig = True, just that the tex file is not updated),

The workaround for this is running sphinx-build the -E option (don't use a saved environment): sphinx-build -E -b latex . _build/latex/

Environment Information

Platform:              linux; (Linux-6.2.14-300.fc38.x86_64-x86_64-with-glibc2.37)
Python version:        3.11.3 (main, Apr  5 2023, 00:00:00) [GCC 13.0.1 20230401 (Red Hat 13.0.1-0)])
Python implementation: CPython
Sphinx version:        7.0.1
Docutils version:      0.19
Jinja2 version:        3.1.2
Pygments version:      2.14.0

Sphinx extensions

No response

Additional context

No response

@zoranbosnjak
Copy link

We have noticed the same problem when upgrading from sphinx 3.2.1 (the problem was not present) to 7.0.0.

Since the steps to reproduce are exact and there exists working/non-working pair of releases, it should be possible to use git bisect or something alike to narrow down to the change, where the problem was introduced. From there, it is usually easy to resolve it.

@tombolano
Copy link
Author

I ran git bisect as suggested by @zoranbosnjak, I used master as the good commit and v5.0.0 as the bad. I used the following code to run it automatically:

#!/bin/bash

# setup virtual environment and install sphinx
python -m venv .venv
source .venv/bin/activate
pip install -e .

mkdir .venv/test
cd .venv/test

# generate a default project
sphinx-quickstart --no-sep -p "myproject" -a test -r "" -l en

# add numfig = True option to conf.py
echo -e "\nnumfig=True" >> conf.py

# build latex & get latex file modification time
make latex
file_modification_time_1="$(stat _build/latex/myproject.tex --printf=%y)"

# add a change to index.rst
echo -e "\n*new change*" >> index.rst

# build latex & get main latex file modification time again
make latex
file_modification_time_2="$(stat _build/latex/myproject.tex --printf=%y)"

# clean virtual environment
deactivate
cd ../..
rm -rf .venv

# print result
if [ "$file_modification_time_1" == "$file_modification_time_2" ]; then
	exit 1 # bad
else
	exit 0 # good
fi

The result was:

a9b0f2708b67a355c5f4969ccbbe9bd695518805 is the first bad commit
commit a9b0f2708b67a355c5f4969ccbbe9bd695518805
Author: Adam Turner <9087854+aa-turner@users.noreply.github.com>
Date:   Wed Jan 4 04:22:20 2023 +0000

    Cache doctrees more efficiently

 sphinx/environment/__init__.py         | 23 ++++++++++++++---------
 sphinx/environment/adapters/toctree.py |  2 +-
 sphinx/testing/util.py                 |  4 ++++
 tests/test_build_gettext.py            |  3 ++-
 tests/test_build_html.py               |  2 +-
 tests/test_build_latex.py              |  2 +-
 tests/test_ext_math.py                 |  4 ++--
 7 files changed, 25 insertions(+), 15 deletions(-)
bisect found first bad commit

That commit introduced some changes related to the reading of the toctree from cache, but it is not clear to me where exactly is the bug.

@zoranbosnjak
Copy link

@AA-Turner, since commit a9b0f27, the make latex does not detect change in .rst input files (when the numfig = True in the configuration) and so the documentation project is not recompiled correctly. Could you please check?

Also suggesting to add/extend a testcase, to cover this scenario (make latex, change, make latex again... check by the timestamp if the output is updated).

@ban2st
Copy link

ban2st commented Jan 29, 2024

Hi, I dug into this and actually uncovered what's causing the issue (at least I think so):

After all documents have been read, the Builder.build method is invoking the BuildEnvironment.check_dependents methods.
image

This in turn emits the env-get-updated event that causes the EnvironmentCollector.get_updated_docs to be called. The implementation of this method in TocTreeCollector is then calling a assign_figure_numbers method which does nothing except when setting numfig = True:
image

Then _walk_doc is calling BuildEnvironment.get_doctree on EVERY file in the worktree.
So every single doctree is read into a bytes array and stored in the _pickled_doctree_cache.
image

The real issue now is that all of this happens before the environment is pickled and written to disk for further incremental builds:
image

So the environment gets written to a file while it contains the data of all doctrees in the _pickled_doctree_cache map.
(this was actually what brought me onto track here, as my environment.pickle file was far too huge).
Now during the next incremental build, the environment gets unpickled while still having cached all the existing doctree files in the worktree!
As _pickled_doctree_cache is not meant to be pickled and reloaded, this essentially breaks the update build.

An easy fix is to prevent it from being pickled in the BuildEnvironment.__getstate__ method:
image

On top of that, the issue is even harder to spot as it can be obscured by the _write_doc_doctree_cache which is used later in the write stage of the build before falling back to the _pickled_doctree_cache.
The _write_doc_doctree_cache is only filled if the reading of the source files is done in serial.
This is always the case if the number of changed source files <= 5.
So the bug can only be really spotted if you've enabled parallel build AND more then 5 of your source files have changes.

@AA-Turner Can you verify this analysis? :)

@picnixz
Copy link
Member

picnixz commented Feb 4, 2024

Oh, I indeed think that the _pickled_doctree_cache attribute should not be present in the serialized environment since whenever we are querying get_doctree we are getting a fresh doctree. Actually, what happens is that we are never loading the 'latest' pickled file (if you print what is being deserialized you'll just after check_dependents, you'll see that the documents are not those you'd expect).

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 14, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants