Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LANGUAGE environment variable inconsistently affects output of objects.inv #9778

Open
lamby opened this issue Oct 26, 2021 · 1 comment
Open
Labels

Comments

@lamby
Copy link
Contributor

lamby commented Oct 26, 2021

Describe the bug

Hi,

Not entirely sure where the bug is here, but it seems like there is something up with language handling and generating the objects.inv file. The context to all this is that I'm working on Reproducible Builds, and some update has suddenly rendered a lot of packages that use Sphinx unreproducible - that is, generating different output regardless of the surrounding environment.

In particular, I discovered this by comparing two builds: the first with LANGUAGE="en_GB:en" and the second with LANGUAGE="et_EE:et" environment variable. What happens is that all of the documentation is identical except that a single entry in the objects.inv file appears to be translated. This is despite the output including the following logging message in both builds:

dumping search index in English (code: en)... done
dumping object inventory... done

(NB. code: en here in both builds)

Decoding this zlib-encoded file, I can see that the difference is a translation one:

# Sphinx inventory version 2
# Project: OpenDrop
# Version: 
# The remainder of this file is compressed using zlib.

developers/index std:doc -1 developers/index.html Developer notes
-genindex std:label -1 genindex.html Index
+genindex std:label -1 genindex.html Indeks
getting_started/index std:doc -1 getting_started/index.html Getting Started
index std:doc -1 index.html Overview
modindex std:label -1 py-modindex.html Module Index
py-modindex std:label -1 py-modindex.html Python Module Index
search std:label -1 search.html Search Page
usage/conan std:doc -1 usage/conan.html Contact Angle
usage/ift std:doc -1 usage/ift.html Interfacial Tension

... and, indeed, "Indeks" is in the Estonian .po file:

#: sphinx/builders/latex/__init__.py:194 sphinx/domains/std.py:604
#: sphinx/templates/latex/latex.tex_t:97
#: sphinx/themes/basic/genindex-single.html:30
#: sphinx/themes/basic/genindex-single.html:55
#: sphinx/themes/basic/genindex-split.html:11
#: sphinx/themes/basic/genindex-split.html:14
#: sphinx/themes/basic/genindex.html:11 sphinx/themes/basic/genindex.html:34
#: sphix/themes/basic/genindex.html:67 sphinx/themes/basic/layout.html:147
#: sphinx/writers/texinfo.py:498
msgid "Index"
msgstr "Indeks"

This is just confusing though because why isn't "Module Index" translated as well? "Mooduli indeks" is also there in the Estonian .po:

#: sphinx/domains/std.py:605
msgid "Module Index"
msgstr "Mooduli indeks"

... so I suppose the bug here is either that "Index" gets translated whilst "Module Index" is not... or the other way around. This why I use "inconsistency" in the title of this issue.

Playing around with the code, I am pretty certain that the translated entry is in sphinx/domains/std.py — could it be that the data in initial_data is being prematurely translated? Either way, though, I was expecting that the documentation and entries are all identical, regardless of the LANGUAGE environment variable.

How to Reproduce

Compare the builds between exporting the LANGUAGE="en_GB:en and LANGUAGE="et_EE:et" environment variable, specifically the objects.inv file.

Expected behavior

No response

Your project

I'm using opendrop, but this will occur with any package

Screenshots

No response

OS

Linux

Python version

3.9

Sphinx version

4.2.0

Sphinx extensions

No response

Extra tools

No response

Additional context

No response

@lamby lamby added the bug label Oct 26, 2021
@lamby
Copy link
Contributor Author

lamby commented Oct 27, 2021

  • This is likely more than just the LANGUAGE environment variable. It will be the same as gettext, etc. (eg. LANGUAGE, LC_ALL, LC_MESSAGES and LANG)

  • Oh, and just to braindump a nasty local hack:

diff --git a/sphinx/locale/__init__.py b/sphinx/locale/__init__.py
index 8fc6c1519..8bd7d0314 100644
--- a/sphinx/locale/__init__.py
+++ b/sphinx/locale/__init__.py
@@ -10,6 +10,7 @@
 
 import gettext
 import locale
+import os
 from collections import UserString, defaultdict
 from gettext import NullTranslations
 from typing import Any, Callable, Dict, Iterable, List, Optional, Tuple, Union
@@ -129,6 +130,10 @@ def init(locale_dirs: List[Optional[str]], language: Optional[str],
     else:
         languages = None
 
+        # Don't fallback to consulting LANG, etc. if we want a reproducible build.
+        if 'SOURCE_DATE_EPOCH' in os.environ:
+            languages = []
+
     # loading
     for dir_ in locale_dirs:
         try:

@lamby lamby changed the title LANGUAGE environement variable inconsistently affects output of objects.inv LANGUAGE environment variable inconsistently affects output of objects.inv Oct 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: In Progress
Development

No branches or pull requests

1 participant