Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade sphinx to ~7.1.2 #1954

Merged
merged 1 commit into from
Aug 18, 2024
Merged

Conversation

EliahKagan
Copy link
Contributor

@EliahKagan EliahKagan commented Aug 18, 2024

This upgrades Sphinx from 4.3.2 to 7.1.2.

The old pinned version and its explicitly constrained dependencies are retained for now for Python 3.7 so that documentation can be built even with 3.7. (This could maybe be removed soon as a preliminary step toward eventually dropping 3.7 support.)

For Python 3.8 and higher, version 7.1.2 is used, allowing later patch versions but constrained to remain 7.1.*. This is so the same versions are likely to be selected on all Python version from 3.8 and higher, to minimize small differences in generated documentation that different versions could give, and also to simplify debugging.

The reason to upgrade Sphinx now is to support Python 3.13, which shall be (and, in the pre-releases available, is) incompatible with versions of Sphinx below 6.2. This is because those earlier Sphinx versions use the deprecated imghdr module, which 3.13 removes:

On old versions of Sphinx, that gives the error:

Extension error:
Could not import extension sphinx.builders.epub3 (exception: No module named 'imghdr')

Using Sphinx 6.2 is sufficient to avoid this, but there do not seem to be any disadvantages for GitPython to remain below 7.1.2.

The reason we did not upgrade Sphinx before is that, last time we considered doing so, we ran into a problem of new warnings (that we treat as errors). This is detailed in the "Can we upgrade Sphinx?" section of #1802, especially the "What Sphinx 5 is talking about" subsection. The problem is warnings about Actor when it comes in through type annotations:

WARNING: more than one target found for cross-reference 'Actor': git.objects.util.Actor, git.util.Actor

So this includes other changes to fix that problem as well. The solution used here is to import Actor even when TYPE_CHECKING is false, and write it unquoted in annotations, as Actor rather than "Actor". This allows Sphinx to discern where it should consider it to be located, for the purpose of linking to its documentation.

This does not incur overhead, because:

  • The affected modules already have imports from git.util, so also importing Actor from git.util does not cause any modules to be imported that were not imported otherwise, nor even to be imported at a different time.
  • Even if that that had not been the case, most modules in GitPython including git.util have members imported them into the top-level git module in git.__init__ when git is imported (and thus when any Python submodule of git is imported).

The only disadvantage is the presence of the additional name in those modules at runtime, which a user might inadvertently use and thus write brittle code that could break if it is later removed. But:

  • The affected modules define __all__ and do not include "Actor" in __all__, so it is non-public.
  • There are many places in GitPython (and most Python libraries) where the onus is already on the author of code that uses the library to avoid doing this.

The reasons for the approach taken here, rather than any of several others, were:

  1. I did not write out the annotations as git.util.Actor to resolve the ambiguity because annotations should, and for some uses must, also be interpretable as expressions. But while from git.util import Actor works and makes Actor available, git.util.Actor cannot be used as an expression even after import git.util. This is because, even after such an import, git.util actually refers to git.index.util. This is as detailed in the warnings issued when it is accessed, originally from an overly broad * import but retained because removing it could be a breaking change. See git/__init__.py for details.

  2. The reason I did not write out the annotations as git.objects.util.Actor to resolve the ambiguity is that not all occurrences where Sphinx needed to be told which module to document it as being from were within the git.objects Python submodule. Two of the warnings were in git/objects/tag.py, where annotating it that way would not be confusing. But the other two were in git/index/base.py.

  3. Although removing Actor from git.objects.util.__all__ would resolve the ambiguity, this should not be done, because:

    • This would be a breaking change.
    • It seems to be there deliberately, since git.objects.util contains other members that relate to it directly.
  4. The reasons I did not take this opportunity to move the contents of git/util.py to a new file in that directory and make git/util.py re-export the contents, even though this would allow a solution analogous to (1) but for the new module to be used, while also simplifying importing elsewhere, were:

    • That seems like a change that should be done separately, based either on the primary benefit to users or on a greater need for it.
    • If and when that is done, it may make sense to change the interface as well. For example, git/util.py has a number of members that it makes available for use throughout GitPython but that are deliberately omitted from __all__ and are meant as non-public outside the project. One approach would be to make a module with a leading _ for these "internal" members, and another public ones with everything else. But that also cannot be decided based on the considerations that motivate the changes here.
    • The treatment of HIDE_WINDOWS_KNOWN_ERRORS, which is public in git/util.py and which currently does have an effect, will need to be considered. Although it cannot be re-bound by assigning to git.util.HIDE_WINDOWS_KNOWN_ERRORS because the git.util subexpression would evaluate to git.index.util, there may be code that re-binds it in another way, such as by accessing the module through sys.modules. Unlike functions and classes that should not be monkey-patched from code outside GitPython's test suite anyway, this attribute may reasonably be assigned to, so it matters what module it is actually in, unless the action of assigning to it elsewhere is customized dynamically to carry over to the "real" place.
  5. An alternative solution that may be reasonable in the near future is to modify reference.rst so duplicate documentation is no longer emitted for functions and classes that are defined in one place but imported and "re-exported" elsewhere. I suspect this may solve the problem, allowing the Actor imports to go back under if TYPE_CHECKING: and to annotate with "Actor" again while still running make -C doc html with no warnings.

    However, this would entail design decisions about how to still document those members. They should probably have links to where they are fully documented. So an entry for Actor in the section of reference.rst for git.objects.util would still exist, but be very short and refer to the full autodoc item for Actor the section for git.util. Since a :class: reference to git.objects.util.Actor should still go to the stub that links to git.util.Actor, it is not obvious that solving the duplication in documentation generated for reference.rst ought to be done in a way that would address the ambiguity Sphinx warns about, even if it can be done in such a way.

    Therefore, that should also be a separate consideration and, if warranted, a separate change.


This builds successfully:

  • Locally in Ubuntu 22.04 LTS and Windows 10, tested with Python 3.12.5 and Python 3.13.0rc1.
  • On CI in this PR, on all three platforms and all Python versions already under test. Note that the older version of Sphinx and its dependencies continues to be used on Python 3.7. I advocate that support for building GitPython's documentation on Python 3.7 be dropped very soon to simplify the situation, but I have not included that in this PR.
  • On separate experimental CI on Python 3.13.0rc1 on Ubuntu and macOS. On Windows there is a test failure unrelated to these changes that occurs before the documentation step would run. Because I did heavy local testing of 3.13.0rc1, I have not set the unit test step to continue-on-error to observe the documentation step, though that could be done if necessary. Here's a CI run including 3.13.0rc1 without upgrading Sphinx, and here's a CI run with including 3.13.0rc1 as well as the changes from this PR.
  • On Read the Docs, in the preview build. Using a newer version of Sphinx changes the appearance subtly, but in my opinion it is an improvement. Here is the RTD preview build and here is the content there.

Notice how in the API reference each section name on the left side is now expandable, and the items inside it are then expandable, and so forth. This adds another way of finding what one is looking for. I didn't deliberately change anything here to achieve that; it comes along with the version upgrade.

The old pinned version and its explicitly constrained dependencies
are retained for now for Python 3.7 so that documentation can be
built even with 3.7. (This could maybe be removed soon as a
preliminary step toward evenutally dropping 3.7 support.)

For Python 3.8 and higher, version 7.1.2 is used, allowing later
patch versions but constrained to remain 7.1.*. This is so the same
versions are likely to be selected on all Python version from 3.8
and higher, to minimize small differences in generated documentation
that different versions could give, and also to simplify debugging.

The reason to upgrade Sphinx now is to suppport Python 3.13, which
shall be (and, in the pre-releases available, is) incompatible with
versions of Sphinx below 6.2. This is because those earlier Sphinx
versions use the deprecated `imghdr` module, which 3.13 removes:

- https://docs.python.org/3.13/whatsnew/3.13.html#whatsnew313-pep594
- python/cpython#104818

On old versions of Sphinx, that gives the error:

    Extension error:
    Could not import extension sphinx.builders.epub3 (exception: No module named 'imghdr')

Using Sphinx 6.2 is sufficient to avoid this, but there do not seem
to be any disadvantages for GitPython to remain below 7.1.2.

The reason we did not upgrade Sphinx before is that, last time we
considered doing so, we ran into a problem of new warnings (that we
treat as errors). This is detailed in the "Can we upgrade Sphinx?"
section of gitpython-developers#1802, especially the "What Sphinx 5 is talking about"
subsection. The problem is warnings about `Actor` when it comes
in through type annotations:

    WARNING: more than one target found for cross-reference 'Actor': git.objects.util.Actor, git.util.Actor

So this includes other changes to fix that problem as well. The
solution used here is to import `Actor` even when `TYPE_CHECKING`
is `false`, and write it unquoted in annotations, as `Actor` rather
than `"Actor"`. This allows Sphinx to discern where it should
consider it to be located, for the purpose of linking to its
documentation.

This does not incur overhead, because:

- The affected modules already have imports from `git.util`, so also
  importing `Actor` from `git.util` does not cause any modules to
  be imported that were not imported otherwise, nor even to be
  imported at a different time.

- Even if that that had not been the case, most modules in GitPython
  including `git.util` have members imported them into the top-level
  `git` module in `git.__init__` when `git` is imported (and thus
  when any Python submodule of `git` is imported).

The only disadvantage is the presence of the additional name in
those modules at runtime, which a user might inadvertently use and
thus write brittle code that could break if it is later removed.
But:

- The affected modules define `__all__` and do not include
  `"Actor"` in `__all__`, so it is non-public.

- There are many places in GitPython (and most Python libraries)
  where the onus is already on the author of code that uses the
  library to avoid doing this.

The reasons for this approach, rather than any of several others,
were:

1. I did not write out the annotations as `git.util.Actor` to
   resolve the ambiguity because annotations should, and for some
   uses must, also be interpretable as expressions. But while
   `from git.util import Actor` works and makes `Actor` available,
   `git.util.Actor` cannot be used as an expression even after
   `import git.util`. This is because, even after such an import,
   `git.util` actually refers to `git.index.util`. This is as
   detailed in the warnings issued when it is accessed, originally
   from an overly broad `*` import but retained because removing it
   could be a breaking change. See `git/__init__.py` for details.

2. The reason I did not write out the annotations as
   `git.objects.util.Actor` to resolve the ambiguity is that not
   all occurrences where Sphinx needed to be told which module to
   document it as being from were within the `git.objects` Python
   submodule. Two of the warnings were in `git/objects/tag.py`,
   where annotating it that way would not be confusing. But the
   other two were in `git/index/base.py`.

3. Although removing `Actor` from `git.objects.util.__all__` would
   resolve the ambiguity, this should not be done, because:

   - This would be a breaking change.

   - It seems to be there deliberately, since `git.objects.util`
     contains other members that relate to it directly.

4. The reasons I did not take this opportunity to move the contents
   of `git/util.py` to a new file in that directory and make
   `git/util.py` re-export the contents, even though this would
   allow a solution analogous to (1) but for the new module to be
   used, while also simplifying importing elsewhere, were:

   - That seems like a change that should be done separately, based
     either on the primary benefit to users or on a greater need
     for it.

   - If and when that is done, it may make sense to change the
     interface as well. For example, `git/util.py` has a number of
     members that it makes available for use throughout GitPython
     but that are deliberately omitted from `__all__` and are meant
     as non-public outside the project. One approach would be to make
     a module with a leading `_` for these "internal" members, and
     another public ones with everything else. But that also cannot
     be decided based on the considerations that motivate the
     changes here.

   - The treatment of `HIDE_WINDOWS_KNOWN_ERRORS`, which is public
     in `git/util.py` and which currently *does* have an effect,
     will need to be considered. Although it cannot be re-bound by
     assigning to `git.util.HIDE_WINDOWS_KNOWN_ERRORS` because the
     `git.util` subexpression would evaluate to `git.index.util`,
     there may be code that re-binds it in another way, such as by
     accessing the module through `sys.modules`. Unlike functions
     and classes that should not be monkey-patched from code
     outside GitPython's test suite anyway, this attribute may
     reasonably be assigned to, so it matters what module it is
     actually in, unless the action of assigning to it elsewhere
     is customized dynamically to carry over to the "real" place.

5. An alternative solution that may be reasonable in the near
   future is to modify `reference.rst` so duplicate documentation
   is no longer emitted for functions and classes that are defined
   in one place but imported and "re-exported" elsewhere. I suspect
   this may solve the problem, allowing the `Actor` imports to go
   back under `if TYPE_CHECKING:` and to annotate with `"Actor"`
   again while still running `make -C doc html` with no warnings.

   However, this would entail design decisions about how to still
   document those members. They should probably have links to where
   they are fully documented. So an entry for `Actor` in the
   section of `reference.rst` for `git.objects.util` would still
   exist, but be very short and refer to the full autodoc item for
   `Actor` the section for `git.util`. Since a `:class:` reference
   to `git.objects.util.Actor` should still go to the stub that
   links to `git.util.Actor`, it is not obvious that solving the
   duplication in documentation generated for `reference.rst` ought
   to be done in a way that would address the ambiguity Sphinx
   warns about, even if it *can* be done in such a way.

   Therefore, that should also be a separate consideration and, if
   warranted, a separate change.
@EliahKagan EliahKagan marked this pull request as ready for review August 18, 2024 07:14
@EliahKagan EliahKagan mentioned this pull request Aug 18, 2024
Copy link
Member

@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this initiative, and the incredible amount of attention to detail!

Independently of what made the changes necessary, the direct import led to more readable type definitions.

@Byron Byron merged commit 651fcf0 into gitpython-developers:main Aug 18, 2024
22 checks passed
@EliahKagan EliahKagan deleted the py313-doc branch August 18, 2024 08:55
@EliahKagan
Copy link
Contributor Author

Should support for building documentation on Python 3.7 be removed now, or kept in for a while? This would make the documentation step in the CI workflow conditional on not being 3.7, and would allow:

-sphinx >= 7.1.2, < 7.2 ; python_version >= "3.8"
+sphinx >= 7.1.2, < 7.2
-sphinx == 4.3.2 ; python_version < "3.8"
-sphinxcontrib-applehelp >= 1.0.2, <= 1.0.4 ; python_version < "3.8"
-sphinxcontrib-devhelp == 1.0.2 ; python_version < "3.8"
-sphinxcontrib-htmlhelp >= 2.0.0, <= 2.0.1 ; python_version < "3.8"
-sphinxcontrib-qthelp == 1.0.3 ; python_version < "3.8"
-sphinxcontrib-serializinghtml == 1.1.5 ; python_version < "3.8"
 sphinx_rtd_theme
 sphinx-autodoc-typehints

I don't think this is urgent, since special-casing 3.7 can remain in place for a while, possibly as long as we otherwise test/support 3.7.

However, I also suspect not much is gained by continuing to support building documentation on 3.7, because:

  • No one developing GitPython is going to only have access to Python 3.7.
  • The current situation (since this PR) already special-cases 3.7, on which the generated documentation would not be quite the same (due to the much earlier major version of Sphinx).

@Byron
Copy link
Member

Byron commented Aug 18, 2024

I'd love to get the patch above applied, thanks for raising awareness!

EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request Aug 18, 2024
This removes the specially cased alternative lower versions of
`sphinx` and its dependencies that, since gitpython-developers#1954, were only for
Python 3.7. As discussed in comments there, this simplifies the
documentation dependencies and avoids a situation where the version
of Python used to build the documentation has a noticeable effect
on the generated result.

This also conditions running the "Documentation" step in the main
CI test workflow (`pythonpackage.yml`) on the Python version not
being 3.7 (otherwise the job would always fail).

The only change this makes to the support status of GitPython on
Python 3.7 is to no longer support building documentation on 3.7.
GitPython can still be installed and used on 3.7 (though usually
this would not be a good idea, outside of testing, since Python 3.7
itself has not been supported by the Python Software Foundation for
quite some time). In addition, the documentation, which can be
built on any version >= 3.8 (including 3.13 starting in gitpython-developers#1954) is
no less relevant to usage on Python 3.7 than it was before.
@EliahKagan
Copy link
Contributor Author

I've opened #1956 for this. Note that, in addition to the above-shown patch, it also modifies the main CI test workflow so it doesn't attempt to build documentation on Python 3.7.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants