Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace all wildcard imports with explicit imports #1880

Merged
merged 34 commits into from
Mar 19, 2024

Commits on Mar 18, 2024

  1. Add a script to validate refactored imports

    This script can be removed after imports are refactored and checked
    to see that module contents are the same as before or otherwise
    non-broken.
    
    This script assumes that module contents are the same as the
    contents of their dictionaries, and that all modules in the project
    get imported as a consequence of importing the top-level module.
    These are both the case currently for GitPython, but they do not
    hold for all projects, and may not hold for GitPython at some point
    in the future.
    EliahKagan committed Mar 18, 2024
    Configuration menu
    Copy the full SHA
    1e5a944 View commit details
    Browse the repository at this point in the history
  2. Add regression tests of the git.util aliasing situation

    Although this situation is not inherently desirable, for backward
    compatibility it cannot change at this time. It may be possible to
    change it in the next major version of GitPython, but even then it
    should not be changed accidentally, which can easily happen while
    refactoring imports.
    
    This tests the highest-risk accidental change (of those that are
    currently known) of the kind that the temporary modattrs.py script
    exists to help safeguard against. That script will be removed when
    the immediately forthcoming import refactoring is complete, whereas
    these test cases can be kept.
    
    For information about the specific situation this helps ensure
    isn't changed accidentally, see the new test cases' docstrings, as
    well as the next commit (which will test modattrs.py and these test
    cases by performing an incomplete change that would be a bug until
    completed).
    
    This commit adds three test cases. The first tests the unintuitive
    aspect of the current situation:
    
    - test_git_util_attribute_is_git_index_util
    
    The other two test the intuitive aspects of it, i.e., they test
    that changes (perhaps in an attempt to preserve the aspect needed
    for backward compatibility) do not make `git.util` unusual in new
    (and themselves incompatible) ways:
    
    - test_git_index_util_attribute_is_git_index_util
    - test_separate_git_util_module_exists
    
    The latter tests should also clarify, for readers of the tests, the
    limited nature of the condition the first test asserts.
    EliahKagan committed Mar 18, 2024
    Configuration menu
    Copy the full SHA
    5b2771d View commit details
    Browse the repository at this point in the history
  3. Incompletely change git.index imports to test modattrs.py

    This also checks if the regression tests in test_imports.py work.
    
    This replaces wildcard imports in git.index with explicit imports,
    and adds git.index.__all__. But this temporarily omits from it the
    public attributes of git.index that name its Python submodules and
    are thus are present as an indirect effect of importing names
    *from* them (since doing so also imports them).
    
    This partial change, until it is completed in the next commit,
    represents the kind of bug that modattrs.py seeks to safeguard
    against, and verifies that a diff between old and current output of
    modattrs.py clearly shows it. This diff is temporarily included as
    ab.diff, which will be deleted in the next commit.
    
    The key problem that diff reveals is the changed value of the util
    attribute of the top-level git module. Due to the way wildcard
    imports have been used within GitPython for its own modules,
    expressions like `git.util` (where `git` is the top-level git
    module) and imports like `from git import util` are actually
    referring to git.index.util, rather than the util Python submodule
    of the git module (conceptually git.util), which can only be
    accessed via `from git.util import ...` or in `sys.modules`.
    Although this is not an inherently good situation, in practice it
    has to be maintained at least until the next major version, because
    reasonable code that uses GitPython would be broken if the
    situation were changed to be more intuitive.
    
    But the more intuitive behavior automatically happens if wildcard
    imports are replaced with explicit imports of the names they had
    originally intended to import (in this case, in the top-level git
    module, which has not yet been done but will be), or if __all__ is
    added where helpful (in this case, in git.index, which this does).
    Adding the names explicitly is sufficient to restore the old
    behavior that is needed for backward compatibility, and has the
    further advantage that the unintuitive behavior will be explicit
    once all wildcard imports are replaced and __all__ is added to each
    module where it would be helpful. The modattrs.py script serves to
    ensure incompatible changes are not made; this commit checks it.
    
    The tests in test_imports.py also cover this specific anticipated
    incompatibility in git.util, but not all breakages that may arise
    when refactoring imports and that diff-ing modattrs.py output would
    help catch.
    EliahKagan committed Mar 18, 2024
    Configuration menu
    Copy the full SHA
    fc86a23 View commit details
    Browse the repository at this point in the history
  4. Fix git.index imports

    - Add the base, fun, typ, and util Python submodules of git.index
      to git.index.__all__ to restore the old behavior.
    
    - Import them explicitly for clarity, even though they are bound to
      the same-named attributes of git.index either way. This should
      help human readers know what the entries in __all__ are referring
      to, and may also help some automated tools.
    
      This is a preliminary decision and not necessarily the one that
      will ultimately be taken in this import refactoring. It *may*
      cause some kinds of mistakes, such as those arising from
      ill-advised use of wildcard imports in production code *outside*
      GitPython, somewhat worse in their effects.
    
    - Remove the ab.diff file that showed the problem this solves.
    
    This resolves the problem deliberately introduced in the previous
    incomplete commit, which modattrs.py output and test_imports test
    results confirm.
    EliahKagan committed Mar 18, 2024
    Configuration menu
    Copy the full SHA
    4badc19 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    1c9bda2 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    8b51af3 View commit details
    Browse the repository at this point in the history
  7. Replace wildcard imports in git.refs

    This uses explicit imports rather than wildcard imports in git.refs
    for names from its submodules.
    
    A comment suggested that the import order was deliberate. But each
    of the modules being imported from defined __all__, and there was
    no overlap in the names across any of them.
    
    The other main reason to import in a specific order is an order
    dependency in the side effects, but that does not appear to apply,
    at least not at this time.
    
    (In addition, at least for now, this adds explicit imports for the
    Python submodules of git.refs, so it is clear that they can always
    be accessed directly in git.refs without further imports, if
    desired. For clarity, those appear first, and that makes the order
    of the "from" imports not relevant to such side effects, due to the
    "from" imports no longer causing modules to be loaded for the first
    time. However, this is a much less important reason to consider the
    other imports' reordering okay, because these module imports may
    end up being removed later during this refactoring; their clarity
    benefit might not be justified, because if production code outside
    GitPython ill-advisedly uses wildcard imports, the bad effect of
    doing so could be increased.)
    EliahKagan committed Mar 18, 2024
    Configuration menu
    Copy the full SHA
    b25dd7e View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    b32ef65 View commit details
    Browse the repository at this point in the history
  9. Add git.repo.__all__ and make submodules explicit

    - No need to import Repo "as Repo". Some tools only recognize this
      to give the name conceptually public status when it appears in
      type stub files (.pyi files), and listing it in the newly created
      __all__ is sufficient to let humans and all tools know it has
      that status.
    
    - As very recently done in git.refs, this explicitly imports the
      submodules, so it is clear they are available and don't have to
      be explicitly imported. (Fundamental to the way they are used is
      that they will end up being imported in order to import Repo.)
    
      However, also as in git.refs, it may be that the problems this
      could cause in some inherently flawed but plausible uses are too
      greater for it to be justified. So this may be revisited.
    EliahKagan committed Mar 18, 2024
    Configuration menu
    Copy the full SHA
    0ba06e9 View commit details
    Browse the repository at this point in the history
  10. Improve order of imports and __all__ in git.objects.*

    But not yet in git.objects.submodule.* modules.
    EliahKagan committed Mar 18, 2024
    Configuration menu
    Copy the full SHA
    c946906 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    4e9a2f2 View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    c58be4c View commit details
    Browse the repository at this point in the history
  13. Don't patch IndexObject and Object into git.objects.submodule.util

    Such patching, which was introduced in 9519f18, seems no longer to
    be necessary. Since git.objects.submodule.util.__all__ has existed
    for a long time without including those names, they are not
    conceptually public attributes of git.objects.submodule.util, so
    they should not be in use by any code outside GitPython either.
    
    The modattrs.py script shows the change, as expected, showing these
    two names as no longer being in the git.objects.submodule.util
    module dictionary, in output of:
    
        python modattrs.py >b
        git diff --no-index a b
    
    However, because the removal is intentional, this is okay.
    EliahKagan committed Mar 18, 2024
    Configuration menu
    Copy the full SHA
    01c95eb View commit details
    Browse the repository at this point in the history
  14. Fix git.objects.__all__ and make submodules explicit

    The submodules being made explicit here are of course Python
    submodules, not git submodules.
    
    The git.objects.submodule Python submodule (which is *about* git
    submodules) is made explicit here (as are the names imported from
    that Python submodule's own Python submodules) along with the other
    Python submodules of git.objects.
    
    Unlike some other submodules, but like the top-level git module
    until gitpython-developers#1659, git.objects already defined __all__ but it was
    dynamically constructed. As with git.__all__ previously (as noted
    in gitpython-developers#1859), I have used https://github.com/EliahKagan/deltall here
    to check that it is safe, sufficient, and probably necessary to
    replace this dynamically constructed __all__ with a literal list of
    strings in which all of the original elements are included. See:
    https://gist.github.com/EliahKagan/e627603717ca5f9cafaf8de9d9f27ad7
    
    Running the modattrs.py script, and diffing against the output from
    before the current import refactoring began, reveals two changes to
    module contents as a result of the change here:
    
    - git.objects no longer contains `inspect`, which it imported just
      to build the dynamically constructed __all__. Because this was
      not itself included in that __all__ or otherwise made public or
      used out of git.objects, that is okay. This is exactly analogous
      to the situation in 8197e90, which removed it from the top-level
      git module after gitpython-developers#1659.
    
    - The top-level git module now has has new attributes blob, commit,
      submodule, and tree, aliasing the corresponding modules. This
      has happened as a result of them being put in git.objects.__all__
      along with various names imported from them. (As noted in some
      prior commits, there are tradeoffs associated with doing this,
      and it may be that such elements of __all__ should be removed,
      here and elsewhere.)
    EliahKagan committed Mar 18, 2024
    Configuration menu
    Copy the full SHA
    f89d065 View commit details
    Browse the repository at this point in the history
  15. Make git.objects.util module docstring more specific

    So git.objects.util is less likely to be confused with git.util.
    
    (The modattrs.py script reveals the change in the value of
    git.objects.util.__doc__, as expected.)
    EliahKagan committed Mar 18, 2024
    Configuration menu
    Copy the full SHA
    3786307 View commit details
    Browse the repository at this point in the history
  16. Add __all__ and imports in git.objects.submodule

    This is the top-level of the git.objects.submodule subpackage, for
    its own Python submodules.
    
    This appears only not to have been done before because of a cyclic
    import problem involving imports that are no longer present. Doing
    it improves consistency, since all the other subpackages under the
    top-level git package already do this.
    
    The modattrs.py script reveals the four new attributes of
    git.objects.submodule for the four classes obtained by the new
    imports, as expected. (The explicit module imports do not change
    the attribues because they are the same attributes as come into
    existence due to those Python submodules being imported anywhere in
    any style.)
    EliahKagan committed Mar 18, 2024
    Configuration menu
    Copy the full SHA
    de540b7 View commit details
    Browse the repository at this point in the history
  17. Improve how imports and __all__ are written in git.util

    + Update the detailed comment about the unused import situation.
    EliahKagan committed Mar 18, 2024
    Configuration menu
    Copy the full SHA
    a05597a View commit details
    Browse the repository at this point in the history

Commits on Mar 19, 2024

  1. Configuration menu
    Copy the full SHA
    2053a3d View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    b8bab43 View commit details
    Browse the repository at this point in the history
  3. Improve how second-level imports and __all__ are written

    These are in the modules that are directly under the top-level git
    module (and are not themselves subpackages, with their own
    submodules in the Python sense), except for:
    
    - git.util, where this change was very recently made.
    
    - git.compat, where no improvements of this kind were needed.
    
    - git.types, which currently has no __all__ and will only benefit from
      it if what should go in it is carefully considered (and where the
      imports themselves are grouped, sorted, and formatted already).
    EliahKagan committed Mar 19, 2024
    Configuration menu
    Copy the full SHA
    3d4e476 View commit details
    Browse the repository at this point in the history
  4. Make F401 "unused import" suppressions more specific

    In git.compat and git.util.
    
    This applies them individually to each name that is known to be an
    unused import, rather than to entire import statements (except
    where the entire statement fit on one line and everything it
    imported is known to be unused).
    
    Either Ruff has the ability to accept this more granular style of
    suppression, or I had been unaware of it from flake8 (used before).
    I have veriifed that the suppressions are not superfluous: with no
    suppressions, Ruff does warn. This commit makes no change in
    git.types because it looks like no suppression at all may be
    needed there anymore; that will be covered in the next commit.
    
    This also removes the old `@UnusedImport` comments, which had been
    kept before because they were more granular; this specificity is
    now achieved by comments the tools being used can recognize.
    EliahKagan committed Mar 19, 2024
    Configuration menu
    Copy the full SHA
    6318eea View commit details
    Browse the repository at this point in the history
  5. Remove unneeded F401 "Unused import" suppressions

    In git.types.
    
    It appears Ruff, unlike flake8, recognizes "import X as X" to mean
    that "X" should not be considered unused, even when it appears
    outside a .pyi type stub file where that notation is more commonly
    used.
    
    Those imports in git.types may benefit from being changed in some
    way that uses a syntax whose intent is clearer in context, and
    depending on how that is done it may even be necessary to bring
    back suppressions. If so, they can be written more specifically.
    (For now, changing them would express more than is known about what
    names that are only present in git.types becuase it imports them
    should continue to be imported, should be considered conceptually
    public, or should be condered suitable for use within GitPython.)
    EliahKagan committed Mar 19, 2024
    Configuration menu
    Copy the full SHA
    31bc8a4 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    abbe74d View commit details
    Browse the repository at this point in the history
  7. Replace wildcard imports in top-level git module

    - Use explicit imports instead of * imports.
    - Remove now-unneeded linter suppressions.
    - Alphabetize inside the try-block, though this will be undone.
    
    This currently fails to import due to a cyclic import error, so the
    third change, alphabetizing the imports, will have to be undone (at
    least in the absence of other changes). It is not merely that they
    should not be reordered in a way that makes them cross into or out
    of the try-block, but that within the try block not all orders will
    work.
    
    There will be more to do for backward compatibility, but the
    modattrs.py script, which imports the top-level git module, cannot
    be run until the cyclic import problem is fixed.
    EliahKagan committed Mar 19, 2024
    Configuration menu
    Copy the full SHA
    7745250 View commit details
    Browse the repository at this point in the history
  8. Restore relative order to fix circular import error

    This still uses all explicit rather than wildcard imports (and
    still omits suppressions that are no longer needed due to wildcard
    imports not being used). But it brings back the old relative order
    of the `from ... import ...` statements inside the try-block.
    
    Since this fixes the circular import problem, it is possible to run
    the modattrs.py script to check for changes. New changes since
    replacing wildcard imports, which are probably undesirable, are the
    removal of these attributes pointing to indirect Python submodules
    of the git module:
    
        base -> git.index.base
        fun -> git.index.fun
        head -> git.refs.head
        log -> git.refs.log
        reference -> git.refs.reference
        symbolic -> git.refs.symbolic
        tag -> git.refs.tag
        typ -> git.index.typ
    EliahKagan committed Mar 19, 2024
    Configuration menu
    Copy the full SHA
    64c9efd View commit details
    Browse the repository at this point in the history
  9. Add the nonpublic indirect submodule aliases back for now

    These should definitely never be used by code inside or outside of
    GitPython, as they have never been public, having even been omitted
    by the dynamic (and expansive) technique used to build git.__all__
    in the past (which omitted modules intentionally).
    
    However, to ease compatibility, for now they are back. This is so
    that the change of making all imports explicit rather than using
    wildcards does not break anything. However, code using these names
    could never be relied on to work, and these should be considered
    eligible for removal, at least from the perspective of code outside
    GitPython.
    
    That is for the indirect submodules whose absence as a same-named
    direct submodule or attribute listed in __all__ was readily
    discoverable.
    
    The more difficult case is util. git.util is a module, git/util.py,
    which is how it is treated when it appears immediately after the
    "from" keyword, or as a key in sys.modules. However, the expression
    `git.util`, and a `from git import util` import, instead access the
    separate git.index.util module, which due to a wildcard import has
    been an attribute of the top-level git module for a long time.
    
    It may not be safe to change this, because although any code
    anywhere is better off not relying on this, this situation hasn't
    been (and isn't) immediately clear. To help with it somewhat, this
    also includes a detailed comment over where util is imported from
    git.index, explaining the situation.
    EliahKagan committed Mar 19, 2024
    Configuration menu
    Copy the full SHA
    31f89a1 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    9bbbcb5 View commit details
    Browse the repository at this point in the history
  11. Add missing submodule imports in git.objects

    Since they are listed in __all__. (They are imported either way
    because names are imported from them, and this caused them to be
    present with the same effect.)
    
    Though they are proably about to be removed along with the
    corresponding entries in __all__.
    EliahKagan committed Mar 19, 2024
    Configuration menu
    Copy the full SHA
    00f4cbc View commit details
    Browse the repository at this point in the history
  12. Don't explicitly list direct submodules in __all__

    This is for non-toplevel __all__, as git.__all__ already did not
    do this.
    
    As noted in some of the previous commit messags that added them,
    omitting them might be a bit safer in terms of the impact of bugs
    bugs in code using GitPython, in that unexpected modules, some of
    which have the same name as other modules within GitPython, won't
    be imported due to wildcard imports from intermediate subpackages
    (those that are below the top-level git package/module but collect
    non-subpackage modules). Though it is hard to know, since some of
    these would have been imported before, when an __all__ was not
    defined at that level.
    
    However, a separate benefit of omitting them is consistency with
    git.__all__, which does not list the direct Python submodules of
    the git module.
    
    This does not affect the output of the modattrs.py script, because
    the attributes exist with the same objects as their values as a
    result of those Python submodules being imported (in "from" imports
    and otherwise), including when importing the top-level git module.
    EliahKagan committed Mar 19, 2024
    Configuration menu
    Copy the full SHA
    fcc7418 View commit details
    Browse the repository at this point in the history
  13. Pick a consistent type for __all__ (for now, list)

    This makes all __all__ everywhere in the git package lists. Before,
    roughly half were lists and half were tuples. There are reasonable
    theoretical arguments for both, and in practice all tools today
    support both. Traditionally using a list is far more common, and it
    remains at least somewhat more common. Furthermore, git/util.py
    uses a list and is currently written to append an element to it
    that is conditionally defined on Windows (though it would probably
    be fine for that to be the only list, since it reflects an actual
    relevant difference about it).
    
    The goal here is just to remove inconsistency that does not signify
    anything and is the result of drift over time. If a reason (even
    preference) arises to make them all tuples in the future, then that
    is also probably fine.
    EliahKagan committed Mar 19, 2024
    Configuration menu
    Copy the full SHA
    78055a8 View commit details
    Browse the repository at this point in the history
  14. Save diff of non-__all__ attributes across import changes

    This commits the diff of the output of the modattrs.py script
    between when the script was introduced in 1e5a944, and now (with
    the import refactoring finished and no wildcard imports remaining
    outside the test suite, and only there in one import statement for
    test helpers).
    
    Neither this diff nor modattrs.py itself will be retained. Both
    will be removed in the next commit. This is committed here to
    facilitate inspection and debugging (especially if turns out there
    are thus-far undetected regressions).
    EliahKagan committed Mar 19, 2024
    Configuration menu
    Copy the full SHA
    ecdb6aa View commit details
    Browse the repository at this point in the history
  15. Remove modattrs.py and related

    Now that it has served its purpose. (Of course, it can also be
    brought back from the history at any time if needed.)
    
    Changes:
    
    - Delete the modattrs.py script that was used to check for
      unintended changes to what module attributes existed and what
      objects they referred to, while doing the import refactoring.
    
    - Delete the ab.diff file showing the overall diff, which was
      temporarily introduced in the previous commit.
    
    - Remove the "a" and "b" temporary entries in .gitignore that were
      used to facilitate efficient use of modattrs.py while carrying
      out the import refactoring.
    EliahKagan committed Mar 19, 2024
    Configuration menu
    Copy the full SHA
    f705fd6 View commit details
    Browse the repository at this point in the history
  16. Improve test suite import grouping/sorting, __all__ placement

    There is only one __all__ in the test suite, so this is mostly the
    change to imports, grouping and sorting them in a fully consistent
    style that is the same as the import style in the code under test.
    EliahKagan committed Mar 19, 2024
    Configuration menu
    Copy the full SHA
    4a4d880 View commit details
    Browse the repository at this point in the history
  17. Configuration menu
    Copy the full SHA
    d524c76 View commit details
    Browse the repository at this point in the history