Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace imp module by importlib #11530

Merged
merged 2 commits into from
Apr 21, 2023
Merged

Replace imp module by importlib #11530

merged 2 commits into from
Apr 21, 2023

Conversation

vkuznet
Copy link
Contributor

@vkuznet vkuznet commented Apr 4, 2023

Fixes #11434

Status

ready

Description

Replace import imp occurrences with import importlib

Is it backward compatible (if not, which system it affects?)

MAYBE

Related PRs

External dependencies / deployment changes

@vkuznet vkuznet requested a review from amaltaro April 4, 2023 12:59
@vkuznet vkuznet self-assigned this Apr 4, 2023
@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 17 new failures
  • Python3 Pylint check: failed
    • 75 warnings and errors that must be fixed
    • 6 warnings
    • 173 comments to review
  • Pylint py3k check: failed
    • 10 errors and warnings that should be fixed
    • 2 warnings
  • Pycodestyle check: succeeded
    • 146 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14168/artifact/artifacts/PullRequestReport.html

@vkuznet vkuznet removed the request for review from amaltaro April 4, 2023 13:28
@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 2 new failures
    • 2 changes in unstable tests
  • Python3 Pylint check: failed
    • 50 warnings and errors that must be fixed
    • 6 warnings
    • 169 comments to review
  • Pylint py3k check: failed
    • 10 errors and warnings that should be fixed
    • 2 warnings
  • Pycodestyle check: succeeded
    • 143 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14169/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 2 new failures
    • 1 tests no longer failing
  • Python3 Pylint check: failed
    • 47 warnings and errors that must be fixed
    • 6 warnings
    • 170 comments to review
  • Pylint py3k check: failed
    • 10 errors and warnings that should be fixed
    • 2 warnings
  • Pycodestyle check: succeeded
    • 145 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14170/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 1 tests no longer failing
    • 1 changes in unstable tests
  • Python3 Pylint check: failed
    • 47 warnings and errors that must be fixed
    • 6 warnings
    • 170 comments to review
  • Pylint py3k check: failed
    • 10 errors and warnings that should be fixed
    • 2 warnings
  • Pycodestyle check: succeeded
    • 145 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14171/artifact/artifacts/PullRequestReport.html

Copy link
Contributor

@amaltaro amaltaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vkuznet Valentin, the python3 documentation suggests different replacements for imp.find_module and imp.load_module, see:
https://docs.python.org/3/library/imp.html#imp.load_module

Can you please clarify why you implemented these with importlib.machinery instead of importlib.utils?

@vkuznet
Copy link
Contributor Author

vkuznet commented Apr 5, 2023

Alan, I made changes as close to original implementation as possible. Here are different ways of doing module loading:

  • using img library
    cfgBaseName = os.path.basename(configPath).replace(".py", "")
    cfgDirName = os.path.dirname(configPath)
    modPath = imp.find_module(cfgBaseName, [cfgDirName])
    loadedConfig = imp.load_module(cfgBaseName, modPath[0],
                                   modPath[1], modPath[2])
  • using importlib.machinery implementation
    cfgBaseName = os.path.basename(configPath).replace(".py", "")
    cfgDirName = os.path.dirname(configPath)
    modSpecs = importlib.machinery.PathFinder().find_spec(cfgBaseName, [cfgDirName])
    module = modSpecs.loader.load_module()
  • using importlib.util implementation
    cfgBaseName = os.path.basename(configPath).replace(".py", "")
    spec = importlib.util.spec_from_file_location(cfgBaseName, configPath)
    module = importlib.util.module_from_spec(spec)

The Python documentation does not explicitly says which method is better (even though it provides examples of importlib.util) and both importlib.machiner and importlib.util are appropriate methods to load modules. The former is more abstract and flexible though.

Copy link
Contributor

@todor-ivanov todor-ivanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @vkuznet
Thanks for this PR. I was about to ask the same question as Alan did. And to me the third implementation you were suggesting looks the best. Talking about this one:

  • using importlib.util implementation
   cfgBaseName = os.path.basename(configPath).replace(".py", "")
   spec = importlib.util.spec_from_file_location(cfgBaseName, configPath)
   module = importlib.util.module_from_spec(spec)

@vkuznet
Copy link
Contributor Author

vkuznet commented Apr 10, 2023

@amaltaro , @todor-ivanov could you please clarify how we should move forward with this PR. Neither of you explicitly said let's change the fix to use importlib.utils instead of importlib.machinery. If you feel this way please say it explicitly, otherwise you left me guessing about your decision, or if you ok with current implementation let's merge it.

@amaltaro
Copy link
Contributor

@vkuznet Valentin, my preference goes towards importlib.util, but given that you have already made all the relevant changes, including to the unit tests, I feel like it's not worth it changing the implementation/module to use.

I do have a concern though, instead of importing the whole importlib, should we actually import only what's going to be used, thus:

from importlib import machinery

?

@vkuznet
Copy link
Contributor Author

vkuznet commented Apr 12, 2023

Alan, the import lib has only one machiner.py module therefore

from importlib import machinery

is equivalent to

import importlib.machiner

In both cases they provide identical content, e.g.:

>>> import importlib.machinery
>>> help(import.machinery)
Help on module importlib.machinery in importlib:

NAME
    importlib.machinery - The machinery of importlib: finders, loaders, hooks, etc.

MODULE REFERENCE
    https://docs.python.org/3.11/library/importlib.machinery.html

    The following documentation is automatically generated from the Python
    source files.  It may be incomplete, incorrect or include features that
    are considered implementation detail and may vary between Python
    implementations.  When in doubt, consult the module reference at the
    location listed above.

FUNCTIONS
    all_suffixes()
        Returns a list of all recognized module suffixes for this process

DATA
    BYTECODE_SUFFIXES = ['.pyc']
    DEBUG_BYTECODE_SUFFIXES = ['.pyc']
    EXTENSION_SUFFIXES = ['.cpython-311-darwin.so', '.abi3.so', '.so']
    OPTIMIZED_BYTECODE_SUFFIXES = ['.pyc']
    SOURCE_SUFFIXES = ['.py']

FILE
    /opt/local/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/importlib/machinery.py

vs

>>> from importlib import machinery
>>> help(importlib.machinery)
Help on module importlib.machinery in importlib:

NAME
    importlib.machinery - The machinery of importlib: finders, loaders, hooks, etc.

MODULE REFERENCE
    https://docs.python.org/3.11/library/importlib.machinery.html

    The following documentation is automatically generated from the Python
    source files.  It may be incomplete, incorrect or include features that
    are considered implementation detail and may vary between Python
    implementations.  When in doubt, consult the module reference at the
    location listed above.

FUNCTIONS
    all_suffixes()
        Returns a list of all recognized module suffixes for this process

DATA
    BYTECODE_SUFFIXES = ['.pyc']
    DEBUG_BYTECODE_SUFFIXES = ['.pyc']
    EXTENSION_SUFFIXES = ['.cpython-311-darwin.so', '.abi3.so', '.so']
    OPTIMIZED_BYTECODE_SUFFIXES = ['.pyc']
    SOURCE_SUFFIXES = ['.py']

FILE
    /opt/local/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/importlib/machinery.py

@amaltaro
Copy link
Contributor

Sorry, I guess I wasn't clear enough in my previous message. What I wanted to say is that, if only the machinery module is used in a WMCore module, ideally we should change the import from:

import importlib

to

from importlib import machinery

From what I read, the lack of this change will not affect memory footprint though. So we can just leave it for another day as well.

@vkuznet
Copy link
Contributor Author

vkuznet commented Apr 13, 2023

Alan, I rather prefer to keep the code this way as it clears what it is doing in placed which have been changed, i.e. the prefix of importlib.machiner what makes it clear.

@amaltaro amaltaro self-requested a review April 13, 2023 13:18
Copy link
Contributor

@amaltaro amaltaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me, Valentin!
Given that we are addressing the final bits for an upcoming production deployment, let's merge it only after upgrading services in CMSWEB production.

@amaltaro amaltaro merged commit 0f27526 into dmwm:master Apr 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Replace imp by importlib library
4 participants