Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix broken module path and module name during import #11587

Merged

Conversation

todor-ivanov
Copy link
Contributor

Fixes #11585

Status

Ready

Description

With the current PR, we try to correctly resolve module's path and name when loading the WMAgent config template at wmagent-mod-config.

Is it backward compatible (if not, which system it affects?)

YES

Related PRs

None

External dependencies / deployment changes

None

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 1 new failures
    • 2 changes in unstable tests
  • Python3 Pylint check: failed
    • 3 warnings and errors that must be fixed
    • 3 warnings
    • 32 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 5 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14249/artifact/artifacts/PullRequestReport.html

Copy link
Contributor

@amaltaro amaltaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@todor-ivanov please see my review along the code.
In addition to that, the issue that is supposed to be closed by this PR is still unassigned and in status:ToDo (instead of In Progress). Please do always update issues to properly reflect what you are working on.


cfgBaseName = os.path.basename(filename).replace(".py", "")
cfgDirName = os.path.dirname(filename)
if not cfgDirName:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Todor, in which scenario a given filename would NOT have a dirname? I am trying to understand if we really need this if/else statement below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If one directly tries to load by module name and the search path is supposed to be the default system python search path, rather than determined by the relative path coming with the filename.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this method expects a filename as input, then it should not work if a module name is passed instead.
But now I understand when cfgDirName would be None, given that we strip the filename.

try:
modRef = modSpecs.loader.load_module(cfgBaseName)
except Exception as ex:
msg = "Unable to load Configuration File:\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can probably shrink all these lines into only 2 (or maybe 3). There is no need to make them that short.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@todor-ivanov todor-ivanov force-pushed the bugfix_Config_BrokenImportlib_fix-11585 branch from 140996d to 5ab9738 Compare May 9, 2023 18:19
@todor-ivanov todor-ivanov requested a review from amaltaro May 9, 2023 18:19
@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 1 new failures
    • 3 changes in unstable tests
  • Python3 Pylint check: failed
    • 3 warnings and errors that must be fixed
    • 3 warnings
    • 32 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 5 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14251/artifact/artifacts/PullRequestReport.html

Copy link
Contributor

@amaltaro amaltaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@todor-ivanov I think the code looks good, but I need your confirmation in one line of the code.

mod = modSpecs.loader.load_module("wmcore_config_input")
config = getattr(mod, 'config', None)

cfgBaseName = os.path.basename(filename).replace(".py", "")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this cfgBaseName set to wmcore_config_input? Or a different question, what is filename meant to be here?

Copy link
Contributor Author

@todor-ivanov todor-ivanov May 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I am not wrong, wmcore_config_input was a module name assigned during load time with the previous implementation with imp. This module name has never been persisted anywhere neither in the module definition nor the code .... Currently importlib uses the file's basename as module name, which I think is also the default for python in general. So it doesn't matter if one passes a module name or a file name here (we will strip the .py suffix anyway). The only thing that matters is if one includes the full path (cfgDirName in this case) with the filename. Depending on that, two different modes to import the module with importlib needs to be called.

  • one - using the so provided directory for the module search path
  • the alternative - using the default system's python module search path

Now that said, to answer your additional question:

Or a different question, what is filename meant to be here?

This may (and could) be any of the following 3 examples:

  • /data/srv/wmagent/current/config/wmagent/config-template.py
  • config-template.py
  • config-template
    In this particular case, depending on the environment configuration, under which the manage script is called. Meaning all 3 cases are possible, because this path/module name comes with an env. variable, so we must be able to support the three of them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still not sure whether stopping to import the default module - which was hardcoded in this script - named wmcore_config_input will be Okay. Based on your previous reply, I understand that we would be trying to import config-template module instead. Either it does not make sense that it worked before, or it isn't clear that it will work this time.

But let's give it a try and come back to this if needed. Thanks Todor.

Copy link
Contributor Author

@todor-ivanov todor-ivanov May 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alan , there is no such module wmcore_config_input. That was a random name assigned to the newly loaded module at load time. A name which was never referenced anywhere else in the code. It obviously have been put there only because the imp.load_module() method requires it as a mandatory parameter:

In [2]: help(imp.load_module)
load_module(name, file, filename, details)
    **DEPRECATED**
    Load a module, given information returned by find_module().
    The module name must include the full package name, if any.

And anyway we do not care as long as the module loads, even under a different name, since in the next line we very clearly define which object we use for fetching the config attribute from. This always succeeds, given that the path to the module is correct.

So let me summaries this the following way:

  • Using import imp and importing the module like bellow, works (which was the way, before the bug was introduced)
mod = imp.load_module("wmcore_config_input", open(filename, 'r'), filename, (".py", "r", imp.PY_SOURCE))

And this works, because you provide a file descriptor to an already opened for reading file in the second parameter, so you care not about a search path or anything.

  • Using import importlib and importing the module like bellow, does not work (which was actually the bug)
modSpecs = importlib.machinery.PathFinder().find_spec(filename)
mod = modSpecs.loader.load_module("wmcore_config_input")

And this one is broken, because the modSpec object is not a module reference, it is only an object which defines where and how to search for a module. And then the next line simply breaks, because:

  • there is no such module as wmcore_config_input - never have been any
  • the search path is wrong, because it is not fetched from the file reference path (could be relative or absolute path as well), but points to the system default.

And this even behaves differently in python 3.8 and 3.9 see bellow. For python 3.8 find_spec() returns a real but wrongly defined _frozen_importlib.ModuleSpec object, while for python 3.9 it returns None. So here is yet another very nasty way, how that would have changed behavior in the future e.g. once we were to try moving to python 3.9 and beyond.

python 3.8

(WMAgent.dock) [cmst1@unit01:/data/srv/wmagent/current]$ /home/cmst1/.local/bin/ipython
Python 3.8.16 (default, Feb  4 2023, 12:19:00) 
Type 'copyright', 'credits' or 'license' for more information
IPython 8.12.2 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import importlib

In [2]: modSpec = importlib.machinery.PathFinder().find_spec('/data/srv/wmagent/current/config/wmagent/config-template.py')

In [3]: modSpec
Out[3]: ModuleSpec(name='/data/srv/wmagent/current/config/wmagent/config-template.py', loader=<_frozen_importlib_external.SourceFileLoader object at 0x7f6c65357160>, origin='/usr/local/lib/python3.8/site-packages/py/__init__.py', submodule_search_locations=['/usr/local/lib/python3.8/site-packages/py'])

In [4]: mod = modSpec.loader.load_module("wmcore_config_input")
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
Cell In[4], line 1
----> 1 mod = modSpec.loader.load_module("wmcore_config_input")

File <frozen importlib._bootstrap_external>:520, in _check_name_wrapper(self, name, *args, **kwargs)

ImportError: loader for /data/srv/wmagent/current/config/wmagent/config-template.py cannot handle wmcore_config_input

In [5]: type(modSpec)
Out[5]: _frozen_importlib.ModuleSpec

python 3.9

(WMCore.venv3) [cmst1@unit01 WMCore.venv3]$ ipython
Python 3.9.14 (main, Jan  9 2023, 00:00:00) 
Type 'copyright', 'credits' or 'license' for more information
IPython 8.13.2 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import importlib

In [2]: modSpec = importlib.machinery.PathFinder().find_spec('/data/srv/wmagent/current/config/wmagent/config-template.py')

In [3]: mod = modSpec.loader.load_module("wmcore_config_input")
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[3], line 1
----> 1 mod = modSpec.loader.load_module("wmcore_config_input")

AttributeError: 'NoneType' object has no attribute 'loader'

In [4]: type(modSpec)
Out[4]: NoneType

@todor-ivanov todor-ivanov requested a review from amaltaro May 9, 2023 20:09
Copy link
Contributor

@amaltaro amaltaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Despite some concerns along the code, it looks good to me. Anyhow, we better test it with both RPM and PyPi based deployment and update it in the coming days.

@todor-ivanov
Copy link
Contributor Author

todor-ivanov commented May 10, 2023

Thanks @amaltaro,
It is tested line by line multiple times now. And it is working regardless of the packaging system, take a look at [1] (currently for both rpm and pypi: filename = '/data/srv/wmagent/current/config/wmagent/config-template.py').

It was broken up until now, at the very least because there is no such module wmcore_config_input let alone the fact that the search path was the system default, not the one coming with the agent's environment/the filename.

I am about to merge this one as well, and once CERN network is back, I'll cut a new tag, so that the new pypi packages get uploaded.

[1]

In [6]: import importlib

In [7]: import os

In [10]: filename = '/data/srv/wmagent/current/config/wmagent/config-template.py'

In [11]: cfgBaseName = os.path.basename(filename).replace(".py", "")

In [15]: cfgDirName = os.path.dirname(filename)

In [16]: cfgDirName
Out[16]: '/data/srv/wmagent/current/config/wmagent'

In [20]: modSpecs = importlib.machinery.PathFinder().find_spec(cfgBaseName, [cfgDirName])

In [24]: modRef = modSpecs.loader.load_module(cfgBaseName)

In [25]: modRef
Out[25]: <module 'config-template' from '/data/srv/wmagent/current/config/wmagent/config-template.py'>

In [26]: dir(modRef)
Out[26]: 
['Configuration',
 'PY3',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 'acdcDBName',
 'bossAirPlugins',
 'config',
 'contactName',
 'couchURL',
 'databaseSocket',
 'databaseUrl',
 'glideInAcctGroup',
 'glideInAcctGroupUser',
 'globalDBSUrl',
 'globalDBSVersion',
 'globalLogLevel',
 'jobDumpDBName',
 'jobSummaryDBName',
 'localDBSVersion',
 'logDBName',
 'os',
 'retryAlgoParams',
 'serverHostName',
 'submitScript',
 'summaryStatsDBName',
 'wmbsServicePort',
 'workDirectory',
 'workflowArchiveTimeout',
 'workloadSummaryDB',
 'workloadSummaryURL',
 'workqueueDBName',
 'workqueueInboxDbName']

In [27]: config = getattr(modRef, 'config', None)

In [28]: config
Out[28]: <WMCore.Configuration.Configuration at 0x7f5b710cdaf0>

In [29]: dir(config)
Out[29]: 
['ACDC',
 'Agent',
 'AgentStatusWatcher',
 'Alert',
 'AnalyticsDataCollector',
 'ArchiveDataReporter',
 'BossAir',
 'CoreDatabase',
 'DBS3Upload',
 'DBSInterface',
 'DashboardReporter',
 'EmailAlert',
 'ErrorHandler',
 'General',
 'JobAccountant',
 'JobArchiver',
 'JobCreator',
 'JobStateMachine',
 'JobStatusLite',
 'JobSubmitter',
 'JobTracker',
 'JobUpdater',
 'RetryManager',
 'RucioInjector',
 'TaskArchiver',
 'WorkQueueManager',
 'WorkloadSummary',
 '__add__',
 '__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_instance',
 '_internal_components',
 '_internal_sections',
 '_internal_webapps',
 'commentedString_',
 'component_',
 'documentedString_',
 'getInstance',
 'listComponents_',
 'listSections_',
 'listWebapps_',
 'pythonise_',
 'section_',
 'webapp_']

@todor-ivanov todor-ivanov merged commit d138370 into dmwm:master May 10, 2023
nhduongvn pushed a commit to nhduongvn/WMCore that referenced this pull request Jan 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Broken usage of importlib
3 participants