-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve command-line argument support for Python configs in cmsRun and edm tools #42650
Conversation
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-42650/36703
|
A new Pull Request was created by @kpedro88 (Kevin Pedro) for master. It involves the following packages:
@cmsbuild, @smuzaffar, @Dr15Jones, @makortel can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
@cmsbuild, please test |
@cmsbuild, please abort |
test parameters:
|
@cmsbuild, please test |
@makortel does |
IIRC yes. We wanted to see the scope of the failures that occur presently. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Kevin for the PR. Chris and I took a first look jointly. I think this development is in good direction, but some of the details need to be ironed out.
In addition of the more detailed comments below, we want
- Test(s) that show that
cmsRun config.py nonexistent=foo
andcmsRun config.py --nonexistent
give meaningful error messages (and non-zero exit code) when theconfig.py
uses VarParsing or argparse, respectively. - Test(s) where the configuration file checks the sys.argv directly (without VarParsing or argparse). This kind of test could then clearly demonstrate the
cmsRun
optional arguments do not get passed to python.
FWCore/Framework/bin/cmsRun.cpp
Outdated
if (!(pythonOptValues.size() == 1 and pythonOptValues[0] == kPythonOptDefault)) | ||
pythonArgs = | ||
edm::vector_transform(pythonOptValues, [](const auto& arg) { return const_cast<char*>(arg.c_str()); }); | ||
pythonArgs.insert(pythonArgs.begin(), const_cast<char*>(fileName.c_str())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me it would be possible to avoid the const_cast
s by cleaning up python2 code from PyBind11ProcessDesc
and updating the interface in between. But that is probably beyond the scope of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to clean up our pybind interface if Core is in favor of modernizing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like just passing in an option to the constructor of PyBind11ProcessDesc
saying what the string represents (a file name or an actual python commands) would work.
If you'd like to improve our use of pybind that would be appreciated :).
FWCore/Framework/test/BuildFile.xml
Outdated
<test name="testFWCoreFrameworkVarparsing" command="cmsRun -n 1 ${LOCALTOP}/src/FWCore/Framework/test/test_varparsing.py threads=2"/> | ||
<test name="testFWCoreFrameworkVarparsingHelp" command="cmsRun -n 1 ${LOCALTOP}/src/FWCore/Framework/test/test_varparsing.py --help"/> | ||
<test name="testFWCoreFrameworkArgparse" command="cmsRun -n 1 ${LOCALTOP}/src/FWCore/Framework/test/test_argparse.py -n 2"/> | ||
<test name="testFWCoreFrameworkArgparseHelp" command="cmsRun -n 1 ${LOCALTOP}/src/FWCore/Framework/test/test_argparse.py --help"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests should ensure the cmsRun
behaves as expected instead of just "doesn't crash".
<test name="testFWCoreParameterSetVarparsingDump" command="edmConfigDump -o dump_varparsing.py ${LOCALTOP}/src/FWCore/ParameterSet/test/test_varparsing.py output=foo intprod=2"/> | ||
<test name="testFWCoreParameterSetVarparsingSplit" command="edmConfigSplit ${LOCALTOP}/src/FWCore/ParameterSet/test/test_varparsing.py output=foo intprod=2"/> | ||
<test name="testFWCoreParameterSetArgparseDump" command="edmConfigDump -o dump_argparse.py ${LOCALTOP}/src/FWCore/ParameterSet/test/test_argparse.py -o foo -i 2"/> | ||
<test name="testFWCoreParameterSetArgparseSplit" command="edmConfigSplit ${LOCALTOP}/src/FWCore/ParameterSet/test/test_argparse.py -o foo -i 2"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests should ensure the behavior is as expected rather than just "doesn't crash".
FWCore/Framework/bin/cmsRun.cpp
Outdated
std::vector<boost::program_options::option> final_opts_parser(std::vector<std::string>& args) { | ||
std::vector<boost::program_options::option> result; | ||
if (!args.empty() and args[0].size() >= kParameterSetSuffixLen and | ||
args[0].compare(args[0].size() - kParameterSetSuffixLen, kParameterSetSuffixLen, kParameterSetSuffix) == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We require that there is no restriction at all on the configuration file name. However, all cmsRun
optional parameters start with -
. Therefore, if the args[0]
doesn't begin with -
, it has to be the file name. The only way to use a configuration file name starting with -
would be to precede it with --
. So, if args[0] == "--"
and args.size() >= 2
, then args[1]
must be the file name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented this logic as requested, but when I added tests for it, they failed. The CMSSW pybind interface apparently does not handle config file names as specified here:
cmssw/FWCore/PythonParameterSet/src/PyBind11ProcessDesc.cc
Lines 74 to 79 in 7f25ad5
// if it ends with py, it's a file | |
if (config.substr(config.size() - 3) == ".py") { | |
readFile(config); | |
} else { | |
readString(config); | |
} |
FWCore/Framework/bin/cmsRun.cpp
Outdated
@@ -102,6 +105,30 @@ namespace { | |||
edm::EventProcessor* ep_; | |||
}; | |||
|
|||
std::vector<boost::program_options::option> final_opts_parser(std::vector<std::string>& args) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given how well this part of boost is documented, could you add some notes how the extra_style_parser
gets called during parsing? It would greatly help future maintenance. I'm guessing the function gets called for the remaining args
as long as it is non-empty, or the result
is not-default-constructed, but I could easily be wrong or miss some detail.
process.source = cms.Source("EmptySource") | ||
|
||
process.maxEvents.input = args.maxEvents | ||
process.options.numberOfThreads = args.numThreads |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would make the ensuring the cmsRun
behavior easier if the options of the configuration file would have different behavior than what cmsRun
does (like with -n
). You could e.g. print out the value and check the printout, or have the configuration file to throw an exception if the value is unexpected.
-1 Failed Tests: UnitTests Unit TestsI found 96 errors in the following unit tests: ---> test TestIntegrationDropOnInput had ERRORS ---> test TestIntegrationEventHistory had ERRORS ---> test TestIntegrationEventSetup had ERRORS and more ... Comparison SummarySummary:
|
+core |
+1 |
+analysis |
@makortel can you also sign for heterogeneous? |
+heterogeneous |
This pull request is fully signed and it will be integrated in one of the next master IBs (but tests are reportedly failing). This pull request will now be reviewed by the release team before it's merged. @sextonkennedy, @rappoccio, @antoniovilela (and backports should be raised in the release meeting by the corresponding L2) |
@cms-sw/orp-l2 Just to remind that this PR needs to be merged together with #42823 |
urgent
|
+1 Requires co-dependency from #42823 to resolve errors. |
merge |
PR description:
cmsRun
to use a stricter syntax (already required for the use of VarParsing), eliminating the-p
,--parameter-set
flag and treating the Python config file name as a required positional parameter only. This enables treating everything after that parameter as arguments to the Python config file. (In practice, this required writing a simple custom parser forboost::program_options
, which does not natively have the equivalent ofargparse
'sREMAINDER
type.)cmsRun
not to treatsys.exit(0)
in Python as an exception, but rather just to stop execution and exit with code 0. This allows usingcmsRun config.py --help
in an intuitive way.edmConfigDump
to handle command-line arguments (either VarParsing or argparse, or anything else, in principle). Here,argparse
'sREMAINDER
type is used.edmConfigSplit
andedmConfigHash
modified for the same purpose. No other edm tools were identified for which such a modification makes sense.edmParameterSetDump
seems like a prototype ofedmConfigDump
that had not substantially changed in over a decade (the somewhat fragmented history: part 1, part 2, part 3, part 4), so it is removed as redundant.Summary of interface changes:
cmsRun
:-p
,--parameter-set
flags removed.config_file.py
now a positional parameter that must come after any other arguments tocmsRun
and before any arguments to the Python config file itself.cmsRun
:-c
,--command
flag added to allow passing a config as a string containing Python code, rather than a file (following thepython -c
interface). (@Dr15Jones realized while reviewing this PR that such functionality had been available from the pybind11 interface setup before it was modified here.)cmsRun config_file.py ...
, inside the Python config file, the contents ofsys.argv
are["config_file.py",...]
(consistent with the result ofpython3 config_file.py ...
). Previously, when usingcmsRun
, the contents ofsys.argv
would be["cmsRun","config_file.py",...]
.PyBind11ProcessDesc
: constructors modified with an additionalbool isFile
parameter, to allow for any config file names (rather than just ending in.py
), as requested by Core.readConfig()
functions now usevector<string>
input rather thanint argc, char* argv[]
.Unit tests in packages with dependencies on headers modified here are updated accordingly. Other unit tests will be updated in separate PRs.
PR validation:
Unit tests in the modified packages pass, including new tests of the relevant behavior from the modified executables.
If this PR is a backport please specify the original PR and why you need to backport that PR. If this PR will be backported please specify to which release cycle the backport is meant for:
Having this functionality available in all active release cycles would be nice, and I'm happy to prepare full or partial backports. However, the interface changes and the impacts on tests from various packages make this a difficult proposition to support centrally. It's possible that minimal branches can be provided for users who want this functionality in their own working areas.