-
Notifications
You must be signed in to change notification settings - Fork 5
Description
Now that people have had a chance to use the classes, I'm thinking about simplifying their interface slightly.
Typically I find myself needing only 2 locations:
- a directory on
/storage(or similar) to put condor files, dag files, and logs - a directory on HDFS to put input/output files (already covered by
hdfs_mirror_dir)
The JobSet class has many constructor arguments, especially for STDOUT/STDERR/LOG output (which themselves have separate args for directory and filename). The reason why I did it this way is to make it easy to use the same dir for all 3, but different filenames. (Of course, one could just use os.path.join() to avoid this!)
Would it therefore be worth me slimming down the interface? e.g. Having
JobSet(...
storage_dir="/storage/abc1234/ntuple_31_10_16/",
filename="cmsRun_091011.condor",
logname="logs/cmsRun.$(cluster).$(process).log",
...)
to replace
JobSet(...
filename='/storage/abc1234/ntuple_31_10_16/cmsRun_091011.condor',
out_dir='/storage/abc1234/ntuple_31_10_16/logs', out_file='cmsRun.$(cluster).$(process).out',
err_dir='/storage/abc1234/ntuple_31_10_16/logs', err_file='cmsRun.$(cluster).$(process).err',
log_dir='/storage/abc1234/ntuple_31_10_16/logs', log_file='cmsRun.$(cluster).$(process).log',
...)
and inferring the stdout/err files from the logname field?
Basically, I don't want people to be put off by a multitude of args, that are often set to be the same or very similar. However I don't want to remove support for someone' particular workflow! (Suggestions for other simplifications welcome as well)