New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
try to solve mg5 /tmp problem from ExternalLHEProducer #23004
Conversation
The code-checks are being triggered in jenkins. |
example of failure in production |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-23004/4402 |
A new Pull Request was created by @perrozzi for master. It involves the following packages: GeneratorInterface/LHEInterface @cmsbuild, @efeyazgan, @perrozzi can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
please test |
The tests are being triggered in jenkins. |
My two cents, since the issue of tmp is not only in the gridpack business but could be anywhere, is to know whether TMPDIR being set is an actual requirement of cmssw environment, in which case it has to be set properly upstream of a specific module of cmssw. With this PR, we kill one bird with one stone, while there are other birds out there. |
I have always thought that setting it in the condor pilot made the most sense. Is there any opposition to this? If we set it in the CMSSW, doesn't it depend which CMSSW version is used by the gridpack script? |
Comparison job queued. |
Comparison is ready Comparison Summary:
|
+1 To answer Kenneth in a pedantic way, if:
|
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @davidlange6, @slava77, @smuzaffar, @fabiocos (and backports should be raised in the release meeting by the corresponding L2) |
@perrozzi aaah, ok, I see. So run_generic_tarball_cvmfs.sh is still taken from the CMSSW of the job, not from the one specified in the gridpack. This looks like quite a simple and robust solution to me then. |
To complement on this, @vlimant created an elog thread regarding solving this issue at the pilot level for a general solution. This was discussed today at the Submission Infrastructure meeting and there is a plan to test setting |
@khurtado thanks for the update, please keep us posted. from our side is indeed not mandatory to have it in cmssw. |
I've merged Kenyi's patch on qa, will start testing it today and if everything goes well, we could push this to production at the beginning of the next week. Just keep in mind that the renewal of pilots will take few days (2-3) making the change likely to be available in the full pool by the end of next week. |
as the comment says, if TMPDIR is unset, set it to the condor scratch area if present, otherwise fallback to /tmp |
+1 |
In production (and gridpack creation) when using condor infrastructure
some failures have been observed which are related to the use of
/tmp
by MG5_aMC when the
$TMPDIR
environment variable is not set.A solution has been provided by @khurtado (cms-sw/genproductions#1534 (comment))
and is used for new gridpacks.
However, so far, existing gridpacks had been patched one by one.
Different MG5_aMC versions require different patches, making the operation cumbersome.
Discussing with @fabiocos and @vlimant it comes out that CMSSW doesn't touch this environment variable,
so one can effectively set the $TMPDIR upstream in the ExternalLHEProducer before the gridpack runs.
It has been cross-checked that the call of
cmsenv
doesn't overwrite the$TMPDIR
variable@kdlong, @agrohsje, @efeyazgan, @perrozzi
[1]