-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Environment Isolation #4480
Comments
Hi Chris, As for DIRAC(OS) should be modified to not need $PYTHONPATH/$LD_LIBRARY_PATH: Daniela |
I don't understand what you mean by this? If |
I feel like a secretary here: Simon thinks you could be right and it might work, but without trying it we cannot say. I was in the middle of a lengthy comment, which might or might not be useful, but I've typed it now, so here it is: Option 3 would break in containers and assumes this all works in the first place, which is not always a given, so that would be a no from us. |
You can now give it a try by passing |
Simon and me just did a quick test. It seemed to work for us as expected, but there is still the danger of the user overriding the PATH and breaking it that way. E.g. experiments often set up their own version of python. (Cleaning the environment in the pilot should probably be a separate issue) If we stick the PATH etc into the wrapper script
then this would be perfectly self contained and protected from accidental user interference. |
sorry but can you remind me what this |
This is an environment variable used to define an interactive client session used by the COMDIRAC commands |
Is there any progress ? As I said, this would be a great step in the right direction for us. Daniela |
The last version of v7r0 (v7r0p21) is using DIRACOS v1r11 (DIRACGrid/DIRACOS#127) |
I am tempted to close this task (DIRACOS2, py3 etc), unless there are objections. |
I don't even remember the details of this, but given that I haven't seen any more issues looking like this recently, go ahead. |
Current situation
I've ran tests in the DIRAC certification instance (DIRACOS) and in LHCb's production DIRAC instance (lcgbundle). The results can be split into two categories.
No isolation at all
When submitting with:
dirac-wms-job-submit
(Vanilla DIRAC + DIRACOS)DIRAC.Interfaces.API
(Vanilla DIRAC + DIRACOS)dirac-wms-job-submit
(LHCbDIRAC + lcgbundle)The job inherits the full DIRACOS environment. This mainly a problem because
PYTHONPATH
andLD_LIBRARY_PATH
are set and this makes it almost impossible to use non-DIRACOS binaries. For example:PYTHONPATH
also causes some weird things. It's less bad now onlyDIRAC
and extensions will be added but it's still going to be weird from a user perspective:Some other variables will also cause unexpected issues like
PYTHONOPTIMIZE
causing assertions in analysis scripts to not trigger:+ /cvmfs/lhcbdev.cern.ch/conda-experimental/bin/python -c 'assert False; print("All is okay")' All is okay
Too much isolation
During the BiLD discussion it was mentioned that LHCb cleans the environment. This isn't the case for the vast majority of jobs, including all production jobs however it turns out it is for jobs submitted with
LHCbDIRAC.Interfaces.API.DiracLHCb
. The environment is cleaned completely and not even$PATH
is set:Surprisingly this is less broken than I expected:
+ echo 1 1 + python --help [help is shown]
Though it is still a strange environment:
It's also fundamentally broken. For example
$X509_USER_PROXY
isn't set causing authentication to fail:Solution
DIRAC(OS) should be modified to not need
$PYTHONPATH
/$LD_LIBRARY_PATH
by:python
to look for Python libraries in$DIRAC
(usingsitecustomize
). It would be better would be topip install
DIRAC so there is no need for anything special but is a longer-term goal.RPATH
in all binaries relative to$ORIGIN
therefore it possible to use/path/to/diracos/bin/python
directly without any environment being set. There is a nice right up about whenLD_LIBRARY_PATH
is appropriate here.Regardless of what we do here I expect this to happen as part of the move to Python 3 but that's a discussion for elsewhere.
1. Clean the environment as much as possible
One solution would be to change the default to clean the environment like how
LHCbDIRAC.Interfaces.API.DiracLHCb
currently does. If this is done we should add a whitelist of variables to keep. Not only those which are authentication related but also things likeTMPDIR
. We should also make sure$PATH
is set to the result ofgetconf PATH
.Having
$DIRAC
set would probably be useful to allow jobs to runsource "${DIRAC}/bashrc"
if required.It does however limit sites from being able to set variables to influence jobs. I know this has been used by LHCb sites to let them set
OMP_NUM_THREADS=1
to prevent OpenMP from overwhelming the system in single threaded queues. That said, I still think cleaning the environment entirely is the right thing to do.2. Just make DIRAC set fewer variables
If the dependency on
$PYTHONPATH
/$LD_LIBRARY_PATH
being set is removed then most of the current issues go away automatically.I dislike it though as it makes the grid less homogeneous. In particular I have had issues with some sites setting silly variable defaults like
LD_LIBRARY_PATH=/usr/lib:/usr/lib64
orPYTHONPATH=/usr/lib64/lib/python2.7/site-packages
.3. Snapshot the environment before
$DIRAC/bashrc
is sourcedThis is a more extreme version of 2 which I don't think gains very much and still need the whitelist from 1. to ensure variables like
$X509_USER_PROXY
are kept. I don't see any benefit from doing this.Conclusion
In summary, my opinion is that these changes should be made:
Make DIRACOS's python build look in$DIRAC
for python modulesRPATH
soLD_LIBRARY_PATH
doesn't need to be set$DIRAC
and$X509_USER_PROXY
The text was updated successfully, but these errors were encountered: