Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dask: load Python environment from CVMFS #43

Merged
merged 11 commits into from
Feb 22, 2021
Merged

dask: load Python environment from CVMFS #43

merged 11 commits into from
Feb 22, 2021

Conversation

zonca
Copy link
Collaborator

@zonca zonca commented Nov 3, 2020

No description provided.

@zonca zonca changed the title load Python environment from CVMFS dask: load Python environment from CVMFS Nov 3, 2020
@zonca
Copy link
Collaborator Author

zonca commented Nov 4, 2020

I have configured the dask worker and the dask scheduler to mount the CVMFS volumes,
and then modify the environment variables to load V03-06, they need to be hardcoded, I think it is fine for now, I can update that once in a while.

However, I get some dynamic library errors.

When trying to launch the scheduler, I get:

Traceback (most recent call last):
  File "/cvmfs/cdms.opensciencegrid.org/releases/centos7/V03-06/bin/dask-gateway-scheduler", line 6, in <module>
    from dask_gateway.dask_cli import scheduler
  File "/cvmfs/cdms.opensciencegrid.org/releases/centos7/V03-06/lib/python3.6/site-packages/dask_gateway/__init__.py", line 2, in <module>
    from .client import (
  File "/cvmfs/cdms.opensciencegrid.org/releases/centos7/V03-06/lib/python3.6/site-packages/dask_gateway/client.py", line 5, in <module>
    import ssl
  File "/cvmfs/sft.cern.ch/lcg/releases/Python/3.6.5-f74f0/x86_64-centos7-gcc8-opt/lib/python3.6/ssl.py", line 101, in <module>
    import _ssl             # if we can't import it, let the error propagate
ImportError: libssl.so.10: cannot open shared object file: No such file or directory

However, if I try that in Jupyter, that works fine, it correctly picks up libssl from lib64:

bash-4.2$ ldd /cvmfs/sft.cern.ch/lcg/releases/Python/3.6.5-f74f0/x86_64-centos7-gcc8-opt/lib/python3.6/lib-dynload/_ssl.cpython-36m-x86_64-linux-gnu.so
        linux-vdso.so.1 =>  (0x00007ffe87b61000)
        libssl.so.10 => /lib64/libssl.so.10 (0x00007f7db4fdd000)

I also added /lib64 to LD_LIBRARY_PATH but it still fails,
@bloer, would you have any suggestions?

@bloer
Copy link

bloer commented Nov 4, 2020

My only guess is that maybe some nodes either don't have libssl or CVMFS isn't working quite right (buffer not big enough maybe?). Neither seems overly likely.

We'll be moving to a newer version of the base CVMFS environment within the next week or two, which may help if there's a problem with the distribution. Again, seems unlikely

@zonca
Copy link
Collaborator Author

zonca commented Nov 4, 2020

thanks @bloer,
something that I don't understand, maybe related:

when I run in the notebook, dask-scheduler works fine and finds libssl in the /lib64 folder, but /lib64 is not in LD_LIBRARY_PATH, how does it know to look there?

@zonca
Copy link
Collaborator Author

zonca commented Nov 4, 2020

oh sorry it is explained in the ldconfig man: https://man7.org/linux/man-pages/man8/ldconfig.8.html
it is one of the standard directories.

But why doesn't that work in the dask container? it is exactly the same docker container I am running Jupyter in, so /lib64 is definitely there.

However instead of calling the env.sh script from CVMFS, I am just setting PATH LD_LIBRARY_PATH and PYTHONPATH.
@bloer do you think I need anything else from env.sh?

@zonca
Copy link
Collaborator Author

zonca commented Nov 4, 2020

nevermind! I was using the wrong container....
ok, now I have different errors I'm investigating.

@zonca
Copy link
Collaborator Author

zonca commented Nov 4, 2020

ok, I think it is now working!

  • I can launch a dask cluster
  • increase the number of workers
  • make sure the workers run from CVMFS so have all the CDMS packages (now the dask workers always use 03.06, will look for a better way
def get_python_path():
    import sys
    return sys.path

client.run(get_python_path)
{'tls://10.233.64.172:35025': ['/home/jovyan/dask-worker-space/dask-worker-space/worker-c1v5u3xz',
  '/cvmfs/cdms.opensciencegrid.org/releases/centos7/V03-06/bin',
  '/cvmfs/cdms.opensciencegrid.org/releases/centos7/V03-06/lib/python3.6/site-packages',
  '/cvmfs/sft.cern.ch/lcg/releases/ROOT/6.18.00-885ca/x86_64-centos7-gcc8-opt/lib',
  '/cvmfs/sft.cern.ch/lcg/views/LCG_96python3/x86_64-centos7-gcc8-opt/lib',
  '/cvmfs/sft.cern.ch/lcg/views/LCG_96python3/x86_64-centos7-gcc8-opt/lib/python3.6/site-packages',
  '/cvmfs/sft.cern.ch/lcg/releases/Python/3.6.5-f74f0/x86_64-centos7-gcc8-opt/lib/python36.zip',
  '/cvmfs/sft.cern.ch/lcg/releases/Python/3.6.5-f74f0/x86_64-centos7-gcc8-opt/lib/python3.6',
  '/cvmfs/sft.cern.ch/lcg/releases/Python/3.6.5-f74f0/x86_64-centos7-gcc8-opt/lib/python3.6/lib-dynload',
  '/cvmfs/sft.cern.ch/lcg/releases/Python/3.6.5-f74f0/x86_64-centos7-gcc8-opt/lib/python3.6/site-packages'],
 'tls://10.233.65.62:43247': ['/home/jovyan/dask-worker-space/dask-worker-space/worker-dw8a74ch',
  '/cvmfs/cdms.opensciencegrid.org/releases/centos7/V03-06/bin',
  '/cvmfs/cdms.opensciencegrid.org/releases/centos7/V03-06/lib/python3.6/site-packages',
  '/cvmfs/sft.cern.ch/lcg/releases/ROOT/6.18.00-885ca/x86_64-centos7-gcc8-opt/lib',
  '/cvmfs/sft.cern.ch/lcg/views/LCG_96python3/x86_64-centos7-gcc8-opt/lib',
  '/cvmfs/sft.cern.ch/lcg/views/LCG_96python3/x86_64-centos7-gcc8-opt/lib/python3.6/site-packages',
  '/cvmfs/sft.cern.ch/lcg/releases/Python/3.6.5-f74f0/x86_64-centos7-gcc8-opt/lib/python36.zip',
  '/cvmfs/sft.cern.ch/lcg/releases/Python/3.6.5-f74f0/x86_64-centos7-gcc8-opt/lib/python3.6',
  '/cvmfs/sft.cern.ch/lcg/releases/Python/3.6.5-f74f0/x86_64-centos7-gcc8-opt/lib/python3.6/lib-dynload',
  '/cvmfs/sft.cern.ch/lcg/releases/Python/3.6.5-f74f0/x86_64-centos7-gcc8-opt/lib/python3.6/site-packages']}

Also the dashboard works fine:
image

@ziqinghong can you please test it and check if it works for you?

You should not have any package installed in ~/.local/lib

You can execute this test notebook, just replace js-XXX-YYY with supercdms:
https://gist.github.com/zonca/355a7ec6b5bd3f84b1413a8c29fbc877

Please report here any error.

@zonca
Copy link
Collaborator Author

zonca commented Jan 7, 2021

@pibion @ziqinghong, I just tested this again now,
it works fine, but it would be nice if someone else can test it within the next 2 weeks,
especially I am interested to know if how it is setup now fits your needs or not.

We can schedule a Zoom meeting dedicated to this if you think it is useful (after you gave it a try).

@zonca zonca added this to To do in CDMS JupyterHub on XSEDE Feb 5, 2021
@zonca
Copy link
Collaborator Author

zonca commented Feb 22, 2021

ok, added docs to the usual README, pointing to #51 for later testing

@zonca zonca merged commit 5151220 into master Feb 22, 2021
CDMS JupyterHub on XSEDE automation moved this from To do to Done Feb 22, 2021
@zonca zonca deleted the dask_gateway_cvmfs branch February 22, 2021 18:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

2 participants