Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PulsarEmbeddedJobRunner not passing tool's environment_variable to interactive tools #10093

Closed
abretaud opened this issue Aug 12, 2020 · 3 comments
Labels

Comments

@abretaud
Copy link
Contributor

Hi there,
I'm trying to configure an instance to run interactive tools to a slurm cluster using the PulsarEmbeddedJobRunner.
I used https://training.galaxyproject.org/training-material/topics/admin/tutorials/interactive-tools/tutorial.html to configure things:

# pulsar_app.yml
managers:
  _default_:
    type: queued_drmaa
    native_specification: "-p galaxy -c 2 --mem=6000"
# job_conf.xml
        <destination id="interactive_tools" runner="pulsar_embedded">
          <param id="docker_enabled">true</param>
	  <!-- If you have not set 'outputs_to_working_directory: true' in galaxy.yml you can remove the docker_volumes setting. -->
          <param id="docker_volumes">$defaults</param>
          <param id="docker_sudo">false</param>
          <param id="docker_net">bridge</param>
          <param id="docker_auto_rm">true</param>
          <param id="require_container">true</param>
          <param id="docker_set_user"></param>
          <param id="docker_run_extra_arguments">--memory 6g</param>
          <param id="container_monitor_result">callback</param>
	</destination>

The jobs are properly launched within slurm jobs, I manage to run an ethercalc gxit without problem.
However I noticed that environment variables set in tools' xml are not set properly, they're just empty inside the container. Ethercalc doesn't need it so it works, but jupyter doesn't like it much, and askomics just refuses to start.

Jupyter's example: https://github.com/galaxyproject/galaxy/blob/dev/tools/interactive/interactivetool_jupyter_notebook.xml#L11
Inside the container, env gives something like this:

HISTORY_ID=
MINICONDA_VERSION=4.6.14
REMOTE_HOST=
LANGUAGE=en_US.UTF-8
HOSTNAME=a6e0821abc91
DEBUG=false
XDG_CACHE_HOME=/home/jovyan/.cache/
NOTEBOOK_PASSWORD=none
HOME=/home/jovyan
CONDA_VERSION=4.7.10
JULIA_DEPOT_PATH=/opt/julia
NB_USER=jovyan
GALAXY_URL=
JULIA_VERSION=1.2.0
PATH=/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
NB_GID=100
GALAXY_SLOTS=1
CORS_ORIGIN=none
LANG=en_US.UTF-8
GALAXY_WEB_PORT=
API_KEY=
DEBIAN_FRONTEND=noninteractive
SHELL=/bin/bash
CONDA_DIR=/opt/conda
LC_ALL=en_US.UTF-8
PWD=/opt/galaxy_dev/galaxy-dist/database/jobs_directory/_interactive/19032/working
JULIA_PKGDIR=/opt/julia
NB_UID=1000
DOCKER_PORT=none

In the galaxy logs I've seen something like this:

docker run -e "GALAXY_SLOTS=$GALAXY_SLOTS" -e "HISTORY_ID=$HISTORY_ID" -e "REMOTE_HOST=$REMOTE_HOST" -e "GALAXY_WEB_PORT=$GALAXY_WEB_PORT" -e "GALAXY_URL=$GALAXY_URL" -e "API_KEY=$API_KEY" -p 8888 --name 6926de21ad044a96b61adf968c7e5e92 [...]

Looking at the generated command.sh script, the env_setup_commands is empty, so I guess the env vars are lost at some point, but I can't find where exactly. Any idea?

(also tested without slurm, with queued_python, same problem)

@abretaud
Copy link
Contributor Author

I've tried making a change to pass the env vars to the pulsar job, it is added to the script, but it fails because the tmp files containing the env var values are not transferred:

$ cat /blabla/jobs_directory/_interactive/19052/command.sh 
#!/bin/bash

_galaxy_setup_environment() {
    local _use_framework_galaxy="$1"
    _GALAXY_JOB_DIR="/blabla/jobs_directory/_interactive/19052"
    _GALAXY_JOB_HOME_DIR="/blabla/jobs_directory/_interactive/19052/home"
    _GALAXY_JOB_TMP_DIR=""
    HISTORY_ID=`cat "$_GALAXY_JOB_DIR/tmpdivgry_d"`; export HISTORY_ID
REMOTE_HOST=`cat "$_GALAXY_JOB_DIR/tmppfz4n30w"`; export REMOTE_HOST
GALAXY_WEB_PORT=`cat "$_GALAXY_JOB_DIR/tmpwx31xvm6"`; export GALAXY_WEB_PORT
GALAXY_URL=`cat "$_GALAXY_JOB_DIR/tmpz0zbuy5v"`; export GALAXY_URL
API_KEY=`cat "$_GALAXY_JOB_DIR/tmpqtzofr09"`; export API_KEY

=> "No such file or directory" for each var

I guess we'd need to tell pulsar to transfer the file around https://github.com/abretaud/galaxy/blob/pulsar_env/lib/galaxy/tools/evaluation.py#L558 but I see no way to copy them at the root of the remote job dir.

(by the way, I had to add the _GALAXY_JOB_DIR line in https://github.com/galaxyproject/pulsar/blob/master/pulsar/managers/util/job_script/DEFAULT_JOB_FILE_TEMPLATE.sh#L7 to get it in command.sh above)

@mvdbeek
Copy link
Member

mvdbeek commented Feb 23, 2021

Wow, we should have fixed this earlier, but I think #11349 should fix this.

@mvdbeek mvdbeek closed this as completed Feb 23, 2021
@abretaud
Copy link
Contributor Author

Cool, many thanks :) I'll test it soon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants