Reusing containers created with Pyxis not working #28

Juanjdurillo · 2020-09-30T07:46:40Z

reusing a container launched with srun and the Pyxis plugin does not seem to work with the latest version.

When using --container-name flag of srun, the container fs is preceded with pyxis and an id. Every new srun command results on a new container fs (with different id) despite using the same name. Using that prefix does not work either.

The text was updated successfully, but these errors were encountered:

flx42 · 2020-09-30T15:37:07Z

Hi @Juanjdurillo,

Are you doing different srun commands within a sbatch, salloc or even below another srun? In this case, --container-name should work, since it's the same job ID for Slurm.

However, if you do two separate srun commands directly from the login node, it will effectively be different jobs for Slurm, and it's normal that --container-name doesn't work. The latest version of pyxis cleans up all containers after a job completes. This is what we were doing on our cluster, but through a Slurm job epilog instead, now its built into pyxis directly. If you don't cleanup named containers, they tend to accumulate quickly and consume a lot of space on the compute nodes.

If you want to go back to the old behavior, you can set epilog=0 in the pyxis options, see https://github.com/NVIDIA/pyxis/wiki/Setup#slurm-plugstack-configuration as an example (epilog is not documented for now as it's not part of a released version yet).

Juanjdurillo · 2020-10-01T07:51:00Z

Hi @flx42 thanks for your answer. The srun were done within a salloc. However, the effect is creating a new container. Even if I have a previous container there with the same name, this is ignored and a new one is created.

However, if you do two separate srun commands directly from the login node, it will effectively be different jobs for Slurm, and it's normal that --container-name doesn't work. The latest version of pyxis cleans up all containers after a job completes. This is what we were doing on our cluster, but through a Slurm job epilog instead, now its built into pyxis directly. If you don't cleanup named containers, they tend to accumulate quickly and consume a lot of space on the compute nodes.

I am not sure if this is the case. These containers created with the suffix are never cleaned up (they persist within my enroot/data folder) and are never reused.

flx42 · 2020-10-01T16:38:21Z

I'm not able to reproduce the problem you are describing right now. You can git checkout v0.8.1 for now, or check if disabling the epilog cleanup solves your problem like I described above.

Could you list the commands you are using and the error message (if any) that you are seeing? And what version of Slurm are you using?

Juanjdurillo · 2020-10-02T09:04:34Z

My workflow is as follows:
salloc <desired nodes config>

and afterwards, I do
srun ... --container-name=mycontainer ...

Sometimes, I might finish and within the same allocation issue another srun command to execute the code with a different dataset.

But sometimes, what I want to do is to get an allocation in another point in time and repeat these steps. If I look here https://github.com/NVIDIA/pyxis reusing the container should be possible even across allocations (at least the documentation does not state the opposite).

However, if I look into the code in master, the slurm_spank_user_init

pyxis/pyxis_slurmstepd.c

Line 815 in f3ea9a7

int slurm_spank_user_init(spank_t sp, int ac, char **av)

and

pyxis/pyxis_slurmstepd.c

Line 834 in f3ea9a7

    
           ret = xasprintf(&container_name, "pyxis_%u_%s", context.job.jobid, context.args->container_name);

I'd say that reusing containers is only possible within the same slurm job now (which is something different than what the previous documentation states). Is this the case?

However, even, if my new assumption is correct, this does not solve my usecase problem, as I am experiencing problems in reusing containers even within the same slurm job id (i.e., within a single salloc). This would make sense only if enroot_container_get would not provide the right information. In that function there were a big refactor in the last update, but everything seems fine to me. The only thing where I suspect a potential error would be coming from (only because I do not know what is happening in that function) is:

pyxis/enroot.c

Line 159 in f3ea9a7

log_fd = pyxis_memfd_create("enroot-log", MFD_CLOEXEC);

.

Unfortunately, as much as I would like to test this and provide a patch if an error is found, I cannot because I am simply a user of a system which provides that configuration.

Juanjdurillo · 2020-10-02T13:42:40Z

The errors were happening with:

current master of pyxis (latest commit f3ea9a7)
slurm 20.11.0-0pre1
Enroot: enroot/now 3.1.1-1 amd64, enroot+caps/now 3.1.1-1 amd64

flx42 · 2020-10-02T17:25:32Z

Thanks for the detailed feedback.

I'm still unsure about what your sequence of command looks like, does the following works for you?

$ salloc -N1
salloc: Granted job allocation 4
salloc: Waiting for resource configuration
salloc: Nodes ioctl are ready for job

$ srun --container-image ubuntu which vmtouch
pyxis: importing docker image ...
pyxis: creating container filesystem ...
pyxis: starting container ...
srun: error: ioctl: task 0: Exited with exit code 1

$ srun --container-image ubuntu --container-name ctr bash -c "apt-get update && apt-get install -y vmtouch"
pyxis: importing docker image ...
pyxis: creating container filesystem ...
pyxis: starting container ...
[...]

$ srun --container-name ctr which vmtouch
pyxis: reusing existing container filesystem
pyxis: starting container ...
/usr/bin/vmtouch

Perhaps you are doing things differently, are you allocating multiple nodes? Or perhaps you were actually using a different job but landing on the same node? You can echo $SLURM_JOB_ID if you're not sure if you have a single or multiple different jobs.

I'd say that reusing containers is only possible within the same slurm job now (which is something different than what the previous documentation states). Is this the case?

Yes, it's a change in the latest code (after 0.8.1, so not in a tagged release yet). In our cluster we are doing the cleanup of named containers manually with a Slurm epilog, not doing so would quickly fill the local storage. So the latest changes is to integrate this cleanup in pyxis directly, and the cleanup is done in a job epilog.

Unfortunately, as much as I would like to test this and provide a patch if an error is found, I cannot because I am simply a user of a system which provides that configuration.

I understand, thanks for taking time to look at this. You should recommend to your admin to go back to pyxis 0.8.1 in the meantime then.

Juanjdurillo · 2020-10-02T18:27:31Z

Thanks to you for having a look!

I'm still unsure about what your sequence of command looks like, does the following works for you?

$ salloc -N1
salloc: Granted job allocation 4
salloc: Waiting for resource configuration
salloc: Nodes ioctl are ready for job

$ srun --container-image ubuntu which vmtouch
pyxis: importing docker image ...
pyxis: creating container filesystem ...
pyxis: starting container ...
srun: error: ioctl: task 0: Exited with exit code 1

$ srun --container-image ubuntu --container-name ctr bash -c "apt-get update && apt-get install -y vmtouch"
pyxis: importing docker image ...
pyxis: creating container filesystem ...
pyxis: starting container ...
[...]

$ srun --container-name ctr which vmtouch
pyxis: reusing existing container filesystem
pyxis: starting container ...
/usr/bin/vmtouch

Here is the thing, if I tried to do a sequence of commands as you suggested, the last srun won't actually reuse the ctr container. I suspect that if the function enroot_container_get mentioned before has no error, then it has probably to do with the slurm configuration on my side (probably assigning a new jobid after every job step?)

I understand, thanks for taking time to look at this. You should recommend to your admin to go back to pyxis 0.8.1 in the meantime then.

Yes, this has been the adopted solution. I very much appreciate your help on the matter. I would also appreciate if reusing containers across slurm jobs would be still possible in the future :-)
Thanks!

Juanjdurillo · 2020-10-02T18:28:17Z

I close the issue as resolve as the suggested tagged version works fine for us

flx42 · 2020-10-02T20:33:47Z

then it has probably to do with the slurm configuration on my side (probably assigning a new jobid after every job step?)

I'm not aware of anything like that, weird! I'll do further research soon.

flx42 · 2020-10-23T20:26:07Z

@Juanjdurillo FYI, this might also solve your issue: 5a7d900

See #30

Juanjdurillo closed this as completed Oct 2, 2020

flx42 mentioned this issue Oct 23, 2020

Clarify intended usage of --container-name #30

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reusing containers created with Pyxis not working #28

Reusing containers created with Pyxis not working #28

Juanjdurillo commented Sep 30, 2020 •

edited

Loading

flx42 commented Sep 30, 2020

Juanjdurillo commented Oct 1, 2020

flx42 commented Oct 1, 2020

Juanjdurillo commented Oct 2, 2020 •

edited

Loading

Juanjdurillo commented Oct 2, 2020

flx42 commented Oct 2, 2020

Juanjdurillo commented Oct 2, 2020

Juanjdurillo commented Oct 2, 2020

flx42 commented Oct 2, 2020

flx42 commented Oct 23, 2020

Reusing containers created with Pyxis not working #28

Reusing containers created with Pyxis not working #28

Comments

Juanjdurillo commented Sep 30, 2020 • edited Loading

flx42 commented Sep 30, 2020

Juanjdurillo commented Oct 1, 2020

flx42 commented Oct 1, 2020

Juanjdurillo commented Oct 2, 2020 • edited Loading

Juanjdurillo commented Oct 2, 2020

flx42 commented Oct 2, 2020

Juanjdurillo commented Oct 2, 2020

Juanjdurillo commented Oct 2, 2020

flx42 commented Oct 2, 2020

flx42 commented Oct 23, 2020

Juanjdurillo commented Sep 30, 2020 •

edited

Loading

Juanjdurillo commented Oct 2, 2020 •

edited

Loading