Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incompatibility with charliecloud@0.34 #4463

Open
nschan opened this issue Oct 31, 2023 · 12 comments
Open

Incompatibility with charliecloud@0.34 #4463

nschan opened this issue Oct 31, 2023 · 12 comments

Comments

@nschan
Copy link
Contributor

nschan commented Oct 31, 2023

Bug report

Running nextflow@23.10.0 together with charliecloud@0.34 yields an error:

error: can't run directory images from storage (hint: run by name)

Expected behavior and actual behavior

The expected behaviour is that nextflow is able to start the container.
This is related to a change in charlieclouds behaviour (hpc/charliecloud#1505)

Steps to reproduce the problem

This problem should occur in any case where charlielcloud > 0.33 is used as the container engine for nextflow. I do not think that this problem is specific to the nextflow version used.

Program output

Command exit status:
  1

Command output:
  (empty)

Command error:
  ch-run[64707]: error: can't run directory images from storage (hint: run by name) (ch-run.c:320)

Environment

  • Nextflow version: 23.10.0
  • Java version: openjdk 11.0.13 2021-10-19
  • Operating system: Linux
  • Bash version: GNU bash, version 4.4.23(1)

Additional context

@nschan
Copy link
Contributor Author

nschan commented Nov 2, 2023

I realized that this issue is tied to $CH_IMAGE_STORAGE. ch-run refuses to allow images from its storage directory by path, and accepts these only by name.
One possible workaround is to unset CH_IMAGE_STORAGE and use NXF_CHARLIECLOUD_CACHEDIR pointing to the same directory before attempting to run a pipeline.

@reidpr
Copy link

reidpr commented Nov 7, 2023

Charliecloud project lead here. Thanks for this bug report!

This behavior is by design. In general, the storage directory has never been a stable API, so folks should not depend on its structure. However, we've only been erroring on ch-run against images in the storage directory since Charliecloud 0.31, I think.

You can now run images in storage by specifying their name, e.g. instead of ch-run $CH_IMAGE_STORAGE/img/foo ... you would say ch-run foo .... (Note that image names are encoded, so you can't reliably say ch-run $(basename $CH_IMAGE_STORAGE/img/foo).)

However, ch-run -w foo is disallowed because this is quite likely to corrupt the storage directory. In this case you would have to export to another directory with ch-convert, then ch-run -w that other directory. (However, best practice is to avoid -w and bind-mount output directories and whatnot.)

@reidpr
Copy link

reidpr commented Nov 7, 2023

An alternative to ch-convert would be to build a small derived image containing the desired mount directory.

See also the embarrassingly old hpc/charliecloud#96.

@nschan
Copy link
Contributor Author

nschan commented Jan 12, 2024

Since charliecloud will soon provide --write-fake, are there plans to add support for this to nextflow?
I guess it would require some checks for the charliecloud version used, or removing support for versions that do not support --write-fake. As I understand versions <0.33 are fine with -w from the cache directory, but >0.33 not. Presumably --write-fake would be supported at some point in the future (>0.35 ?)

@reidpr
Copy link

reidpr commented Jan 12, 2024

Unfortunately, no version of Charliecloud is fine with -w in the storage directory, though consequences vary. We didn't enforce this until recently, however.

--write-fake will be available in 0.36, to be released soon.

@nschan
Copy link
Contributor Author

nschan commented Jan 12, 2024

I guess I should have said 'works, although it is not a great idea' :). I think using --write-fake for upcoming versions is the best solution, the main issue I see is how older versions of charliecloud could be supported, or if that is even necessary. @phue what are your thoughts on this?

@phue
Copy link
Contributor

phue commented Jan 15, 2024

The sole purpose of that -w flag was to enable bind-mounting of the nextflow work directory when using public container images. Now that charliecloud soon supports overlayfs, it should be dropped.
And yes, compatibility with earlier versions is problematic, but that has happened before because of another breaking change

The shared (multi-user) cache directory you are using is another can of worms though, I currently don't see how that can work properly with the rather recent git-based layer caching used in charliecloud.
I think the way forward is to just drop the CharliecloudCache class and leave it to the user to pre-pull images if desired.

@nschan
Could you try these config settings with the current charliecloud git, to see if that already works with the latest nextflow version?

charliecloud {
  cacheDir = false
  readOnlyInputs = true
  runOptions = "--write-fake"
}

@nschan
Copy link
Contributor Author

nschan commented Jan 15, 2024

I generally agree that breaking backwards compatibility could be justified in this case.

I see the issues of the multi-user cache.. In practice that cache is pretty much only used by me, but we will move away from a shared cache.

Anyhow, the approach you suggested does not currently work (nextflow@23.10.1, charliecloud@0.36~pre+78aae5e), but I assume that is something for @reidpr ?

Command error:
  ch-run[47171]: error: can't read: /ch/environment: Stale file handle (ch_misc.c:259 116)

When i try to run the container manually with -W or --write-fake I get this:

ch-run[35682]: error: can't overlay: Operation not permitted (ch_core.c:323 1)

This is using quay.io/biocontainers/agat:1.1.0--pl5321hdfd78af_1

Edit:
Since you mention dropping CharliecloudCache, it is anyway somewhat necessary to pre-pull images, since in many cases pipelines will start a bunch of different processes in the first step, which often leads to #3566

Another edit:
hpc/charliecloud/pull/1793 mentions kernel >5.11 ideally >6.6. The hpc system I am using is running on 4.12.14, so I guess that explains why --write-fake does not work for me...

@nschan
Copy link
Contributor Author

nschan commented Jan 16, 2024

To sum up:

I think the fact that relying on --write-fake would break compatibility with older kernels is very problematic. Not supporting older charliecloud versions is one thing, but dropping support for older kernels seems a bit drastic.

I think an alternative could be something along the lines of building write-able process specific containers. This would hopefully solve most of the problems associated with running containers from a cache. I admit my understanding of containers is pretty basic, but as I read the current implementation all parallel processes that use the same container image run in the same container (irrespective $CH_IMAGE_STORAGE set). Wouldn't it be better to have one dedicated container per process? Using a cache and ch-convert (as sugested by @reidpr earlier) seems like a reasonable approach to me.
This would keep things backwards compatible, avoid the current issues with -w, and might actually be the correct way to run containers..?
It would require some adjustments to the CharliecloudBuilder, essentially adding a step of ch-convert to create the container. I am not sure if the new container should be stored in the current work directory, or if that will create a horrible mess when the work directory including the container is mounted into the container?

Of course guess adding -w --unsafe instead of -w would be another (strongly discouraged) option.

@reidpr
Copy link

reidpr commented Jan 16, 2024

Anyhow, the approach you suggested does not currently work (nextflow@23.10.1, charliecloud@0.36~pre+78aae5e), but I assume that is something for @reidpr ?

Command error:
  ch-run[47171]: error: can't read: /ch/environment: Stale file handle (ch_misc.c:259 116)

“Stale file handle” is something about inode re-use. I know it often comes up in NFS. If it’s a Charliecloud bug we definitely want to fix it; is there a reproducer without Nextflow?

I read the current implementation all parallel processes that use the same container image run in the same container (irrespective $CH_IMAGE_STORAGE set). Wouldn't it be better to have one dedicated container per process?

Processes on the same node that are in different containers cannot use certain system calls to communicate, and this has performance impact. That’s the motivation for --join, though it’s not the default.

Note that multiple containers can be running the same image. I’d be skeptical of a separate image per process. That’d be a lot of images on a wide computer.

Another thing to consider is hpc/charliecloud#1408. This would be a supported way to create a container modified by a shell command (e.g., to add mount points).

One could also ch-convert, then manually adjust the resulting directory.

I hope that helps.

@nschan
Copy link
Contributor Author

nschan commented Jan 17, 2024

I could not reproduce the "stale file handle" outside nextflow, it might also have other, filesystem related reasons.

Processes on the same node that are in different containers cannot use certain system calls to communicate, and this has performance impact. That’s the motivation for --join, though it’s not the default.

I was referring to nextflow processes (tasks), which I think should anyway not communicate with each other while running.

What is not really clear to me, is what would happen if I have several (simultaneous) jobs running containers of the same image and for some reason input-specific files are created in a certain folder of the image (e.g. /opt/someprogram/somefile.tmp) with -w. As I understand all jobs see the what was written to /opt/program by other jobs. If the contents of somefile.tmp are specific for the task input then this could turn out to be somewhat of a problem, no?

If that could happen I think this would argue in favour of using ch-convert for each nextflow task. If that is not an issue, then I think it should be ok for nextflow to just store images in $workDir/charliecloud as it currently does when $CH_IMAGE_STORAGE and $NXF_CHARLIECLOUD_CACHEDIR are not set.

@reidpr
Copy link

reidpr commented Jan 18, 2024

Looks like I misunderstood. Thanks for clarifying.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants