Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for charliecloud containers. #18

Closed
wants to merge 1 commit into from

Conversation

olifre
Copy link
Contributor

@olifre olifre commented Nov 12, 2017

As mentioned on the mailing list, I spent some time developing support for Charliecloud:
https://github.com/hpc/charliecloud

More details:

  • condor_starter detects ch-run and advertises it, including the version.
  • CHARLIECLOUD_JOB = true forces the job to be run inside charliecloud.
  • CHARLIECLOUD_CONTAINER_EXPR specifies the container directory.
  • CHARLIECLOUD overrides the default path to the ch-run binary.
    • Defaults to /usr/bin/ch-run.
  • CHARLIECLOUD_MOUNT_HOME allows to disable / enable bind mounting
    user's home directory.
  • HTCondor automatically bind mounts the $_CONDOR_SCRATCH_DIR and sets the
    initial working directory to it. Location in the container has to exist,
    can be specified via CHARLIECLOUD_TARGET_DIR.

Initial documentation is also there, the features and usability are close to what is offered by the Singularity support. Also the implementation is in many parts almost identical.

Significant differences Charliecloud <=> Singularity:

  • Charliecloud runs with user privileges at all times. There's nothing setuid root and no root-owned daemon, so it requires kernel user namespaces support to work.
  • Charliecloud focuses on running containers only. There's no "bootstrapping code". Charliecloud relies on Docker to build containers.
  • Charliecloud only supports "extracted" images, not big image files (since loop-mounting would require root).
  • Charliecloud does not support the PID namespace (since doing that right would require it's own init-process).
  • Charliecloud has a significantly smaller codebase with focus on security.

As with Singularity, interactive jobs and condor_ssh_to_job do not work yet.
However, a solution would be easy by using nsenter which can run with user privileges. I have suggested that on the mailing list:
https://www-auth.cs.wisc.edu/lists/htcondor-users/2017-October/msg00116.shtml
but sadly did not receive any feedback as of yet.

Also, the PID namespace feature of HTCondor could be used with Charliecloud, but again for this (same as for interactive jobs), this should best happen outside the container.

More details:
- condor_starter detects ch-run and advertises it, including the version.
- CHARLIECLOUD_JOB = true forces the job to be run inside charliecloud.
- CHARLIECLOUD_CONTAINER_EXPR specifies the container directory.
- CHARLIECLOUD overrides the default path to the ch-run binary.
  - Defaults to /usr/bin/ch-run.
- CHARLIECLOUD_MOUNT_HOME allows to disable / enable bind mounting
  user's home directory.
- HTCondor automatically bind mounts the $_CONDOR_SCRATCH_DIR and sets the
  initial working directory to it. Location in the container has to exist,
  can be specified via CHARLIECLOUD_TARGET_DIR.
@olifre
Copy link
Contributor Author

olifre commented Nov 12, 2017

In case somebody wants to try this out:

I plan to improve the packages still and it would be nice to get them into the distros proper at some point.

@timtheisen
Copy link
Member

Greg Thain and Oliver Freyermuth had a discussion at 2018 European HTCondor Workshop. So, we can close this PR.

@timtheisen timtheisen closed this Sep 14, 2018
@olifre
Copy link
Contributor Author

olifre commented Sep 14, 2018

Just to ensure this is not misunderstood in case sombody stumbles upon it:
The idea is to have proper OCI support in HTCondor in the future, and until then, not waste too much energy by increasing the number of separately maintained container implementations.

Since the charliecloud implementation is pretty close to the singularity implementation, which has its own issues (interactive jobs etc.) which are being looked at / worked on, the idea is to better drop this separate implementation for now and let the devs focus energies on improving Docker and Singularity, and finally have OCI (which also allows to use Singularity 3.0 which should be OCI compatible).
Making charliecloud commandline compatible with OCI implementation should not be too complex at that point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants