-
Notifications
You must be signed in to change notification settings - Fork 117
[feat] Improve Sarus backend #1737
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feat] Improve Sarus backend #1737
Conversation
|
Can I test this patch? |
| if self.image is None: | ||
| raise ContainerError('no image specified') | ||
|
|
||
| if not self.commands: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The creator of an images should know best how to execute the command. E.g. for Spack the ENTRYPOINT needs to be honored.
I'd like to argue that a container should also include the execution script.
ChristianKniep
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put some inline comments into the PR to explain what I am doing.
reframe/core/containers.py
Outdated
| #: :default: :class:`False` | ||
| with_mpi = fields.TypedField(bool) | ||
|
|
||
| #: Skip pull of images |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This violates the reproducibility; thus, not a keeper.
But (being devil's advocate), an image name like qnib/spack-gromacs:latest violates reproducibility as well. :)
To make an image reproducibile you would want to specify an image using the content hash.
docker image inspect qnib/spack-gromacs-shp2:c5n_18xl_off |jq '.[]|.RepoDigests'
[
"qnib/spack-gromacs-shp2@sha256:4f5ae272e343395de2259e2745ab24f20a34688343f9f233cec895915f1704f5"
]
and not qnib/spack-gromacs-shp2:c5n_18xl_off
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That might be a thing btw. inspecting the actual hash and putting it into the job output as an strict description.
For some images like ubuntu:18.04 you actually specify a manifest list (image index) and not a 'real' image.
| super().launch_command() | ||
| run_opts = ['--mount=type=bind,source="%s",destination="%s"' % | ||
| mp for mp in self.mount_points] | ||
| run_opts.append('--workdir='+self.workdir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cleaner then just changing the dir with cd
|
|
||
| run_opts += self.options | ||
| run_cmd = self._command + ' run %s %s bash -c ' % (' '.join(run_opts), | ||
| run_cmd = self._command + ' run %s %s' % (' '.join(run_opts), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, the entrypoint you specify here (bash -c) might mess with the container image and renders it unusable.
|
Hello @ChristianKniep, Thank you for updating!
Do see the ReFrame Coding Style Guide Comment last updated at 2021-02-03 10:47:12 UTC |
|
@ChristianKniep make sure you adjust also the unittests |
vkarak
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we could also extend the Docker backend in a similar way, couldn't we?
| #: | ||
| #: :type: boolean | ||
| #: :default: :class:`False` | ||
| with_metahub = fields.TypedField(bool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about replacing this with registry so that we allow users to specify any container registry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually the registry is part of the string that is the image name.
ubuntu is actually docker.io/library/ubuntu:latest
The first part is the registry, the second the org and the last piece the image name.
I might check the prefix and react accordingly.
But in a greater scheme of things the question becomes how do you deal with container images compared to binaries installed on a system?
| 'docker pull -q mh.qnib.org/'+self.image, | ||
| 'docker save -o '+tmpdir+'/image.tar mh.qnib.org/'+self.image, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't sarus pull work here? I don't understand very much the purpose of the save and load.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, sarus pull seems to need a API call that the proxy Metahub is not implementing (yet). working with the SARUS folks to fix that soon(-ish).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, you should put this logic on the test as a prerun step. When sarus can pull directly, you just remove the workaround from the particular test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, you can use skopeo to copy the image directly as a docker-archive and load it with sarus
Codecov Report
@@ Coverage Diff @@
## master #1737 +/- ##
==========================================
- Coverage 87.27% 87.24% -0.03%
==========================================
Files 46 46
Lines 7708 7717 +9
==========================================
+ Hits 6727 6733 +6
- Misses 981 984 +3
Continue to review full report at Codecov.
|
| super().__init__() | ||
| self.with_mpi = False | ||
| self._command = 'sarus' | ||
| self.with_metahub = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not needed since you can specify some hooks to be run before each step of the test pipeline.
|
|
||
| def launch_command(self): | ||
| super().launch_command() | ||
| run_opts = ['--mount=type=bind,source="%s",destination="%s"' % |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| run_opts = ['--mount=type=bind,source="%s",destination="%s"' % | |
| run_opts = [f'--mount=type=bind,source="{mp}",destination="{mp}"' for mp in self.mount_points] |
| super().launch_command() | ||
| run_opts = ['--mount=type=bind,source="%s",destination="%s"' % | ||
| mp for mp in self.mount_points] | ||
| run_opts.append('--workdir='+self.workdir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| run_opts.append('--workdir='+self.workdir) | |
| run_opts.append(f'--workdir={self.workdir}') |
| return run_cmd + "'" + '; '.join( | ||
| ['cd ' + self.workdir] + self.commands) + "'" | ||
| if self.commands: | ||
| return run_cmd + " '" + '; '.join(self.commands) + "'" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are not going to use bash -c then we don't need the single quotes nor we need the ;. I think it's better if the user passes the whole command verbatim.
|
|
||
| run_opts += self.options | ||
| run_cmd = self._command + ' run %s %s bash -c ' % (' '.join(run_opts), | ||
| run_cmd = self._command + ' run %s %s' % (' '.join(run_opts), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| run_cmd = self._command + ' run %s %s' % (' '.join(run_opts), | |
| run_cmd = f"{self._command} run {' '.join(run_opts)} {self.image}" |
|
@ChristianKniep I can take this issue and work on it. Is that fine? |
|
@teojgo Sure, let me know if I can help |
|
I am closing this, since it's replaced by #1775. |
Some changes to improve the use of SARUS #1736 . @Madeeks