You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With the current push, I have the most basic skeleton of running a hello world batch Job (starting with this before venturing into other CRD types, because there are many small problems to solve) with Snakemake via Kueue. This is the log from running the snakemake step in the container:
The Kubernetes job succeeds - JobStatus.SUCCEEDED is linked to the succeeded status of the job == the completions.
But of course as Snakemake is expecting output files, and we have neither a shared filesystem nor a way to get them back immediately, the snakemake run fails. There are several problems:
I'm using a ConfigMap for the Snakefile, which totally disregards any additional context of the local working directory
The working directory where the job runs needs write, so I use an EmptyVolume to avoid setting up something complicated for development.
Thus we need solutions for:
Handling a working directory and getting it into a pod or some volume for pods.
That volume needing write
The results of the run being able to be passed back / saved somewhere.
I wanted to open these questions for discussion, and I'm thinking for early prototyping we could require output files to be sent to some remote (e.g., gs://) and then maybe there could be a way to not expect an actual file, but to see some text result in logs?
On a higher level, this is likely going to be a bigger problem as we try to support many different environments with snakemake, where everyone uses a different cloud / different storage/ different file-types. My experiences with using Kubernetes and setting up storage is that it's really hard (and adds developer and user toil that we should want to avoid). For Google LifeSciences + Snakemake we assume using Google Storage, and for the working directory we upload it and download it on startup. We could do something similar here, but then we face the challenge of needing to support multiple clouds. Paulo from Nextflow is working on fusion that minimally can do a flexible fuse-style mount, but it's not an open source project. Services like Google Filestore are also really nice, but again, we can't assume being on Google Cloud.
Let me know your thoughts - and chat tomorrow @johanneskoester !
The text was updated successfully, but these errors were encountered:
This is handled with a remote (e.g., AWS). I still haven't gotten this working on the MPI Operator - if the design means the output is generated elsewhere (and it's expected to be on the launcher) we might run into an issue - will try testing again.
@johanneskoester and @alculquicondor I want to bring you in for discussion and updates here!
With the current push, I have the most basic skeleton of running a hello world batch Job (starting with this before venturing into other CRD types, because there are many small problems to solve) with Snakemake via Kueue. This is the log from running the snakemake step in the container:
The Kubernetes job succeeds -
JobStatus.SUCCEEDED
is linked to the succeeded status of the job == the completions.But of course as Snakemake is expecting output files, and we have neither a shared filesystem nor a way to get them back immediately, the snakemake run fails. There are several problems:
Thus we need solutions for:
I wanted to open these questions for discussion, and I'm thinking for early prototyping we could require output files to be sent to some remote (e.g., gs://) and then maybe there could be a way to not expect an actual file, but to see some text result in logs?
On a higher level, this is likely going to be a bigger problem as we try to support many different environments with snakemake, where everyone uses a different cloud / different storage/ different file-types. My experiences with using Kubernetes and setting up storage is that it's really hard (and adds developer and user toil that we should want to avoid). For Google LifeSciences + Snakemake we assume using Google Storage, and for the working directory we upload it and download it on startup. We could do something similar here, but then we face the challenge of needing to support multiple clouds. Paulo from Nextflow is working on fusion that minimally can do a flexible fuse-style mount, but it's not an open source project. Services like Google Filestore are also really nice, but again, we can't assume being on Google Cloud.
Let me know your thoughts - and chat tomorrow @johanneskoester !
The text was updated successfully, but these errors were encountered: