You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some programs which use large file-based databases, like jackhmmer, see a significant performance increase when the database can be loaded into memory using a shm (e.g. /dev/shm) or tmpfs volume.
I would like to specify the size of a ramdisk, dependent on the memory allocated to the container. E.g. If I allocate 64Gb to the container, I would pass --shm-size=64g.
The size needs to be dynamic, so it matches the memory available to the container. Mounting /dev/shm is problematic in this case, because I do not want processes to compete for the host machine's shared memory (i.e. when multiple containers are scheduled onto the same instance).
on AWS, because nextflow creates job definitions and does not support the process directive containerOptions, I do not think it is possible to provision a dynamically sized shared memory volume without manually creating a job definition (which I would like to avoid).
By default, docker allocates 64Mb to /dev/shm, but can be configured using --shm-size (ref)[https://docs.docker.com/engine/reference/run/]. The size cannot be changed from within the container, as it requires remounting the volume.
AWS supports specifying sharedMemorySize in the job definition, which simply passes through to docker's --shm-size
Usage scenario
The use of a ramdisk in alphafold's colab notebook for running jackhmmer can be see here (creating /tmp/ramdisk). There is a similar recommendation on github for speeding up hhblits.
Suggest implementation
Add a process directory shmSize and update AWS Batch plugin's newSubmitRequest(TaskRun task)ref, and the local container executor.
The text was updated successfully, but these errors were encountered:
New feature
Some programs which use large file-based databases, like
jackhmmer
, see a significant performance increase when the database can be loaded into memory using ashm
(e.g./dev/shm
) ortmpfs
volume.I would like to specify the size of a ramdisk, dependent on the memory allocated to the container. E.g. If I allocate 64Gb to the container, I would pass
--shm-size=64g
.The size needs to be dynamic, so it matches the memory available to the container. Mounting
/dev/shm
is problematic in this case, because I do not want processes to compete for the host machine's shared memory (i.e. when multiple containers are scheduled onto the same instance).on AWS, because nextflow creates job definitions and does not support the process directive
containerOptions
, I do not think it is possible to provision a dynamically sized shared memory volume without manually creating a job definition (which I would like to avoid).By default, docker allocates 64Mb to
/dev/shm
, but can be configured using--shm-size
(ref)[https://docs.docker.com/engine/reference/run/]. The size cannot be changed from within the container, as it requires remounting the volume.AWS supports specifying
sharedMemorySize
in the job definition, which simply passes through to docker's--shm-size
Usage scenario
The use of a ramdisk in alphafold's colab notebook for running
jackhmmer
can be see here (creating/tmp/ramdisk
). There is a similar recommendation on github for speeding up hhblits.Suggest implementation
Add a process directory
shmSize
and update AWS Batch plugin'snewSubmitRequest(TaskRun task)
ref, and the local container executor.The text was updated successfully, but these errors were encountered: