Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cohort script does not match new docker image entry point #240

Closed
tirohia opened this issue Aug 30, 2019 · 2 comments
Closed

cohort script does not match new docker image entry point #240

tirohia opened this issue Aug 30, 2019 · 2 comments
Assignees

Comments

@tirohia
Copy link

tirohia commented Aug 30, 2019

Using run_cohort_from_csv.sh to generate docker commands, it generates a file with the following command:

docker run --ulimit nofile=20000:20000 -v "/blue/project/cancer-seq-pipeline/data/references/hg19:/data/reference/"         -v "/blue/project/gridssTest/src/perSampleScripts:/data/blacklist/"         -v "/blue/project/gridssTest/src/perSa
mpleScripts:/data/assembly/"         -v "/blue/project/cancer-seq-pipeline/data/intermediate/P1033:/data/output/"          -v "/blue/project/cancer-seq-pipeline/data/intermediate/P1033:/data/bam1" -v "/blue/project/cancer-seq-pipeline/dat
a/intermediate/P1033:/data/bam2" -v "/blue/project/cancer-seq-pipeline/data/intermediate/P1033:/data/bam3"         gridss/gridss         TMP_DIR="/data/output/"         REFERENCE_SEQUENCE="/data/reference/ucsc.hg19.fasta"         BLACKLIS
T="/data/blacklist/ENCFF001TDO.bed"         ASSEMBLY="/data/assembly/P1033.assembly.bam"         OUTPUT="/data/output/P1033.vcf"          INPUT=/data/bam1/P1033A.control.aligned.bam  INPUT=/data/bam2/P1033C.test.aligned.bam  INPUT=/data/b
am3/P1033G.test.aligned.bam          2>&1 | tee -a /blue/project/gridssTest/src/perSampleScripts/gridss.P1033.$HOSTNAME.$$.log

which when run gives me:

Usage: gridss.sh --reference <reference.fa> --output <output.vcf> --assembly <assembly.bam> [--threads n] [--jar gridss.jar] [--workingdir <directory>] [--jvmheap <threads * 4>g] [--blacklist <blacklist.bed>] input1.bam [input2.bam [...]]
Specify assembly bam location using the --assembly command line argument. Assembly location must be in a writeable directory.

As best I can tell, the generated file has the arguments in an incompatible format. i.e. ASSEMBLY="/data/assembly/P1033.assembly.bam" instead of --assembly /data/assembly/P1033.assembly.bam.
If I manually change assembly, REFERENCE_SEQUENCE, BLACKLIST, ASSEMBLY, and OUTPUT to a corresponding -- format, and remove the INPUT= from the input files, the docker will run.
I can't specify a TMP_DIR in this manner. If I leave the TMP_DIR as is, the docker runs and immediately exits. Attempting to use --tmp_dir results in a response of unknown argument. Removing all references to the tmp directory results in a docker command that runs, but appears to write all intermediate files to /var/lib/docker/overlay2. This would probably be fine, but given the size of the disk that docker is installed on, and that that's were all the other containers we are running are , it's ending up fulling up the entire disk and falling over with a disk out of space error.

Ideally I'd like to be able to specify the tmp directory, unless it's already possible to do so and I'm doing it wrong.

@d-cameron d-cameron changed the title Unable to specify tmp directory. cohort script does not match new docker image entry point Sep 1, 2019
@d-cameron d-cameron self-assigned this Sep 1, 2019
@d-cameron
Copy link
Member

To simplify the command line usage, the driver gridss.sh script sets TMP_DIR to match WORKING_DIR so it should never need to be specified now.

The bigger issue is that the latest docker image uses the driver script, and the cohort script was designed for usage with the docker image that invokes java directory.

We'll be imminently releasing a pre-print of a gridss+downstream tools somatic pipeline and I'm currently working on changing the docker internals so the docker images for just gridss and gridss-purple-linx are synchronised. I'll get a fix for this issue in while I'm working on them.

@d-cameron
Copy link
Member

gridss/gridss:2.6.0 docker image now available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants