You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I've posted this issue on the cromwell github too.
So I'm running the ENCODE ATAC SEQ pipelineon a SGE cluster.
We don't allow hard-links in my facility (beegfs filesystem). Therefore I've been trying to use the localization parameters in the cromwell configuration file but to no avail. The backend file is being used since I can get errors message by putting non supported keyword in the localization array.
I've been trying it with different version of CROMWELL (30.2, 31, 32, 32)
Here is the script generated by cromwell based on my WDL file :
# make the directory which will keep the matching files
mkdir /sandbox/users/foucal-a/test_atac-pipe/cromwell-executions/atac/f4fd93fa-6f3a-42a6-94f2-459901d245c4/call-trim_adapter/shard-0/execution/glob-4f26c666d13d1cb48973da7f646a7de2
# symlink all the files into the glob directory
( ln -L merge_fastqs_R?_*.fastq.gz /sandbox/users/foucal-a/test_atac-pipe/cromwell-executions/atac/f4fd93fa-6f3a-42a6-94f2-459901d245c4/call-trim_adapter/shard-0/execution/glob-4f26c666d13d1cb48973da7f646a7de2 2> /dev/null ) || ( ln merge_fastqs_R?_*.fastq.gz /sandbox/users/foucal-a/test_atac-pipe/cromwell-executions/atac/f4fd93fa-6f3a-42a6-94f2-459901d245c4/call-trim_adapter/shard-0/execution/glob-4f26c666d13d1cb48973da7f646a7de2 )
# list all the files that match the glob into a file called glob-[md5 of glob].list
ls -1 /sandbox/users/foucal-a/test_atac-pipe/cromwell-executions/atac/f4fd93fa-6f3a-42a6-94f2-459901d245c4/call-trim_adapter/shard-0/execution/glob-4f26c666d13d1cb48973da7f646a7de2 > /sandbox/users/foucal-a/test_atac-pipe/cromwell-executions/atac/f4fd93fa-6f3a-42a6-94f2-459901d245c4/call-trim_adapter/shard-0/execution/glob-4f26c666d13d1cb48973da7f646a7de2.list
I have the error when the script tries to symlink all the files into the glob directory.
Here is the WDL code :
scatter( i in range(length(fastqs_)) ) {
# trim adapters and merge trimmed fastqs
call trim_adapter { input :
fastqs = fastqs_[i],
adapters = if length(adapters_)>0 then adapters_[i] else [],
paired_end = paired_end,
}
# align trimmed/merged fastqs with bowtie2s
call bowtie2 { input :
idx_tar = bowtie2_idx_tar,
fastqs = trim_adapter.trimmed_merged_fastqs, #[R1,R2]
paired_end = paired_end,
multimapping = multimapping,
}
}
With the function :
task trim_adapter { # trim adapters and merge trimmed fastqs
# parameters from workflow
Array[Array[File]] fastqs # [merge_id][read_end_id]
Array[Array[String]] adapters # [merge_id][read_end_id]
Boolean paired_end
# mandatory
Boolean? auto_detect_adapter # automatically detect/trim adapters
# optional
Int? min_trim_len # minimum trim length for cutadapt -m
Float? err_rate # Maximum allowed adapter error rate
# for cutadapt -e
# resource
Int? cpu
Int? mem_mb
Int? time_hr
#Commenting this line as a test. PRoblem with hard link
String? disks
command {
python $(which encode_trim_adapter.py) \
${write_tsv(fastqs)} \
--adapters ${write_tsv(adapters)} \
${if paired_end then "--paired-end" else ""} \
${if select_first([auto_detect_adapter,false]) then "--auto-detect-adapter" else ""} \
${"--min-trim-len " + select_first([min_trim_len,5])} \
${"--err-rate " + select_first([err_rate,'0.1'])} \
${"--nth " + select_first([cpu,2])}
}
output {
# WDL glob() globs in an alphabetical order
# so R1 and R2 can be switched, which results in an
# unexpected behavior of a workflow
# so we prepend merge_fastqs_'end'_ (R1 or R2)
# to the basename of original filename
# this prefix will be later stripped in bowtie2 task
Array[File] trimmed_merged_fastqs = glob("merge_fastqs_R?_*.fastq.gz")
}
runtime {
cpu : select_first([cpu,2])
memory : "${select_first([mem_mb,'12000'])} MB"
time : select_first([time_hr,24])
disks : select_first([disks,"local-disk 100 HDD"])
}
}
Can you completely remove "hard-link" from the backend file and try again? Also, please post this issue on the cromwell github repo too. They might have some insights about this. https://github.com/broadinstitute/cromwell/issues
Hi,
I've posted this issue on the cromwell github too.
So I'm running the ENCODE ATAC SEQ pipelineon a SGE cluster.
We don't allow hard-links in my facility (beegfs filesystem). Therefore I've been trying to use the localization parameters in the cromwell configuration file but to no avail. The backend file is being used since I can get errors message by putting non supported keyword in the localization array.
I've been trying it with different version of CROMWELL (30.2, 31, 32, 32)
Here is the script generated by cromwell based on my WDL file :
I have the error when the script tries to symlink all the files into the glob directory.
Here is the WDL code :
With the function :
My backend.conf :
I wonder if there is something wrong with my config files or if Cromwell's localization is at fault.
The text was updated successfully, but these errors were encountered: