Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OOM error in markDuplicates #519

Closed
gianfilippo opened this issue Feb 7, 2024 · 3 comments
Closed

OOM error in markDuplicates #519

gianfilippo opened this issue Feb 7, 2024 · 3 comments

Comments

@gianfilippo
Copy link

Hi,

I am trying to run the MAE and rnaVariantCalling and I am getting a OOM error in markDuplicates (see below).

I am submitting this as a slurm job and I allocated 10 cores and 180Gb for the last run. I do not recall (I may be wrong) having to allocate more memory when running a GATK based pipeline for RNAseq data.
Should I just allocate more memory or use the config file to manage it ?

Thanks

Error in rule markDuplicates:
jobid: 247
input: DROP/Analysis_60M_60F_ExternalSamples/processed_data/rnaVariantCalling/out/bam/661T/661T_Aligned.sortedByCoord.out.bam, DROP/Analysis_60M_60F_ExternalSamples/processed_data/rnaVariantCalling/out/bam/661T/661T_Aligned.sortedByCoord.out.bam.bai
output: DROP/Analysis_60M_60F_ExternalSamples/processed_data/rnaVariantCalling/out/bam/661T/661T_Aligned.sortedByCoord.dupMarked.out.bam, DROP/Analysis_60M_60F_ExternalSamples/processed_data/rnaVariantCalling/out/bam/661T/661T_Aligned.sortedByCoord.dupMarked.out.bai
log:
DROP/Analysis_60M_60F_ExternalSamples/processed_data/rnaVariantCalling/logs/markDuplicates/661T.log (check log file(s) for error details)
shell:

    gatk MarkDuplicates         -I DROP/Analysis_60M_60F_ExternalSamples/processed_data/rnaVariantCalling/out/bam/661T/661T_Aligned.sortedByCoord.out.bam -O DROP/Analysis_60M_60F_ExternalSamples/processed_data/rnaVariantCalling/out/bam/661T/661T_Aligned.sortedByCoord.dupMarked.out.bam         -M 

DROP/Analysis_60M_60F_ExternalSamples/processed_data/rnaVariantCalling/out/picard-tools-marked-dup-metrics.txt --CREATE_INDEX true --TMP_DIR "/tmp" --VALIDATION_STRINGENCY SILENT 2> DROP/Analysis_60M_60F_ExternalSamples/processed_data/rnaVariantCalling/logs/markDuplicates/661T.log

    (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Removing output files of failed job markDuplicates since they might be corrupted:
DROP/Analysis_60M_60F_ExternalSamples/processed_data/rnaVariantCalling/out/bam/661T/661T_Aligned.sortedByCoord.dupMarked.out.bam, DROP/Analysis_60M_60F_ExternalSamples/processed_data/rnaVariantCalling/out/bam/661T/661T_Aligned.sortedByCoord.dupMarked.out.bai

Below is the end of the sample specific log file DROP/Analysis_60M_60F_ExternalSamples/processed_data/rnaVariantCalling/logs/markDuplicates/661T.log

INFO 2024-02-05 23:22:39 MarkDuplicates Sorting list of duplicate records.
INFO 2024-02-05 23:22:42 MarkDuplicates After generateDuplicateIndexes freeMemory: 17094205248; totalMemory: 25199378432; maxMemory: 32178700288
INFO 2024-02-05 23:22:42 MarkDuplicates Marking 29761681 records as duplicates.
INFO 2024-02-05 23:22:42 MarkDuplicates Found 3318 optical duplicate clusters.
INFO 2024-02-05 23:22:42 MarkDuplicates Reads are assumed to be ordered by: coordinate
INFO 2024-02-05 23:23:26 MarkDuplicates Written 10,000,000 records. Elapsed time: 00:00:44s. Time for last 10,000,000: 44s. Last read position: chr4:73,408,762
INFO 2024-02-05 23:24:10 MarkDuplicates Written 20,000,000 records. Elapsed time: 00:01:28s. Time for last 10,000,000: 44s. Last read position: chr8:108,203,053
INFO 2024-02-05 23:25:00 MarkDuplicates Written 30,000,000 records. Elapsed time: 00:02:18s. Time for last 10,000,000: 49s. Last read position: chr14:94,378,547
INFO 2024-02-05 23:25:44 MarkDuplicates Written 40,000,000 records. Elapsed time: 00:03:02s. Time for last 10,000,000: 43s. Last read position: chr19:58,355,146
INFO 2024-02-05 23:26:18 MarkDuplicates Written 50,000,000 records. Elapsed time: 00:03:36s. Time for last 10,000,000: 33s. Last read position: chrM:8,968
INFO 2024-02-05 23:26:40 MarkDuplicates Writing complete. Closing input iterator.
INFO 2024-02-05 23:26:40 MarkDuplicates Duplicate Index cleanup.
INFO 2024-02-05 23:26:40 MarkDuplicates Getting Memory Stats.
INFO 2024-02-05 23:26:40 MarkDuplicates Before output close freeMemory: 288414536; totalMemory: 335544320; maxMemory: 32178700288
INFO 2024-02-05 23:26:40 MarkDuplicates Closed outputs. Getting more Memory Stats.
INFO 2024-02-05 23:26:40 MarkDuplicates After output close freeMemory: 188927600; totalMemory: 234881024; maxMemory: 32178700288
[Mon Feb 05 23:26:40 EST 2024] picard.sam.markduplicates.MarkDuplicates done. Elapsed time: 9.16 minutes.
Runtime.totalMemory()=234881024
Using GATK jar $HOME/.conda/envs/drop_env_133/share/gatk4-4.4.0.0-0/gatk-package-4.4.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar $HOME/.conda/envs/drop_env_133/share/gatk4-4.4.0.0-0/gatk-package-4.4.0.0-local.jar MarkDuplicates -I DROP/Analysis_60M_60F_ExternalSamples/processed_data/rnaVariantCalling/out/bam/661T/661T_Aligned.sortedByCoord.out.bam -O DROP/Analysis_60M_60F_ExternalSamples/processed_data/rnaVariantCalling/out/bam/661T/661T_Aligned.sortedByCoord.dupMarked.out.bam -M DROP/Analysis_60M_60F_ExternalSamples/processed_data/rnaVariantCalling/out/picard-tools-marked-dup-metrics.txt --CREATE_INDEX true --TMP_DIR /tmp --VALIDATION_STRINGENCY SILENT

@gianfilippo
Copy link
Author

UPDATE:
I issued the same "gatk MarkDuplicates" command as in the log file, using an interactive node with only 32G of memory and it completed.
May be the problem is with DROP/snakemake default settings for memory management, but I am not sure how to change that. Any suggestions ?

@vyepez88
Copy link
Collaborator

vyepez88 commented Feb 9, 2024

Hi, I think there was an issue with the maskMultiVCF because a path couldn't be accessed. It is working now. It could have been that.
180 Gb for 10 samples should be more than enough.
You could add specific resource allocations to the headers of the scripts.

@gianfilippo
Copy link
Author

Hi, just completed a rerun and it worked. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants