-
Notifications
You must be signed in to change notification settings - Fork 318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disk quota exceeded - a way to reduce the number of read partitions? #1410
Comments
Hi,
Sorry about this - the massive number of files is a 'feature' of running
Trinity. It's better than the original version, but it can still be an
issue.
One way to reduce it is to reduce the number of lowly expressed transcripts
that would get assembled. You can do this by using
Trinity --min_kmer_cov 2
and another way is to limit the assembly to longer transcripts. The
default min length cutoff is 200, but if you set it to 300 that'll make a
big difference.
hope this helps.
…On Fri, May 3, 2024 at 2:10 PM EvaMarkovic ***@***.***> wrote:
Hi!
I'm running Trinity with the following options/parameters:
Trinity --seqType fq --NO_SEQTK --max_memory 230G --CPU 8 --left "comma
sep FW fastq files"--right "comma sep REV fastq files" --SS_lib_type FR
--trimmomatic --quality_trimming_params "SLIDINGWINDOW:4:15 LEADING:10
TRAILING:10 MINLEN:25" --output .../trinity_assembly/ --bflyHeapSpaceMax
24G --bflyCPU 8
All of the steps of Phase 1 completed successfully, but during Phase 2, I
encountered the following error(s):
warning, cmd:
/mnt/netapp1/Optcesga_FT2_RHEL7/2020/software/Compiler/gcccore/system/trinityrnaseq/2.13.2/trinityrnaseq-v2.13.2/util/support_scripts/../../Trinity
--single
"/mnt/lustre/scratch/nlsas/home/csic/gfy/mcs/eva/test/PC/trinity_assembly/read_partitions/Fb_0/CBin_423/c42371.trinity.reads.fa"
--output
"/mnt/lustre/scratch/nlsas/home/csic/gfy/mcs/eva/test/PC/trinity_assembly/read_partitions/Fb_0/CBin_423/c42371.trinity.reads.fa.out"
--CPU 1 --max_memory 1G --run_as_paired --SS_lib_type F --seqType fa
--trinity_complete --full_cleanup --NO_SEQTK --bflyHeapSpaceMax 6G
--bflyCPU 60 --no_distributed_trinity_exec --no_salmon failed with ret:
512, going to retry.
mkdir: cannot create directory
'/mnt/lustre/scratch/nlsas/home/csic/gfy/mcs/eva/test/PC/trinity_assembly/read_partitions/Fb_0/CBin_4/c498.trinity.reads.fa.out':
Disk quota exceeded
The obvious problem is "Disk quota exceeded". I am running Trinity on a
cluster where each user has a limit of 3T of data and 240000 files. With my
dataset, Trinity creates approximately 160000 files in the read_partitions
directory which (together with other files I have) exceeded the disk quota
and stops Trinity from finishing Phase 2. (The storage itself is fine, I
have 500 G free). Is there a way to reduce the number of read partitions
that are created and bypass this error?
—
Reply to this email directly, view it on GitHub
<#1410>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABZRKXYUKXOV4LCJZSBNVPTZAPHKPAVCNFSM6AAAAABHF52XXCVHI2DSMVQWIX3LMV43ASLTON2WKOZSGI3TQMJYGQ4TQNQ>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas <http://broad.mit.edu/~bhaas>
|
Thank you for your response and suggestions! I tried setting --min_kmer_cov to 2, and it did indeed decrease the number of files, but unfortunately, there are still too many. I will try to set the min length cutoff next. Thanks so much again! |
I think the min contig length going to be the largest contributor to the
number of files.... probably an exponential relationship there.
…On Sat, May 4, 2024 at 7:15 AM EvaMarkovic ***@***.***> wrote:
Thank you for your response and suggestions!
I tried setting --min_kmer_cov to 2, and it did indeed decrease the number
of files, but unfortunately, there are still too many. I will try to set
the min length cutoff next.
Thanks so much again!
—
Reply to this email directly, view it on GitHub
<#1410 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABZRKXZOAEJ3WP2QNW5ONMLZAS7MLAVCNFSM6AAAAABHF52XXCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJUGEZDINRTHA>
.
You are receiving this because you commented.Message ID:
***@***.***>
--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas <http://broad.mit.edu/~bhaas>
|
I set the min contig length to 300 and it reduced the number of files approximately 3 times! Which was well below my quota. Thank you so much again!! |
Great to hear!
…On Sun, May 5, 2024 at 4:39 AM EvaMarkovic ***@***.***> wrote:
I set the min contig length to 300 and it reduced the number of files
approximately 3 times! Which was well below my quota.
Thank you so much again!!
—
Reply to this email directly, view it on GitHub
<#1410 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABZRKX4DBTN3HLCIDR4A2C3ZAXV5FAVCNFSM6AAAAABHF52XXCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJUGY4DONBYGI>
.
You are receiving this because you commented.Message ID:
***@***.***>
--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas <http://broad.mit.edu/~bhaas>
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi!
I'm running Trinity with the following options/parameters:
Trinity --seqType fq --NO_SEQTK --max_memory 230G --CPU 8 --left "comma sep FW fastq files"--right "comma sep REV fastq files" --SS_lib_type FR --trimmomatic --quality_trimming_params "SLIDINGWINDOW:4:15 LEADING:10 TRAILING:10 MINLEN:25" --output .../trinity_assembly/ --bflyHeapSpaceMax 24G --bflyCPU 8
All of the steps of Phase 1 completed successfully, but during Phase 2, I encountered the following error(s):
warning, cmd: /mnt/netapp1/Optcesga_FT2_RHEL7/2020/software/Compiler/gcccore/system/trinityrnaseq/2.13.2/trinityrnaseq-v2.13.2/util/support_scripts/../../Trinity --single "/mnt/lustre/scratch/nlsas/home/csic/gfy/mcs/eva/test/PC/trinity_assembly/read_partitions/Fb_0/CBin_423/c42371.trinity.reads.fa" --output "/mnt/lustre/scratch/nlsas/home/csic/gfy/mcs/eva/test/PC/trinity_assembly/read_partitions/Fb_0/CBin_423/c42371.trinity.reads.fa.out" --CPU 1 --max_memory 1G --run_as_paired --SS_lib_type F --seqType fa --trinity_complete --full_cleanup --NO_SEQTK --bflyHeapSpaceMax 6G --bflyCPU 60 --no_distributed_trinity_exec --no_salmon failed with ret: 512, going to retry.
mkdir: cannot create directory '/mnt/lustre/scratch/nlsas/home/csic/gfy/mcs/eva/test/PC/trinity_assembly/read_partitions/Fb_0/CBin_4/c498.trinity.reads.fa.out': Disk quota exceeded
The obvious problem is "Disk quota exceeded". I am running Trinity on a cluster where each user has a limit of 3T of data and 240000 files. With my dataset, Trinity creates approximately 160000 files in the read_partitions directory which (together with other files I have) exceeded the disk quota and stops Trinity from finishing Phase 2. (The storage itself is fine, I have 500 G free). Is there a way to reduce the number of read partitions that are created and bypass this error?
The text was updated successfully, but these errors were encountered: