New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In silico normalization problems with Trinityv2.4.0 #481
Comments
Hi,
The reads that it's having trouble finding are in the file:
/home/manager/Transcriptoma/Jatropha/Totrans/Trinity_Jatropha/insilico_read_normalization/Root0Trimm.fastq_ext_all_reads.normalized_K25_C50_pctSD10000.fa.missing_accs
Is there something unusual about how these read names are formatted in this
file as compared to how they're found in the input fastq files?
~b
On Mon, May 28, 2018 at 7:57 PM, UlisesArcos ***@***.***> wrote:
I try to use Trinityrnaseq to practices in bioinformatic with a
transcriptome in single end, but when I use trinity I have this problem:
-sorting each stats file by read name.
Thread 1 terminated abnormally: Error, not all specified records have
been retrieved (missing 600) from
/home/manager/Transcriptoma/Jatropha/Totrans/Root0Trimm.fastq
/home/manager/Transcriptoma/Jatropha/Totrans/Root001Trimm.fastq
/home/manager/Transcriptoma/Jatropha/Totrans/Root100Trimm.fastq, see file:
/home/manager/Transcriptoma/Jatropha/Totrans/Trinity_Jatropha/insilico_read_normalization/Root0Trimm.fastq_ext_all_reads.normalized_K25_C50_pctSD10000.fa.missing_accs
for list of missing entries at
/home/manager/Downloads/trinityrnaseq-Trinity-v2.4.0/util/
insilico_read_normalization.pl line 548.
Error encountered with thread.
Error, at least one thread died at
/home/manager/Downloads/trinityrnaseq-Trinity-v2.4.0/util/
insilico_read_normalization.pl line 419.
Error, cmd: /home/manager/Downloads/trinityrnaseq-Trinity-v2.4.0/util/
insilico_read_normalization.pl --seqType fa --JM 50G --max_cov 50 --CPU 8
--output
/home/manager/Transcriptoma/Jatropha/Totrans/Trinity_Jatropha/insilico_read_normalization
--max_pct_stdev 10000 --SS_lib_type F --single
/home/manager/Transcriptoma/Jatropha/Totrans/Root0Trimm.fastq,/home/manager/Transcriptoma/Jatropha/Totrans/Root001Trimm.fastq,/home/manager/Transcriptoma/Jatropha/Totrans/Root100Trimm.fastq
died with ret 7424 at
/home/manager/Downloads/trinityrnaseq-Trinity-v2.4.0/Trinity line 2462.
…
I try to repair the problem but I have not been able to achieve it.
Could someone help me?
Many Thanks!
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas
|
I use a file named Root0Trimm.fastq like input, thus the name is correct after Trinity ouput. I still can not make trinity work. Can I do it something inside the file Root0Trimm.fastq_ext_all_reads.normalized_K25_C50_pctSD10000.fa.missing_accs |
closing old posts. If this is still an active issue we can continue to explore it |
I get a similar problem. Is it possible to answer the solution to this problem? Is there anything to do with the fastq format? |
It usually has to do with the formatting of the input fastq records. Can you show the top of your files? (ie. head *fastq) |
Thanks Brian for your quick response, My data is generated with Nextseq sequencing and I don't extract it from SRA. I am using the latest version of Trinity (v2.8.5) The head of my file looks like: [mc4719@shake Combined_lanes]$ head Sample_A01_S1_R2_001.fastq I tried to modify the header of the reads with respect to the forward and reverse strand, but I still get an error: [mc4719@shake Combined_lanes]$ head trinity_Sample_A01_S1_R2_001.fastq |
I see the problem is the space in the original read def line; Try making this line look like this: instead of: and likewise for the /1 entries. After this, try running Trinity in a new clean workspace so that it doesn't try to reuse any of the earlier intermediate outputs from the earlier run. best, ~b |
Hi, it seems to be working now. Solution for my specific headers (forward reads for example): if files are .gz:
in the reverse reads you change the second step ({ print $1""$2"/2" }, instead of { print $1""$2"/1" }: Thanks! |
Hello Brian, do lines beginning with @A00127:170:H2H3NDSXY:1:1101:22146:1235/1
ATTTGACTCAGAATCTGTGAGGCTCTCTTGATCGAATCAGCACTCAACAATTTCCCGGTTTCACTTTTGTCCAAAAGGGCTGGATGATAACAAAGTGCAAG
+A00127:170:H2H3NDSXY:1:1101:22146:1235 1:N:0:GGTAGAATTA+TGTAAGGTGG
FFFFFF:FFFFFF,F,,F,,F,:,,FFFFFFFFFFF,:F:,FF,,FF:FFF,,F,,,,F,FFFFF::F::F:FFFFFF,:F,,FF,F:,,FFFFF:F:FFF Thanks, |
Hi Ian,
It's just the read name (1st of the 4 fastq lines / record) that need to
have the /1 or /2.
The latest version of Trinity should be more flexible around this
requirement... if the /1 or /2 is there, it'll check that it matches up
with expected read directionality, or it'll add it where it's missing.
best,
~brian
…On Tue, Mar 17, 2020 at 8:56 AM Ian Gilman ***@***.***> wrote:
Hello Brian, do lines beginning with +A also need to be amended? For
example, would the following be a valid fastq read?
@A00127:170:H2H3NDSXY:1:1101:22146:1235/1
ATTTGACTCAGAATCTGTGAGGCTCTCTTGATCGAATCAGCACTCAACAATTTCCCGGTTTCACTTTTGTCCAAAAGGGCTGGATGATAACAAAGTGCAAG
+A00127:170:H2H3NDSXY:1:1101:22146:1235 1:N:0:GGTAGAATTA+TGTAAGGTGG
FFFFFF:FFFFFF,F,,F,,F,:,,FFFFFFFFFFF,:F:,FF,,FF:FFF,,F,,,,F,FFFFF::F::F:FFFFFF,:F,,FF,F:,,FFFFF:F:FFF
Thanks,
Ian
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#481 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABZRKXYZW5LQ7OGO3Q7RMYDRH5XOJANCNFSM4FCCXDNA>
.
--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas <http://broad.mit.edu/~bhaas>
|
Hello! I'm having a similar issue and looked into my trimmed reads. I just wanted to verify... My read headers (for forward reads) look like this : So should I remove the space to make the headers look like this? @a00920:888:HWTTMDSX2:4:1101:12192:1000/1 Thank you and Happy New Year! Margot |
Hi Margot,
The ' 2:N:0:' part of the read name indicates that it's the 'right' or /2
read of the fragment. The 'left' or /1 read would have ' 1:N:0'
hope this helps,
~brian
…On Thu, Jan 5, 2023 at 4:06 PM peanut-buddy ***@***.***> wrote:
Hello!
I'm having a similar issue and looked into my trimmed reads. I just wanted
to verify...
My read headers (for forward reads) look like this :
@a00920 <https://github.com/a00920>:888:HWTTMDSX2:4:1101:2627:1000
2:N:0:ACAATCCGTG+GGATTCTGTC
TAACTATCCTATCTTCTGATGACAGTTTAGCTCTTCAGAATCAAGAAACGCTTCTTAAGCTGAAACATCCTAAACCATCTAGATCTTTATCATTTCCTGAAACAATAACTTCTTTTACTGAAGTTTTGTTAGTCAAGTGAAGATGACGT
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:F:FFF:FFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFF:,FFFFFFFFFFF:
@a00920 <https://github.com/a00920>:888:HWTTMDSX2:4:1101:12192:1000
2:N:0:ACAATCCGTG+GGATTCTGTC
TTTTAAGTTAAAAGGCTTAAATAGGTATAAAAGACGAGAAGACCCCATAGAGTTTAATTTTTTATTTTATTTTAGAATAATTTTATAACCAAAAAATTGGTTGGGGTAACTTAAAGATAAAATAAATTCTTTAAATGTGTAATCATTGAT
+
So should I remove the space to make the headers look like this?
@a00920 <https://github.com/a00920>:888:HWTTMDSX2:4:1101:2627:1000/1
@a00920 <https://github.com/a00920>:888:HWTTMDSX2:4:1101:12192:1000/1
Thank you and Happy New Year!
Margot
—
Reply to this email directly, view it on GitHub
<#481 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABZRKX2IMCU5BLXM6EFVHCTWQ4ZTRANCNFSM4FCCXDNA>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas <http://broad.mit.edu/~bhaas>
|
Thanks for pointing this out, I'll look into the raw reads. Once this issue is corrected, would the correction (removal of space, addition of /1 or /2 to suffix) be the correct way to address this? Thanks again!! |
The normal fastq format for the read name, if it fits the earlier example,
should be totally fine. If you need to reformat something like from SRA
because they do some read name mangling (for lack of a better word), then
forcing the /1 or /2 would be easiest. You just don't want to change it to
/1 or /2 to be trinity-compatible if the reads are really coming from
different paired read arrangements. Otherwise, just run it all as
single-end.
best,
~b
…On Thu, Jan 5, 2023 at 4:30 PM peanut-buddy ***@***.***> wrote:
Thanks for pointing this out, I'll look into the raw reads.
Once this issue is corrected, would the correction (removal of space,
addition of /1 or /2 to suffix) be the correct way to address this?
Thanks again!!
—
Reply to this email directly, view it on GitHub
<#481 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABZRKX663VMKNEJC44NADGTWQ44PFANCNFSM4FCCXDNA>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas <http://broad.mit.edu/~bhaas>
|
Hi Brian, Okay good news--I checked the read files and I had accidentally copied and pasted a reverse read earlier. The forward reads are labeled with 1:N:0 Bad news...I'm not sure why I'm getting this error if the headers look fine. Here is an example error: Thread 6 terminated abnormally: Error, not all specified records have been retrieved (missing 15054418) from /PR09I1_2_val_2.fq.gz, see file: /trinity_out_dir/insilico_read_normalization/PR09I1_2_val_2.fq.gz.normalized_K25_maxC200_minC1_pctSD10000.fq.missing_accs for list of missing entries at /usr/local/bin/trinityrnaseq/util/insilico_read_normalization.pl line 552. Do you have any suggestions for how to fix this? Thank you for your help, I really appreciate it! Margot |
I see. It's having a difficult time recovering reads of interest.
Can you show me the top entries in the file:
/trinity_out_dir/insilico_read_normalization/PR09I1_2_val_2.fq.gz.normalized_K25_maxC200_minC1_pctSD10000.fq.missing_accs
and then also the top entries from:
/trinity_out_dir/insilico_read_normalization/PR09I1_2_val_2.fq.gz
?
This should give us more insights.
best,
~b
…On Thu, Jan 5, 2023 at 4:50 PM peanut-buddy ***@***.***> wrote:
Hi Brian,
Okay good news--I checked the read files and I had accidentally copied and
pasted a reverse read earlier. The forward reads are labeled with 1:N:0
Bad news...I'm not sure why I'm getting this error if the headers look
fine.
Here is an example error:
Thread 6 terminated abnormally: Error, not all specified records have been
retrieved (missing 15054418) from /PR09I1_2_val_2.fq.gz, see file:
/trinity_out_dir/insilico_read_normalization/PR09I1_2_val_2.fq.gz.normalized_K25_maxC200_minC1_pctSD10000.fq.missing_accs
for list of missing entries at /usr/local/bin/trinityrnaseq/util/
insilico_read_normalization.pl line 552.
Thread 5 terminated abnormally: Error, not all specified records have been
retrieved (missing 15054418) from /PR09I1_1_val_1.fq.gz, see file:
/trinity_out_dir/insilico_read_normalization/PR09I1_1_val_1.fq.gz.normalized_K25_maxC200_minC1_pctSD10000.fq.missing_accs
for list of missing entries at /usr/local/bin/trinityrnaseq/util/
insilico_read_normalization.pl line 552.
Error encountered with thread.
Error encountered with thread.
Error, at least one thread died at /usr/local/bin/trinityrnaseq/util/
insilico_read_normalization.pl line 423.
Do you have any suggestions for how to fix this? Thank you for your help,
I really appreciate it!
Margot
—
Reply to this email directly, view it on GitHub
<#481 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABZRKX52P5QYRBU6NGRXQLLWQ4633ANCNFSM4FCCXDNA>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas <http://broad.mit.edu/~bhaas>
|
Here are the top entries for: /trinity_out_dir/insilico_read_normalization/PR09I1_2_val_2.fq.gz.normalized_K25_maxC200_minC1_pctSD10000.fq.missing_accs A00877:895:HWWVNDSX2:3:2225:9959:20995 it doesn't seem to find the second file, I wonder if it hasn't been created yet? Thanks! |
This is peculiar. The read names look fine there. It's basically not
finding those read names in your original fastq file "PR09I1_2_val_2.fq.gz".
If you find those read names in your original fastq files, then I can't
explain why it's not able to recover them.
You could try rerunning in a clean workspace. If it continues to give you
trouble, I could try to run your data through on my system to troubleshoot
anything directly.
best,
~b
…On Thu, Jan 5, 2023 at 5:00 PM peanut-buddy ***@***.***> wrote:
Here are the top entries for:
/trinity_out_dir/insilico_read_normalization/PR09I1_2_val_2.fq.gz.normalized_K25_maxC200_minC1_pctSD10000.fq.missing_accs
A00877:895:HWWVNDSX2:3:2225:9959:20995
A00253:590:HYCJ2DSX2:1:2402:2808:24518
A00253:581:HYM7GDSX2:1:1402:3206:5196
A00920:888:HWTTMDSX2:4:1563:28980:11819
A00917:820:HYMNJDSX2:3:1102:4426:34491
A00877:895:HWWVNDSX2:3:2242:15393:19038
A00253:581:HYM7GDSX2:1:2353:20437:15687
A00920:888:HWTTMDSX2:4:2446:15067:9455
A00253:590:HYCJ2DSX2:1:1443:9417:8625
it doesn't seem to find the second file, I wonder if it hasn't been
created yet?
Thanks!
—
Reply to this email directly, view it on GitHub
<#481 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABZRKXYDDJXPMF5RTV2XIR3WQ476DANCNFSM4FCCXDNA>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas <http://broad.mit.edu/~bhaas>
|
Hi Brian, It seems like these reads are missing--what would you recommend next? EDIT: Sorry, I will try re-running in a new workspace and let you know how it turns out...thanks so much for your help. Thank you!! |
I think you're going to want to rerun from scratch in a new working
directory, to be sure that no earlier intermediate outputs are reused in
this new run. That should solve the problem. Where the mystery read names
are coming from is beyond me, but if it happens again, we'll continue to
explore it further.
best of luck,
~brian
…On Thu, Jan 5, 2023 at 5:47 PM peanut-buddy ***@***.***> wrote:
Hi Brian,
I just grepped the trimmed and raw read files and didn't find the entries
I posted for
/trinity_out_dir/insilico_read_normalization/PR09I1_2_val_2.fq.gz.normalized_K25_maxC200_minC1_pctSD10000.fq.missing_accs
It seems like these reads are missing--what would you recommend next?
Thank you!!
—
Reply to this email directly, view it on GitHub
<#481 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABZRKX7QPRZLFHJQ3M45PWLWQ5FRBANCNFSM4FCCXDNA>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas <http://broad.mit.edu/~bhaas>
|
Hi Brian, the jobs have finished. I just needed to run them in a new directory. Thanks so much for your help!!!! |
Great, thanks for letting me know!
…On Sun, Jan 8, 2023 at 7:27 PM peanut-buddy ***@***.***> wrote:
Hi Brian, the jobs have finished. I just needed to run them in a new
directory. Thanks so much for your help!!!!
—
Reply to this email directly, view it on GitHub
<#481 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABZRKX5KIIFAVYO2DOH447LWRNLOPANCNFSM4FCCXDNA>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas <http://broad.mit.edu/~bhaas>
|
I try to use Trinityrnaseq to practices in bioinformatic with a transcriptome in single end, but when I use trinity I have this problem:
-sorting each stats file by read name.
Thread 1 terminated abnormally: Error, not all specified records have been retrieved (missing 600) from /home/manager/Transcriptoma/Jatropha/Totrans/Root0Trimm.fastq /home/manager/Transcriptoma/Jatropha/Totrans/Root001Trimm.fastq /home/manager/Transcriptoma/Jatropha/Totrans/Root100Trimm.fastq, see file: /home/manager/Transcriptoma/Jatropha/Totrans/Trinity_Jatropha/insilico_read_normalization/Root0Trimm.fastq_ext_all_reads.normalized_K25_C50_pctSD10000.fa.missing_accs for list of missing entries at /home/manager/Downloads/trinityrnaseq-Trinity-v2.4.0/util/insilico_read_normalization.pl line 548.
Error encountered with thread.
Error, at least one thread died at /home/manager/Downloads/trinityrnaseq-Trinity-v2.4.0/util/insilico_read_normalization.pl line 419.
Error, cmd: /home/manager/Downloads/trinityrnaseq-Trinity-v2.4.0/util/insilico_read_normalization.pl --seqType fa --JM 50G --max_cov 50 --CPU 8 --output /home/manager/Transcriptoma/Jatropha/Totrans/Trinity_Jatropha/insilico_read_normalization --max_pct_stdev 10000 --SS_lib_type F --single /home/manager/Transcriptoma/Jatropha/Totrans/Root0Trimm.fastq,/home/manager/Transcriptoma/Jatropha/Totrans/Root001Trimm.fastq,/home/manager/Transcriptoma/Jatropha/Totrans/Root100Trimm.fastq died with ret 7424 at /home/manager/Downloads/trinityrnaseq-Trinity-v2.4.0/Trinity line 2462.
I try to repair the problem but I have not been able to achieve it.
Could someone help me?
Many Thanks!
The text was updated successfully, but these errors were encountered: