Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trinity transcript_abundance.pl error using RSEM #305

Closed
amartin44 opened this issue Jul 7, 2017 · 9 comments
Closed

Trinity transcript_abundance.pl error using RSEM #305

amartin44 opened this issue Jul 7, 2017 · 9 comments

Comments

@amartin44
Copy link

@amartin44 amartin44 commented Jul 7, 2017

Hello,

I am trying to estimate transcript abundance from a transcriptome generated from paired-end RNA-seq data using Trinity. However, I am running into a consistent error while using the transcript_abundance.pl script that I cannot seem to debug. After submitting a PBS job using the following options:

perl align_and_estimate_abundance.pl --transcripts Trinity.fasta --seqType fq
--samples_file fundulus_bioreplicates.txt --SS_lib_type RF --est_method RSEM
--aln_method bowtie --trinity_mode --prep_reference --output_dir rsem_outdir
--thread_count 20

the reference files are successfully prepared. Subsequently, the program begins to make .bam files for the different biological replicates, but eventually the entire job fails in the rsem-calculate-expression phase. The error is as follows:

Read HWI-ST1234:250:C96G9ACXX:2:1101:3338:2119: The adjacent two lines do not represent the two mates of a paired-end read! (RSEM assumes the two mates of a paired-end read should be adjacent)
Error, cmd: rsem-calculate-expression --paired-end -p 4 --forward-prob 0 --no-bam-output --bam fundulus_RSEM.bowtie.bam (...) died with ret: 65280 at align_and_estimate_abundance.pl line 766.

I have verified that both reads in the pair with the tag "HWI-ST1234:250:C96G9ACXX:2:1101:3338:2119" exist in the .bam file. Has anyone experienced a similar error and had success debugging it? I am relatively new in the bioinformatics field and would greatly appreciate any insight on what might be going wrong.

Thanks!

Alex.

@brianjohnhaas
Copy link
Member

@brianjohnhaas brianjohnhaas commented Jul 8, 2017

Loading

@amartin44
Copy link
Author

@amartin44 amartin44 commented Jul 10, 2017

Thanks for getting back to me so quickly Brian; Bowtie2 seems to be working fine.

Loading

@brianjohnhaas
Copy link
Member

@brianjohnhaas brianjohnhaas commented Jul 10, 2017

Loading

@DarioS
Copy link

@DarioS DarioS commented Jul 12, 2017

It only recently began happening BenLangmead/bowtie#52 Perhaps Bowtie will be updated soon and output ordered alignments, as it did before. You could also downgrade to version 1.1 in the meantime because Bowtie 1 is much faster (simpler algorithm) than Bowtie 2.

Loading

@patrick-douglas
Copy link
Collaborator

@patrick-douglas patrick-douglas commented Feb 2, 2019

Hello,
I'm having a similar issue, I'm running the following commandline

$align_and_estimate_abundance --transcripts $fasta_ref --seqType fq --samples_file $samples_file --est_method RSEM --aln_method bowtie2 --trinity_mode --thread_count $threads --gene_trans_map $gene_trans_map_file

However during processing I'm seeing a lot of warnings like bellow

....
Warning: Read SN1054:328:HGF77BCX2:2:2216:20693:63217 is ignored due to at least one of the mates' length < seed length (= 25)!
Warning: Read SN1054:328:HGF77BCX2:2:2216:20745:33963 is ignored due to at least one of the mates' length < seed length (= 25)!
Warning: Read SN1054:328:HGF77BCX2:2:2216:20790:49442 is ignored due to at least one of the mates' length < seed length (= 25)!
Warning: Read SN1054:328:HGF77BCX2:2:2216:20941:75533 is ignored due to at least one of the mates' length < seed length (= 25)!
Warning: Read SN1054:328:HGF77BCX2:2:2216:21025:31564 is ignored due to at least one of the mates' length < seed length (= 25)!
Warning: Read SN1054:328:HGF77BCX2:2:2216:21060:10374 is ignored due to at least one of the mates' length < seed length (= 25)!
Warning: Read SN1054:328:HGF77BCX2:2:2216:21140:89164 is ignored due to at least one of the mates' length < seed length (= 25)!
Warning: Read SN1054:328:HGF77BCX2:2:2216:21158:14723 is ignored due to at least one of the mates' length < seed length (= 25)!
Warning: Read SN1054:328:HGF77BCX2:2:2216:21183:10997 is ignored due to at least one of the mates' length < seed length (= 25)!
Warning: Read SN1054:328:HGF77BCX2:2:2216:21183:15602 is ignored due to at least one of the mates' length < seed length (= 25)!
Warning: Read SN1054:328:HGF77BCX2:2:2216:21205:62769 is ignored due to at least one of the mates' length < seed length (= 25)!
Warning: Read SN1054:328:HGF77BCX2:2:2216:21232:39489 is ignored due to at least one of the mates' length < seed length (= 25)!
Warning: Read SN1054:328:HGF77BCX2:2:2216:21283:5726 is ignored due to at least one of the mates' length < seed length (= 25)!
Warning: Read SN1054:328:HGF77BCX2:2:2216:21284:94154 is ignored due to at least one of the mates' length < seed length (= 25)!
...

The commandline ends with an error...

ROUND = 7029, SUM = 8427703.99999993, bChange = 0.00107337, totNum = 1
ROUND = 7030, SUM = 8427703.99999993, bChange = 0.00107345, totNum = 1
ROUND = 7031, SUM = 8427703.99999993, bChange = 0.00107354, totNum = 1
ROUND = 7032, SUM = 8427703.99999993, bChange = 0.00107363, totNum = 1
ROUND = 7033, SUM = 8427703.99999994, bChange = 0.00107372, totNum = 1
ROUND = 7034, SUM = 8427703.99999993, bChange = 0.0010738, totNum = 1
ROUND = 7035, SUM = 8427703.99999993, bChange = 0.000896156, totNum = 0
Expression Results are written!
Time Used for EM.cpp : 0 h 10 m 05 s

rm -rf RSEM.temp

CMD: touch RSEM.isoforms.results.ok
CMD: set -o pipefail && bowtie2 --no-mixed --no-discordant --gbar 1000 --end-to-end -k 200  -q -X 800 -x /home/me/sequenciamento_Tambaqui_INCA_Agosto_2018/REF/Trinity.fasta.bowtie2 -1 /home/me/sequenciamento_Tambaqui_INCA_Agosto_2018/FASTQ/FEMEA_f-3_r1.trimmed.fastq -2 /home/me/sequenciamento_Tambaqui_INCA_Agosto_2018/FASTQ/FEMEA_f-3_r2.trimmed.fastq -p 12 | samtools view -F 4 -S -b | samtools sort -n -o bowtie2.bam 
22421893 reads; of these:
  22421893 (100.00%) were paired; of these:
    14977154 (66.80%) aligned concordantly 0 times
    4042750 (18.03%) aligned concordantly exactly 1 time
    3401989 (15.17%) aligned concordantly >1 times
33.20% overall alignment rate
[bam_sort_core] merging from 15 files and 1 in-memory blocks...
CMD: touch bowtie2.bam.ok
CMD: convert-sam-for-rsem bowtie2.bam bowtie2.bam.for_rsem
samtools sort -n -@ 1 -m 1G -o bowtie2.bam.for_rsem.tmp.bam bowtie2.bam
[bam_sort_core] merging from 11 files and 1 in-memory blocks...

rsem-scan-for-paired-end-reads 1 bowtie2.bam.for_rsem.tmp.bam bowtie2.bam.for_rsem.bam
.
Number of first and second mates in read SN1054:328:HGF77BCX2:1:1101:1094:41616's full alignments (both mates are aligned) are not matched!
"rsem-scan-for-paired-end-reads 1 bowtie2.bam.for_rsem.tmp.bam bowtie2.bam.for_rsem.bam" failed! Plase check if you provide correct parameters/options for the pipeline!
Error, cmd: convert-sam-for-rsem bowtie2.bam bowtie2.bam.for_rsem died with ret: 65280 at /usr/local/bin/trinityrnaseq-Trinity-v2.8.3/util/align_and_estimate_abundance.pl line 790.

How to fix this?
Thank you in advance

Loading

@patrick-douglas
Copy link
Collaborator

@patrick-douglas patrick-douglas commented Feb 2, 2019

Hello,

I am trying to estimate transcript abundance from a transcriptome generated from paired-end RNA-seq data using Trinity. However, I am running into a consistent error while using the transcript_abundance.pl script that I cannot seem to debug. After submitting a PBS job using the following options:

perl align_and_estimate_abundance.pl --transcripts Trinity.fasta --seqType fq
--samples_file fundulus_bioreplicates.txt --SS_lib_type RF --est_method RSEM
--aln_method bowtie --trinity_mode --prep_reference --output_dir rsem_outdir
--thread_count 20

the reference files are successfully prepared. Subsequently, the program begins to make .bam files for the different biological replicates, but eventually the entire job fails in the rsem-calculate-expression phase. The error is as follows:

Read HWI-ST1234:250:C96G9ACXX:2:1101:3338:2119: The adjacent two lines do not represent the two mates of a paired-end read! (RSEM assumes the two mates of a paired-end read should be adjacent)
Error, cmd: rsem-calculate-expression --paired-end -p 4 --forward-prob 0 --no-bam-output --bam fundulus_RSEM.bowtie.bam (...) died with ret: 65280 at align_and_estimate_abundance.pl line 766.

I have verified that both reads in the pair with the tag "HWI-ST1234:250:C96G9ACXX:2:1101:3338:2119" exist in the .bam file. Has anyone experienced a similar error and had success debugging it? I am relatively new in the bioinformatics field and would greatly appreciate any insight on what might be going wrong.

Thanks!

Alex.

Could you send the script that you use to submit this task on PBS?
I tried use this tool in a cluster, but no success.

Loading

@xiutinghua
Copy link

@xiutinghua xiutinghua commented May 27, 2019

Hello,
I am trying to estimate transcript abundance from a transcriptome generated from paired-end RNA-seq data using Trinity. I'm running the following commandline. In fact, I submitted three samples submitted in the SGV, but only this sample(SES-MC-3_FRAS190026479-1a) failed.

~/software/trinityrnaseq-Trinity-v2.8.4/util/align_and_estimate_abundance.pl --transcripts Sspon.cds.fasta --seqType fq --left SES-MC-3_FRAS190026479-1a_1QC.fq.gz --right SES-MC-3_FRAS190026479-1a_2QC.fq.gz --est_method RSEM --output_dir MC --aln_method bowtie --thread_count 4 --prep_reference

The commandline ends with an error...

.Number of first and second mates in read A00808:49:HKYHNDSXX:2:1103:9462:31375's full alignments (both mates are aligned) are not matched!

"rsem-scan-for-paired-end-reads 1 bowtie.bam.for_rsem.tmp.bam bowtie.bam.for_rsem.bam" failed! Plase check if you provide correct parameters/options for the pipeline!
Error, cmd: convert-sam-for-rsem bowtie.bam bowtie.bam.for_rsem died with ret: 65280 at /public1/home/stu_huaxiuting/software/trinityrnaseq-Trinity-v2.8.4/util/align_and_estimate_abundance.pl line 790.

Thanks
xiuting

Loading

@brianjohnhaas
Copy link
Member

@brianjohnhaas brianjohnhaas commented May 27, 2019

Loading

@polaxgr
Copy link

@polaxgr polaxgr commented Feb 14, 2020

hello,

also got some problem running :

./align_and_estimate_abundance.pl --seqType fq --single '..x.fastq' --transcripts '.../Trinity.fasta' --output_dir '/..output' --est_method RSEM --aln_method bowtie2 --trinity_mode --prep_reference

it ran correctly up until


samtools sort -n -@ 1 -m 1G -o bowtie2.bam.for_rsem.tmp.bam bowtie2.bam

[bam_sort_core] merging from 114 files and 1 in-memory blocks...

rsem-scan-for-paired-end-reads 1 bowtie2.bam.for_rsem.tmp.bam bowtie2.bam.for_rsem.bam
..................................................................................................
Finished!

Conversion is completed. bowtie2.bam.for_rsem.bam will be checked by 'rsem-sam-validator'.
rsem-sam-validator bowtie2.bam.for_rsem.bam
...................................................................................................................................................................................................................................................................................................................................................................................................................................................
The input file is valid!

then it produced an error

CMD: rsem-calculate-expression     -p 4 --fragment-length-mean 200 --fragment-length-sd 80   --no-bam-output --bam bowtie2.bam.for_rsem.bam /Trinity_trimmed.fasta.RSEM RSEM 
rsem-parse-alignments /trinity_trimmed/Trinity_trimmed.fasta.RSEM RSEM.temp/RSEM RSEM.stat/RSEM bowtie2.bam.for_rsem.bam 1 -tag XM
Cannot open /trinity_trimmed/Trinity_trimmed.fasta.RSEM.grp! It may not exist.
"rsem-parse-alignments /home/app/Trinity_trimmed.fasta.RSEM RSEM.temp/RSEM RSEM.stat/RSEM bowtie2.bam.for_rsem.bam 1 -tag XM" failed! Plase check if you provide correct parameters/options for the pipeline!
Error, cmd: rsem-calculate-expression     -p 4 --fragment-length-mean 200 --fragment-length-sd 80   --no-bam-output --bam bowtie2.bam.for_rsem.bam /Trinity_trimmed.fasta.RSEM RSEM  died with ret: 65280 at ./align_and_estimate_abundance.pl line 729.

any help ? should i use bowtie(1)? or salmon?

edit: i saw here https://groups.google.com/forum/#!topic/trinityrnaseq-users/LEBWTtzUKkI , that you said you have to --prep_reference first. I got that option in my command. Should i run first the command --prep_reference and then the rest?

Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
6 participants