Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is --outFilterMultimapNmax set to 1 OK for RSEM? #97

Closed
GIS-liuxl opened this issue Apr 24, 2017 · 6 comments
Closed

Is --outFilterMultimapNmax set to 1 OK for RSEM? #97

GIS-liuxl opened this issue Apr 24, 2017 · 6 comments

Comments

@GIS-liuxl
Copy link

Dear rpd team,

In RSEM paper, author mentioned that when using an alternative aligner (default Bowtie)

aligners must be configured to report all valid alignments of a read, and not just a single "best" alignment.

In our latest release (2017-01), we set --outFilterMultimapNmax to 1 for STAR, which means we only keep those uniquely mapped reads (am I right?). May I know it is OK for using RSEM downstream?

bless~

@justinjj24
Copy link
Contributor

Hi,
It's fine to use no big deal since we keep only uniquely mapped reads. STAR mapper uses bowtie2 aligner and provides the rsem compatible transcriptome bam file.

Justin

@GIS-liuxl
Copy link
Author

Hi @justinjj24,

Thank you very much! So, the transcriptome bam file actually provides multiple alignment information of a read in genome reference mapping even if we only keep the uniquely mapping read (in transcriptome mapping).

bless~

@justinjj24
Copy link
Contributor

No. In our pipeline star map the reads against the genome and keep only the uniquely mapped reads and not multimap or best alignment of the multimap, later convert the output genomic alignments in transcriptomic coordinates (Aligned.toTranscriptome.out.bam) for rsem quantification which only recognizes transcriptome coordinate based bam. So in both genomic and transcriptome bam contain only uniquely mapped reads.

But you may see NH flag in the transcriptome bam (coordinates are by transcripts, not genes) referring reads with multimap eg."NH:i:3" mapped to three different isoforms of the same gene and not map to different location in the genome.

@GIS-liuxl
Copy link
Author

Hi @justinjj24

Oh, I see. Maybe I can understand in this way: the "NH:i:3" case in our pipeline is the counterpart of "all valid alignments" in RSEM paper. The RSEM paper actually means alignment to genome when it mentioned "all valid alignments".

bless~

@justinjj24
Copy link
Contributor

RSEM paper suggests when you output multimap from other aligner and feed the bam to rsem for quantification it prefers to choose the best alignment based on their own method, not just the aligner chooses the best alignment by alignment quality of the mapped read.

@GIS-liuxl
Copy link
Author

@justinjj24

Thank you very much for your help!

bless~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants