New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support to hisat2 #221

Closed
pgonzale60 opened this Issue May 26, 2016 · 17 comments

Comments

Projects
None yet
3 participants
@pgonzale60

pgonzale60 commented May 26, 2016

Hi Phil,

According to JHU "HISAT2 is a successor to both HISAT and TopHat2". It has the option of giving a report as report.txt, but I think the most clearly useful information is what it outputs to stderr:

hisat2_PE.txt
hisat2_SE.txt

I think the most important statistic is alignment rate, although alignment % classified by pairing also would be nice.

Bests,

Pablo

@ewels ewels added the module: new label May 27, 2016

@ewels

This comment has been minimized.

Show comment
Hide comment
@ewels

ewels May 27, 2016

Owner

Hi Pablo,

Thanks for the suggestion - remarkably, the two stderr outputs that you attach are identical in format to those produced by Bowtie2. So MultiQC actually already works with HISAT2, however it will label results as coming from Bowtie2.

These logs are really difficult for MultiQC to understand (I actually wrote a blog post on this topic yesterday). From what I can see, there is no way to distinguish these HISAT2 logs from Bowtie2. Or tell what input file they came from. Hopefully the reports that MultiQC will still be useful though - see this example which is based on your example files.

The report.txt file would be much easier to parse, however I can't see any useful fields within that file (input files, alignment rates).

If you have any ideas on how to improve this situation, let me know!

Phil

Owner

ewels commented May 27, 2016

Hi Pablo,

Thanks for the suggestion - remarkably, the two stderr outputs that you attach are identical in format to those produced by Bowtie2. So MultiQC actually already works with HISAT2, however it will label results as coming from Bowtie2.

These logs are really difficult for MultiQC to understand (I actually wrote a blog post on this topic yesterday). From what I can see, there is no way to distinguish these HISAT2 logs from Bowtie2. Or tell what input file they came from. Hopefully the reports that MultiQC will still be useful though - see this example which is based on your example files.

The report.txt file would be much easier to parse, however I can't see any useful fields within that file (input files, alignment rates).

If you have any ideas on how to improve this situation, let me know!

Phil

@ewels

This comment has been minimized.

Show comment
Hide comment
@ewels

ewels May 27, 2016

Owner

I've asked the authors of HISAT2 whether the reporting of summary statistics can be improved. See infphilo/hisat2#48

Owner

ewels commented May 27, 2016

I've asked the authors of HISAT2 whether the reporting of summary statistics can be improved. See infphilo/hisat2#48

@pgonzale60

This comment has been minimized.

Show comment
Hide comment
@pgonzale60

pgonzale60 May 27, 2016

That's great! Seems like it will be possible in a near future.

Also, I checked the open and closed issues of bowtie and didn't find a reason to separate each sample summary in individual tracks. Wouldn't it be easier to inspect them if they were aggregated as in the featureCounts and STAR modules?

pgonzale60 commented May 27, 2016

That's great! Seems like it will be possible in a near future.

Also, I checked the open and closed issues of bowtie and didn't find a reason to separate each sample summary in individual tracks. Wouldn't it be easier to inspect them if they were aggregated as in the featureCounts and STAR modules?

@ewels

This comment has been minimized.

Show comment
Hide comment
@ewels

ewels May 27, 2016

Owner

Yup, fingers crossed! 👌

When you say "separate each sample into individual tracks" - do you mean the bar plots? This is true in the example above because one is paired end and one is single end. The categories in the logs (and plots) are different, and not really comparable as I do some nasty stuff with halving the counts of individual mate alignments for PE alignments. Typically samples will be all SE or all PE, then they will be grouped into a single plot.

Phil

Owner

ewels commented May 27, 2016

Yup, fingers crossed! 👌

When you say "separate each sample into individual tracks" - do you mean the bar plots? This is true in the example above because one is paired end and one is single end. The categories in the logs (and plots) are different, and not really comparable as I do some nasty stuff with halving the counts of individual mate alignments for PE alignments. Typically samples will be all SE or all PE, then they will be grouped into a single plot.

Phil

@pgonzale60

This comment has been minimized.

Show comment
Hide comment
@pgonzale60

pgonzale60 May 27, 2016

Sorry, I didn't know. I promise I won't ask for a modification without trying it with a "real" example. Thanks!

Bests,
Pablo

pgonzale60 commented May 27, 2016

Sorry, I didn't know. I promise I won't ask for a modification without trying it with a "real" example. Thanks!

Bests,
Pablo

@ewels

This comment has been minimized.

Show comment
Hide comment
@ewels

ewels May 27, 2016

Owner

hah, no problem! Better to ask :)

Owner

ewels commented May 27, 2016

hah, no problem! Better to ask :)

@ewels

This comment has been minimized.

Show comment
Hide comment
@ewels

ewels May 30, 2016

Owner

Hi @pgonzale60,

I'll close this issue until new logs are available, then we can reopen.

Phil

Owner

ewels commented May 30, 2016

Hi @pgonzale60,

I'll close this issue until new logs are available, then we can reopen.

Phil

@ewels ewels closed this May 30, 2016

@ewels ewels reopened this May 31, 2017

@ewels

This comment has been minimized.

Show comment
Hide comment
@ewels

ewels May 31, 2017

Owner

HISAT2 summary stats output now updated, see infphilo/hisat2#48 (comment)

Now just waiting for the release..

Owner

ewels commented May 31, 2017

HISAT2 summary stats output now updated, see infphilo/hisat2#48 (comment)

Now just waiting for the release..

@pgonzale60

This comment has been minimized.

Show comment
Hide comment
@pgonzale60

pgonzale60 Jun 16, 2017

Hi, Phil!

Nice to see this happening! The update seems to be officially released (https://ccb.jhu.edu/software/hisat2/index.shtml). Could you please add support for the new summary?

Bests,
Pablo

pgonzale60 commented Jun 16, 2017

Hi, Phil!

Nice to see this happening! The update seems to be officially released (https://ccb.jhu.edu/software/hisat2/index.shtml). Could you please add support for the new summary?

Bests,
Pablo

@ewels

This comment has been minimized.

Show comment
Hide comment
@ewels

ewels Jun 16, 2017

Owner

Yup, I saw 😁 I've just installed the new version, need to do some test runs to generate some log output and then I'll see if I can write up the module. If you have any log files already generated from the new version that would be helpful!

Phil

Owner

ewels commented Jun 16, 2017

Yup, I saw 😁 I've just installed the new version, need to do some test runs to generate some log output and then I'll see if I can write up the module. If you have any log files already generated from the new version that would be helpful!

Phil

@ewels

This comment has been minimized.

Show comment
Hide comment
@ewels

ewels Jun 16, 2017

Owner

ah, just re-read the release summary:

  • Implemented --new-summary option to output a new style of alignment summary, which is easier to parse for programming purposes.
  • Implemented --summary-file option to output alignment summary to a file in addition to the terminal (e.g. stderr).

So the new release still doesn't give a nice log file output by default 😞

Owner

ewels commented Jun 16, 2017

ah, just re-read the release summary:

  • Implemented --new-summary option to output a new style of alignment summary, which is easier to parse for programming purposes.
  • Implemented --summary-file option to output alignment summary to a file in addition to the terminal (e.g. stderr).

So the new release still doesn't give a nice log file output by default 😞

ewels added a commit to ewels/MultiQC_TestData that referenced this issue Jun 16, 2017

@pgonzale60

This comment has been minimized.

Show comment
Hide comment
@pgonzale60

pgonzale60 Jun 17, 2017

Yes, that's not so cool. Still, this will allow MultiQC to distinguish hisat2 from bowtie2 :)
I see you added example output from single end libraries. I attach an example of paired end output.

examp_hisat2_newSummary-PE.txt

pgonzale60 commented Jun 17, 2017

Yes, that's not so cool. Still, this will allow MultiQC to distinguish hisat2 from bowtie2 :)
I see you added example output from single end libraries. I attach an example of paired end output.

examp_hisat2_newSummary-PE.txt

@ewels

This comment has been minimized.

Show comment
Hide comment
@ewels

ewels Jun 19, 2017

Owner

Great stuff, thanks!

Owner

ewels commented Jun 19, 2017

Great stuff, thanks!

ewels added a commit to ewels/MultiQC_TestData that referenced this issue Jun 19, 2017

@ewels ewels added this to Ready to be worked on in New Modules Jun 29, 2017

@ewels ewels closed this in 552e341 Jul 5, 2017

@ewels ewels removed this from Ready to be worked on in New Modules Jul 5, 2017

@zmiimz

This comment has been minimized.

Show comment
Hide comment
@zmiimz

zmiimz Apr 27, 2018

If logs are indistinguishable, would it be not more explanatory to add the appropriate header description like “Bowtie2 / Hisat2”?

zmiimz commented Apr 27, 2018

If logs are indistinguishable, would it be not more explanatory to add the appropriate header description like “Bowtie2 / Hisat2”?

@ewels

This comment has been minimized.

Show comment
Hide comment
@ewels

ewels May 18, 2018

Owner

Yes, we could do I suppose. Do you think this is a big enough problem to warrant a change?

Owner

ewels commented May 18, 2018

Yes, we could do I suppose. Do you think this is a big enough problem to warrant a change?

@zmiimz

This comment has been minimized.

Show comment
Hide comment
@zmiimz

zmiimz May 21, 2018

I think that the clarity of the MultiQC report is worth of any change in the code.

zmiimz commented May 21, 2018

I think that the clarity of the MultiQC report is worth of any change in the code.

@ewels

This comment has been minimized.

Show comment
Hide comment
@ewels

ewels May 24, 2018

Owner

Ok 👍 Added as a new issue: #765

Owner

ewels commented May 24, 2018

Ok 👍 Added as a new issue: #765

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment