Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Salmon and/or Kallisto #171

Closed
ewels opened this issue Mar 17, 2019 · 13 comments
Closed

Add Salmon and/or Kallisto #171

ewels opened this issue Mar 17, 2019 · 13 comments
Labels
feature-request help wanted Extra attention is needed

Comments

@ewels
Copy link
Member

ewels commented Mar 17, 2019

It's been at the back of my mind for a while now that it could be nice to add in the option to run Salmon and/or Kallisto. This could be instead of STAR/HiSAT + featureCounts or even in addition to.

Looking through forks of this pipeline, some people have already implemented this. Notably
@kerimoff with Salmon (diff here) and @lconde-ucl with Kallisto (diff here).

Thoughts and feedback welcome! Should this be a separate pipeline (rnaquant?) or would it be good in this rnaseq pipeline?

@ewels ewels added feature-request help wanted Extra attention is needed labels Mar 17, 2019
@kerimoff
Copy link

@ewels thanks for opening this issue. I was going to speak to you about this but I thought it is not ready yet.
I did quite some work to add

  • salmon (transcript expression)
  • DEXseq exon expression
  • LeafCutter (Splicing expression)
  • TxRevise (customized salmon index quantification) - Ignore this for now :)

It works at the moment, we use it in our local HPC. But, it is needs some more work like:

  • single containerization (I did not change the main rnaseq container, and used specific container per process)
  • CI stuff
  • and some cosmetic changes

About, adding a separate pipeline or not: I think no need for new pipeline. These new quantification methods can be added as optional features to existing pipeline. So, if it will be run without any added "new" flags like e.g. "--run-transcript-quant" it will run only gene_expression as it does now.

P.S: I wish I could come to Tubingen Hackaton to discuss but, unfortunately I am too busy developing QTL mapping pipeline :D

@olgabot
Copy link
Contributor

olgabot commented Apr 2, 2019

Ooh DEXSeq and Leafcutter! How exciting! +1 for beyond-the-gene-model

@lpantano
Copy link
Contributor

Hi there!

@kerimoff , I really would like to have salmon here. Is there a branch with the tools you already have even if they are not part of the container?

I would be happy to help in any way to get this done, I would love to run this pipeline with that option.

Thanks

@kerimoff
Copy link

Hi @lpantano ,
Check out https://github.com/kerimoff/rnaseq
As I mentioned above it is ready to use with container support. We use salmon only for transcript expression, but you can sum them up to get gene expression.

Let me know if you have further questions,
Enjoy! ;)

@lpantano
Copy link
Contributor

cool, thanks!

is still the idea to integrate it in this repo?

@kerimoff
Copy link

cool, thanks!

is still the idea to integrate it in this repo?

As soon as possible ;)

@olgabot
Copy link
Contributor

olgabot commented Jun 3, 2019

Hello! I was in the middle of adding HTSeq because I thought it solved the problem of the union exon model. However, it doesn't:

In the case of RNA-Seq, the features are typically genes, where each gene is considered here as the union of all its exons.
(from HTSEq documentation)

Since I was already working on upgrading the gene expression quantifier, and have used Salmon before, AND @kerimoff already has it working, I can integrate that fairly quickly. RSEM I have not used before and would need more time to integrate it. What do you think?

@apeltzer
Copy link
Member

apeltzer commented Jun 3, 2019

👍 from me! It's also nice because there are MultiQC module(s) for both Salmon and RSEM, making the reporting very useful. If there is something missing in these modules, we could add that as well.

@apeltzer
Copy link
Member

apeltzer commented Jun 3, 2019

Also bioconda recipes available for both - you could create a PR with updating the environment again, dropping HTSeq but including Salmon + RSEM there as a start.

@lpantano
Copy link
Contributor

lpantano commented Jun 3, 2019

I agree this is a very important feature. I did in my repo as well, since I was waiting to see what happens here. Whoever can do it first, should do it. And if one is quicker to other to integrate, I would do one before the other.

@apeltzer
Copy link
Member

apeltzer commented Jun 3, 2019

I guess you could jointly work on adding it if you already have experience in working on it. (I'm on vacation fairly soon so won't be able to help a lot after June 07th till 20th), but happy to review afterward. Also, I think there are a couple of others who'd be capable to have a look and review while I'm gone 👍

@lpantano
Copy link
Contributor

lpantano commented Jun 3, 2019

@olgabot, I am totally ok if you can do it, but I would do it in another PR, adding one feature at a time if that makes sense. Happy to help if needed. I have it added like this here:

let's do it!

@drpatelh
Copy link
Member

drpatelh commented Jul 8, 2019

Added Salmon in:
#221

@drpatelh drpatelh closed this as completed Jul 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

6 participants