Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IUC Contribution Fest - Metagenomics Tools and Workflows #299

Closed
jmchilton opened this issue Oct 6, 2015 · 66 comments
Closed

IUC Contribution Fest - Metagenomics Tools and Workflows #299

jmchilton opened this issue Oct 6, 2015 · 66 comments

Comments

@jmchilton
Copy link
Member

Remote Contribution Fest happens on 30 Nov and 1 Dec

We are planning a Remote Contribution Fest on 30th of November and 1st of December for developers to work mainly on Galaxy metagenomic tools.

The remote nature of this gives people who don't have the opportunity to come to GCC hackathons (which have always been productive and a lot of fun) a chance to participate in a Galaxy hackathon. Having a well defined topic will allow us to accomplish a lot and let people who don't have particular tasks in mind find something to work on very quickly.

We are collecting ideas to work on, but we expect to attract the most participation by simply getting tool developers interested in getting help adding collection support to their existing tools and workflows to show up and participate.

If you are interested in participating in the hackathon but not interested in actual tool development - we will assemble a list of smaller, manageable Python and JavaScript tasks to work on and certainly documentation is a chronically lacking for collections so we could use help there and no actual coding would be required.

We encourage ideas or advice about how to organize this so please let us know. A core group will be available on IRC all day and we will have 4 google hangouts across those days to organize, answer questions, and report progress.

We will do our best to coordinate and make this hackathon a nice and productive experience and we would like to especially focus on working reasonable hours and discourage overnighters.

All forms of contribution are welcome!

Original idea from @bgruening (galaxyproject/tools-devteam#26 (comment)). See #239 for information on the last hackathon.

Things to work on

Google Maps link - please add yourself

https://www.google.com/maps/d/edit?mid=zQipez5uCQ8I.khoUIhze9Fek&usp

Google Hangout

https://hangouts.google.com/call/iems2ffkpd4s7cyjoq4mi3hgwma

IRC

https://wiki.galaxyproject.org/News/IRCPublicLogProposal

@hexylena
Copy link
Member

hexylena commented Oct 6, 2015

👍 to biom, I'd be happy to do a phinch viz plugin/tool if a biom datatype was available

@jmchilton
Copy link
Member Author

@erasche - add to the TODO list and sign yourself up 😄

@bgruening
Copy link
Member

It would be lovely if we could get @jennaj on-board and maybe can create a workflow that we stick into the Tool Shed and myexperiments.org.

@jennaj
Copy link
Member

jennaj commented Oct 6, 2015

@bgruening Funny this should come up here - it was also brought up last Fri at a data hack meeting. Metagenomics is definitely on the table for the next cycle. Maybe we can combine forces :) I'll email the data hacks with a link to this.

@bgruening
Copy link
Member

Thanks @jennaj!

@nekrut
Copy link
Contributor

nekrut commented Oct 7, 2015

Perhaps we need a bit of a big picture. Improving individual tools is necessary, but we should focus on entire types of MG workflows (e.g., what are potential complete workflows for doing human microbiome studies). This is best accomplished using actual biological projects. Can you guys make a list of analyses you are currently supporting. I'll describe mine

@lparsons
Copy link
Contributor

lparsons commented Oct 7, 2015

@nekrut Great point and a very good first step. I'm looking to analyze 16S data to generate OTU percentages per sample and PCA plot for diversity analysis. Following a workflow similar to that outlined in http://nbviewer.ipython.org/github/biocore/qiime/blob/1.9.1/examples/ipynb/illumina_overview_tutorial.ipynb

@robldavidson
Copy link

I like the idea of linking up the GalaxyScientists (@jennaj's data hackers) with the IUC for this - it would be good to have a project to foster some dialogue between these two groups.

@tiagoantao
Copy link
Contributor

I am not a metagenomics person, but I have seen people struggling with the mothur tool (if you install it, you will see what I mean). My impression is that a few workflows would help a lot. If you look at the mothur documentation, you could easily get inspiration. For example here: http://www.mothur.org/wiki/Analysis_examples

PS - You could consider doing a RAD-seq hackathon in the future. I would like to help on that one...

@bebatut
Copy link
Member

bebatut commented Oct 8, 2015

I'm currently developing a Galaxy instance dedicated to gut microbiota data processing (metagenomics and metatranscriptomics). I made some wrappers for tools such as HUMAnN, MetaPhlAn, SortMeRNA, Reago, PRINSEQ, for which I did not find wrappers fitting my requirements. The aim is to construct a workflow to process data from intestinal microbiota.
I would like to join you on this hackathon

@bgruening
Copy link
Member

@bebatut this is awesome. If you like please join this hackathon and we could review your wrappers if you like.
Btw.: SortMeRNA and PRINSEQ wrappers are already here

@tiagoantao excellent suggestion!

@nsoranzo
Copy link
Member

nsoranzo commented Oct 8, 2015

@bebatut MetaPhlAn and SortMeRNA are also on the Tool Shed, please do not reinvent the wheel, if something is missing please contribute to the already available tools!

https://toolshed.g2.bx.psu.edu/view/rnateam/sortmerna/
https://toolshed.g2.bx.psu.edu/view/dannon/metaphlan/

@bgruening Should PRINSEQ also be published on the MTS or it's not ready yet?
@dannon What about adding metaphlan wrapper code to the IUC github repo?

@bgruening
Copy link
Member

@nsoranzo PRINSEQ It needs some more love but it was on a good way. Happy to work with @bebatut on this during the hackathon.

@bebatut
Copy link
Member

bebatut commented Oct 8, 2015

@bgruening @nsoranzo I saw the wrappers for Prinseq and SortMeRNA, but they do not correspond to what I wanted. For example, not all the parameters are accessible with SortMeRNA wrapper.
I developed a wrapper for MetaPhlAn 2 not for MetaPhlAn...

@bgruening
Copy link
Member

@bebatut let's merge our wrappers during the hackthon and join (maintenance) forces!

@tiagoantao
Copy link
Contributor

I think that there is a bug in the kraken data manager.
Try the following
admin>local data>Run data manager tools>Kraken

Select partial library to download> Human

Depending on my galaxy installation I am getting different errors. The most common is not being able to download human chromosomes. But works well with the other options

@nekrut
Copy link
Contributor

nekrut commented Oct 9, 2015

https://test.galaxyproject.org/u/sjcc/w/16s

Here is the king of stuff we are currently doing:

Input > Quality Trim (FASTQ Quality Trimmer) > Join R1 + R2 (fastq-join) > remove reads with ambiguous bases (Manipulate FASTQ) > remove chimeras (cd-hit-dup) > assign reads to taxon (Kraken) > filter Kraken by confidence level (Kraken-filter) > translate taxon id number into taxon name (Kraken-translate) > report Kraken results (Kraken-report)

Filter Kraken report to see what data you want (Filter) <-- I use this to isolate a taxonomic level (i.e. P, C, O, F, G, S)

Rarefaction analysis (Vegan Rarefaction)
Diversity (Vegan Diversity)

@bgruening
Copy link
Member

@tiagoantao can you move this bug report to a proper issue, so it get not lost?
@nekrut thanks very useful. We could work on materializing this into a TS workflow with documentation. Anything on you wishlist, preferably a "easy" tool without *.loc files?

@jmchilton
Copy link
Member Author

I think we as Galaxy community should also consider a Brew related hackathon (https://github.com/Homebrew/homebrew-science/issues/2876) and I would like if we could schedule one or the other for November. Any particular thoughts on whether we should do this one in November and the Homebrew one in January or vice versa?

@bgruening
Copy link
Member

November for the metagenomic Codefest and January for the next. Until then we can make the brew idea more concrete.

@bgruening
Copy link
Member

What about the 5th and 6th of November? To early?

@jmchilton
Copy link
Member Author

@bgruening I'll be at a conference on the 5th and traveling on the 6th. I don't know if Anton intends to participate directly with the hackathon but he will be at the same conference. Obviously I have much less to contribute to a metagenomics hackathon than a collections one though, so feel free to continue without me. I could do the 19th and 20th or the 9th and 10th.

@jennaj
Copy link
Member

jennaj commented Oct 16, 2015

2nd Nov 9th and 10th. Some of us might or could be in PA that are not otherwise, which could be nice if the primary reason for being there does not conflict (occurs later in the week).

@blankenberg
Copy link
Member

BIOM xref: galaxyproject/galaxy#941

@lparsons
Copy link
Contributor

I could do either of: 5th and 6th, 9th and 10th, or 19th and 20th. That said, the 9th and 10th are looking rather nice.

@jmchilton
Copy link
Member Author

@bgruening Can you do the 9th and 10th?

@bgruening
Copy link
Member

If so I will sitting next to you and will have a hard time to motivate peoples from Freiburg remotely.
But yes, I'm ok with this.

@jmchilton
Copy link
Member Author

Oh - yeah I suppose that might not be the best use of your limited time here. Forgot about that.

@bgruening
Copy link
Member

30th of November and 1st of December?

@jmchilton
Copy link
Member Author

This works for me.

@bgruening
Copy link
Member

@IyadKandalaft what do you think? #419

@oxyko
Copy link

oxyko commented Nov 26, 2015

@bgruening My concrete questions are:

  1. Is there going to be a kick-off meeting over the hangouts?
  2. How to contact the 4 mentioned google hangouts on the day of the hackathon?

@bgruening
Copy link
Member

  1. Kick-off meeting is hard because we have so many timezone. If you like I will get an introduction and show a little bit planemo. Just let me know.
  2. There will be only one hangout, I will post a link over on Friday or over the weekend and everyone can connect (let see how many people can join such a hangout :))

@oxyko
Copy link

oxyko commented Nov 26, 2015

Sounds great! I would appreciate a bit of an intro, since I haven't done remote hackathons before.

@bgruening
Copy link
Member

Sure, we can do this. In general don't hesitate to ask questions at any time.

@yvanlebras
Copy link
Contributor

Hi everyone,

just an update concerning:
1/ Mothur. We have, in the GUGGO Tool Shed, quite old Mothur (1.32) toolshedized tools developped by colleagues from the CNRS / OSUR Biogenouest Environmental genomics core facility in partnership with GenOuest.

I don't know if this can be, totally or in part, of interest for the Mothur community but I remember that the engineer who has working on it, Mathieu Bahin, have made large improvements of the existing (old) Mothur Galaxy tools.... Maybe someone from the "Mothur Galaxy" community (@IyadKandalaft ?) can efficiently evaluate this?
An introduction (on remote hackathon &/or on metagenomics pipeline) can be of interest

2/ Qiime. The Ifremer colleagues have developped, totally from scratch if I don't make mistakes, Galaxy tools using Qiime 1.9. They have used the qiimetogalaxy script to create automatically the basis of Galaxy descriptors.... Once again, maybe this can be of interest for the "Qiime Galaxy" community (@lparsons).... or not ;)

@pjbriggs
Copy link
Contributor

Apologies for arriving late: I've now got the okay from my superiors to contribute to the toolfest next week. We have a number of users who would like to use Qiime within Galaxy so I'd like to offer effort towards making an "official" toolshed version of the Qiime tools as originally proposed by @lparsons.

We have a list of specific Qiime components that we're interested in, all of which are already have versions in Lance's github repo. So I'll be available to contribute general programming effort to work on any identified by the organisers (if there is such a list). However if there's a better way to contribute then please let me know - mainly we just want to help move this forward.

Aside from that: a while ago I developed a data manager for the Mothur toolsuite, which is available on the test toolshed but never got developed further. If this is of interest to anyone here, again please let me know.

@bgruening
Copy link
Member

@yvanlebras nice I have added it to: #419
@pjbriggs I have added the data manager to list potential todo list: #419
@pjbriggs I guess you and @lparsons can team up for a Qiime team :)

@yvanlebras
Copy link
Contributor

@erasche @jmchilton the biom datatype proposed by @fescudie , who will participate to the hackathon, for the FROGS pipeline is maybe ok to propose to do a first phinch viz plugin/tool ?

@tiagoantao
Copy link
Contributor

Just a side comment from someone that does not do metagenomics but sometimes help other people doing it: The mothur tool is very very complex to use a this stage. My impression is that a couple of well thought workflows would help a lot (even if the tool maintains the current format).

For example an implementation of some of their SOPs:
http://www.mothur.org/wiki/MiSeq_SOP
You can find a list here:
http://www.mothur.org/wiki/Analysis_examples

@IyadKandalaft
Copy link

As one of the developers on the galaxy mothur tools, I agree that we need
to include some workflows. We have some workflows in-house that are based
off the sop. I'd consider adding them into a separate repository with a
dependency on mothur.
On 27-Nov-2015 2:46 pm, "Tiago Antao" notifications@github.com wrote:

Just a side comment from someone that does not do metagenomics but
sometimes help other people doing it: The mothur tool is very very complex
to use a this stage. My impression is that a couple of well thought
workflows would help a lot (even if the tool maintains the current format).

For example an implementation of some of their SOPs:
http://www.mothur.org/wiki/MiSeq_SOP
You can find a list here:
http://www.mothur.org/wiki/Analysis_examples


Reply to this email directly or view it on GitHub
#299 (comment)
.

@bgruening
Copy link
Member

@shujianbu do you have time tomorrow to help with Phinch integration?

@yvanlebras
Copy link
Contributor

Hi everyone,

Just few slides that we will use to briefly present the initiative, our pipelines and some To-Do.... : Presentation slides

I will update this document during the hackathon event. Don't hesitate to comment / ask questions. In France this will begin in 17 minutes ;)

@bgruening
Copy link
Member

I updated the original post with links to a google hangout and our IRC channel, feel free to join!

@shujianbu
Copy link

Hi Bjorn,

What's the context/scope of the project integration? I haven't maintained
the Phinch code for a while.

Thanks,

On Sun, Nov 29, 2015 at 11:55 PM, Björn Grüning notifications@github.com
wrote:

@shujianbu https://github.com/shujianbu do you have time tomorrow to
help with Phinch integration?


Reply to this email directly or view it on GitHub
#299 (comment)
.

@bgruening
Copy link
Member

@shujianbu we want to integrate Phinch into Galaxy. The idea is to put Phinch in a Docker container and integrate it into Galaxy. Every biom dataset that is created in Galaxy can than be visualised with Phinch without leaving the analysis platform.

Is/will Phinch be maintained? I have a pending PR since a few month.

@bgruening
Copy link
Member

@yvanlebras awesome. When do you start?

I will also update the main issue above regularly with news. Here is our first PR that adds the Mash package: #427
Anyone up to write the wrappers?

@bgruening
Copy link
Member

@bebatut #428

@bgruening
Copy link
Member

Work will continue in various branches, but I will close this one :)
A summary of this event can be found here: https://gist.github.com/bgruening/734ff541e9d1b02f7b09

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests