New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project 4: Develop an interactive application to help understand alpha and beta diversity metrics choices #4

Open
abaghela opened this Issue Aug 2, 2017 · 11 comments

Comments

Projects
None yet
4 participants
@abaghela
Contributor

abaghela commented Aug 2, 2017

Develop an interactive application to facilitate informed sequencing quality control decisions for downstream analysis on many samples

There's the saying of "garbage in, garbage out" in computer science where the quality of your input influences downstream analyses. Genome sequencing has decreased in cost and so experiments can have many more samples. Manually checking each sample can be time consuming, and less precise. So I propose the development of web application or tool where you can drop in your samples and interactively explore the quality of your samples. This tool could be built by various means. One option would be to develop a Shiny R application, which would require knowledge of R, the Shiny package, and possibly HTML/CSS/JavaScript. Another would be to rely on web development standards (HTML/CSS/JS) to build something like an Electron application for cross browser compatibility and be user friendly. This idea stems from my experience dealing with 16S rRNA sequencing samples. I had a single experiment collect about 200 samples, with a total of about 400 samples for paired end sequencing. Manually viewing all 400 samples is time consuming. Additionally, further analysis of sequencing reads typically require some trimming based on the quality diminishing with longer reads. This tool could also be designed to recommend an ideal trim length based on your specifications of a hard threshold trimming all samples this length, or a dynamic threshold per sample basis. This trimming parameter will depend on the downstream tools used if they can handle such varying read lengths.

Team Lead: Eric Leung | leunge@ohsu.edu | @erictleung | Grad Student | Oregon Health & Science University, USA |

@erictleung

This comment has been minimized.

erictleung commented Sep 10, 2017

Sooo I may have found someone's solution to my proposed project called MultiQC (GitHub link). It was published just over a year ago and is even more robust and has more functionality than just for my 16S rRNA use case. A quick Biostars/Google search could have saved me time 😅

@abaghela if you allow me, I have another proposition for a project I could lead that is specific to microbiome analysis. Let me know if you have any concerns with this new proposed project or not. Thanks.


Title: Develop an interactive application to help understand alpha and beta diversity metrics choices

Problem: There are many alpha and beta diversity metrics to analyze microbial ecological or microbiome data. Alpha diversity describes an estimate of the total number of species in a sample. Beta diversity describes the differences between samples. Below are some example of then number of metrics you can use.

Plot from "Alpha diversity graphics" page for phyloseq showing various alpha diversity metrics to choose from http://joey711.github.io/phyloseq/plot_richness-examples

Below is are just a few beta diversity metrics choose from

> library(phyloseq)
> unlist(distanceMethodList)
    UniFrac1     UniFrac2        DPCoA          JSD     vegdist1     vegdist2
   "unifrac"   "wunifrac"      "dpcoa"        "jsd"  "manhattan"  "euclidean"
    vegdist3     vegdist4     vegdist5     vegdist6     vegdist7     vegdist8
  "canberra"       "bray" "kulczynski"    "jaccard"      "gower"   "altGower"
    vegdist9    vegdist10    vegdist11    vegdist12    vegdist13    vegdist14
  "morisita"       "horn"  "mountford"       "raup"   "binomial"       "chao"
   vegdist15   betadiver1   betadiver2   betadiver3   betadiver4   betadiver5
       "cao"          "w"         "-1"          "c"         "wb"          "r"
  betadiver6   betadiver7   betadiver8   betadiver9  betadiver10  betadiver11
         "I"          "e"          "t"         "me"          "j"        "sor"
 betadiver12  betadiver13  betadiver14  betadiver15  betadiver16  betadiver17
         "m"         "-2"         "co"         "cc"          "g"         "-3"
 betadiver18  betadiver19  betadiver20  betadiver21  betadiver22  betadiver23
         "l"         "19"         "hk"        "rlb"        "sim"         "gl"
 betadiver24        dist1        dist2        dist3   designdist
         "z"    "maximum"     "binary"  "minkowski"        "ANY"

With so many metrics to choose from, how do you know which is the "best" and how will your data affect the calculation of these metrics?

Proposed Project: Create an interactive Shiny application to show changes in your chosen alpha or beta diversity metrics to see how each change based on simulated or real data. Some of these metrics are sensitive to single or double counts of species so this will be good to see how different distributions of counts will change these metrics and your interpretations of them. This should be designed to give an intuitive understanding of how these metrics work.

Possible Requirements:

  • Knowledge of R programming
  • Knowledge (or willingness to learn) Shiny R package
  • Local computer is sufficient to develop
  • RStudio installed (this will make it easier to make Shiny apps)
@abaghela

This comment has been minimized.

Contributor

abaghela commented Sep 11, 2017

@erictleung Hi Eric, we approve your change in project. We are looking forward to this new one!

@abaghela abaghela changed the title from Project 4: Develop an interactive application to facilitate informed sequencing quality control decisions for downstream analysis on many samples to Project 4: Develop an interactive application to help understand alpha and beta diversity metrics choices Sep 11, 2017

@ampatzia

This comment has been minimized.

ampatzia commented Oct 1, 2017

Assignments are out, really looking forward to collaborating in this 👍
@erictleung Need in help with preparation?

@erictleung

This comment has been minimized.

erictleung commented Oct 2, 2017

@ampatzia thanks for your interest! I've created a bare repository for put this project. I plan on getting a base Shiny application up for people to get up and running later this week, along with some ideas of what could be in the application itself. If I come up with anything else, I'll let you know! 😄

@erictleung

This comment has been minimized.

erictleung commented Oct 4, 2017

Some good articles to use while working on this project will be http://shiny.rstudio.com/articles/. It has lots of content on getting started, building the structure, frontend and backend sides of the application, and improving it.

@jakelever

This comment has been minimized.

jakelever commented Oct 10, 2017

Hey team lead, we've been gathering Github IDs for your team members. We see that you've already started a repo for this project. So could you please add the following people as collaborators to that project?

aimirza
amanji
rnoronha00
ampatzia
vnsriniv

Once the people are added, it'd be a great idea to start a discussion on that repo with information to get your team members started (e.g. some small suggested reading, things to look up, etc). We will also be adding everyone to Slack and creating a specific channel for each project. This may be an easier way to communicate.

We'll forward on any remaining Github IDs through this issue.

Thanks, Jake
obo the Hackseq organising committee

@erictleung

This comment has been minimized.

erictleung commented Oct 11, 2017

@jakelever thanks!

@jakelever

This comment has been minimized.

jakelever commented Oct 11, 2017

Hi, one more Github ID for you:

cabrerad

Thanks, Jake

@jakelever

This comment has been minimized.

jakelever commented Oct 13, 2017

And one last one: scatcher125

Cheers, Jake

@erictleung

This comment has been minimized.

erictleung commented Oct 14, 2017

@jakelever both added! Thanks for the update.

@jakelever

This comment has been minimized.

jakelever commented Oct 17, 2017

And actually one more Github ID: szhan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment