- Gene expression time course in human cells (here)
- Sandra's dataset
- Options for different distance/dissimilarity measures
- Options for different types of clustering (hierarchical or K-partitioning)
- Produce a "silhouette plot" (http://www.sciencedirect.com/science/article/pii/0377042787901257?via%3Dihub) to diagnose if the chosen number of clusters makes sense
- More interactivity between graphs, like we discussed choosing a gene that is highlighted in the graph, clicking one graph to trigger an action that changes another graph next to it, etc...
- GO enrichment
- motif enrichment
- For Arabidopsis, potential TF binding enrichment using the DAP-seq dataset?
- There is also the following database, which can be useful, for plants TFs: http://plantregmap.cbi.pku.edu.cn/tf_enrichment.php
What can we add to these already existing apps?
https://kcvi.shinyapps.io/START/
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5291987/
http://biit.cs.ut.ee/clustvis/
Interactive heatmaps: https://blog.rstudio.com/2015/06/24/d3heatmap/
Coupled evens in plots: https://plot.ly/r/shiny-coupled-events/
(Essential functionality is highlighted in bold)
-
Input - restrict/allow options based on the input organism
- if time left: implement solutions for non-model species
-
We start with DEG list (generated by some other tool, provide links to tools)
- supported types of input data: FPKM, TPM, log FC, Z-score
- data transformation
- generate heatmap (different methods)
- cluster genes by expression patterns
- support options for clustering (number of clusters, distance measure, etc)
- silhouete plots to verify/diagnose number of clusters
-
Interactivity
- search by gene name and highlighting it in a row
- table for genes with name, annotation, expression values
- support custom annotation for non-model species (gff files)
- GO enrichment analysis on a cluster - multiple methods/algorithm
- motif/promoter prediction for clusters (genome sequence required)
- extract/highlight TFs from the cluster (annotation required) (*)
-
Recover results
- get code to repeat everything with reduced interactivity, buttons for single methods
- save graphs
- save table outputs (clustering results + expression + annotation)
-
Future work:
- open an issue and create a branch: issue+N
- work on one issue at a time
Data