GitHub - danknights/mwas: Perform microbiome-wide association studies with a variety of models

Package Name	Developers	Date
MWAS R Package v0.9.3	Hu Huang,Emmanuel Montassier, Pajau Vangay, Gabe Al Ghalith, Dan Knights	03-01-2015

MWAS (microbiome-wide association study) package is a R-based toolbox for microbiome study, developped by the members of the Knights Lab at the University of Minnesota, Twin Cities. It provides three main functional modules: learning a predictve model, predicting an unknown microbiome data, and visualization of different results. The latest update is version 0.9.3 (03-2015).

MWAS is developed in R, however, it also provides a Unix command-line interface as a simplified application for those who are not familliar with R.

(Examples are in preparation. Full document will be available soon.)

1. Installing MWAS

Click the above link (section title) for more details.

Use the following command to set MWAS_DIR in the Terminal (or an equivalent command window; /MWAS_directory should be your actual directory):
echo "export MWAS_DIR=$HOME/MWAS_directory" >> ~/.bash_profile
You might need to install dependencies seperately, if it cannot install or load the required packages. Most of the dependencies would be installed when running the corresponding function commands, except one pacakge optparse. Follow the steps below to install this package (This step is needed only when you use the command-line functions for the first time.):
- Open R Console in Terminal (or use RStudio)
- Install the pacakge: install.packages("optparse")
- You should be able to use the MWAS functions now.

(Detailed testing information is available here.)

2. MWAS "learn" Module

Command-line version (in Terminal)
Rscript MWAS_DIR/bin/mwas_analysis.R -w learn -M SVM -C linear -i MWAS_DIR/data/taxa/GG_100nt_even10k-adults_L7.biom -m MWAS_DIR/data/gg-map-adults.txt -o example/svm_output -c COUNTRY -f -v FDR -s 0.05

-w: learn mode
-M: classifier type
-C: kernel type for SVM
-i: input file
-m: mapfile
-c: category name
-o: output directory
-f: proceed feature selection -v: feature selection method: fdr or rf
-s: threshold for feature selection (determines the number of features)

R version (in R Console)

If you are familiar with R, you could manipulate your data in a more flexible way. Here is the same example as shown in the command-line version.

file.sources = list.files("MWAS_DIR/lib", pattern="*.R$", full.names=TRUE, ignore.case=TRUE) invisible(sapply(file.sources, source, .GlobalEnv))

opts <- list()
opts$mode <- "learn"
opts$method <- "SVM"
opts$input_fp <- "data/taxa/GG_100nt_even10k-adults_L7.biom"
opts$map_fp <- "data/gg-map-adults.txt"
opts$category <- "COUNTRY"
opts$outdir <- "example/svm_learn"
opts$nfolds <- 5
opts$method_param <- "linear"
opts$ftMethod <- "FDR"
opts$is_feat <- TRUE
opts$feat_param <- 0.05

train_params <- import.train.params(opts)
best_model <- train.mwas(train_params)

3. MWAS "predict" Module

4. MWAS Visualization Module

5. MWAS feature "statistics" Module

6. Example 1: Learning a predictive model

7. Example 2: Prediction from an unknown dataset

8. Example 3: Taxon Statistical Analysis and Visualization

9. Common Errors and Solutions

References

Breiman, L. (2001). Random forests. Machine learning. 45(1), 5-32.
Leo Breiman and Adele Cutler. (2003) Random Forest - Classification Description. Retrieved on November 1, 2014 from http://www.math.usu.edu/~adele/forests/cc_home.htm
Hastie, T., Tibshirani, R., Friedman, J., Hastie, T., Friedman, J., & Tibshirani, R. (2009). The Elements of Statistical Learning (Vol. 2, No. 1). New York: springer.
Adbi, H., & Williams, L. J. (2010). Jackknife. In: Neil Salkind (Ed.), Encyclopedia of Research Design. Thousand Oaks, CA: Sage.
Robin, X., Turck, N., Hainard, A., Tiberti, N., Lisacek, F., Sanchez, J. C., & Müller, M. (2011). pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC bioinformatics, 12(1), 77.
Chang, C. C., & Lin, C. J. (2011). LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST), 2(3), 27.
Karatzoglou, A., Smola, A., Hornik, K., & Zeileis, A. (2004). kernlab-an S4 package for kernel methods in R. Journal of Statistical Software, 11(9)
Jerome Friedman, Trevor Hastie, Robert Tibshirani (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1), 1-22.
Ben-Hur, A., & Weston, J. (2010). A user’s guide to support vector machines. In: O. Carugo, F. Eisenhaber (eds.), Data mining techniques for the life sciences (pp. 223-239). Humana Press.
Cawley, G. C., & Talbot, N. L. (2010). On over-fitting in model selection and subsequent selection bias in performance evaluation. The Journal of Machine Learning Research, 11, 2079-2107.
Storey, J. D. (2002). A direct approach to false discovery rates. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(3), 479-498.
Noble, W. S. (2009). How does multiple testing correction work? Nature biotechnology, 27(12), 1135-1137.
Storey, J.D. (2010). False discovery rate. Retrieved on Feb. 1, 2015, from http://www.genomine.org/papers/Storey_FDR_2010.pdf
Hu Huang, Emmanuel Montassier, Pajau Vangay, Gabe Al Ghalith, Dan Knights. "Robust statistical models for microbiome phenotype prediction with the MWAS package" (in preparation)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

1. Installing MWAS

2. MWAS "learn" Module

3. MWAS "predict" Module

4. MWAS Visualization Module

5. MWAS feature "statistics" Module

6. Example 1: Learning a predictive model

7. Example 2: Prediction from an unknown dataset

8. Example 3: Taxon Statistical Analysis and Visualization

9. Common Errors and Solutions

References

About

Releases

Packages

Contributors 4

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 331 Commits
bin		bin
config		config
data		data
doc		doc
example		example
examples		examples
lib		lib
.gitignore		.gitignore
README.md		README.md

danknights/mwas

Folders and files

Latest commit

History

Repository files navigation

1. Installing MWAS

2. MWAS "learn" Module

3. MWAS "predict" Module

4. MWAS Visualization Module

5. MWAS feature "statistics" Module

6. Example 1: Learning a predictive model

7. Example 2: Prediction from an unknown dataset

8. Example 3: Taxon Statistical Analysis and Visualization

9. Common Errors and Solutions

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages