Skip to content

Commit

Permalink
Merge pull request #323 from SchlossLab/improve-docs
Browse files Browse the repository at this point in the history
Improve description of `run_ml()` and its args
  • Loading branch information
kelly-sovacool committed Jan 20, 2023
2 parents 4df6fe7 + 23c2aa4 commit f73f8a8
Show file tree
Hide file tree
Showing 80 changed files with 276 additions and 262 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
@@ -1,6 +1,6 @@
Package: mikropml
Title: User-Friendly R Package for Supervised Machine Learning Pipelines
Version: 1.5.0
Version: 1.5.0.9000
Date: 2023-01-15
Authors@R:
c(person(given = "Begüm",
Expand Down
4 changes: 4 additions & 0 deletions NEWS.md
@@ -1,3 +1,7 @@
# mikropml development version

- Minor documentation improvements (#323, @kelly-sovacool).

# mikropml 1.5.0

- New example showing how to plot feature importances in the `parallel` vignette (#310, @kelly-sovacool).
Expand Down
33 changes: 25 additions & 8 deletions R/run_ml.R
@@ -1,13 +1,15 @@
#' Run the machine learning pipeline
#'
#' This function runs machine learning (ML), evaluates the best model,
#' This function splits the data set into a train & test set,
#' trains machine learning (ML) models using k-fold cross-validation,
#' evaluates the best model on the held-out test set,
#' and optionally calculates feature importance using the framework
#' outlined in Topçuoğlu _et al._ 2020 (\doi{10.1128/mBio.00434-20}).
#' Required inputs are a dataframe with an outcome variable and other columns
#' as features, as well as the ML method.
#' Required inputs are a data frame (must contain an outcome variable and all
#' other columns as features) and the ML method.
#' See `vignette('introduction')` for more details.
#'
#' @param dataset Dataframe with an outcome variable and other columns as features.
#' @param dataset Data frame with an outcome variable and other columns as features.
#' @param method ML method.
#' Options: `c("glmnet", "rf", "rpart2", "svmRadial", "xgbTree")`.
#' - glmnet: linear, logistic, or multiclass regression
Expand Down Expand Up @@ -73,13 +75,28 @@
#'
#' - `trained_model`: Output of [caret::train()], including the best model.
#' - `test_data`: Part of the data that was used for testing.
#' - `performance`: Dataframe of performance metrics. The first column is the cross-validation performance metric, and the last two columns are the ML method used and the seed (if one was set), respectively. All other columns are performance metrics calculated on the test data. This contains only one row, so you can easily combine performance dataframes from multiple calls to `run_ml()` (see `vignette("parallel")`).
#' - `feature_importance`: If feature importances were calculated, a dataframe where each row is a feature or correlated group. The columns are the performance metric of the permuted data, the difference between the true performance metric and the performance metric of the permuted data (true - permuted), the feature name, the ML method, the performance metric name, and the seed (if provided). For AUC and RMSE, the higher perf_metric_diff is, the more important that feature is for predicting the outcome. For log loss, the lower perf_metric_diff is, the more important that feature is for predicting the outcome.
#'
#' - `performance`: Data frame of performance metrics. The first column is the
#' cross-validation performance metric, and the last two columns are the ML
#' method used and the seed (if one was set), respectively.
#' All other columns are performance metrics calculated on the test data.
#' This contains only one row, so you can easily combine performance
#' data frames from multiple calls to `run_ml()`
#' (see `vignette("parallel")`).
#' - `feature_importance`: If feature importances were calculated, a data frame
#' where each row is a feature or correlated group. The columns are the
#' performance metric of the permuted data, the difference between the true
#' performance metric and the performance metric of the permuted data
#' (true - permuted), the feature name, the ML method,
#' the performance metric name, and the seed (if provided).
#' For AUC and RMSE, the higher perf_metric_diff is, the more important that
#' feature is for predicting the outcome. For log loss, the lower
#' perf_metric_diff is, the more important that feature is for
#' predicting the outcome.
#'
#' @section More details:
#'
#' For more details, please see [the vignettes](http://www.schlosslab.org/mikropml/articles/).
#' For more details, please see
#' [the vignettes](http://www.schlosslab.org/mikropml/articles/).
#'
#' @export
#' @author Begüm Topçuoğlu, \email{topcuoglu.begum@@gmail.com}
Expand Down
2 changes: 1 addition & 1 deletion docs/dev/CODE_OF_CONDUCT.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/dev/CONTRIBUTING.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/dev/LICENSE-text.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/dev/LICENSE.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/dev/SUPPORT.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/dev/articles/index.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/dev/articles/introduction.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/dev/articles/paper.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

43 changes: 30 additions & 13 deletions docs/dev/articles/parallel.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/dev/articles/preprocess.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/dev/articles/tuning.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/dev/authors.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/dev/index.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit f73f8a8

Please sign in to comment.